Comparison of Random Forest and LSTM for Tokopedia Sentiment Analysis

Authors

  • Fahrizal Denta Saputra Universitas Dian Nuswantoro
  • Fikri Budiman Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v10i1.12042

Keywords:

Tokopedia, Sentiment Analysis, Random Forest, LSTM, E-commerce

Abstract

Tokopedia is one of the largest e-commerce platforms in Indonesia, where every transaction generates user reviews containing opinions about the products or services received. These reviews provide important information about product quality, but the very large quantity makes manual analysis inefficient. This study aims to automatically classify Tokopedia review sentiment and compare the performance of machine learning and deep learning methods. The dataset used was obtained from Kaggle and has undergone an initial cleaning stage, including removing irrelevant columns and manually labeling into two sentiment classes, positive and negative. The research methodology includes several stages, namely data preprocessing (cleaning, case-folding, stopword removal, tokenization, normalization, and stemming), feature extraction using TF-IDF for Random Forest and word embedding for LSTM, implementation of Random Forest and Long Short-Term Memory (LSTM) models, and model evaluation using confusion matrix. Experimental results show that LSTM provides the best performance with 94% accuracy, while Random Forest achieves 92% accuracy. These findings indicate that LSTM is more effective in understanding language context, resulting in more accurate sentiment classification and is useful for decision making in the e-commerce field.

Downloads

Download data is not yet available.

References

[1] A. A. H. Siregar, R. R. Apriliani, dan N. Nurhasanah, “Analisis Korelasi Statistik Antara Populasi Jumlah Penduduk dan Pengguna Internet Di Indonesia,” RIGGS J. Artif. Intell. Digit. Bus., vol. 4, no. 3, hlm. 4776–4781, Sep 2025, doi: 10.31004/riggs.v4i3.2684.

[2] U. Ajnura, I. Ikramuddin, C. Chalirafi, dan M. Subhan, “Pengaruh Faktor Pendorong Belanja Online Terhadap Niat Perilaku Konsumen Di Kota Lhokseumawe Dengan Metode Pembayaran Cash-on-Delivery Sebagai Variabel Mediasi,” J. Manaj. Pemasar., vol. 18, no. 1, hlm. 25–39, Apr 2024, doi: 10.9744/pemasaran.18.1.25-39.

[3] N. Emantonio, R. S. Magdalena, dan A. Wulandari, “Analisis Pertumbuhan Platform Bisnis Digital di Indonesia,” vol. 11, 2025.

[4] A. S. Kembau, J. G. E. Putri, dan R. J. Nehemia, “Customer Indecisiveness pada E-Commerce Indonesia: Peran Price Sensitivity, Product Involvement, Risk, dan Social Inference”.

[5] Y. Sutarso, B. Suminar, dan A. Maschudah Ilfitriah, “Do shopping anxiety and data leakage risks matter to e-commerce customers? Evidence from the largest economy in Southeast Asia,” J. Manaj. Dan Pemasar. Jasa, vol. 17, no. 1, hlm. 97–116, Apr 2024, doi: 10.25105/v17i1.18673.

[6] D. Fadila dan M. Ikhsan, “Analisis Sentimen Pada Aplikasi Tokopedia Menggunakan Metode Support Vector Machine,” vol. 21, no. 1.

[7] H. K. Tunjungsari, “Pengaruh Persepsi Kegunaan Dari Ulasan Online, Kepercayaan Konsumen, Dan Persepsi Risiko Pada Intensi Membeli Produk Busana Secara Online,” vol. 9, no. 2.

[8] Regina Dwi Amelia, M. Michael, dan R. Mulyandi, “Analisis Online Consumer Review Terhadap Keputusan Pembelian pada E-Commerce Kecantikan,” J. Indones. Sos. Teknol., vol. 2, no. 2, hlm. 274–280, Feb 2021, doi: 10.36418/jist.v2i2.80.

[9] N. A. Salsabila, U. Sa’adah, dan F. Fauzi, “Analisis Sentimen Pada Ulasan Aplikasi Tokopedia Menggunakan Klasifikasi Naïve Bayes,” vol. 7, 2024.

[10] D. Darwis, N. Siskawati, dan Z. Abidin, “Penerapan Algoritma Naive Bayes Untuk Analisis Sentimen Review Data Twitter BMKG Nasional,” J. Tekno Kompak, vol. 15, no. 1, hlm. 131, Feb 2021, doi: 10.33365/jtk.v15i1.744.

[11] R. L. Atimi dan Enda Esyudha Pratama, “Implementasi Model Klasifikasi Sentimen Pada Review Produk Lazada Indonesia,” J. Sains Dan Inform., vol. 8, no. 1, hlm. 88–96, Jul 2022, doi: 10.34128/jsi.v8i1.419.

[12] A. H. Hasugian, M. Fakhriza, dan D. Zukhoiriyah, “Analisis Sentimen Pada Review Pengguna E-Commerce Menggunakan Algoritma Naïve Bayes,” J-SISKO TECH J. Teknol. Sist. Inf. Dan Sist. Komput. TGD, vol. 6, no. 1, hlm. 98, Jan 2023, doi: 10.53513/jsk.v6i1.7400.

[13] A. R. Azis, “Analisis Komparasi Algoritma Machine Learning dalam Prediksi Performa Akademik Mahasiswa: Literature Review,” J. Ilmu Komput. Dan Inform., vol. 4, no. 2, hlm. 143–148, Jan 2025, doi: 10.54082/jiki.212.

[14] A. Syah, F. Nurdiyansyah, dan A. Y. Rahman, “Analisis Sentimen Aplikasi Shopee, Tokopedia, Lazada Dan Blibli Menggunakan Leksikon Dan Random Forest,” J. Inform. Dan Tek. Elektro Terap., vol. 12, no. 3S1, Okt 2024, doi: 10.23960/jitet.v12i3S1.5155.

[15] G. Tamami, W. A. Triyanto, dan S. Muzid, “Sentiment Analysis Mobile JKN Reviews Using SMOTE Based LSTM,” IJCCS Indones. J. Comput. Cybern. Syst., vol. 19, no. 1, hlm. 13, Jan 2025, doi: 10.22146/ijccs.101910.

[16] A. R. Gunawan dan R. F. A. Aziza, “Sentiment Analysis Using LSTM Algorithm Regarding Grab Application Services in Indonesia,” vol. 9, no. 2.

[17] F. M. Rayhan, S. H. Wijoyo, dan W. H. N. Putra, “Analisis Sentimen Root Cause Analisis Kepuasan Pengguna Aplikasi Tokopedia Pada Ulasan Menggunakan Metode Random Forest”.

[18] A. Andreyestha dan Q. N. Azizah, “Analisa Sentimen Kicauan Twitter Tokopedia Dengan Optimalisasi Data Tidak Seimbang Menggunakan Algoritma SMOTE,” Infotek J. Inform. Dan Teknol., vol. 5, no. 1, hlm. 108–116, Jan 2022, doi: 10.29408/jit.v5i1.4581.

[19] C. Y. Adhelina, M. H. Dar, dan M. N. S. Hasibuan, “Implementation of Deep Learning Models in Conducting Aspect-Based Sentiment Analysis”.

[20] D. Purnamasari, A. B. Aji, S. Madenda, I. M. Wiryana, dan S. Harmanto, “Sentiment Analysis Methods for Customer Review of Indonesia E-Commerce,” 2024, ICIC International 学会: 01. doi: 10.24507/ijicic.20.01.47.

[21] S, Panggabean dan A, Junika, “Sentiment Analysis on Public Opinions Regarding the 2024 Regional Elections Using Long Short-Term Memory (LSTM), Random Forest, and Naive Bayes,” 2024.

[22] M. Siino, I. Tinnirello, dan M. La Cascia, “Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers,” Inf. Syst., vol. 121, hlm. 102342, Mar 2024, doi: 10.1016/j.is.2023.102342.

[23] R. Rahmadani, A. Rahim, dan R. Rudiman, “Analisis Sentimen Ulasan ‘Ojol the Game’ Di Google Play Store Menggunakan Algoritma Naive Bayes Dan Model Ekstraksi Fitur Tf-Idf Untuk Meningkatkan Kualitas Game,” J. Inform. Dan Tek. Elektro Terap., vol. 12, no. 3, Agu 2024, doi: 10.23960/jitet.v12i3.4988.

[24] N. A. Dirfas dan V. R. S. Nastiti, “Perbandingan Kinerja Pre-Trained Word Embedding Terhadap Performa Klasifikasi Sentimen Ulasan Produk Tokopedia Dengan Long Short-Term Memory(LSTM),” Build. Inform. Technol. Sci. BITS, vol. 6, no. 2, Sep 2024, doi: 10.47065/bits.v6i2.5634.

[25] S. Dermawan dan A. T. Ayunda, “Sentiment Analysis of Coretax on Social Media X Using Naive Bayes, SVM, and LSTM for Service Improvement,” vol. 9, no. 6.

[26] S. Agustiani, Y. T. Arifin, A. Junaidi, S. K. Wildah, dan A. Mustopa, “Klasifikasi Penyakit Daun Padi menggunakan Random Forest dan Color Histogram,” J. Komputasi, vol. 10, no. 1, hlm. 65–74, Apr 2022, doi: 10.23960/komputasi.v10i1.2961.

[27] Suci Amaliah, M. Nusrang, dan A. Aswi, “Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng,” VARIANSI J. Stat. Its Appl. Teach. Res., vol. 4, no. 3, hlm. 121–127, Des 2022, doi: 10.35580/variansiunm31.

[28] Y. Romadhoni dan K. F. H. Holle, “Analisis Sentimen Terhadap PERMENDIKBUD No.30 pada Media Sosial Twitter Menggunakan Metode Naive Bayes dan LSTM,” J. Inform. J. Pengemb. IT, vol. 7, no. 2, hlm. 118–124, Mei 2022, doi: 10.30591/jpit.v7i2.3191.

Downloads

Published

2026-02-04

How to Cite

[1]
F. D. Saputra and F. Budiman, “Comparison of Random Forest and LSTM for Tokopedia Sentiment Analysis”, JAIC, vol. 10, no. 1, pp. 630–639, Feb. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.