Comparing Machine Learning Models for Sentiment Analysis of Tokopedia Reviews

Authors

  • Afif Langgeng Dhiya Ulhaq Universitas Dian Nuswantoro
  • Suprayogi Suprayogi Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v9i6.11239

Keywords:

Sentiment Analysis, SVM, Random Forest, Neural Network, Multi-Layer Perceptron (MLP)

Abstract

This study presents a comparative evaluation of machine learning models for sentiment analysis on Tokopedia user reviews written in the Indonesian language. The objective is to assess the effectiveness of three algorithms—Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP)—in classifying customer sentiments extracted from Tokopedia reviews on Google Play Store. The dataset, collected between January and October 2025, consists of 10,236 unique entries after preprocessing, which included text cleaning, case folding, tokenization, stopword removal, normalization using a verified Indonesian word normalization dictionary, and optional stemming with the Sastrawi library. The reviews were divided into positive and negative categories based on rating polarity (4–5 stars as positive; 1–2 stars as negative).Each model was evaluated using both hold-out validation (80:20 split) and 5-fold cross-validation, employing metrics such as accuracy, precision, recall, and F1-score. Experimental results indicate that the SVM achieved the highest accuracy of 0.88, outperforming Random Forest (0.85) and MLP (0.83). These findings demonstrate that SVM performs more robustly on sparse TF-IDF vector features and is more resistant to noise within informal Indonesian expressions. The research further discusses the linguistic challenges inherent in Indonesian sentiment analysis, including code-mixing, abbreviations, and non-standard words, while proposing preprocessing strategies to mitigate them.The outcomes of this study contribute to enhancing the reliability of sentiment-based decision support systems in Indonesian e-commerce platforms. The methodological framework developed here can serve as a baseline for future work involving hybrid or deep-learning approaches such as LSTM or IndoBERT for improved contextual understanding.

Downloads

Download data is not yet available.

References

[1] B. Setiawan, “A Review of Sentiment Analysis Applications in Indonesia Between 2023-2024,” vol. 08, pp. 71–83, 2024.

[2] R. Damanhuri and V. A. Husein, “Analisis Sentimen pada Ulasan Aplikasi Access by KAI Berbahasa Indonesia Menggunakan Word-Embedding dan Classical Machine Learning,” vol. 15, no. September, 2024, doi: 10.14710/jmasif.15.2.62383.

[3] J. Jtik, J. Teknologi, and F. F. Kiedrowsky, “Sentiment Analysis Marketplaces Digital menggunakan Machine Learning,” vol. 7, no. 3, 2023.

[4] A. Alaiya and C. Agusniar, “Sentiment Analysis of E-Commerce Product Reviews on Tokopedia Using Support Vector Machine,” vol. 9, no. 5, pp. 2869–2878, 2025.

[5] B. Ramadhani and R. R. Suryono, “Komparasi Algoritma Naïve Bayes dan Logistic Regression Untuk Analisis Sentimen Metaverse,” J. Media Inform. Budidarma, vol. 8, no. 2, p. 714, 2024, doi: 10.30865/mib.v8i2.7458.

[6] S. A. R. Rizaldi, S. Alam, and I. Kurniawan, “Analisis Sentimen Pengguna Aplikasi JMO (Jamsostek Mobile) Pada Google Play Store Menggunakan Metode Naive Bayes,” STORAGE J. Ilm. Tek. dan Ilmu Komput., vol. 2, no. 3, pp. 109–117, 2023, doi: 10.55123/storage.v2i3.2334.

[7] N. Agustina, D. H. Citra, W. Purnama, C. Nisa, and A. R. Kurnia, “Implementasi Algoritma Naive Bayes untuk Analisis Sentimen Ulasan Shopee pada Google Play Store,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 2, no. 1, pp. 47–54, 2022, doi: 10.57152/malcom.v2i1.195.

[8] G. Darmawan, S. Alam, and M. I. Sulistyo, “Analisis Sentimen Berdasarkan Ulasan Pengguna Aplikasi Mypertamina Pada Google Playstore Menggunakan Metode Naïve Bayes,” STORAGE – J. Ilm. Tek. dan Ilmu Komput., vol. 2, no. 3, pp. 100–108, 2023.

[9] O. Irnawati and K. Solecha, “Analisis Sentimen Ulasan Aplikasi Flip Menggunakan Naïve Bayes dengan Seleksi Fitur PSO,” J. Ilm. Intech Inf. Technol. J. UMUS, vol. 4, no. 02, pp. 189–199, 2022, doi: 10.46772/intech.v4i02.868.

[10] I. F. Rahman, A. N. Hasanah, and N. Heryana, “Analisis Sentimen Ulasan Pengguna Aplikasi Samsat Digital Nasional (Signal) Dengan Menggunakan Metode Naïve Bayes Classifier,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 2, pp. 963–969, 2024, doi: 10.23960/jitet.v12i2.4073.

[11] R. Nur and S. Prasetija, “Analisis Pengaruh Normalisasi Teks pada Klasifikasi Sentimen Ulasan Produk Kecantikan,” vol. 9, no. 3, 2022.

[12] P. M. S. Ardinata, A. A. J. Permana, and I. N. S. W. Wijaya, “IDENTIFIKASI DAN NORMALISASI TEKS SLANG DENGAN,” vol. 21, no. 1, 2024.

[13] A. Yohni, W. Finansyah, and V. M. Sutanto, “Performance Comparison of Similarity Measure Algorithm as Data Preprocessing Stage : Text Normalization in Bahasa Indonesia,” vol. 9, no. 1, pp. 1–7, 2022, doi: 10.15294/sji.v9i1.30052.

[14] A. F. Anjani, D. Anggraeni, and I. M. Tirta, “Implementasi Random Forest Menggunakan SMOTE untuk Analisis Sentimen Ulasan Aplikasi Sister for Students UNEJ,” J. Nas. Teknol. dan Sist. Inf., vol. 9, no. 2, pp. 163–172, 2023, doi: 10.25077/teknosi.v9i2.2023.163-172.

[15] A. A. Qolbu, N. Fitriyati, and N. Inayah, “Performa Naïve Bayes , SVM , dan IndoBERT pada Analisis Sentimen Twitter IndiHome dengan Strategi Penanganan Data Tidak Seimbang,” vol. 814, no. 1, pp. 29–44, 2025, doi: 10.14421/fourier.2025.141.29-44.

[16] Y. Julianto, D. H. Setiabudi, and S. Rostianingsih, “Analisis Sentimen Ulasan Restoran Menggunakan Metode SVM,” J. Infra, vol. 10, no. 1, 2022.

[17] M. F. Y. Herjanto and C. Carudin, “Analisis Sentimen Ulasan Pengguna Aplikasi Sirekap Pada Play Store Menggunakan Algoritma Random Forest Classifer,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 2, pp. 1204–1210, 2024, doi: 10.23960/jitet.v12i2.4192.

[18] I. P. Dedy, W. Darmawan, G. A. Pradnyana, I. Bagus, and N. Pascima, “Optimasi Parameter Support Vector Machine Dengan Algoritma Genetika Untuk Analisis Sentimen Pada Media Sosial Instagram,” vol. 6, no. 1, pp. 58–67, 2023.

[19] M. F. Alam, A. Nuryaman, P. H. Khotimah, and A. Parlina, “Optimizing Multi-Layer Perceptron performance in sentiment classification through neural network feature extraction,” vol. 46, no. 1, pp. 1–14, 2025, doi: 10.55981/j.baca.2025.8240.

Downloads

Published

2025-12-09

How to Cite

[1]
A. L. D. Ulhaq and S. Suprayogi, “Comparing Machine Learning Models for Sentiment Analysis of Tokopedia Reviews”, JAIC, vol. 9, no. 6, pp. 3642–3647, Dec. 2025.

Similar Articles

<< < 1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.