Comparing Machine Learning Models for Sentiment Analysis of Tokopedia Reviews
DOI:
https://doi.org/10.30871/jaic.v9i6.11239Keywords:
Sentiment Analysis, SVM, Random Forest, Neural Network, Multi-Layer Perceptron (MLP)Abstract
This study presents a comparative evaluation of machine learning models for sentiment analysis on Tokopedia user reviews written in the Indonesian language. The objective is to assess the effectiveness of three algorithms—Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP)—in classifying customer sentiments extracted from Tokopedia reviews on Google Play Store. The dataset, collected between January and October 2025, consists of 10,236 unique entries after preprocessing, which included text cleaning, case folding, tokenization, stopword removal, normalization using a verified Indonesian word normalization dictionary, and optional stemming with the Sastrawi library. The reviews were divided into positive and negative categories based on rating polarity (4–5 stars as positive; 1–2 stars as negative).Each model was evaluated using both hold-out validation (80:20 split) and 5-fold cross-validation, employing metrics such as accuracy, precision, recall, and F1-score. Experimental results indicate that the SVM achieved the highest accuracy of 0.88, outperforming Random Forest (0.85) and MLP (0.83). These findings demonstrate that SVM performs more robustly on sparse TF-IDF vector features and is more resistant to noise within informal Indonesian expressions. The research further discusses the linguistic challenges inherent in Indonesian sentiment analysis, including code-mixing, abbreviations, and non-standard words, while proposing preprocessing strategies to mitigate them.The outcomes of this study contribute to enhancing the reliability of sentiment-based decision support systems in Indonesian e-commerce platforms. The methodological framework developed here can serve as a baseline for future work involving hybrid or deep-learning approaches such as LSTM or IndoBERT for improved contextual understanding.
Downloads
References
[1] B. Setiawan, “A Review of Sentiment Analysis Applications in Indonesia Between 2023-2024,” vol. 08, pp. 71–83, 2024.
[2] R. Damanhuri and V. A. Husein, “Analisis Sentimen pada Ulasan Aplikasi Access by KAI Berbahasa Indonesia Menggunakan Word-Embedding dan Classical Machine Learning,” vol. 15, no. September, 2024, doi: 10.14710/jmasif.15.2.62383.
[3] J. Jtik, J. Teknologi, and F. F. Kiedrowsky, “Sentiment Analysis Marketplaces Digital menggunakan Machine Learning,” vol. 7, no. 3, 2023.
[4] A. Alaiya and C. Agusniar, “Sentiment Analysis of E-Commerce Product Reviews on Tokopedia Using Support Vector Machine,” vol. 9, no. 5, pp. 2869–2878, 2025.
[5] B. Ramadhani and R. R. Suryono, “Komparasi Algoritma Naïve Bayes dan Logistic Regression Untuk Analisis Sentimen Metaverse,” J. Media Inform. Budidarma, vol. 8, no. 2, p. 714, 2024, doi: 10.30865/mib.v8i2.7458.
[6] S. A. R. Rizaldi, S. Alam, and I. Kurniawan, “Analisis Sentimen Pengguna Aplikasi JMO (Jamsostek Mobile) Pada Google Play Store Menggunakan Metode Naive Bayes,” STORAGE J. Ilm. Tek. dan Ilmu Komput., vol. 2, no. 3, pp. 109–117, 2023, doi: 10.55123/storage.v2i3.2334.
[7] N. Agustina, D. H. Citra, W. Purnama, C. Nisa, and A. R. Kurnia, “Implementasi Algoritma Naive Bayes untuk Analisis Sentimen Ulasan Shopee pada Google Play Store,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 2, no. 1, pp. 47–54, 2022, doi: 10.57152/malcom.v2i1.195.
[8] G. Darmawan, S. Alam, and M. I. Sulistyo, “Analisis Sentimen Berdasarkan Ulasan Pengguna Aplikasi Mypertamina Pada Google Playstore Menggunakan Metode Naïve Bayes,” STORAGE – J. Ilm. Tek. dan Ilmu Komput., vol. 2, no. 3, pp. 100–108, 2023.
[9] O. Irnawati and K. Solecha, “Analisis Sentimen Ulasan Aplikasi Flip Menggunakan Naïve Bayes dengan Seleksi Fitur PSO,” J. Ilm. Intech Inf. Technol. J. UMUS, vol. 4, no. 02, pp. 189–199, 2022, doi: 10.46772/intech.v4i02.868.
[10] I. F. Rahman, A. N. Hasanah, and N. Heryana, “Analisis Sentimen Ulasan Pengguna Aplikasi Samsat Digital Nasional (Signal) Dengan Menggunakan Metode Naïve Bayes Classifier,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 2, pp. 963–969, 2024, doi: 10.23960/jitet.v12i2.4073.
[11] R. Nur and S. Prasetija, “Analisis Pengaruh Normalisasi Teks pada Klasifikasi Sentimen Ulasan Produk Kecantikan,” vol. 9, no. 3, 2022.
[12] P. M. S. Ardinata, A. A. J. Permana, and I. N. S. W. Wijaya, “IDENTIFIKASI DAN NORMALISASI TEKS SLANG DENGAN,” vol. 21, no. 1, 2024.
[13] A. Yohni, W. Finansyah, and V. M. Sutanto, “Performance Comparison of Similarity Measure Algorithm as Data Preprocessing Stage : Text Normalization in Bahasa Indonesia,” vol. 9, no. 1, pp. 1–7, 2022, doi: 10.15294/sji.v9i1.30052.
[14] A. F. Anjani, D. Anggraeni, and I. M. Tirta, “Implementasi Random Forest Menggunakan SMOTE untuk Analisis Sentimen Ulasan Aplikasi Sister for Students UNEJ,” J. Nas. Teknol. dan Sist. Inf., vol. 9, no. 2, pp. 163–172, 2023, doi: 10.25077/teknosi.v9i2.2023.163-172.
[15] A. A. Qolbu, N. Fitriyati, and N. Inayah, “Performa Naïve Bayes , SVM , dan IndoBERT pada Analisis Sentimen Twitter IndiHome dengan Strategi Penanganan Data Tidak Seimbang,” vol. 814, no. 1, pp. 29–44, 2025, doi: 10.14421/fourier.2025.141.29-44.
[16] Y. Julianto, D. H. Setiabudi, and S. Rostianingsih, “Analisis Sentimen Ulasan Restoran Menggunakan Metode SVM,” J. Infra, vol. 10, no. 1, 2022.
[17] M. F. Y. Herjanto and C. Carudin, “Analisis Sentimen Ulasan Pengguna Aplikasi Sirekap Pada Play Store Menggunakan Algoritma Random Forest Classifer,” J. Inform. dan Tek. Elektro Terap., vol. 12, no. 2, pp. 1204–1210, 2024, doi: 10.23960/jitet.v12i2.4192.
[18] I. P. Dedy, W. Darmawan, G. A. Pradnyana, I. Bagus, and N. Pascima, “Optimasi Parameter Support Vector Machine Dengan Algoritma Genetika Untuk Analisis Sentimen Pada Media Sosial Instagram,” vol. 6, no. 1, pp. 58–67, 2023.
[19] M. F. Alam, A. Nuryaman, P. H. Khotimah, and A. Parlina, “Optimizing Multi-Layer Perceptron performance in sentiment classification through neural network feature extraction,” vol. 46, no. 1, pp. 1–14, 2025, doi: 10.55981/j.baca.2025.8240.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Afif Langgeng Dhiya Ulhaq, Suprayogi Suprayogi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








