Sentiment Analysis of E-Commerce Product Reviews on Tokopedia Using Support Vector Machine
DOI:
https://doi.org/10.30871/jaic.v9i5.10977Keywords:
E-commerce, Sentiment Analysis, Support Vector Machine, Text Mining, TokopediaAbstract
This research aims to analyze the performance of Support Vector Machine (SVM) algorithm in classifying sentiment of e-commerce product reviews on the Tokopedia platform using web scraping data of 571 reviews from the 2024 period. The data includes review text variables, publication dates, and usernames processed through text preprocessing (text cleaning, stopword removal, stemming with Sastrawi), auto-labeling using a lexicon-based approach, and TF-IDF feature extraction with optimal parameters (max_features=5000, ngram_range=(1,2)) resulting in 1,187 features. Data splitting was performed using stratified method with proportions of training (80%) and testing (20%) on 461 reviews from binary classification filtering (positive vs negative). The research results demonstrate that Support Vector Machine with linear kernel achieved excellent performance with accuracy 95.70%, precision 95.89%, recall 95.70%, and F1-score 94.89% on the testing set. Despite the imbalanced dataset characteristics (92.4% positive vs 7.6% negative), SVM effectively handled the classification task by identifying negative sentiment with 100% precision and 42.86% recall, demonstrating its robustness in handling skewed data distribution. TF-IDF feature analysis identified the highest discriminative words such as "suitable", "goods", and "good" that are relevant for classifying consumer sentiment towards e-commerce products. The results indicate that SVM algorithm is highly effective for sentiment classification of e-commerce product reviews, making it suitable for practical implementation in automated sentiment analysis systems for online marketplaces.
Downloads
References
[1] M. Aulia and A. Hermawan, “Analisis Perbandingan Algoritma SVM, Naïve Bayes, dan Perceptron untuk Analisis Sentimen Ulasan Produk Tokopedia,” Jurnal Media Informatika Budidarma, vol. 7, no. 4, p. 1850, 2023, doi: 10.30865/mib.v7i4.6839.
[2] S. S. Muna, Nurdin, and Taufiq, “Tokopedia and Shopee Marketplace Performance Analysis Using Metrix Google Lighthouse,” Int. J. Eng. Sci. Inf. Technol., vol. 2, no. 3, pp. 106–110, 2022, doi: 10.52088/ijesty.v1i4.312.
[3] A. Ernawati, A. O. Sari, S. N. Sofyan, M. Iqbal, and R. F. W. Wijaya, “Implementasi Algoritma Naïve Bayes dalam Menganalisis Sentimen Review Pengguna Tokopedia pada Produk Kesehatan,” Bulletin of Information Technology (BIT), vol. 4, no. 4, pp. 533–543, 2023, doi: 10.47065/bit.v4i4.1090.
[4] R. A. E. V. T. Sapanji, D. Hamdani, and P. Harahap, “Sentiment Analysis of the Top 5 E-commerce Platforms in Indonesia using Text Mining and Natural Language Processing (NLP),” Journal of Applied Informatics and Computing, vol. 7, no. 2, pp. 202–211, 2023, doi: 10.30871/jaic.v7i2.6517.
[5] N. K. Putri, A. V. Vitianingsih, S. Kacung, A. L. Maukar, and V. Yasin, “Sentiment Analysis of Brand Ambassador Influence on Product Buyer Interest Using KNN and SVM,” Indonesian Journal of Artificial Intelligence and Data Mining, vol. 7, no. 2, p. 327, 2024, doi: 10.24014/ijaidm.v7i2.29469.
[6] Nurdin, Bustami, M. Hutomi, M. Elveny, and R. Syah, “Implementation of the bfs algorithm and web scraping techniques for online shop detection in Indonesia,” J. Theor. Appl. Inf. Technol., vol. 99, no. 12, pp. 2878–2889, 2021.
[7] Nurdin, M. Hutomi, M. Qamal, and B. Bustami, “Sistem Pengecekan Toko Online Asli atau Dropship pada Shopee Menggunakan Algoritma Breadth First Search,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 4, no. 6, pp. 1117–1123, 2020, doi: 10.29207/resti.v4i6.2514.
[8] D. Astika and N. Nurdin, “Penerapan Data Mining Untuk Menganalisis Penjualan Barang Dengan Menggunakan Metode Apriori Pada Supermarket Sejahtera Lhokseumawe,” 2019.
[9] B. W. K. Nurdin, “Implementasi Data Mining Untuk Mengklasifikasi Data Nasabah Pt. Adira Finance Aceh Tengah Menggunakan Algoritma C4.5,” Jurnal Sistem Informasi Kaputama (JSIK), vol. 1, no. 1, 2017.
[10] Fauziah, Dedy Hartama, and Irfan Sudahri Damanik, “Analisa Kepuasan Pelanggan Menggunakan Klasifikasi Data Mining,” Jurnal Penerapan Kecerdasan Buatan, 2020.
[11] N. Nurdin, M. Suhendri, Yesy Afrilia, and R. Rizal, “SISTEMASI: Jurnal Sistem Informasi Klasifikasi Karya Ilmiah (Tugas Akhir) Mahasiswa Menggunakan Metode Naive Bayes Classifier (Nbc),” 2021. [Online]. Available: http://sistemasi.ftik.unisi.ac.id
[12] T. Ridwansyah, “Implementasi Text Mining Terhadap Analisis Sentimen Masyarakat Dunia Di Twitter Terhadap Kota Medan Menggunakan K-Fold Cross Validation Dan Naïve Bayes Classifier,” KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 2, no. 5, pp. 178–185, 2022, doi: 10.30865/klik.v2i5.362.
[13] D. Darwis, N. Siskawati, and Z. Abidin, “Penerapan Algoritma Naive Bayes Untuk Analisis Sentimen Review Data Twitter Bmkg Nasional,” Jurnal Tekno Kompak, vol. 15, no. 1, p. 131, 2021, doi: 10.33365/jtk.v15i1.744.
[14] V. A. Zahrah, Nurdin, and Risawandi, “Sentiment Analysis of Google Maps User Reviews on the Play Store Using Support Vector Machine and Latent Dirichlet Allocation Topic Modeling,” Int. J. Eng. Sci. Inf. Technol., vol. 4, no. 4, pp. 87–100, 2024, doi: 10.52088/ijesty.v4i4.580.
[15] B. Samudera, N. Nurdin, and H. A. Aidilof, “Sentiment Analysis of User Reviews on BSI Mobile and Action Mobile Applications on the Google Play Store Using Multinomial Naive Bayes Algorithm,” vol. 4, no. 4, pp. 101–112, 2024.
[16] A. Anggara, Nurdin, and R. Meiyanti, “Sentiment Analysis of the MK Decision Trial of the Result of the 2024 President and Vice President General Election on Social Media X Using the Support Vector Machine Method,” Int. J. Eng. Sci. Inf. Technol., vol. 4, no. 4, pp. 125–134, 2024, doi: 10.52088/ijesty.v4i4.591.
[17] Styawati, Andi Nurkholis, Zaenal Abidin, and Heni Sulistiani, “Optimasi Parameter Support Vector Machine Berbasis Algoritma Firefly Pada Data Opini Film,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 5, pp. 904–910, 2021, doi: 10.29207/resti.v5i5.3380.
[18] Oryza Habibie Rahman, Gunawan Abdillah, and Agus Komarudin, “Klasifikasi Ujaran Kebencian pada Media Sosial Twitter Menggunakan Support Vector Machine,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 1, pp. 17–23, 2021, doi: 10.29207/resti.v5i1.2700.
[19] A. P. Ruise, A. S. Mashuri, M. Sulaiman, and F. Rahman, “Studi Komparasi Metode Svm, Logistic Regresion Dan Random Forest Clasifier Untuk Mengklasifikasi Fake News di Twitter,” J I M P - Jurnal Informatika Merdeka Pasuruan, vol. 7, no. 2, p. 64, Sep. 2023, doi: 10.51213/jimp.v7i2.472.
[20] A. Ginting, Nurdin, and C. Agusniar, “Performance Analysis of SVM and Linear Regression for Predicting Tourist Visits in North Sumatera,” Int. J. Eng. Sci. Inf. Technol., vol. 5, no. 1, pp. 101–108, 2025, doi: 10.52088/ijesty.v5i1.667.
[21] I. M. D. P. Asana and N. P. D. T. Yanti, “Sistem Klasifikasi Pengajuan Kredit Dengan Metode Support Vector Machine (SVM) I Made Dwi Putra Asana,” Jurnal Sistem Cerdas, vol. 6, no. 2, pp. 123–133, 2023
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Azna Alaiya, Nurdin Nurdin, Cut Agusniar

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








