Detecting Fake Reviews in E-Commerce: A Case Study on Shopee Using Support Vector Machine and Random Forest

Authors

  • Khoirotulmuadiba Purifyregalia UIN Walisongo Semarang
  • Khothibul Umam UIN Walisongo Semarang
  • Nur Cahyo Hendro Wibowo UIN Walisongo Semarang
  • Maya Rini Handayani UIN Walisongo Semarang

DOI:

https://doi.org/10.30871/jaic.v9i3.9514

Keywords:

Fake Review Detection, NLP, Shopee, Random Forest, SVM

Abstract

The increasing popularity of online shopping, particularly on platforms such as Shopee, has made product reviews a significant factor influencing consumer purchasing decisions. However, the presence of fake reviews generated by non-human agents undermines consumer trust and affects platform credibility. This study aims to detect fake reviews on Shopee by applying a text classification approach using Random Forest and Support Vector Machine (SVM) algorithms. A dataset consisting of 3,686 Shopee product reviews was collected and underwent preprocessing steps including data cleaning, normalization, tokenization, and TF-IDF weighting. Review labeling was performed automatically through the Latent Dirichlet Allocation (LDA) method, categorizing reviews into Original (OR) and Computer-Generated (CG). Model performance was evaluated using accuracy, precision, recall, and F1-score metrics. Experimental results show that the SVM algorithm achieved the highest accuracy at 88.84%, outperforming Random Forest which obtained 80.39%. These findings highlight the effectiveness of SVM in handling high-dimensional text data for fake review detection. The study contributes to the application of automated topic modeling (LDA) for labeling e-commerce reviews in the Indonesian context and opens opportunities for further enhancement using larger datasets and deep learning-based models to improve classification accuracy and scalability.

Downloads

Download data is not yet available.

References

[1] Z. Hadi, M. Zulpahmi, . Z., and A. Asrory, “Detecting Fake Reviews Using BERT and Sublinear_TF Methods on Hotel Reviews in the Lombok Tourism Area,” J. Appl. Informatics Comput., vol. 8, no. 2, pp. 550–556, Nov. 2024, doi: 10.30871/jaic.v8i2.8721.

[2] K. Mane, S. Dongre, and M. Madankar, “Fake Review Detection using Random Forest Classifier,” in 2025 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), IEEE, Jan. 2025, pp. 1–6. doi: 10.1109/SCEECS64059.2025.10940605.

[3] O. Singh, S. Singh, S. Rawat, and S. Nirati, “Fake Reviews Identification Using Deep Learning Techniques,” vol. 8, no. 2, pp. 820–827, 2025.

[4] S. Zabeen, A. Hasan, M. F. Islam, M. S. Hossain, and A. A. Rasel, “Robust Fake Review Detection Using Uncertainty-Aware LSTM and BERT,” in 2023 IEEE 15th International Conference on Computational Intelligence and Communication Networks (CICN), IEEE, Dec. 2023, pp. 786–791. doi: 10.1109/CICN59264.2023.10402342.

[5] H. Alamsyah, Y. Cahyana, and A. R. Pratama, “Deteksi Fake Review Menggunakan Metode Support Vector Machine dan Naïve Bayes Di Tokopedia,” Jutisi J. Ilm. Tek. Inform. dan Sist. Inf., vol. 12, no. 2, p. 585, Aug. 2023, doi: 10.35889/jutisi.v12i2.1222.

[6] Sugiyono, Metode Penelitian Kuantitatif, Kualitatif Dan R&D. 1967. [Online]. Available: https://www.academia.edu/118903676/Metode_Penelitian_Kuantitatif_Kualitatif_dan_R_and_D_Prof_Sugiono

[7] K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text Classification Algorithms: A Survey,” Information, vol. 10, no. 4, p. 150, Apr. 2019, doi: 10.3390/info10040150.

[8] M. Nurjannah and I. Fitri Astuti, “PENERAPAN ALGORITMA TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) UNTUK TEXT MINING Mahasiswa S1 Program Studi Ilmu Komputer FMIPA Universitas Mulawarman Dosen Program Studi Ilmu Komputer FMIPA Universitas Mulawarman,” J. Inform. Mulawarman, vol. 8, no. 3, pp. 110–113, 2013.

[9] A. Supriatman, “Pembobotan TF-IDF pada Judul Penelitian Dosen Sebagai Dasar Klasifikasi Menggunakan Algoritma K-NN (Studi Kasus: Universitas Siliwangi),” J. Serambi Eng., vol. 6, no. 1, pp. 1573–1579, 2021, doi: 10.32672/jse.v6i1.2645.

[10] T. Posangi, L. Yahya, and D. Wungguli, “Implementasi Algoritma Random Forest dengan Forward Selection untuk Klasifikasi Indeks Pembangunan Manusia,” Jambura J. Probab. Stat., vol. 4, no. 2, pp. 85–91, 2023, doi: 10.37905/jjps.v4i2.18460.

[11] I. Afdhal, R. Kurniawan, I. Iskandar, R. Salambue, E. Budianita, and F. Syafria, “Penerapan Algoritma Random Forest Untuk Analisis Sentimen Komentar Di YouTube Tentang Islamofobia,” J. Nas. Komputasi dan Teknol. Inf., vol. 5, no. 1, pp. 122–130, 2022, [Online]. Available: http://ojs.serambimekkah.ac.id/jnkti/article/view/4004/pdf

[12] A. Ramadhan, B. Susetyo, and Indahwati, “Penerapan Metode Klasifikasi Random Forest Dalam Mengidentifikasi Faktor Penting Penilaian Mutu Pendidikan,” J. Pendidik. dan Kebud., vol. 4, no. 2, pp. 169–182, 2019, doi: 10.24832/jpnk.v4i2.1327.

[13] K. Putri et al., “Implementasi Algoritma Support Vector Machine dalam Klasifikasi Deteksi Depresi dari Postingan pada Media Sosial,” J. Nas. Teknol. Inf. dan Apl., vol. 2, no. 1, pp. 193–202, 2023.

[14] W. A. Naseer, S. Sarwido, and B. B. Wahono, “Gradient Boosting Optimization with Pruning Technique for Prediction of Bmt Al-hikmah Permata Customer Data,” Jinteks, vol. 6, no. 3, pp. 719–727, 2024.

[15] K. Adib, M. R. Handayani, W. D. Yuniarti, and K. Umam, “Opini Publik Pasca-Pemilihan Presiden: Eksplorasi Analisis Sentimen Media Sosial X Menggunakan SVM,” SINTECH (Science Inf. Technol. J., vol. 7, no. 2, pp. 80–91, 2024, doi: 10.31598/sintechjournal.v7i2.1581.

[16] M. H. Aufan, M. R. Handayani, A. B. Nurjanna, and N. C. Hendro, “THE PERCEPTIONS OF SEMARANG FIVE STAR HOTEL TOURISTS WITH SUPPORT VECTOR MACHINE ON GOOGLE REVIEWS PERSEPSI WISATAWAN HOTEL BINTANG LIMA SEMARANG DENGAN,” vol. 5, no. 5, pp. 1241–1247, 2025.

[17] M. Apriliyani, M. I. Musyaffaq, S. Nur’Aini, M. R. Handayani, and K. Umam, “Implementasi analisis sentimen pada ulasan aplikasi Duolingo di Google Playstore menggunakan algoritma Naïve Bayes,” AITI, vol. 21, no. 2, pp. 298–311, Sep. 2024, doi: 10.24246/aiti.v21i2.298-311.

Downloads

Published

2025-06-19

How to Cite

[1]
Khoirotulmuadiba Purifyregalia, Khothibul Umam, Nur Cahyo Hendro Wibowo, and Maya Rini Handayani, “Detecting Fake Reviews in E-Commerce: A Case Study on Shopee Using Support Vector Machine and Random Forest”, JAIC, vol. 9, no. 3, pp. 955–965, Jun. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.