Application of Multinomial Naïve Bayes for Sentiment Classification on Bukalapak Reviews

Authors

  • Dona Yuliawati Institut Informatika dan Bisnis Darmajaya
  • Musyafa Faeang Ogya Widi Institut Informatika dan Bisnis Darmajaya

DOI:

https://doi.org/10.30871/jaic.v9i6.11671

Keywords:

Sentiment Analysis, Multinomial Naïve Bayes, E-Commerce, Bukalapak, Customer Reviews

Abstract

This study investigates sentiment analysis on user reviews from Bukalapak, a major Indonesian e-commerce platform, using the Multinomial Naïve Bayes (MNB) classifier. The study focuses on tackling the challenge of data imbalance and the linguistic complexities of Indonesian, such as slang, affixes, and negation, which are common in user reviews. Data was collected through web scraping from Bukalapak's app on the Google Play Store, resulting in a dataset of 19,999 reviews. A structured preprocessing pipeline was employed, including text normalization, tokenization, stopword removal, stemming, and term frequency-inverse document frequency (TF-IDF) weighting to prepare the data. The sentiment analysis results show that the model performs well in categorizing neutral reviews (accuracy 81%), but struggles with positive and negative sentiments due to data imbalance, leading to lower accuracy for these categories. The study highlights the effectiveness of Multinomial Naïve Bayes in large-scale sentiment analysis tasks in the e-commerce domain, particularly for platforms with large volumes of user-generated content. The study also introduces SMOTE (Synthetic Minority Over-sampling Technique) for handling data imbalance and k-fold cross-validation for model evaluation, significantly improving the model’s reliability. The research concludes that sentiment analysis can greatly benefit e-commerce platforms by improving customer service, informing product management decisions, and providing valuable insights for business strategies.

Downloads

Download data is not yet available.

References

[1] N. Adiasa, "Pengaruh Pemahaman Peraturan Pajak terhadap Kepatuhan Wajib Pajak dengan Moderating Preferensi Risiko," Accounting Analysis Journal, vol. 2, no. 3, pp. 345–352, 2013.

[2] S. Andrews and L. Hirsch, "A Tool for Creating and Visualising Formal Concept Trees," CEUR Workshop Proceedings, vol. 1637, pp. 1–9, 2016.

[3] A. Agustinah, "Word Cloud of Corruption Eradication Commission," pp. 4-5, 2015.

[4] A. Aswin and A. Wahidun, "Analisis Atribut Produk Samsung dan Asus Menggunakan Metode Multidimensional Scaling (MDS) di Bandar Lampung," Jurnal Bisnis Darmajaya, vol. 2, no. 2, pp. 62–74, 2016.

[5] R. Feldman and J. Sanger, The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data, Cambridge University Press, 2007.

[6] F. Fitria and I. Dwijananda, "Analisis Pengaruh Electronic Word of Mouth terhadap Proses Keputusan Pembelian (Studi pada Go-Jek)," Eproceding of Management, vol. 3, pp. 1–19, 2016.

[7] M. Rasyadi, "Analisis Sentimen pada Twitter Menggunakan Metode Naïve Bayes (Studi Kasus Pemilihan Gubernur DKI Jakarta 2017)," pp. 1–17, 2017.

[8] C. Sagita, "Pengaruh Electronic Word of Mouth, Brand Ambassador, dan Persepsi Nilai terhadap Keputusan Pembelian pada Tokopedia.com," IIB Darmajaya, 2020. [Online]. Available: http://repo.darmajaya.ac.id/id/eprint/2673

[9] F. Gorunescu, Data Mining: Concepts, Models and Techniques, Springer Science & Business Media, 2011.

[10] C. Kaur and A. Sharma, "Twitter Sentiment Analysis on Coronavirus Using TextBlob," EasyChair Preprint 2974, pp. 1–10, 2020.

[11] S. Kim, K. Han, H. Rim, and S. Myaeng, "Some Effective Techniques for Naive Bayes Text Classification," IEEE Transactions on Knowledge and Data Engineering, vol. 24, pp. 1457–1466, 2006.

[12] N. Komang et al., "Seleksi Fitur Bobot Kata dengan Metode TF-IDF untuk Ringkasan Bahasa Indonesia," Merpati, vol. 6, no. 2, 2018.

[13] A. Kurniawan, "Analisis Kondisi Lingkungan Fisik Rumah dengan Kejadian ISPA pada Balita di Wilayah Puskesmas Purwokerto Selatan Kecamatan Purwokerto Selatan Kabupaten Banyumas Tahun 2013," Universitas Harapan Bangsa, 2013. Available: http://eprints.uhb.ac.id/id/eprint/2103

[14] C. K. Laudon and P. J. Laudon, Essentials of Management Information Systems, Pearson Education, Inc., 2013.

[15] M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations of Machine Learning, 2012.

[16] A. Mustika and M. Affandes, "Penerapan Metode Support Vector Machine dalam Klasifikasi Sentimen Tweet Public Figure," Sentra, pp. 978–979, 2015.

[17] D. Normawati and S. A. Prayogi, "Implementasi Naïve Bayes Classifier dan Confusion Matrix pada Analisis Sentimen Berbasis Teks pada Twitter," Jurnal Sains Komputer & Informatika (J-Sakti), vol. 5, no. 2, pp. 697–711, 2021.

[18] F. Nurhuda, S. W. Sihwi, and A. Doewes, "Analisis Sentimen Masyarakat terhadap Calon Presiden Indonesia 2014 Berdasarkan Opini dari Twitter Menggunakan Metode Naïve Bayes Classifier," ITSmart: Jurnal Teknologi dan Informasi, vol. 2, no. 2, pp. 35–42, 2013.

[19] L. Perkovic, Introduction to Computing Using Python, pp. 510, 2012. Available: https://dspace.uii.ac.id/bitstream/handle/123456789/7762/14611242_syarifah_rosita_dewi_statistika.pdf?Sequence=1

[20] P. Kotler and K. L. Keller, A Framework for Marketing Management (Sixth Edition-Global Edition), 2016.

[21] J. Pustejovsky and A. Stubbs, Natural Language Annotation for Machine Learning: A Guide to Corpus-Building for Applications, 2012. Available: https://books.google.co.id/books?id=nmx4vxv1k0yc

[22] M. Rasyadi, "Analisis Sentimen pada Twitter Menggunakan Metode Naïve Bayes (Studi Kasus Pemilihan Gubernur DKI Jakarta 2017)," pp. 1–17, 2017.

[23] C. Sagita, "Pengaruh Electronic Word of Mouth, Brand Ambassador, dan Persepsi Nilai terhadap Keputusan Pembelian pada Tokopedia.com," IIB Darmajaya, 2020. Available: http://repo.darmajaya.ac.id/id/eprint/2673

[24] U. Sumarwan, U. Simanjuntak, and L. N. Yuliati, "Meta-Analysis Study: Reading Behavior of Food Products Label," Journal of Consumer Sciences, vol. 2, no. 2, pp. 26, 2017.

Downloads

Published

2025-12-15

How to Cite

[1]
D. Yuliawati and M. Faeang Ogya Widi, “Application of Multinomial Naïve Bayes for Sentiment Classification on Bukalapak Reviews”, JAIC, vol. 9, no. 6, pp. 3883–3891, Dec. 2025.

Similar Articles

<< < 1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.