Comparative Analysis of CNN, Transformers, and Traditional ML for Classifying Online Gambling Spam Comments in Indonesian
DOI:
https://doi.org/10.30871/jaic.v9i3.9468Keywords:
indonesian language, deep learning, spam detection, transformer, wordformerAbstract
The rise of user-generated content on social media and live-streaming platforms has intensified the spread of spam, particularly online gambling (Judi Online) promotions, which remain prevalent in Indonesian comment sections. This study investigates the effectiveness of various machine learning (ML) and deep learning (DL) approaches in classifying such spam content in Bahasa Indonesia. We compare five models: Support Vector Machine (SVM), Random Forest (RF), a CNN-based model, IndoBERT, and a custom lightweight transformer model named Wordformer. While IndoBERT achieves the highest performance across all metrics, it comes with high computational demands. Wordformer, in contrast, delivers a strong balance between accuracy and efficiency, outperforming traditional models while being significantly more lightweight than IndoBERT. Wordformer achieved 0.9975 accuracy and macro F1-score, surpassing SVM (0.9578) and Random Forest (0.9729), while maintaining a significantly smaller model size and fewer multiply-add operations. An extensive ablation study further explores the architectural and training design choices that influence Wordformer’s performance. The findings suggest that lightweight transformer models can offer practical, scalable solutions for spam detection in low-resource language settings without the need for large pretrained backbones.
Downloads
References
[1] A. Fahrudin et al., “Online gambling addiction: Problems and solutions for policymakers and stakeholders in Indonesia,” J. Infrastruct. Pol. Dev., vol. 8, no. 11, p. 9077, Oct. 2024.
[2] T. N. Dellia Putri Octavia, “Negative Impacts Of Online Gambling Reviewed From The Social Economic And Psychological Perspective In Accordance With Undergoing No. 1 Of 2024 On Second Amendment To Undergoing Number11 Of 2008 On Information And Transactions,” in International Conference Restructuring and Transforming Law, 2024.
[3] S. Sriyana, “Judi Online: Dampak Sosial, Ekonomi, Dan Psikologis Di Era Digital,” J SOCIOPOLITICO, Feb. 2025.
[4] A. Kosasih and T. Setiady, “Akibat hukum Artis promosikan situs slot Judi online dampak terhadap masyarakat Dan upaya penanggulangnya,” YUSTISI, vol. 12, no. 1, pp. 67–78, Feb. 2025.
[5] A. Nurdiansyah and A. S. Kanda, “Bahaya Judi Online : Dampak Sosial, Ekonomi, Dan Kesehatan,” sscj-amik, vol. 2, no. 1, pp. 305–310, Jan. 2024.
[6] L. Rafiqah and H. Rasyid, “The Dampak Judi Online terhadap Kehidupan Sosial Ekonomi Masyarakat,” Al-Mutharahah, vol. 20, no. 2, pp. 282–290, Dec. 2023.
[7] A. A. Hadi, A. Zaky, N. Rizqiananda, and B. Unggaran, “Edukasi Bahaya Judi Online Digital Sebagai Upaya Pencegahan Dampak Sosial Dan Ekonomi Bagi Masyarakat Komplek Graha Indah 2 Pamulang,” Krepa, vol. 3, no. 12, pp. 61–70, Dec. 2024.
[8] A. R. Chrismanto, A. K. Sari, and Y. Suyanto, “Critical evaluation on spam content detection in social media,” Journal of Theoretical and Applied Information Technology, vol. 100, no. 8, pp. 2642–2667, Apr. 2022.
[9] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep learning--based text classification: A comprehensive review,” ACM Comput. Surv., vol. 54, no. 3, pp. 1–40, Apr. 2022.
[10] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, “Support vector machines,” IEEE Intell. Syst., vol. 13, no. 4, pp. 18–28, Jul. 1998.
[11] L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001.
[12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North, Minneapolis, Minnesota, 2019, pp. 4171–4186.
[13] L. P. Hung and S. Alias, “Beyond sentiment analysis: A review of recent trends in text based sentiment analysis and emotion detection,” J. Adv. Comput. Intell. Intell. Inform., vol. 27, no. 1, pp. 84–95, Jan. 2023.
[14] L. Gong and R. Ji, “What does a TextCNN learn?,” arXiv [stat.ML], 18-Jan-2018.
[15] Z. Hou et al., “C-BDCLSTM: A false emotion recognition model in micro blogs combined Char-CNN with bidirectional dilated convolutional LSTM,” Appl. Soft Comput., vol. 130, no. 109659, p. 109659, Nov. 2022.
[16] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter,” arXiv [cs.CL], 02-Oct-2019.
[17] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “ALBERT: A lite BERT for self-supervised learning of language representations,” arXiv [cs.CL], 26-Sep-2019.
[18] X. Jiao et al., “TinyBERT: Distilling BERT for natural language understanding,” arXiv [cs.CL], 23-Sep-2019.
[19] Z. Sun, H. Yu, X. Song, R. Liu, Y. Yang, and D. Zhou, “MobileBERT: A compact task-agnostic BERT for resource-limited devices,” arXiv [cs.CL], 06-Apr-2020.
[20] Yaemico, “Deteksi Judi Online.” 07-Oct-2024.
[21] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” arXiv [cs.CL], 25-Aug-2014.
[22] B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” in Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Martin Clinton Tosima Manullang, Arkham Zahri Rakhman, Hartanto Tantriawan, Andika Setiawan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).