Deep Learning-Based Detection of Online Gambling Promotion Spam in Indonesian YouTube Comments
DOI:
https://doi.org/10.30871/jaic.v9i6.11240Keywords:
Online Gambling, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Deep LearningAbstract
Online gambling promotion has increasingly penetrated social media platforms, with YouTube comments becoming a frequent target for spam-based advertising. Such activities not only violate platform policies but also expose users to harmful content. Addressing this issue requires automated detection systems capable of handling noisy, informal, and highly imbalanced text data. This study investigates the effectiveness of four recurrent neural architectures LSTM, GRU, BiLSTM, and BiGRU for detecting gambling promotion comments in Indonesian YouTube data. To address class imbalance, multiple experimental scenarios were explored, including the original distribution, undersampling, oversampling, and class weighting. Model performance was evaluated using accuracy, precision, recall, F1-score, ROC-AUC, and confusion matrix analysis. The results show that bidirectional models outperformed their unidirectional counterparts, with BiGRU achieving the best overall performance. When combined with class weighting, BiGRU reached 98% accuracy, 0.83 F1-score, and 0.971 ROC-AUC, demonstrating a superior ability to detect minority-class instances. Oversampling improved recall substantially but increased false positives, while undersampling reduced accuracy; class weighting provided the most balanced performance across metrics. These findings confirm that BiGRU with class weighting offers the most practical balance between accuracy, recall, and computational efficiency, making it well-suited for real-time moderation systems. The study provides a strong foundation for future research on transformer-based architectures and cross-platform spam detection in Indonesian social media environments.
Downloads
References
[1] YouTube, “YouTube Press Statistics.” [Online]. Available: https://www.youtube.com/yt/about/press/
[2] A. S. Xiao and Q. Liang, “Spam detection for Youtube video comments using machine learning approaches,” Mach. Learn. with Appl., vol. 16, no. December 2023, p. 100550, 2024, doi: 10.1016/j.mlwa.2024.100550.
[3] A. A. Makarin and L. Astuti, “Faktor yang Mempengaruhi Mahasiswa Melakukan Perjudian Online,” Indones. J. Crim. Law Criminol., vol. 3, no. 3, pp. 180–189, 2023, doi: 10.18196/ijclc.v3i3.17674.
[4] Pande Putu Rastika Paramartha, Anak Agung Sagung Laksmi Dewi, and I Putu Gede Seputra, “Sanksi Pidana terhadap Para Pemasang dan Promosi Iklan Bermuatan Konten Judi Online,” J. Prefer. Huk., vol. 2, no. 1, pp. 156–160, 2021, doi: 10.22225/jph.2.1.3062.156-160.
[5] H. Oh, “A YouTube Spam Comments Detection Scheme Using Cascaded Ensemble Machine Learning Model,” IEEE Access, vol. 9, pp. 144121–144128, 2021, doi: 10.1109/ACCESS.2021.3121508.
[6] Samuel and D. Prasetya Kristiadi, “Deteksi Teks Promosi Judi Online Menggunakan Ai Dengan Kombinasi NLP Dan Deep Learning,” J. Sist. Inf. dan Teknol. , vol. 5, no. 2 SE-Artikel, pp. 179–185, Jul. 2025, doi: 10.56995/sintek.v5i2.179.
[7] J. R. Fernando, R. Budiraharjo, and E. Haganusa, “Spam Classification on 2019 Indonesian President Election Youtube Comments Using Multinomial Naïve-Bayes,” Indones. J. Artif. Intell. Data Min., vol. 2, no. 1, pp. 37–44, 2019, doi: 10.24014/ijaidm.v2i1.6445.
[8] Bloomberg Technoz, “Konten Judi Online Menjamur di Komen YouTube, Google Menjawab,” Feb. 2025. [Online]. Available: https://www.bloombergtechnoz.com/detail-news/63330/konten-judi-online-menjamur-di-komen-youtube-google-menjawab/2
[9] N. M. Samsudin, C. F. B. Mohd Foozy, N. Alias, P. Shamala, N. F. Othman, and W. I. S. Wan Din, “Youtube spam detection framework using naïve bayes and logistic regression,” Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3, pp. 1508–1517, 2019, doi: 10.11591/ijeecs.v14.i3.pp1508-1517.
[10] A. O. Abdullah, M. A. Ali, M. Karabatak, and A. Sengur, “A comparative analysis of common YouTube comment spam filtering techniques,” 6th Int. Symp. Digit. Forensic Secur. ISDFS 2018 - Proceeding, vol. 2018-Janua, pp. 1–5, 2018, doi: 10.1109/ISDFS.2018.8355315.
[11] N. Ghatasheh, I. Altaharwa, and K. Aldebei, “Modified Genetic Algorithm for Feature Selection and Hyper Parameter Optimization: Case of XGBoost in Spam Prediction,” IEEE Access, vol. 10, no. July, pp. 84365–84383, 2022, doi: 10.1109/ACCESS.2022.3196905.
[12] F. Jauhari, M. Riza, R. G. Guntara, and M. R. Nugraha, “Indonesian Journal of Digital Business Implementasi Algoritma Naive Bayes untuk Filtrasi Spam Komentar Judi Online pada YouTube,” vol. 5, no. 2, pp. 411–423, 2025.
[13] M. C. T. Manullang, A. Z. Rakhman, H. Tantriawan, and A. Setiawan, “Comparative Analysis of CNN, Transformers, and Traditional ML for Classifying Online Gambling Spam Comments in Indonesian,” J. Appl. Informatics Comput., vol. 9, no. 3, pp. 592–602, 2025, doi: 10.30871/jaic.v9i3.9468.
[14] D. Naik and C. D. Jaidhar, “A novel Multi-Layer Attention Framework for visual description prediction using bidirectional LSTM,” J. Big Data, vol. 9, no. 1, 2022, doi: 10.1186/s40537-022-00664-6.
[15] E. Mahdi, C. Martin-Barreiro, and X. Cabezas, “A Novel Hybrid Approach Using an Attention-Based Transformer + GRU Model for Predicting Cryptocurrency Prices,” Mathematics, vol. 13, no. 9, pp. 1–19, 2025, doi: 10.3390/math13091484.
[16] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781
[17] Y. Kim, “Convolutional neural networks for sentence classification,” EMNLP 2014 - 2014 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 1746–1751, 2014, doi: 10.3115/v1/d14-1181.
[18] K. Cho et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” EMNLP 2014 - 2014 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 1724–1734, 2014, doi: 10.3115/v1/d14-1179.
[19] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997, doi: 10.1162/neco.1997.9.8.1735.
[20] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, pp. 1929–1958, 2014.
[21] I. Goodfellow, “Front Matter,” Linear Algebr., pp. i–ii, 2014, doi: 10.1016/b978-0-12-391420-0.09987-x.
[22] M. J. Hamayel and A. Y. Owda, “A Novel Cryptocurrency Price Prediction Model Using GRU, LSTM and bi-LSTM Machine Learning Algorithms,” AI, vol. 2, no. 4, pp. 477–496, 2021, doi: 10.3390/ai2040030.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Muhammad Zhafran Ammar, Ricky Eka Putra, Yuni Yamasari

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








