Deep Learning-Based Detection of Online Gambling Promotion Spam in Indonesian YouTube Comments

Authors

  • Muhammad Zhafran Ammar Universitas Negeri Surabaya
  • Ricky Eka Putra Universitas Negeri Surabaya
  • Yuni Yamasari Universitas Negeri Surabaya

DOI:

https://doi.org/10.30871/jaic.v9i6.11240

Keywords:

Online Gambling, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Deep Learning

Abstract

Online gambling promotion has increasingly penetrated social media platforms, with YouTube comments becoming a frequent target for spam-based advertising. Such activities not only violate platform policies but also expose users to harmful content. Addressing this issue requires automated detection systems capable of handling noisy, informal, and highly imbalanced text data. This study investigates the effectiveness of four recurrent neural architectures LSTM, GRU, BiLSTM, and BiGRU for detecting gambling promotion comments in Indonesian YouTube data. To address class imbalance, multiple experimental scenarios were explored, including the original distribution, undersampling, oversampling, and class weighting. Model performance was evaluated using accuracy, precision, recall, F1-score, ROC-AUC, and confusion matrix analysis. The results show that bidirectional models outperformed their unidirectional counterparts, with BiGRU achieving the best overall performance. When combined with class weighting, BiGRU reached 98% accuracy, 0.83 F1-score, and 0.971 ROC-AUC, demonstrating a superior ability to detect minority-class instances. Oversampling improved recall substantially but increased false positives, while undersampling reduced accuracy; class weighting provided the most balanced performance across metrics. These findings confirm that BiGRU with class weighting offers the most practical balance between accuracy, recall, and computational efficiency, making it well-suited for real-time moderation systems. The study provides a strong foundation for future research on transformer-based architectures and cross-platform spam detection in Indonesian social media environments.

Downloads

Download data is not yet available.

References

[1] YouTube, “YouTube Press Statistics.” [Online]. Available: https://www.youtube.com/yt/about/press/

[2] A. S. Xiao and Q. Liang, “Spam detection for Youtube video comments using machine learning approaches,” Mach. Learn. with Appl., vol. 16, no. December 2023, p. 100550, 2024, doi: 10.1016/j.mlwa.2024.100550.

[3] A. A. Makarin and L. Astuti, “Faktor yang Mempengaruhi Mahasiswa Melakukan Perjudian Online,” Indones. J. Crim. Law Criminol., vol. 3, no. 3, pp. 180–189, 2023, doi: 10.18196/ijclc.v3i3.17674.

[4] Pande Putu Rastika Paramartha, Anak Agung Sagung Laksmi Dewi, and I Putu Gede Seputra, “Sanksi Pidana terhadap Para Pemasang dan Promosi Iklan Bermuatan Konten Judi Online,” J. Prefer. Huk., vol. 2, no. 1, pp. 156–160, 2021, doi: 10.22225/jph.2.1.3062.156-160.

[5] H. Oh, “A YouTube Spam Comments Detection Scheme Using Cascaded Ensemble Machine Learning Model,” IEEE Access, vol. 9, pp. 144121–144128, 2021, doi: 10.1109/ACCESS.2021.3121508.

[6] Samuel and D. Prasetya Kristiadi, “Deteksi Teks Promosi Judi Online Menggunakan Ai Dengan Kombinasi NLP Dan Deep Learning,” J. Sist. Inf. dan Teknol. , vol. 5, no. 2 SE-Artikel, pp. 179–185, Jul. 2025, doi: 10.56995/sintek.v5i2.179.

[7] J. R. Fernando, R. Budiraharjo, and E. Haganusa, “Spam Classification on 2019 Indonesian President Election Youtube Comments Using Multinomial Naïve-Bayes,” Indones. J. Artif. Intell. Data Min., vol. 2, no. 1, pp. 37–44, 2019, doi: 10.24014/ijaidm.v2i1.6445.

[8] Bloomberg Technoz, “Konten Judi Online Menjamur di Komen YouTube, Google Menjawab,” Feb. 2025. [Online]. Available: https://www.bloombergtechnoz.com/detail-news/63330/konten-judi-online-menjamur-di-komen-youtube-google-menjawab/2

[9] N. M. Samsudin, C. F. B. Mohd Foozy, N. Alias, P. Shamala, N. F. Othman, and W. I. S. Wan Din, “Youtube spam detection framework using naïve bayes and logistic regression,” Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3, pp. 1508–1517, 2019, doi: 10.11591/ijeecs.v14.i3.pp1508-1517.

[10] A. O. Abdullah, M. A. Ali, M. Karabatak, and A. Sengur, “A comparative analysis of common YouTube comment spam filtering techniques,” 6th Int. Symp. Digit. Forensic Secur. ISDFS 2018 - Proceeding, vol. 2018-Janua, pp. 1–5, 2018, doi: 10.1109/ISDFS.2018.8355315.

[11] N. Ghatasheh, I. Altaharwa, and K. Aldebei, “Modified Genetic Algorithm for Feature Selection and Hyper Parameter Optimization: Case of XGBoost in Spam Prediction,” IEEE Access, vol. 10, no. July, pp. 84365–84383, 2022, doi: 10.1109/ACCESS.2022.3196905.

[12] F. Jauhari, M. Riza, R. G. Guntara, and M. R. Nugraha, “Indonesian Journal of Digital Business Implementasi Algoritma Naive Bayes untuk Filtrasi Spam Komentar Judi Online pada YouTube,” vol. 5, no. 2, pp. 411–423, 2025.

[13] M. C. T. Manullang, A. Z. Rakhman, H. Tantriawan, and A. Setiawan, “Comparative Analysis of CNN, Transformers, and Traditional ML for Classifying Online Gambling Spam Comments in Indonesian,” J. Appl. Informatics Comput., vol. 9, no. 3, pp. 592–602, 2025, doi: 10.30871/jaic.v9i3.9468.

[14] D. Naik and C. D. Jaidhar, “A novel Multi-Layer Attention Framework for visual description prediction using bidirectional LSTM,” J. Big Data, vol. 9, no. 1, 2022, doi: 10.1186/s40537-022-00664-6.

[15] E. Mahdi, C. Martin-Barreiro, and X. Cabezas, “A Novel Hybrid Approach Using an Attention-Based Transformer + GRU Model for Predicting Cryptocurrency Prices,” Mathematics, vol. 13, no. 9, pp. 1–19, 2025, doi: 10.3390/math13091484.

[16] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Jan. 2013, [Online]. Available: http://arxiv.org/abs/1301.3781

[17] Y. Kim, “Convolutional neural networks for sentence classification,” EMNLP 2014 - 2014 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 1746–1751, 2014, doi: 10.3115/v1/d14-1181.

[18] K. Cho et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” EMNLP 2014 - 2014 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 1724–1734, 2014, doi: 10.3115/v1/d14-1179.

[19] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997, doi: 10.1162/neco.1997.9.8.1735.

[20] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, pp. 1929–1958, 2014.

[21] I. Goodfellow, “Front Matter,” Linear Algebr., pp. i–ii, 2014, doi: 10.1016/b978-0-12-391420-0.09987-x.

[22] M. J. Hamayel and A. Y. Owda, “A Novel Cryptocurrency Price Prediction Model Using GRU, LSTM and bi-LSTM Machine Learning Algorithms,” AI, vol. 2, no. 4, pp. 477–496, 2021, doi: 10.3390/ai2040030.

Downloads

Published

2025-12-08

How to Cite

[1]
M. Z. Ammar, R. E. Putra, and Y. Yamasari, “Deep Learning-Based Detection of Online Gambling Promotion Spam in Indonesian YouTube Comments”, JAIC, vol. 9, no. 6, pp. 3632–3641, Dec. 2025.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.