Implementation of CNN Algorithm for Indonesian Hoax News Detection on Online News Portals

Authors

  • Clifansi Remi Siwi Hati Universitas Teknokrat Indonesia
  • Heni Sulistiani Universitas Teknokrat Indonesia

DOI:

https://doi.org/10.30871/jaic.v9i3.9403

Keywords:

Political Hoax News, Convolutional Neural Network (CNN), Deep Learning, FastText

Abstract

The Spread of hoax news in the Industrial Revolution 4.0 era has occurred in the world’s society, including Indonesia. Therefore, an effective method is needed to detect it. The purpose of this research is to apply deep learning with the Convolutional Neural Network (CNN) algorithm in detecting text-based hoax news in Indonesian. The dataset is taken from Kaggle, which has been scraped from CNN Indonesia, Tempo, and Turnbackhoax, which will be labeled as valid and hoax. The implementation of the dataset goes through several processes that include input dataset, data pre-processing using pre-trained embedding GloVe, data processing, model evaluation, also model deployment into the simple web. Data is divided into 80% training data and 20% test data for CNN model development. The results show that the CNN model can achieve high accuracy in detecting hoaxes with training accuracy values reaching 99.65% and validation accuracy reaching 99.88% with a loss of 0.0477 and 0.0435, which means that the model is effective in classifying text-based hoax news to the maximum. The model is evaluated using a confusion matrix, precision, recall, and heatmap as a visualization of results. For further research, it is recommended to increase additional variations for training data so the model can understand patterns well.

Downloads

Download data is not yet available.

References

[1] M. Athaillah, Y. Azhar, and Y. Munarko, “Perbandingan Metode Klasifikasi Berita Hoaks Berbahasa Indonesia Berbasis Pembelajaran Mesin,” REPOSITOR, vol. 2, no. 12, pp. 1700–1705, 2020, [Online]. Available: www.trunbackhoax.id

[2] F. Diani et al., “Prosiding the 15 th Industrial Research Workshop and National Seminar Bandung,” 2024.

[3] C. Andreas, S. Priandi, A. N. M. B. Simamora, and M. F. F. Mardianto, “Analisis Hubungan Media Sosial dan Media Massa dalam Penyebaran Berita Hoaks berdasarkan Structural Equation Modeling-Partial Least Square,” MUST J. Math. Educ. Sci. Technol., vol. 6, no. 1, p. 81, Jul. 2021, doi: 10.30651/must.v6i1.8816.

[4] V. Ramadhan and A. Pambudi, “Implementasi Algoritma Convolutional Neural Network Untuk Mengidentifikasi Berita Hoaks Berbahasa Indonesia,” 2024.

[5] C. S. Sriyano and E. B. Setiawan, “Pendeteksian Berita Hoax Menggunakan Naive Bayes Multinomial Pada Twitter dengan Fitur Pembobotan TF-IDF.”

[6] “Kementerian Komunikasi dan Digital.” Accessed: Apr. 14, 2025. [Online]. Available: https://www.komdigi.go.id/berita/siaran-pers/detail/komdigi-identifikasi-1923-konten-hoaks-sepanjang-tahun-2024

[7] “Deteksi Berita Hoax Dengan Pendekatan Lexicon Based Dan Lstm Thesis Oleh : Edwin Hari Agus Prastyo Nim. 220605220005 Program Studi Magister Informatika Fakultas Sains Dan Teknologi Universitas Islam Negeri Maulana Malik Ibrahim Malang 2024.”

[8] S. Imron, E. I. Setiawan, and J. Santoso, “Deteksi Aspek Review E-Commerce Menggunakan IndoBERT Embedding dan CNN,” J. Intell. Syst. Comput., vol. 5, no. 1, pp. 10–16, Apr. 2023, doi: 10.52985/insyst.v5i1.267.

[9] W. Hidayat, E. Utami, A. F. Iskandar, A. D. Hartanto, and A. B. Prasetio, “Perbandingan Performansi Model pada Algoritma K-NN terhadap Klasifikasi Berita Fakta Hoaks Tentang Covid-19,” Edumatic J. Pendidik. Inform., vol. 5, no. 2, pp. 167–176, Dec. 2021, doi: 10.29408/edumatic.v5i2.3664.

[10] K. Maharana, S. Mondal, and B. Nemade, “A review: Data pre-processing and data augmentation techniques,” Glob. Transitions Proc., vol. 3, no. 1, pp. 91–99, Jun. 2022, doi: 10.1016/j.gltp.2022.04.020.

[11] N. Alvi Hasanah, Nanik Suciati, and Diana Purwitasari, “Pemantauan Perhatian Publik terhadap Pandemi COVID-19 melalui Klasifikasi Teks dengan Deep Learning,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 193–202, Feb. 2021, doi: 10.29207/resti.v5i1.2927.

[12] P. R. Gupte et al., “A guide to pre-processing high-throughput animal tracking data,” J. Anim. Ecol., vol. 91, no. 2, pp. 287–307, Feb. 2022, doi: 10.1111/1365-2656.13610.

[13] F. T. Sabilillah, S. Winarno, and R. B. Abiyyi, “Implementasi BERT dan Cosine Similarity untuk Rekomendasi Dosen Pembimbing berdasarkan Judul Tugas Akhir,” Edumatic J. Pendidik. Inform., vol. 8, no. 2, pp. 585–594, Dec. 2024, doi: 10.29408/edumatic.v8i2.27791.

[14] S. Sarica and J. Luo, “Stopwords in technical language processing,” PLoS One, vol. 16, no. 8 August, Aug. 2021, doi: 10.1371/journal.pone.0254937.

[15] R. E. kalaivani and R. MarivendanE, “The Effect of Stop Word Removal and Stemming In Datapreprocessing,” 2021. [Online]. Available: http://annalsofrscb.ro

[16] A. Tabassum and R. R. Patil, “A Survey on Text Pre-Processing & Feature Extraction Techniques in Natural Language Processing,” Int. Res. J. Eng. Technol., 2020, [Online]. Available: www.irjet.net

[17] M. Kamyab, G. Liu, and M. Adjeisah, “Attention-Based CNN and Bi-LSTM Model Based on TF-IDF and GloVe Word Embedding for Sentiment Analysis,” Appl. Sci., vol. 11, no. 23, Dec. 2021, doi: 10.3390/app112311255.

[18] M. Umer et al., “Impact of convolutional neural network and FastText embedding on text classification,” Multimed. Tools Appl., vol. 82, no. 4, pp. 5569–5585, Feb. 2023, doi: 10.1007/s11042-022-13459-x.

[19] D. Elreedy, A. F. Atiya, and F. Kamalov, “A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning,” Mach. Learn., vol. 113, no. 7, pp. 4903–4923, Jul. 2024, doi: 10.1007/s10994-022-06296-4.

[20] D. S. Silalahi, M. M. Santoni, and A. Muliawati, Implementasi Convolutional Neural Network Untuk Klasifikasi Kata Pada Citra Teks.

[21] “JEPIN (Jurnal Edukasi dan Penelitian Informatika) Penerapan Convolutional Neural Network (CNN) pada Pengenalan Aksara Lampung Berbasis Optical Character Recognition (OCR) Agus Mulyanto #1 , Erlina Susanti #2 , Farli Rosi #3 , Wajiran #4 , Rohmat Indra Borman #5”, [Online]. Available: https://colab.research.google.com.

[22] V. Q. Nguyen, T. N. Anh, and H. J. Yang, “Real-time event detection using recurrent neural network in social sensors,” Int. J. Distrib. Sens. Networks, vol. 15, no. 6, Jun. 2019, doi: 10.1177/1550147719856492.

[23] M. Fadli and R. A. Saputra, “Klasifikasi Dan Evaluasi Performa Model Random Forest Untuk Prediksi Stroke Classification And Evaluation Of Performance Models Random Forest For Stroke Prediction,” vol. 12, [Online]. Available: http://jurnal.umt.ac.id/index.php/jt/index

[24] K. Kristiawan and A. Widjaja, “Perbandingan Algoritma Machine Learning dalam Menilai Sebuah Lokasi Toko Ritel,” J. Tek. Inform. dan Sist. Inf., vol. 7, no. 1, Apr. 2021, doi: 10.28932/jutisi.v7i1.3182.

[25] G. Ayu, V. Mastrika Giri, and L. Radhitya, “Musical Instrument Classification using Audio Features and Convolutional Neural Network,” 2024. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC

Downloads

Published

2025-06-05

How to Cite

[1]
C. R. S. Hati and H. Sulistiani, “Implementation of CNN Algorithm for Indonesian Hoax News Detection on Online News Portals”, JAIC, vol. 9, no. 3, pp. 765–774, Jun. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.