Comparative Analysis of IndoBERT and Classic Machine Learning Models for Sentiment Classification of Education Policy on Social Media X

Authors

  • Gabriella Fani Suciarti Medantoro Universitas Dian Nuswantoro
  • Muljono Muljono Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v10i1.11723

Keywords:

Sentiment Analysist, Social Media X, Machine Learning, Implicit, Education

Abstract

Leadership changes provide an opportunity for new education policies, generating complex public opinions on social media X that often contain implicit sentiments like satire, making automated analysis challenging. This study aims to address this challenge by conducting a comparative analysis to evaluate the effectiveness of the IndoBERT model in capturing nuanced, implicit sentiments compared to traditional machine learning classifiers (SVM, Naïve Bayes, Logistic Regression, KNN, and Random Forest). This research utilized a dataset of Indonesian-language tweets, collected via crawling. Data was pre-processed (cleaning, case folding, etc.) and labeled (positive/negative) using a hybrid Lexicon-LLM approach. The TF-IDF technique was used for feature extraction for the machine learning models, while IndoBERT used its internal tokenization. Models were evaluated using accuracy, precision, recall, and F1-score. The results showed that the IndoBERT model performed best with an accuracy score of 97%, significantly outperforming the other best machine learning models, namely Random Forest 95% and SVM 95%. This study concludes that the IndoBERT model is a superior and more robust solution for analyzing nuanced public sentiment on educational policies, demonstrating a greater ability to understand complex context and implicit language compared to traditional TF-IDF-based methods.

Downloads

Download data is not yet available.

References

[1] A. P. Putra, “Pemerintah, DPR, dan Penyelenggara Sepakati Pemilu Serentak 14 Februari 2024,” Kementerian Pendayagunaan Aparatur Negara dan Reformasi Birokrasi. Diakses: 6 Oktober 2025. [Daring]. Tersedia pada: https://menpan.go.id/site/berita-terkini/berita-daerah/pemerintah-dpr-dan-penyelenggara-sepakati-pemilu-serentak-14-februari-2024

[2] triya.andriyani, “Media Sosial jadi Sarana Penyampaian Pesan dan Kritik Sosial Kalangan Anak Muda,” Universitas Gadjah Mada. Diakses: 6 Oktober 2025. [Daring]. Tersedia pada: https://ugm.ac.id/id/berita/medis-sosial-jadi-sarana-penyampaian-pesan-dan-kritik-sosial-kalangan-anak-muda/

[3] I. Z. Hayati, R. Herdiana, dan S. Mulyani, “Gaya Bahasa Sindiran dalam Kolom Komentar Twitter Akun @tanyakanrl,” Diksatrasia J. Ilm. Pendidik. Bhs. Dan Sastra Indones., vol. 8, no. 2, hlm. 556, Agu 2024, doi: 10.25157/diksatrasia.v8i2.15123.

[4] Z. Li, Y. Zou, C. Zhang, Q. Zhang, dan Z. Wei, “Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training,” dalam Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021, hlm. 246–256. doi: 10.18653/v1/2021.emnlp-main.22.

[5] A. O. Thakare, N. R. Soora, L. Jena, A. R. Singh, A. P. H, dan R. Pachlor, “Hate Speech Detection in Social Media Data Using Big Data Analytics*,” dalam 2025 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC), Bhubaneswar, India: IEEE, Mei 2025, hlm. 1–9. doi: 10.1109/ASSIC64892.2025.11158561.

[6] Q. Zhang, X. Zhu, J. L. Zhao, dan L. Liang, “Discovering signals of platform failure risks from customer sentiment: the case of online P2P lending,” Ind. Manag. Data Syst., vol. 122, no. 3, hlm. 666–681, Mar 2022, doi: 10.1108/IMDS-05-2021-0308.

[7] X. Li, X. Wang, C. Yao, dan Y. Li, “Graph-enhanced implicit aspect-level sentiment analysis based on multi-prompt fusion,” Sci. Rep., vol. 15, no. 1, hlm. 17460, Mei 2025, doi: 10.1038/s41598-025-02609-4.

[8] M. F. Mubaraq dan W. Maharani, “Sentiment Analysis on Twitter Social Media towards Climate Change on Indonesia Using IndoBERT Model,” J. MEDIA Inform. BUDIDARMA, vol. 6, no. 4, hlm. 2426, Okt 2022, doi: 10.30865/mib.v6i4.4570.

[9] M. N. Hidayat dan R. Pramudita, “Analisis Sentimen Terhadap Pembelajaran Secara Daring Pasca Pandemi Covid-19 Menggunakan Metode IndoBERT,” Inf. Manag. Educ. Prof. J. Inf. Manag., vol. 8, no. 2, hlm. 161, Jan 2024, doi: 10.51211/imbi.v8i2.2719.

[10] Y. D. Novandian dkk., “IndoBERT-based Indonesian Cyberbullying Detection with Multi-stage Labeling,” dalam 2024 International Seminar on Application for Technology of Information and Communication (iSemantic), Semarang, Indonesia: IEEE, Sep 2024, hlm. 515–521. doi: 10.1109/isemantic63362.2024.10762553.

[11] G. Hakim, T. N. Fatyanosa, dan A. W. Widodo, “Analisis Sentimen Masyarakat terhadap Kereta Cepat Whoosh pada Platform X menggunakan IndoBERT”.

[12] A. I. Mu’alifah dan . S., “Self Disclosure Pada Pengguna Media Sosial Twitter (Studi Kualitatif Self Disclosure Pada Pengguna Media Sosial Twitter),” J. SIGNAL, vol. 11, no. 1, hlm. 01–14, Apr 2023, doi: 10.33603/signal.v11i1.7510.

[13] Y. Fauziah, B. Yuwono, dan A. S. Aribowo, “Lexicon Based Sentiment Analysis in Indonesia Languages : A Systematic Literature Review,” RSF Conf. Ser. Eng. Technol., vol. 1, no. 1, hlm. 363–367, Des 2021, doi: 10.31098/cset.v1i1.397.

[14] G. Colavito, F. Lanubile, N. Novielli, dan L. Quaranta, “Leveraging GPT-like LLMs to Automate Issue Labeling,” dalam Proceedings of the 21st International Conference on Mining Software Repositories, Lisbon Portugal: ACM, Apr 2024, hlm. 469–480. doi: 10.1145/3643991.3644903.

[15] S. R. K. W. Tommy Rustandi, D. Suhaedi, dan Y. Pemanasari, “Pemetaan Hyperplane Pada Support Vector Machine,” Bdg. Conf. Ser. Math., vol. 3, no. 2, hlm. 109–119, Agu 2023, doi: 10.29313/bcsm.v3i2.8187.

[16] A. N. Sihananto dan H. Maulana, “Studi Literatur Tentang Performa Naïve Bayes Dalam Klasifikasi Data,” Pros. Semin. Nas. Inform. Bela Negara, vol. 2, hlm. 132–135, Nov 2021, doi: 10.33005/santika.v2i0.134.

[17] E. Roflin, F. Riana, E. Munarsih, Pariyana, dan I. A. Liberty, Regresi Logistik Biner dan Multinomial. PT Nasya Expanding Management, 2023. [Daring]. Tersedia pada: https://books.google.co.id/books?id=FOi3EAAAQBAJ&lpg=PR1&ots=jrivgTSzA0&lr&hl=id&pg=PR4#v=onepage&q&f=false

[18] S. Zhang, X. Li, M. Zong, X. Zhu, dan D. Cheng, “Learning k for kNN Classification,” ACM Trans. Intell. Syst. Technol., vol. 8, no. 3, hlm. 1–19, Mei 2017, doi: 10.1145/2990508.

[19] M. W. Nugroho, “Analisis Performa Algoritma Random Forest dalam Mengatasi Overfitting pada Model Prediksi,” J. JTIK J. Teknol. Inf. Dan Komun., vol. 9, no. 4, hlm. 1562–1571, Okt 2025, doi: 10.35870/jtik.v9i4.4236.

[20] D. Sebastian, H. D. Purnomo, dan I. Sembiring, “BERT for Natural Language Processing in Bahasa Indonesia,” dalam 2022 2nd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA), Bandung, Indonesia: IEEE, Des 2022, hlm. 204–209. doi: 10.1109/ICICyTA57421.2022.10038230.

Downloads

Published

2026-02-04

How to Cite

[1]
G. F. S. Medantoro and M. Muljono, “Comparative Analysis of IndoBERT and Classic Machine Learning Models for Sentiment Classification of Education Policy on Social Media X”, JAIC, vol. 10, no. 1, pp. 548–557, Feb. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.