Comparative Analysis of IndoBERT and Classic Machine Learning Models for Sentiment Classification of Education Policy on Social Media X
DOI:
https://doi.org/10.30871/jaic.v10i1.11723Keywords:
Sentiment Analysist, Social Media X, Machine Learning, Implicit, EducationAbstract
Leadership changes provide an opportunity for new education policies, generating complex public opinions on social media X that often contain implicit sentiments like satire, making automated analysis challenging. This study aims to address this challenge by conducting a comparative analysis to evaluate the effectiveness of the IndoBERT model in capturing nuanced, implicit sentiments compared to traditional machine learning classifiers (SVM, Naïve Bayes, Logistic Regression, KNN, and Random Forest). This research utilized a dataset of Indonesian-language tweets, collected via crawling. Data was pre-processed (cleaning, case folding, etc.) and labeled (positive/negative) using a hybrid Lexicon-LLM approach. The TF-IDF technique was used for feature extraction for the machine learning models, while IndoBERT used its internal tokenization. Models were evaluated using accuracy, precision, recall, and F1-score. The results showed that the IndoBERT model performed best with an accuracy score of 97%, significantly outperforming the other best machine learning models, namely Random Forest 95% and SVM 95%. This study concludes that the IndoBERT model is a superior and more robust solution for analyzing nuanced public sentiment on educational policies, demonstrating a greater ability to understand complex context and implicit language compared to traditional TF-IDF-based methods.
Downloads
References
[1] A. P. Putra, “Pemerintah, DPR, dan Penyelenggara Sepakati Pemilu Serentak 14 Februari 2024,” Kementerian Pendayagunaan Aparatur Negara dan Reformasi Birokrasi. Diakses: 6 Oktober 2025. [Daring]. Tersedia pada: https://menpan.go.id/site/berita-terkini/berita-daerah/pemerintah-dpr-dan-penyelenggara-sepakati-pemilu-serentak-14-februari-2024
[2] triya.andriyani, “Media Sosial jadi Sarana Penyampaian Pesan dan Kritik Sosial Kalangan Anak Muda,” Universitas Gadjah Mada. Diakses: 6 Oktober 2025. [Daring]. Tersedia pada: https://ugm.ac.id/id/berita/medis-sosial-jadi-sarana-penyampaian-pesan-dan-kritik-sosial-kalangan-anak-muda/
[3] I. Z. Hayati, R. Herdiana, dan S. Mulyani, “Gaya Bahasa Sindiran dalam Kolom Komentar Twitter Akun @tanyakanrl,” Diksatrasia J. Ilm. Pendidik. Bhs. Dan Sastra Indones., vol. 8, no. 2, hlm. 556, Agu 2024, doi: 10.25157/diksatrasia.v8i2.15123.
[4] Z. Li, Y. Zou, C. Zhang, Q. Zhang, dan Z. Wei, “Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training,” dalam Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021, hlm. 246–256. doi: 10.18653/v1/2021.emnlp-main.22.
[5] A. O. Thakare, N. R. Soora, L. Jena, A. R. Singh, A. P. H, dan R. Pachlor, “Hate Speech Detection in Social Media Data Using Big Data Analytics*,” dalam 2025 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC), Bhubaneswar, India: IEEE, Mei 2025, hlm. 1–9. doi: 10.1109/ASSIC64892.2025.11158561.
[6] Q. Zhang, X. Zhu, J. L. Zhao, dan L. Liang, “Discovering signals of platform failure risks from customer sentiment: the case of online P2P lending,” Ind. Manag. Data Syst., vol. 122, no. 3, hlm. 666–681, Mar 2022, doi: 10.1108/IMDS-05-2021-0308.
[7] X. Li, X. Wang, C. Yao, dan Y. Li, “Graph-enhanced implicit aspect-level sentiment analysis based on multi-prompt fusion,” Sci. Rep., vol. 15, no. 1, hlm. 17460, Mei 2025, doi: 10.1038/s41598-025-02609-4.
[8] M. F. Mubaraq dan W. Maharani, “Sentiment Analysis on Twitter Social Media towards Climate Change on Indonesia Using IndoBERT Model,” J. MEDIA Inform. BUDIDARMA, vol. 6, no. 4, hlm. 2426, Okt 2022, doi: 10.30865/mib.v6i4.4570.
[9] M. N. Hidayat dan R. Pramudita, “Analisis Sentimen Terhadap Pembelajaran Secara Daring Pasca Pandemi Covid-19 Menggunakan Metode IndoBERT,” Inf. Manag. Educ. Prof. J. Inf. Manag., vol. 8, no. 2, hlm. 161, Jan 2024, doi: 10.51211/imbi.v8i2.2719.
[10] Y. D. Novandian dkk., “IndoBERT-based Indonesian Cyberbullying Detection with Multi-stage Labeling,” dalam 2024 International Seminar on Application for Technology of Information and Communication (iSemantic), Semarang, Indonesia: IEEE, Sep 2024, hlm. 515–521. doi: 10.1109/isemantic63362.2024.10762553.
[11] G. Hakim, T. N. Fatyanosa, dan A. W. Widodo, “Analisis Sentimen Masyarakat terhadap Kereta Cepat Whoosh pada Platform X menggunakan IndoBERT”.
[12] A. I. Mu’alifah dan . S., “Self Disclosure Pada Pengguna Media Sosial Twitter (Studi Kualitatif Self Disclosure Pada Pengguna Media Sosial Twitter),” J. SIGNAL, vol. 11, no. 1, hlm. 01–14, Apr 2023, doi: 10.33603/signal.v11i1.7510.
[13] Y. Fauziah, B. Yuwono, dan A. S. Aribowo, “Lexicon Based Sentiment Analysis in Indonesia Languages : A Systematic Literature Review,” RSF Conf. Ser. Eng. Technol., vol. 1, no. 1, hlm. 363–367, Des 2021, doi: 10.31098/cset.v1i1.397.
[14] G. Colavito, F. Lanubile, N. Novielli, dan L. Quaranta, “Leveraging GPT-like LLMs to Automate Issue Labeling,” dalam Proceedings of the 21st International Conference on Mining Software Repositories, Lisbon Portugal: ACM, Apr 2024, hlm. 469–480. doi: 10.1145/3643991.3644903.
[15] S. R. K. W. Tommy Rustandi, D. Suhaedi, dan Y. Pemanasari, “Pemetaan Hyperplane Pada Support Vector Machine,” Bdg. Conf. Ser. Math., vol. 3, no. 2, hlm. 109–119, Agu 2023, doi: 10.29313/bcsm.v3i2.8187.
[16] A. N. Sihananto dan H. Maulana, “Studi Literatur Tentang Performa Naïve Bayes Dalam Klasifikasi Data,” Pros. Semin. Nas. Inform. Bela Negara, vol. 2, hlm. 132–135, Nov 2021, doi: 10.33005/santika.v2i0.134.
[17] E. Roflin, F. Riana, E. Munarsih, Pariyana, dan I. A. Liberty, Regresi Logistik Biner dan Multinomial. PT Nasya Expanding Management, 2023. [Daring]. Tersedia pada: https://books.google.co.id/books?id=FOi3EAAAQBAJ&lpg=PR1&ots=jrivgTSzA0&lr&hl=id&pg=PR4#v=onepage&q&f=false
[18] S. Zhang, X. Li, M. Zong, X. Zhu, dan D. Cheng, “Learning k for kNN Classification,” ACM Trans. Intell. Syst. Technol., vol. 8, no. 3, hlm. 1–19, Mei 2017, doi: 10.1145/2990508.
[19] M. W. Nugroho, “Analisis Performa Algoritma Random Forest dalam Mengatasi Overfitting pada Model Prediksi,” J. JTIK J. Teknol. Inf. Dan Komun., vol. 9, no. 4, hlm. 1562–1571, Okt 2025, doi: 10.35870/jtik.v9i4.4236.
[20] D. Sebastian, H. D. Purnomo, dan I. Sembiring, “BERT for Natural Language Processing in Bahasa Indonesia,” dalam 2022 2nd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA), Bandung, Indonesia: IEEE, Des 2022, hlm. 204–209. doi: 10.1109/ICICyTA57421.2022.10038230.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Gabriella Fani Suciarti Medantoro, Muljono Muljono

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








