Public Sentiment Analysis on Corruption Issues in Indonesia Using IndoBERT Fine-Tuning, Logistic Regression, and Linear SVM
DOI:
https://doi.org/10.30871/jaic.v9i5.10537Keywords:
Sentiment Analysis, IndoBERT, Fine-Tuning, Logistic Regression, Linear SVM, Social Media, Corruption, SMOTEAbstract
Sentiment analysis is a method in Natural Language Processing (NLP) that aims to understand public perceptions based on textual data from social media. Opinions expressed in digital platforms play an important role as they reflect public trust and attitudes toward strategic issues in Indonesia. This study aims to compare the performance of three IndoBERT-based approaches for sentiment classification, namely IndoBERT with full fine-tuning, IndoBERT as a feature extractor combined with Logistic Regression, and IndoBERT as a feature extractor combined with Linear SVM. The dataset was collected through the Twitter API, consisting of 2,012 tweets, which after preprocessing and balancing resulted in 2,252 labeled data for positive and negative sentiments. The preprocessing stage included cleansing, normalization, tokenization, stopword removal, and stemming. The dataset was then split into 80% training data, 10% validation data, and 10% testing data. Experimental results show that IndoBERT with full fine-tuning achieved the best performance, with an accuracy of 82.67%, an F1-score of 82.35%, and an AUC value of 0.87. Logistic Regression and Linear SVM produced lower accuracies of 80.20% and 78.22%, respectively. These findings indicate that fine-tuned IndoBERT is more effective in capturing the semantic nuances of the Indonesian language, while the non fine-tuning approaches offer better computational efficiency at the cost of reduced accuracy. This study contributes to the development of NLP methods for the Indonesian language, particularly in sentiment analysis, and highlights the potential of transformer-based models for analyzing strategic issues in social media.
Downloads
References
[1] R. A. Al Hazmi, “Pengaruh Korupsi Terhadap Pertumbuhan Ekonomi Indonesia,” Jurnal Acitya Ardana, vol. 3, no. 2, pp. 85–92, Jun. 2024, doi: 10.31092/jaa.v3i2.2563.
[2] W. J. Kusoema and I. Ibrahim, “Sentiment Analysis on the PT Pertamina Corruption Case using IndoBERT and RCNN Methods,” SISTEMASI, vol. 14, no. 5, p. 2246, Sep. 2025, doi: 10.32520/stmsi.v14i5.5392.
[3] O. A. Irmawan, I. Budi, A. B. Santoso, and P. K. Putra, “Improving Sentiment Analysis and Topic Extraction in Indonesian Travel App Reviews Through BERT Fine-Tuning,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 13, no. 2, pp. 359–370, Jul. 2024, doi: 10.23887/janapati.v13i2.77028.
[4] N. A. Nevrada and M. A. Syaputra, “Sentiment Analysis of Telegram App Reviews on Google Play Store Using the Support Vector Machine (SVM) Algorithm,” Journal of Applied Informatics and Computing, vol. 9, no. 1, pp. 96–105, Jan. 2025, doi: 10.30871/jaic.v9i1.8851.
[5] A. R. Gunawan and R. F. Alfa Aziza, “Sentiment Analysis Using LSTM Algorithm Regarding Grab Application Services in Indonesia,” Journal of Applied Informatics and Computing, vol. 9, no. 2, pp. 322–332, Mar. 2025, doi: 10.30871/jaic.v9i2.8696.
[6] D. A. TARIGAN, Z. Situmorang, and R. Rosnelly, “Analisis Sentimen Aplikasi Playstore Sirekap 2024 Pasca Pilpres Dengan Perbandingan Metode Support Vector Machine (SVM), Naïve Bayes Classifier Dan Random Forest.,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 12, no. 3, pp. 661–670, Jun. 2025, doi: 10.25126/jtiik.2025129608.
[7] R. Alif, A. Hazmi, P. Keuangan, and N. Stan, “Pengaruh Korupsi Terhadap Pertumbuhan Ekonomi Indonesia.”
[8] A. F. Al Farizi and Y. Sibaroni, “Implementation of BiLSTM and IndoBERT for Sentiment Analysis of TikTok Reviews,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 10, no. 1, pp. 96–106, Jan. 2025, doi: 10.29100/jipi.v10i1.5815.
[9] A. Yoga Pratama, G. Ananda Sanjaya, N. Khairunisa Lubis, and M. Rangga Aditya, “Analisis Sentimen Publik Terkait Danantara Menggunakan Algoritma IndoBERT pada Platform Media Sosial,” vol. 9, p. 2025, doi: 10.47002/metik.v9i1.1055.
[10] U. Khairani, V. Mutiawani, and H. Ahmadian, “Pengaruh Tahapan Preprocessing Terhadap Model Indobert Dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 4, pp. 887–894, Aug. 2024, doi: 10.25126/jtiik.1148315.
[11] Erlin, Yulvia Nora Marlim, Junadhi, Laili Suryati, and Nova Agustina, “Deteksi Dini Penyakit Diabetes Menggunakan Machine Learning dengan Algoritma Logistic Regression,” Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 11, no. 2, pp. 88–96, May 2022, doi: 10.22146/jnteti.v11i2.3586.
[12] A. H. Siregar and S. D. Siregar, “Comparison of Logistic Regression and Support Vector Machine Algorithm Performance in Heart Failure Prediction,” Academia Open, vol. 10, no. 2, Jul. 2025, doi: 10.21070/acopen.10.2025.11682.
[13] Amrin, Rudianto, and Sismadi, “JITE (Journal of Informatics and Telecommunication Engineering) Data Mining with Logistic Regression and Support Vector Machine for Hepatitis Disease Diagnosis,” JITE, vol. 8, no. 2, 2025, doi: 10.31289/jite.v8i2.13218.
[14] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” ILKOM Jurnal Ilmiah, vol. 14, no. 3, pp. 348–354, Dec. 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.
[15] A. F. Al Farizi and Y. Sibaroni, “Implementation of BiLSTM and IndoBERT for Sentiment Analysis of TikTok Reviews,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 10, no. 1, pp. 96–106, Jan. 2025, doi: 10.29100/jipi.v10i1.5815.
[16] A. Yoga Pratama, G. Ananda Sanjaya, N. Khairunisa Lubis, and M. Rangga Aditya, “Analisis Sentimen Publik Terkait Danantara Menggunakan Algoritma IndoBERT pada Platform Media Sosial,” vol. 9, p. 2025, doi: 10.47002/metik.v9i1.1055.
[17] N. Hadi and D. Sugiarto, “Analisis Sentimen Pembangunan IKN pada Media Sosial X Menggunakan Algoritma SVM, Logistic Regression dan Naïve Bayes,” Jurnal Informatika: Jurnal Pengembangan IT, vol. 10, no. 1, pp. 37–49, Jan. 2025, doi: 10.30591/jpit.v10i1.7106.
[18] L. Enjelia, Y. Cahyana, Rahmat, and D. Wahiddin, “Comparison of K-Nearest Neighbors and Naive Bayes Classifier Algorithms in Sentiment Analysis of 2024 Election in Twitter (X),” Journal of Applied Informatics and Computing, vol. 9, no. 3, pp. 946–954, Jun. 2025, doi: 10.30871/jaic.v9i3.9593.
[19] J. Setyanto and T. B. Sasongko, “Sentiment Analysis of Sirekap Application Users Using the Support Vector Machine Algorithm,” Journal of Applied Informatics and Computing, vol. 8, no. 1, pp. 71–76, Jul. 2024, doi: 10.30871/jaic.v8i1.7772.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Maria Fatima Kono, Ika Nur Fajri, Yoga Pristyanto

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








