Optimization of IndoBERT for Sentiment Analysis of FOMO on Social Media Through Fine-Tuning and Hybrid Labeling
DOI:
https://doi.org/10.30871/jaic.v9i6.11686Keywords:
Sentiment Analysis, FOMO, IndoBERT, Hybrid Labeling, Fine-TuningAbstract
The rapid growth of social media in Indonesia has given rise to social phenomena such as Fear of Missing Out (FOMO). Expressions of FOMO on platforms like X (previously Twitter) often written informally, filled with abbreviations, slang, and emotional nuances, posing challenges for traditional Natural Language Processing (NLP) methods. This research aims to develop an optimized sentiment classification model for FOMO-related posts by fine-tuning the IndoBERT architecture and applying comprehensive data enhancement strategies. The study introduces three key innovations: (1) systematic text normalization to handle informal expressions, (2) a hybrid labeling framework combining automated model prediction, lexicon-based validation, and manual annotation to construct high-quality ground-truth data, and (3) hyperparameter tuning using both GridSearchCV for traditional machine learning models and Bayesian Optimization (Optuna) for deep learning models to maximize performance. The experimental results demonstrate that the optimized IndoBERT achieved superior performance with an Accuracy of 94.50%, F1-Score of 94.52%, and Macro AUC of 0.987. These results significantly surpass comparative models, including BiLSTM (Accuracy 86.60%), Support Vector Machine (88.06%), and Naive Bayes (80.73%). These results confirm that integrating hybrid labeling and fine-tuned IndoBERT significantly enhances sentiment classification performance. The findings contribute to developing reliable sentiment analysis systems for detecting social anxiety dynamics and computational social science research in Indonesian contexts.
Downloads
References
[1] R. Syahputra, G. J. Yanris, and D. Irmayani, “SVM and Naïve Bayes Algorithm Comparison for User Sentiment Analysis on Twitter,” Sinkron, vol. 7, no. 2, pp. 671–678, May 2022, doi: 10.33395/sinkron.v7i2.11430.
[2] S. Pambudi, P. Setiaji, and W. A. Triyanto, “Sentiment Analysis of Fizzo Novel Application Using Support Vector Machine and Naïve Bayes Algorithm with SEMMA Framework,” Jurnal Teknik Informatika (JUTIF), vol. 6, no. 4, 2025, doi: 10.52436/1.jutif.2025.6.4.4875.
[3] R. A. Fitrianto, A. S. Editya, M. M. Alamin, A. L. Pramana, and A. K. Alhaq, “Classification of Indonesian Sarcasm Tweets on X Platform Using Deep Learning,” in 2024 7th International Conference on Informatics and Computational Sciences (ICICoS), IEEE, Jul. 2024, pp. 388–393. doi: 10.1109/ICICoS62600.2024.10636904.
[4] V. P. Kalanjati et al., “Sentiment analysis of Indonesian tweets on COVID-19 and COVID-19 vaccinations,” F1000Res, vol. 12, 2024, doi: 10.12688/f1000research.130610.4.
[5] A. Vaswani et al., “Attention is All You Need,” arXiv preprint arXiv:1706.03762, 2017, [Online]. Available: https://arxiv.org/abs/1706.03762
[6] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2019, [Online]. Available: https://arxiv.org/abs/1810.04805
[7] R. I. Perwira, V. A. Permadi, D. I. Purnamasari, and R. P. Agusdin, “Domain-Specific Fine-Tuning of IndoBERT for Aspect-Based Sentiment Analysis in Indonesian Travel User-Generated Content,” Journal of Information Systems Engineering and Business Intelligence, vol. 11, no. 1, pp. 30–40, Feb. 2025, doi: 10.20473/jisebi.11.1.30-40.
[8] L. Afuan, N. Hidayat, H. Hamdani, H. Ismanto, B. C. Purnama, and D. I. Ramdhani, “Optimizing BERT Models with Fine-Tuning for Indonesian Twitter Sentiment Analysis,” J Wirel Mob Netw Ubiquitous Comput Dependable Appl, vol. 16, no. 2, pp. 248–267, Jun. 2025, doi: 10.58346/JOWUA.2025.I2.016.
[9] M. Widansyah, Fathia Frazna Az-Zahra, and Agung Pambudi, “Fine-Tuning Model Indobert (Indonesian Bidirectional Encoder Representations from Transformers) untuk Analisis Sentimen Berbasis Aspek pada Aplikasi M-Paspor,” Joutica, vol. 9, no. 2, pp. 183–195, Sep. 2024, doi: 10.30736/informatika.v9i2.1310.
[10] A. R. Lubis, Y. Y. Lase, D. A. Rahman, and D. Witarsyah, “Improving Spell Checker Performance for Bahasa Indonesia Using Text Preprocessing Techniques with Deep Learning Models,” Ingénierie des systèmes d information, vol. 28, no. 5, pp. 1335–1342, Oct. 2023, doi: 10.18280/isi.280522.
[11] M. R. Manoppo et al., “Analisis Sentimen Publik Di Media Sosial Terhadap Kenaikan PPN 12% Di Indonesia Menggunakan Indobert,” Jurnal Kecerdasan Buatan dan Teknologi Informasi, vol. 4, no. 2, pp. 152–163, May 2025, doi: 10.69916/jkbti.v4i2.322.
[12] A. Nafi’, A. T. Harjanta, B. A. Herlambang, and S. Fahmi, “Analisis Sentimen Review Pelanggan Lazada dengan Sastrawi Stemmer dan SVM-PSO untuk Memahami Respon Pengguna,” J-INTECH, vol. 12, no. 02, pp. 330–339, Dec. 2024, doi: 10.32664/j-intech.v12i02.1450.
[13] S. Saifullah, R. Dreżewski, F. A. Dwiyanto, A. S. Aribowo, Y. Fauziah, and N. H. Cahyana, “Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection,” Applied Sciences, vol. 14, no. 3, p. 1078, Jan. 2024, doi: 10.3390/app14031078.
[14] J. Baan, R. Fernández, B. Plank, and W. Aziz, “Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?,” in Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, 2024, pp. 268–277. doi: 10.18653/v1/2024.eacl-short.24.
[15] M. Bosley, S. Kuzushima, T. Enamorado, and Y. Shiraito, “Improving Probabilistic Models In Text Classification Via Active Learning,” American Political Science Review, vol. 119, no. 2, pp. 985–1002, May 2025, doi: 10.1017/S0003055424000716.
[16] Q. Li et al., “A Survey on Text Classification: From Traditional to Deep Learning,” ACM Trans Intell Syst Technol, vol. 13, no. 2, pp. 1–41, Apr. 2022, doi: 10.1145/3495162.
[17] P. Domingos and M. Pazzani, “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Mach Learn, vol. 29, no. 2–3, pp. 103–130, Nov. 1997, doi: 10.1023/A:1007413511361.
[18] B. Wilie, K. R. Vincentio, S. Cahyawijaya, G. I. Winata, Z. Li, and P. Fung, “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” arXiv preprint arXiv:2009.05387, 2020, [Online]. Available: https://arxiv.org/abs/2009.05387
[19] J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” [Online]. Available: https://github.com/tensorflow/tensor2tensor
[20] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” Jul. 2019, [Online]. Available: http://arxiv.org/abs/1907.10902.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nadhif Fauzil Adhim, Nuri Cahyono

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








