Optimization of IndoBERT for Sentiment Analysis of FOMO on Social Media Through Fine-Tuning and Hybrid Labeling

Authors

  • Nadhif Fauzil Adhim Universitas Amikom Yogyakarta
  • Nuri Cahyono Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.30871/jaic.v9i6.11686

Keywords:

Sentiment Analysis, FOMO, IndoBERT, Hybrid Labeling, Fine-Tuning

Abstract

The rapid growth of social media in Indonesia has given rise to social phenomena such as Fear of Missing Out (FOMO). Expressions of FOMO on platforms like X (previously Twitter) often written informally, filled with abbreviations, slang, and emotional nuances, posing challenges for traditional Natural Language Processing (NLP) methods. This research aims to develop an optimized sentiment classification model for FOMO-related posts by fine-tuning the IndoBERT architecture and applying comprehensive data enhancement strategies. The study introduces three key innovations: (1) systematic text normalization to handle informal expressions, (2) a hybrid labeling framework combining automated model prediction, lexicon-based validation, and manual annotation to construct high-quality ground-truth data, and (3) hyperparameter tuning using both GridSearchCV for traditional machine learning models and Bayesian Optimization (Optuna) for deep learning models to maximize performance. The experimental results demonstrate that the optimized IndoBERT achieved superior performance with an Accuracy of 94.50%, F1-Score of 94.52%, and Macro AUC of 0.987. These results significantly surpass comparative models, including BiLSTM (Accuracy 86.60%), Support Vector Machine (88.06%), and Naive Bayes (80.73%). These results confirm that integrating hybrid labeling and fine-tuned IndoBERT significantly enhances sentiment classification performance. The findings contribute to developing reliable sentiment analysis systems for detecting social anxiety dynamics and computational social science research in Indonesian contexts.

Downloads

Download data is not yet available.

References

[1] R. Syahputra, G. J. Yanris, and D. Irmayani, “SVM and Naïve Bayes Algorithm Comparison for User Sentiment Analysis on Twitter,” Sinkron, vol. 7, no. 2, pp. 671–678, May 2022, doi: 10.33395/sinkron.v7i2.11430.

[2] S. Pambudi, P. Setiaji, and W. A. Triyanto, “Sentiment Analysis of Fizzo Novel Application Using Support Vector Machine and Naïve Bayes Algorithm with SEMMA Framework,” Jurnal Teknik Informatika (JUTIF), vol. 6, no. 4, 2025, doi: 10.52436/1.jutif.2025.6.4.4875.

[3] R. A. Fitrianto, A. S. Editya, M. M. Alamin, A. L. Pramana, and A. K. Alhaq, “Classification of Indonesian Sarcasm Tweets on X Platform Using Deep Learning,” in 2024 7th International Conference on Informatics and Computational Sciences (ICICoS), IEEE, Jul. 2024, pp. 388–393. doi: 10.1109/ICICoS62600.2024.10636904.

[4] V. P. Kalanjati et al., “Sentiment analysis of Indonesian tweets on COVID-19 and COVID-19 vaccinations,” F1000Res, vol. 12, 2024, doi: 10.12688/f1000research.130610.4.

[5] A. Vaswani et al., “Attention is All You Need,” arXiv preprint arXiv:1706.03762, 2017, [Online]. Available: https://arxiv.org/abs/1706.03762

[6] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2019, [Online]. Available: https://arxiv.org/abs/1810.04805

[7] R. I. Perwira, V. A. Permadi, D. I. Purnamasari, and R. P. Agusdin, “Domain-Specific Fine-Tuning of IndoBERT for Aspect-Based Sentiment Analysis in Indonesian Travel User-Generated Content,” Journal of Information Systems Engineering and Business Intelligence, vol. 11, no. 1, pp. 30–40, Feb. 2025, doi: 10.20473/jisebi.11.1.30-40.

[8] L. Afuan, N. Hidayat, H. Hamdani, H. Ismanto, B. C. Purnama, and D. I. Ramdhani, “Optimizing BERT Models with Fine-Tuning for Indonesian Twitter Sentiment Analysis,” J Wirel Mob Netw Ubiquitous Comput Dependable Appl, vol. 16, no. 2, pp. 248–267, Jun. 2025, doi: 10.58346/JOWUA.2025.I2.016.

[9] M. Widansyah, Fathia Frazna Az-Zahra, and Agung Pambudi, “Fine-Tuning Model Indobert (Indonesian Bidirectional Encoder Representations from Transformers) untuk Analisis Sentimen Berbasis Aspek pada Aplikasi M-Paspor,” Joutica, vol. 9, no. 2, pp. 183–195, Sep. 2024, doi: 10.30736/informatika.v9i2.1310.

[10] A. R. Lubis, Y. Y. Lase, D. A. Rahman, and D. Witarsyah, “Improving Spell Checker Performance for Bahasa Indonesia Using Text Preprocessing Techniques with Deep Learning Models,” Ingénierie des systèmes d information, vol. 28, no. 5, pp. 1335–1342, Oct. 2023, doi: 10.18280/isi.280522.

[11] M. R. Manoppo et al., “Analisis Sentimen Publik Di Media Sosial Terhadap Kenaikan PPN 12% Di Indonesia Menggunakan Indobert,” Jurnal Kecerdasan Buatan dan Teknologi Informasi, vol. 4, no. 2, pp. 152–163, May 2025, doi: 10.69916/jkbti.v4i2.322.

[12] A. Nafi’, A. T. Harjanta, B. A. Herlambang, and S. Fahmi, “Analisis Sentimen Review Pelanggan Lazada dengan Sastrawi Stemmer dan SVM-PSO untuk Memahami Respon Pengguna,” J-INTECH, vol. 12, no. 02, pp. 330–339, Dec. 2024, doi: 10.32664/j-intech.v12i02.1450.

[13] S. Saifullah, R. Dreżewski, F. A. Dwiyanto, A. S. Aribowo, Y. Fauziah, and N. H. Cahyana, “Automated Text Annotation Using a Semi-Supervised Approach with Meta Vectorizer and Machine Learning Algorithms for Hate Speech Detection,” Applied Sciences, vol. 14, no. 3, p. 1078, Jan. 2024, doi: 10.3390/app14031078.

[14] J. Baan, R. Fernández, B. Plank, and W. Aziz, “Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?,” in Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, 2024, pp. 268–277. doi: 10.18653/v1/2024.eacl-short.24.

[15] M. Bosley, S. Kuzushima, T. Enamorado, and Y. Shiraito, “Improving Probabilistic Models In Text Classification Via Active Learning,” American Political Science Review, vol. 119, no. 2, pp. 985–1002, May 2025, doi: 10.1017/S0003055424000716.

[16] Q. Li et al., “A Survey on Text Classification: From Traditional to Deep Learning,” ACM Trans Intell Syst Technol, vol. 13, no. 2, pp. 1–41, Apr. 2022, doi: 10.1145/3495162.

[17] P. Domingos and M. Pazzani, “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Mach Learn, vol. 29, no. 2–3, pp. 103–130, Nov. 1997, doi: 10.1023/A:1007413511361.

[18] B. Wilie, K. R. Vincentio, S. Cahyawijaya, G. I. Winata, Z. Li, and P. Fung, “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” arXiv preprint arXiv:2009.05387, 2020, [Online]. Available: https://arxiv.org/abs/2009.05387

[19] J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” [Online]. Available: https://github.com/tensorflow/tensor2tensor

[20] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” Jul. 2019, [Online]. Available: http://arxiv.org/abs/1907.10902.

Downloads

Published

2025-12-15

How to Cite

[1]
N. F. Adhim and N. Cahyono, “Optimization of IndoBERT for Sentiment Analysis of FOMO on Social Media Through Fine-Tuning and Hybrid Labeling”, JAIC, vol. 9, no. 6, pp. 3786–3797, Dec. 2025.

Similar Articles

<< < 1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.