Performance Analysis of LSTM, GRU and IndoBERT Variants for Emotion Detection in Indonesian Text

Authors

  • Putri Innayah Mahmid Informatics Engineering, Tadulako University
  • Nouval Trezandy Lapatta Informatics Engineering, Tadulako University

DOI:

https://doi.org/10.30871/jaic.v10i2.12002

Keywords:

Attention Mechanism, Gating Mechanism, Emotion Detection

Abstract

This study evaluates gating mechanisms, specifically Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), in comparison with attention-based models utilizing IndoBERT variants (Base, Large, and Lite) for Indonesian emotion detection across six emotion labels. The evaluation examines accuracy, efficiency, and robustness using both in-distribution and out-of-distribution (OOD) datasets collected from social media. Statistical significance is assessed through confidence interval estimation and bootstrap paired tests, and a detailed error analysis is conducted to identify model limitations. The results indicate that IndoBERT Large achieves superior performance, with a Macro F1-Score of 80.05% and greater robustness to domain shifts, whereas gating models exhibit substantial performance degradation on unseen data. In contrast, GRU outperforms LSTM and achieves the lowest inference latency, with training times up to 131 times faster than IndoBERT Large. Statistical tests confirm that the performance gap between IndoBERT variants and RNN-based models is significant. These findings highlight a key trade-off: attention mechanisms provide state-of-the-art accuracy and robustness, while GRU offers a practical and efficient solution for resource-constrained settings.

Downloads

Download data is not yet available.

References

[1] P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Soc. Netw. Anal. Min., vol. 11, no. 1, Dec. 2021, doi: 10.1007/S13278-021-00776-6.

[2] C. Wang, “Emotion Recognition of College Students’ Online Learning Engagement Based on Deep Learning,” Int. J. Emerg. Technol. Learn., vol. 17, no. 6, pp. 110–110, 2022, doi: 10.3991/ijet.v17i06.30019.

[3] S. V. Oprea and A. Bâra, “Extracting Emotions from Customer Reviews Using Text Mining, Large Language Models and Fine-Tuning Strategies,” J. Theor. Appl. Electron. Commer. Res. 2025, Vol. 20, Page 221, vol. 20, no. 3, p. 221, Sep. 2025, doi: 10.3390/JTAER20030221.

[4] Winda Kurnia Sari, D. P. Rini, Reza Firsandaya Malik, and Iman Saladin B. Azhar, “Multilabel Text Classification in News Articles Using Long-Term Memory with Word2Vec,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 4, no. 2, pp. 276–285, Apr. 2020, doi: 10.29207/RESTI.V4I2.1655.

[5] U. Mahesh, A. Prof, R. Jr, R. Kumar, S. Vm, and U. Bd, “Text Classification using RNN,” Int. J. Eng. Res. Technol., vol. 14, no. 5, May 2025, doi: 10.17577/IJERTV14IS050314.

[6] S. M. Al-Selwi et al., “RNN-LSTM: From applications to modeling techniques and beyond—Systematic review,” J. King Saud Univ. - Comput. Inf. Sci., vol. 36, no. 5, p. 102068, Jun. 2024, doi: 10.1016/J.JKSUCI.2024.102068.

[7] I. G. P. M. Yusadara and I. G. A. D. Saryanti, “Classification of User Expressions on Social Media Using LSTM and GRU Models,” J. Sisfokom (Sistem Inf. dan Komputer), vol. 14, no. 1, pp. 49–54, Jan. 2025, doi: 10.32736/sisfokom.v14i1.2370.

[8] K. G. T. Kumar, R. Anoop, S. G. Koolagudi, T. Rao, and A. Kodipalli, “Stratification of Depressed and Non-Depressed Texts from Social Media using LSTM and its Variants,” Procedia Comput. Sci., vol. 235, pp. 1353–1363, Jan. 2024, doi: 10.1016/J.PROCS.2024.04.127.

[9] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, pp. 4171–4186, Oct. 2018, Accessed: Oct. 22, 2025. [Online]. Available: https://arxiv.org/pdf/1810.04805

[10] Y. A. Singgalen, “Performance Analysis of IndoBERT for Sentiment Classification in Indonesian Hotel Review Data,” J. Inf. Syst. Res., vol. 6, no. 2, pp. 976–986, 2025, doi: 10.47065/josh.v6i2.6505.

[11] U. Khairani, V. Mutiawani, and H. Ahmadian, “Pengaruh Tahapan Preprocessing Terhadap Model Indobert Dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 4, pp. 887–894, 2024, doi: 10.25126/jtiik.1148315.

[12] Riccosan, K. E. Saputra, G. D. Pratama, and A. Chowanda, “Emotion dataset from Indonesian public opinion,” Data Br., vol. 43, p. 108465, Aug. 2022, doi: 10.1016/j.dib.2022.108465.

[13] “Data Preprocessing - an overview | ScienceDirect Topics.” Accessed: Oct. 10, 2025. [Online]. Available: https://www.sciencedirect.com/topics/engineering/data-preprocessing

[14] H. Chung and K. S. Shin, “Genetic algorithm-optimized long short-term memory network for stock market prediction,” Sustain., vol. 10, no. 10, 2018, doi: 10.3390/su10103765.

[15] M. Waqas and U. W. Humphries, “A critical review of RNN and LSTM variants in hydrological time series predictions,” MethodsX, vol. 13, p. 102946, Dec. 2024, doi: 10.1016/J.MEX.2024.102946.

[16] Wildan Amru Hidayat and V. R. S. Nastiti, “Perbandingan Kinerja Pre-Trained Indobert-Base Dan Indobert-Lite Pada Klasifikasi Sentimen Ulasan Tiktok Tokopedia Seller Center Dengan Model Indobert,” JSiI (Jurnal Sist. Informasi), vol. 11, no. 2, pp. 13–20, Sep. 2024, doi: 10.30656/JSII.V11I2.9168.

[17] W. S. Parker, “Model Evaluation,” Routledge Handb. Philos. Sci. Model., pp. 208–219, Jan. 2024, doi: 10.4324/9781003205647-19.

Downloads

Published

2026-04-16

How to Cite

[1]
P. I. Mahmid and N. T. Lapatta, “Performance Analysis of LSTM, GRU and IndoBERT Variants for Emotion Detection in Indonesian Text”, JAIC, vol. 10, no. 2, pp. 1172–1181, Apr. 2026.

Most read articles by the same author(s)

Similar Articles

<< < 3 4 5 6 7 > >> 

You may also start an advanced similarity search for this article.