Integrating IndoBERTweet and GRU for Opinion Classification on X Towards Public Transportation in Jakarta
DOI:
https://doi.org/10.30871/jaic.v9i5.10723Keywords:
Gated Recurrent Unit (GRU), IndoBERTweet, Public Transportation, Text Classification, XAbstract
Jakarta, the capital of Indonesia, faces persistent challenges with its public transportation system due to rapid urbanization, increased use of private vehicles, and poor service quality. While social media platforms such as X (formerly Twitter) offer valuable insights into public opinion, their unstructured nature complicates analysis. This study uses deep learning models to categorize user sentiments into six labels that cover positive and negative aspects of comfort, safety, and punctuality. The results show that IndoBERTweet achieved the highest performance, with 95.43% accuracy and a macro F1-score of 0.9545. It also required the shortest training time, at six minutes and 30 seconds. IndoBERTweet+GRU followed closely behind with an accuracy of 94.62% and a macro F1-score of 0.9460 in six minutes and 50 seconds. This shows that adding a GRU layer provides competitive results, but does not surpass the baseline model. Error analysis revealed that, while the models performed well with explicit sentiments, the models struggled with implicit expressions, such as sarcasm and mixed opinions. These results demonstrate the potential of sentiment analysis in real-time monitoring systems, which could help policymakers identify urgent issues and support data-driven improvements in Jakarta’s urban transportation services.
Downloads
References
[1] World Population Review, “Jakarta Population 2025.” [Online]. Available: https://worldpopulationreview.com/world-cities/jakarta-population.
[2] M.R. Rahmatullah, M. Alimuddin, and P. Lestari, “Evaluating Jakarta’s Public Transportation Services Using Passenger Feedback on Twitter,” Journal of Urban Mobility and Smart Cities, vol.3, no.2, pp. 45-54, 2021.
[3] I.M. Putri, P. Wulandari, and E.Suryani, “Sentiment Analysis on Twitter using LSTM and Word2Vec for Public Opinion Monitoring,” in Proc. 4th Int. Conf. Data Science and Information Technology (DSIT 2022), 2022.
[4] H. K. Dixit, “Natural Language Processing (NLP) and Understanding,” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2025, doi: 10.15662/IJAREEIE.2025.1402025.
[5] F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA: Association for Computational Linguistics, 2021, pp. 10660–10668. doi: 10.18653/v1/2021.emnlp-main.833.
[6] K. L. Tan, C. P. Lee, and K. M. Lim, “RoBERTa-GRU: A Hybrid Deep Learning Model for Enhanced Sentiment Analysis,” Applied Sciences, vol. 13, no. 6, Mar. 2023, doi: 10.3390/app13063915.
[7] D.A. Prawinata, A.D. Rahajoe, and I. G. S. M. Diyasa, ‘Analisis Sentimen Kendaraan Listrik Pada Twitter Menggunakan Metode Long Short Term Memory,” SABER: Jurnal Teknik Informatika, Sains dan Ilmu Komunikasi, vol. 2, no. 1, pp. 300-313, Jan 2024.
[8] A. Kumalasari and W. Handayani, “Sentiment Analysis to Improve the Quality of Public Transportation Services "Suroboyo Bus”, Indonesian Interdisciplinary Journal of Sharia Economics (IIJSE) , vol. 7, no. 3, pp. 6407-6426, Aug. 2024.
[9] N. R. Djodjobo and H. Fahmi, “Understanding Public Sentiment on Jakarta Public Transportation Using LSTM”, SINTECH Journal, vol. 8, no. 1, pp. 38–51, Apr. 2025.
[10] R. Merdiansah, S. Siska, and A. Ali Ridha, “Analisis Sentimen Pengguna X Indonesia Terkait Kendaraan Listrik Menggunakan IndoBERT,” Jurnal Ilmu Komputer dan Sistem Informasi (JIKOMSI), vol. 7, no. 1, pp. 221–228, Mar. 2024. [In Indonesian]
[11] F. Indriani, R. A. Nugroho, M. R. Faisal, and D. Kartini, “Comparative Evaluation of IndoBERT, IndoBERTweet, and mBERT for Multilabel Student Feedback Classification,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 8, no. 6, pp. 748–757, Dec. 2024.
[12] Godkingjay, "selenium-twitter-scraper," GitHub, [Online]. Available: https://github.com/godkingjay/selenium-twitter-scraper.
[13] Rachman, F. F., Nooraeni, R., & Yuliana, L. (2021). Public Opinion of Transportation Integrated (Jak Lingko), in DKI Jakarta, Indonesia. Procedia Computer Science, 179, 696–703. https://doi.org/10.1016/j.procs.2021.01.057
[14] Pavliuk, Baibuz, and Honcharova, “TEXT PREPARATION FOR NATURAL LANGUAGE PROCESSING,” in Proceedings of the XIX International Scientific and Practical Conference, Dnipro, Ukraine: International Science Group, May 2024, pp. 223–225.
[15] Fendiirfan, “Kamus-Alay”, GitHub, [Online]. Available: https://github.com/fendiirfan/Kamus-Alay.
[16] Y. HaCohen-Kerner, D. Miller, and Y. Yigal, “The influence of preprocessing on text classification using a bag-of-words representation,” PLOS ONE, vol. 15, no. 5, p. e0232525, May 2020, doi: 10.1371/journal.pone.0232525.
[17] M. Gerlach, H. Shi, and L. A. N. Amaral, “A universal information theoretic approach to the identification of stopwords,” Nature Machine Intelligence, vol. 1, no. 12, pp. 606–612, Dec. 2019, doi: 10.1038/s42256-019-0112-6.
[18] K. Juluru, H.-H. Shih, K. N. Keshava Murthy, and P. Elnajjar, “Bag-of-Words Technique in Natural Language Processing: A Primer for Radiologists,” RadioGraphics, vol. 41, no. 5, pp. 1420–1426, Sep. 2021, doi: 10.1148/rg.2021210025.
[19] G. Grefenstette, "Tokenization," in Syntactic Wordclass Tagging, H. van Halteren, Ed. Dordrecht: Springer, 1999, pp. 117–133. doi: 10.1007/978-94-015-9273-4_9.
[20] Indolem, "IndoBERTweet," GitHub, [Online]. Available: https://github.com/indolem/IndoBERTweet.
[21] P. Sayarizki, Hasmawati, H. Nurrahmi, “Implementation of IndoBERT for Sentiment Analysis of Indonesia Presidential Candidates”, Indonesia Journal of Computing, vol. 9, no. 2, pp. 61-72, August. 2024
[22] Y. Xu and R. Goodacre, “On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning,” Journal of Analysis and Testing, vol. 2, no. 3, pp. 249–262, Jul. 2018, doi: 10.1007/s41664-018-0068-2.
[23] M. Sivakumar, S. Parthasarathy, and T. Padmapriya, “Trade-off between training and testing ratio in machine learning for medical image processing,” PeerJ Computer Science, vol. 10, Sep. 2024, doi: 10.7717/peerj-cs.2245.
[24] A. Wibowo, A. S. Utomo, and A. Purwarianti, "IndoBERTweet: A Pretrained Language Model for Indonesian Social Media Texts," in Proceedings of the 2021 International Conference on Asian Language Processing (IALP), 2021, pp. 123–128.
[25] K. Cho et al., “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation,” in EMNLP, 2024, pp. 1724–1734.
[26] S. Liu, Y. Chen, and X. Zhang, “Combining BERT and GRU for Sentiment Analysis on Social Media,” in Proc. 2021 Int. Conf. on Computational Linguistics and Intelligent Systems, 2021, pp. 250–258.
[27] M. Cho, C. Kim, K. Jung, and H. Jung, “Water Level Prediction Model Applying a Long Short-Term Memory (LSTM)–Gated Recurrent Unit (GRU) Method for Flood Prediction,” Water, vol. 14, no. 14, p. 2221, Jul. 2022, doi: 10.3390/w14142221.
[28] M. R. Dwimahendra et al., “Klasifikasi Jenis Kayu Berdasarkan Citra Serat Kayu Menggunakan Convolutional Neural Network,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 10, no. 1, pp. 72-80, March. 2024. [In Indonesian] doi: 10.29100/jipi.v10il.5726
[29] T. Garai, S. Dalapati and F. Smarandache., “Softmax Function Based Neutrosophic Aggregation Operators and Application in Multi-Attribute Decision Making Problem,” Neutrosophic Sets and Systems, vol. 56, no. 1, 2023.
[30] M. F. Naufal and S. F. Kusuma, “Analisis Perbandingan Algoritma Machine Learning dan Deep Learning untuk Klasifikasi Citra Sitem Isyarat Bahasa Indonesia (SIBI),” Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 10, no. 4, pp. 873-882, August. 2023. doi: 10.25126/jtiik.2023106828
[31] E. Matsuyama, M. Nishiki, N. Takahashi, and H. Watanabe, “Using Cross Entropy as a Performance Metric for Quantifying Uncertainty in DNN Image Classifiers: An Application to Classification of Lung Cancer on CT Images,” Journal of Biomedical Science and Engineering, vol. 17, no. 01, pp. 1–12, 2024, doi: 10.4236/jbise.2024.171001.
[32] E. Matsuyama, H. Watanabe, and N. Takahashi, “Performance Comparison of Vision Transformer- and CNN-Based Image Classification Using Cross Entropy: A Preliminary Application to Lung Cancer Discrimination from CT Images,” Journal of Biomedical Science and Engineering, vol. 17, no. 09, pp. 157–170, 2024, doi: 10.4236/jbise.2024.179012.
[33] PyTorch Contributors, PyTorch Documentation: torch.nn. CrossEntropyLoss, PyTorch, [Online]. Available:https://docs.pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html.
[34] N. Andriani, B. Warsito and R. Santoso, “Analisis Sentimen Aplikasi Microsoft Teams Berdasarkan Ulasan Google Play Store Menggunakan Model Neural Network Dengan Optimasi Adaptive Moment Estimation (ADAM),” Jurnal Gaussian, vol. 13, no. 1, 2024. [In Indonesian]
[35] M. S. Haqqi and B. Kusumoputro, “Komparasi Metode Optimasi Adam dan SGD dalam Skema Direct Inverse Control untuk Sistem Kendali Data Sikap dan Ketinggian Quadcopter,” ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika, vol. 10, no. 2, April. 2022. DOI: http://dx.doi.org/10.26760/elkomika.v10i2.458 [In Indonesian]
[36] E. Bartz, T. Bartz-Beielstein, M. Zaefferer, and O. Mersmann, Hyperparameter Tuning for Machine and Deep Learning with R. Singapore: Springer Nature Singapore, 2023. doi: 10.1007/978-981-19-5170-1.
[37] J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian Optimization of Machine Learning Algorithms.”
[38] Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016. Accessed: May 22, 2025. [Online]. Available: http://www.deeplearningbook.org
[39] H. Zhang, G. Li, J. Li, Z. Zhang, Y. Zhu, and Z. Jin, “Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively,” Nov. 2022, [Online]. Available: http://arxiv.org/abs/2211.01642
[40] F. Nurqoulby, A. A. Arifiyanti and D. S. Y. Kartika, “Analysis Sentiment Of Users Internet Service Providers In Indonesia On Social Media X Using Support Vector Machine,” Data Science: Journal of Computing and Applied Informatics, vol. 8, no. 2, pp. 88-95, Jul. 2024.
[41] S. Sathyanarayanan and B.R Tantri, “Confusion Matrix-Based Performance Evaluation Metrics,” African Journal of Biomedical Research, vol. 27 no. 4s, pp 4023-4031, Nov. 2024.
[42] F. S. Mulyo, “Building a Sentiment Classification Model Using IndoBERT”, Medium, Dec 26, 2024. [Online]. Available: https://medium.com/%40fadilsatriomulyo/building-a-sentiment-classification-model-using-indobert-22ba010a1257
[43] A. Muzakir, K. Adi, and R. Kusumaningrum, “Short Text Classification Based on Hybrid Semantic Expansion and Bidirectional GRU (BiGRU) Based Method to Improve Hate Speech Detection,” International Information and Engineering Technology Association, vol. 37, no. 6, pp. 1471-1481, Dec 2023. https://doi.org/10.18280/ria.370611
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Fajria Ulumin Nafiah, Talitha Fujisai Panglima, Mohammad Idhom, Trimono Trimono

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








