Enhancing Aspect-Based Sentiment Analysis via Hugging Face Fine-Tuned IndoBERT
DOI:
https://doi.org/10.30871/jaic.v9i6.11409Keywords:
Aspect-Based Sentiment Analysis, IndoBERT, Hotel Reviews, Class Imbalance, Fine-tuningAbstract
Aspect-Based Sentiment Analysis (ABSA) on hotel reviews faces significant challenges regarding semantic complexity and severe class imbalance, particularly in low-resource languages like Indonesian. This study evaluates the effectiveness of fine-tuning IndoBERT, a pre-trained Transformer model, to address these issues by benchmarking it against classical statistical methods (TF-IDF) and static embeddings (Sentence-BERT). Utilizing the HoASA dataset, the experiment implements a Random Oversampling strategy at the text level to mitigate data sparsity in minority classes. Empirical results demonstrate that the fine-tuned IndoBERT significantly outperforms baselines on the majority of aspects, achieving a global accuracy of 97% and macro F1-score of 0.92. Granular per-aspect analysis reveals that the model’s self-attention mechanism captures linguistic context robustly in tangible aspects (e.g., wifi, service), yet faces persistent challenges in highly ambiguous aspects such as smell (bau) and general. Statistical significance tests (Paired t-test and Wilcoxon) confirm that the performance gains over baselines are statistically significant (p < 0.05) and not due to random chance. The study concludes that leveraging contextual representations from IndoBERT, combined with data balancing strategies, offers a superior and statistically robust solution for handling linguistic variations and class bias in the Indonesian hospitality domain.
Downloads
References
[1] A. Chauhan, A. Sharma, and R. Mohana, “A Pre-Trained Model for Aspect-based Sentiment Analysis Task: using Online Social Networking,” Procedia Comput. Sci., vol. 233, pp. 35–44, 2024, doi: 10.1016/j.procs.2024.03.193.
[2] K. K. Yusuf, E. Ogbuju, T. Abiodun, and F. Oladipo, “A Technical Review of the State-of-the-Art Methods in Aspect-Based Sentiment Analysis,” J. Comput. Theor. Appl., vol. 1, no. 3, pp. 287–298, 2024, doi: 10.62411/jcta.9999.
[3] H. T. M. Le, T. A. Phan-Thi, B. T. Nguyen, and T. Q. Nguyen, “Mining online hotel reviews using big data and machine learning: An empirical study from an emerging country,” Ann. Tour. Res. Empir. Insights, vol. 6, no. 1, p. 100170, 2025, doi: 10.1016/j.annale.2025.100170.
[4] N. D. Wulandari, M. H. Z. Nuri, and L. Kurniasari, “Customers’ Satisfaction and Preferences Using Sentiment Analysis on Traveloka: The Case of Yogyakarta Special Region Hotels,” Proc. 1st UMGESHIC Int. Semin. Heal. Soc. Sci. Humanit. (UMGESHIC-ISHSSH 2020), vol. 585, no. April, 2021, doi: 10.2991/assehr.k.211020.058.
[5] D. R. I. M. Setiadi, D. Marutho, and N. A. Setiyanto, “Comprehensive Exploration of Machine and Deep Learning Classification Methods for Aspect-Based Sentiment Analysis with Latent Dirichlet Allocation Topic Modeling,” J. Futur. Artif. Intell. Technol., vol. 1, no. 1, pp. 12–22, 2024, doi: 10.62411/faith.2024-3.
[6] R. Kusumaningrum, I. Z. Nisa, R. Jayanto, R. P. Nawangsari, and A. Wibowo, “Deep learning-based application for multilevel sentiment analysis of Indonesian hotel reviews,” Heliyon, vol. 9, no. 6, p. e17147, 2023, doi: 10.1016/j.heliyon.2023.e17147.
[7] H. Huang and A. A. Zavareh, “Sentiment Analysis in E-Commerce Platforms : A Review of Current Techniques and Future Directions,” IEEE Access, vol. 11, no. August, pp. 90367–90382, 2023, doi: 10.1109/ACCESS.2023.3307308.
[8] S. Taj, S. M. Daudpota, A. S. Imran, and Z. Kastrati, “Aspect-based sentiment analysis for software requirements elicitation using fine-tuned Bidirectional Encoder Representations from Transformers and Explainable Artificial Intelligence,” Eng. Appl. Artif. Intell., vol. 151, no. February, p. 110632, 2025, doi: 10.1016/j.engappai.2025.110632.
[9] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 757–770, 2020, doi: 10.18653/v1/2020.coling-main.66.
[10] S. Cahyawijaya et al., “IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation,” EMNLP 2021 - 2021 Conf. Empir. Methods Nat. Lang. Process. Proc., pp. 8875–8898, 2021, doi: 10.18653/v1/2021.emnlp-main.699.
[11] A. N. Azhar, “2024 11th International Conference on Advanced Informatics: Concept, Theory and Application, ICAICTA 2024,” 2024 11th Int. Conf. Adv. Informatics Concept, Theory Appl. ICAICTA 2024, 2024.
[12] S. Cahyaningtyas, D. Hatta Fudholi, and A. Fathan Hidayatullah, “Deep Learning for Aspect-Based Sentiment Analysis on Indonesian Hotels Reviews,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, 2021, doi: 10.22219/kinetik.v6i3.1300.
[13] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep Learning Based Text Classification: A Comprehensive Review,” vol. 1, no. 1, pp. 1–43, 2021, [Online]. Available: http://arxiv.org/abs/2004.03705
[14] L. A. Kumar and D. K. Renuka, “State-of-the-Art Natural Language Processing,” Deep Learn. Approach Nat. Lang. Process. Speech, Comput. Vis., pp. 49–75, 2023, doi: 10.1201/9781003348689-3.
[15] A. Condor, M. Litster, and Z. Pardos, “Automatic short answer grading with SBERT on out-of-sample questions,” Proc. 14th Int. Conf. Educ. Data Mining, EDM 2021, no. Edm, pp. 345–352, 2021.
[16] P. Sundarreson and S. Kumarapathirage, “SentiGEN: Synthetic Data Generator for Sentiment Analysis,” J. Comput. Theor. Appl., vol. 1, no. 4, pp. 461–477, 2024, doi: 10.62411/jcta.10480.
[17] S. Ali, G. Wang, and S. Riaz, “Aspect Based Sentiment Analysis of Ridesharing Platform Reviews for Kansei Engineering,” vol. 8, 2020, doi: 10.1109/ACCESS.2020.3025823.
[18] E. Yulianti and N. K. Nissa, “ABSA of Indonesian customer reviews using IndoBERT: single-sentence and sentence-pair classification approaches,” Bull. Electr. Eng. Informatics, vol. 13, no. 5, pp. 3579–3589, 2024, doi: 10.11591/eei.v13i5.8032.
[19] Y. A. Singgalen, “Performance Analysis of IndoBERT for Sentiment Classification in Indonesian Hotel Review Data,” J. Inf. Syst. Res., vol. 6, no. 2, pp. 976–986, 2025, doi: 10.47065/josh.v6i2.6505.
[20] M. Y. Ridho and E. Yulianti, “From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 10, no. 3, pp. 544–555, 2024, doi: 10.26555/jiteki.v10i3.29450.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Thania Aprilah, De Rosal Ignatius Moses Setiadi, Wise Herowati

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








