Enhancing Aspect-Based Sentiment Analysis via Hugging Face Fine-Tuned IndoBERT

Authors

  • Thania Aprilah Universitas Dian Nuswantoro
  • De Rosal Ignatius Moses Setiadi Universitas Dian Nuswantoro
  • Wise Herowati Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v9i6.11409

Keywords:

Aspect-Based Sentiment Analysis, IndoBERT, Hotel Reviews, Class Imbalance, Fine-tuning

Abstract

Aspect-Based Sentiment Analysis (ABSA) on hotel reviews faces significant challenges regarding semantic complexity and severe class imbalance, particularly in low-resource languages like Indonesian. This study evaluates the effectiveness of fine-tuning IndoBERT, a pre-trained Transformer model, to address these issues by benchmarking it against classical statistical methods (TF-IDF) and static embeddings (Sentence-BERT). Utilizing the HoASA dataset, the experiment implements a Random Oversampling strategy at the text level to mitigate data sparsity in minority classes. Empirical results demonstrate that the fine-tuned IndoBERT significantly outperforms baselines on the majority of aspects, achieving a global accuracy of 97% and macro F1-score of 0.92. Granular per-aspect analysis reveals that the model’s self-attention mechanism captures linguistic context robustly in tangible aspects (e.g., wifi, service), yet faces persistent challenges in highly ambiguous aspects such as smell (bau) and general. Statistical significance tests (Paired t-test and Wilcoxon) confirm that the performance gains over baselines are statistically significant (p < 0.05) and not due to random chance. The study concludes that leveraging contextual representations from IndoBERT, combined with data balancing strategies, offers a superior and statistically robust solution for handling linguistic variations and class bias in the Indonesian hospitality domain.

Downloads

Download data is not yet available.

Author Biography

Thania Aprilah, Universitas Dian Nuswantoro

Program Studi Teknik Informatika, Universitas Dian Nuswantoro, Semarang, Indonesia

References

[1] A. Chauhan, A. Sharma, and R. Mohana, “A Pre-Trained Model for Aspect-based Sentiment Analysis Task: using Online Social Networking,” Procedia Comput. Sci., vol. 233, pp. 35–44, 2024, doi: 10.1016/j.procs.2024.03.193.

[2] K. K. Yusuf, E. Ogbuju, T. Abiodun, and F. Oladipo, “A Technical Review of the State-of-the-Art Methods in Aspect-Based Sentiment Analysis,” J. Comput. Theor. Appl., vol. 1, no. 3, pp. 287–298, 2024, doi: 10.62411/jcta.9999.

[3] H. T. M. Le, T. A. Phan-Thi, B. T. Nguyen, and T. Q. Nguyen, “Mining online hotel reviews using big data and machine learning: An empirical study from an emerging country,” Ann. Tour. Res. Empir. Insights, vol. 6, no. 1, p. 100170, 2025, doi: 10.1016/j.annale.2025.100170.

[4] N. D. Wulandari, M. H. Z. Nuri, and L. Kurniasari, “Customers’ Satisfaction and Preferences Using Sentiment Analysis on Traveloka: The Case of Yogyakarta Special Region Hotels,” Proc. 1st UMGESHIC Int. Semin. Heal. Soc. Sci. Humanit. (UMGESHIC-ISHSSH 2020), vol. 585, no. April, 2021, doi: 10.2991/assehr.k.211020.058.

[5] D. R. I. M. Setiadi, D. Marutho, and N. A. Setiyanto, “Comprehensive Exploration of Machine and Deep Learning Classification Methods for Aspect-Based Sentiment Analysis with Latent Dirichlet Allocation Topic Modeling,” J. Futur. Artif. Intell. Technol., vol. 1, no. 1, pp. 12–22, 2024, doi: 10.62411/faith.2024-3.

[6] R. Kusumaningrum, I. Z. Nisa, R. Jayanto, R. P. Nawangsari, and A. Wibowo, “Deep learning-based application for multilevel sentiment analysis of Indonesian hotel reviews,” Heliyon, vol. 9, no. 6, p. e17147, 2023, doi: 10.1016/j.heliyon.2023.e17147.

[7] H. Huang and A. A. Zavareh, “Sentiment Analysis in E-Commerce Platforms : A Review of Current Techniques and Future Directions,” IEEE Access, vol. 11, no. August, pp. 90367–90382, 2023, doi: 10.1109/ACCESS.2023.3307308.

[8] S. Taj, S. M. Daudpota, A. S. Imran, and Z. Kastrati, “Aspect-based sentiment analysis for software requirements elicitation using fine-tuned Bidirectional Encoder Representations from Transformers and Explainable Artificial Intelligence,” Eng. Appl. Artif. Intell., vol. 151, no. February, p. 110632, 2025, doi: 10.1016/j.engappai.2025.110632.

[9] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 757–770, 2020, doi: 10.18653/v1/2020.coling-main.66.

[10] S. Cahyawijaya et al., “IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation,” EMNLP 2021 - 2021 Conf. Empir. Methods Nat. Lang. Process. Proc., pp. 8875–8898, 2021, doi: 10.18653/v1/2021.emnlp-main.699.

[11] A. N. Azhar, “2024 11th International Conference on Advanced Informatics: Concept, Theory and Application, ICAICTA 2024,” 2024 11th Int. Conf. Adv. Informatics Concept, Theory Appl. ICAICTA 2024, 2024.

[12] S. Cahyaningtyas, D. Hatta Fudholi, and A. Fathan Hidayatullah, “Deep Learning for Aspect-Based Sentiment Analysis on Indonesian Hotels Reviews,” Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, 2021, doi: 10.22219/kinetik.v6i3.1300.

[13] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, “Deep Learning Based Text Classification: A Comprehensive Review,” vol. 1, no. 1, pp. 1–43, 2021, [Online]. Available: http://arxiv.org/abs/2004.03705

[14] L. A. Kumar and D. K. Renuka, “State-of-the-Art Natural Language Processing,” Deep Learn. Approach Nat. Lang. Process. Speech, Comput. Vis., pp. 49–75, 2023, doi: 10.1201/9781003348689-3.

[15] A. Condor, M. Litster, and Z. Pardos, “Automatic short answer grading with SBERT on out-of-sample questions,” Proc. 14th Int. Conf. Educ. Data Mining, EDM 2021, no. Edm, pp. 345–352, 2021.

[16] P. Sundarreson and S. Kumarapathirage, “SentiGEN: Synthetic Data Generator for Sentiment Analysis,” J. Comput. Theor. Appl., vol. 1, no. 4, pp. 461–477, 2024, doi: 10.62411/jcta.10480.

[17] S. Ali, G. Wang, and S. Riaz, “Aspect Based Sentiment Analysis of Ridesharing Platform Reviews for Kansei Engineering,” vol. 8, 2020, doi: 10.1109/ACCESS.2020.3025823.

[18] E. Yulianti and N. K. Nissa, “ABSA of Indonesian customer reviews using IndoBERT: single-sentence and sentence-pair classification approaches,” Bull. Electr. Eng. Informatics, vol. 13, no. 5, pp. 3579–3589, 2024, doi: 10.11591/eei.v13i5.8032.

[19] Y. A. Singgalen, “Performance Analysis of IndoBERT for Sentiment Classification in Indonesian Hotel Review Data,” J. Inf. Syst. Res., vol. 6, no. 2, pp. 976–986, 2025, doi: 10.47065/josh.v6i2.6505.

[20] M. Y. Ridho and E. Yulianti, “From Text to Truth: Leveraging IndoBERT and Machine Learning Models for Hoax Detection in Indonesian News,” J. Ilm. Tek. Elektro Komput. dan Inform., vol. 10, no. 3, pp. 544–555, 2024, doi: 10.26555/jiteki.v10i3.29450.

Downloads

Published

2025-12-15

How to Cite

[1]
T. Aprilah, D. R. I. M. Setiadi, and W. Herowati, “Enhancing Aspect-Based Sentiment Analysis via Hugging Face Fine-Tuned IndoBERT”, JAIC, vol. 9, no. 6, pp. 3821–3830, Dec. 2025.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.