From Sparse Features to Transformers: A Statistical Evaluation of TF-IDF, FastText, and IndoBERT for Sentiment Classification of Indonesian Travel App Reviews

Claudian Tikulimbong Tangdilomban; Syaifullah Yusuf Ramdhan; Muhammad Rizal; Cici Suhaeni; Bagus Sartono

doi:10.30871/jaic.v10i3.12610

Authors

Claudian Tikulimbong Tangdilomban IPB University
Syaifullah Yusuf Ramdhan IPB University
Muhammad Rizal IPB University
Cici Suhaeni IPB University
Bagus Sartono IPB University

DOI:

https://doi.org/10.30871/jaic.v10i3.12610

Keywords:

FastText, TF-IDF, Sentiment Classification, Supervised Machine Learning, User Reviews

Abstract

This study compares three text representation techniques, namely TF-IDF, FastText, and IndoBERT, in the sentiment classification task of Indonesian-language user reviews of travel applications. The dataset consists of 4.000 reviews from Traveloka and Tiket.com, collected through Google Play Store scraping and manually annotated with sentiment labels. Each representation technique was combined with three classification algorithms, namely Support Vector Machine, Logistic Regression, and Random Forest, resulting in nine experimental configurations. The evaluation was conducted using stratified 5-fold cross-validation with macro F1-score as the primary metric, supported by hyperparameter tuning using GridSearchCV, paired t-test statistical analysis, and Cohen’s d effect size measurement. The evaluation results indicate that IndoBERT generally achieved the best performance compared to TF-IDF and FastText. The best configuration was obtained by IndoBERT with Logistic Regression, achieving an F1-score of 0.9261 after tuning. The statistical test showed that the performance differences among text representations were statistically significant, with large effect sizes in the comparison between IndoBERT and TF-IDF (d = −1.36) and between IndoBERT and FastText (d = −1.10). Nevertheless, TF-IDF combined with Logistic Regression and SVM remained competitive, achieving an F1-score of approximately 0.892 after tuning, making it a lightweight and interpretable alternative. This study concludes that the quality of text representation has a more dominant influence on sentiment classification performance than the complexity of the classification algorithm.

Downloads

Download data is not yet available.

References

[1] W. Chen, Z. Xu, X. Zheng, Q. Yu, and Y. Luo, “applied sciences Research on Sentiment Classification of Online Travel Review Text,” Appl. Sci., vol. 10, 2020, doi: 10.3390/app10155275.

[2] Y. A. Singgalen, “Performance Analysis of IndoBERT for Sentiment Classification in Indonesian Hotel Review Data,” J. Inf. Syst. Res., vol. 6, no. 2, pp. 976–986, 2025, doi: 10.47065/josh.v6i2.6505.

[3] M. G. Al Hakim and F. Irwiensyah, “Analisis Sentimen Terhadap Ulasan Pengguna Pada Aplikasi Traveloka Menggunakan Metode Naïve,” Build. Informatics, Technol. Sci., vol. 6, no. 3, pp. 1448–1456, 2024, doi: 10.47065/bits.v6i3.6119.

[4] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” Ilk. J. Ilm., vol. 14, no. 3, pp. 348–354, 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.

[5] A. Muhammad, S. Defit, and G. W. Nurcahyo, “Determining Intent : Sentiment Analysis Based on the Classification of Indonesian Tourist Destination Review Texts,” J. Adv. Inf. Technol., vol. 15, no. 10, 2024, doi: 10.12720/jait.15.10.1106-1116.

[6] S. Suryadi, D. Syahputra, N. Astrianda, R. A. Syahputra, and R. Suhendra, “Leveraging Machine Learning for Sentiment Analysis in Hotel Applications: A Comparative Study of Support Vector Machine and Random Forest Algorithms,” Brill. Res. Artif. Intell., vol. 4, no. 2, pp. 567–576, 2024, doi: 10.47709/brilliance.v4i2.4877.

[7] R. K. Mishra, S. Urolagin, and A. A. J. Jothi, “A Sentiment analysis-based hotel recommendation using TF-IDF Approach,” Proc. 2019 Int. Conf. Comput. Intell. Knowl. Econ. ICCIKE 2019, pp. 811–815, 2019, doi: 10.1109/ICCIKE47802.2019.9004385.

[8] M. B. Kurniawan, R. Hikmianto, and I. Muslihah, “Hyperparameter Optimization of TF-IDF and SVM via Grid Search for Sentiment Analysis of Traveloka Customer Reviews,” Khazanah Inform., vol. 11, no. 2, 2025.

[9] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,” Assoc. Comput. Linguist., vol. 5, pp. 135–146, 2017.

[10] N. Khamphakdee and P. Seresangtakul, “An Efficient Deep Learning for Thai Sentiment Analysis,” Data, vol. 8, no. 90, 2023, doi: doi.org/10.3390/data8050090.

[11] A. M. M. Al Zoubi, Spam Reviews Detection Models in Multilingual Contexts applying Sentiment Analysis , Metaheuristics , and Advanced Word Embedding. Spain: Universidad De Granada, 2024.

[12] H. Suroyo and E. J. Pratama, “Comparison of Text Representation Methods for Sentiment Analysis Using Support Vector Machine,” J. Adv. Inf. Ind. Technol., vol. 7, no. 1, pp. 21–30, 2025, doi: 10.52435/jaiit.v7i1.610.

[13] L. Afuan and N. Hidayat, “Sentiment Analysis of the Kampus Merdeka Program on Twitter Using Support Vector Machine,” J. Appl. Data Sci., vol. 5, no. 4, pp. 1738–1753, 2024.

[14] F. I. Ramadhani, T. A. Yoga, and N. A. Verdhika, “Komparasi FastText dan TF-IDF Berbasis Random Forest pada Analisis Sentimen IKN di Youtube,” vol. 6, no. 12, pp. 2288–2301, 2026, doi: 10.47065/tin.v6i12.9749.

[15] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, pp. 4171–4186, 2019.

[16] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 757–770, 2020, doi: 10.18653/v1/2020.coling-main.66.

[17] M. A. K. Fata, S. Sumpeno, A. D. Wibawa, and D. A. Feryando, “Evaluating the Sentiment Analysis from Auto-Generated Summary Text Using IndoBERT Fine-Tuning Model in Indonesian News Text,” Proc. - 2023 15th IEEE Int. Conf. Comput. Intell. Commun. Networks, CICN 2023, pp. 822–829, 2023, doi: 10.1109/CICN59264.2023.10402345.

[18] R. I. Perwira, V. A. Permadi, D. I. Purnamasari, and R. P. Agusdin, “Domain-Specific Fine-Tuning of IndoBERT for Aspect-Based Sentiment Analysis in Indonesian Travel User-Generated Content,” J. Inf. Syst. Eng. Bus. Intell., vol. 11, no. 1, pp. 30–40, 2025, doi: 10.20473/jisebi.11.1.30-40.

[19] A. Hogenboom, D. Bal, F. Fransincar, M. Bal, F. de Jong, and U. Kaymak, “Exploiting Emoticons in Sentiment Analysis,” Proc. 28th Annu. ACM Symp. Appl. Comput., 2013.

[20] A. Deshmukh, A. Dhage, R. Gadapa, S. Butle, A. Yenkikar, and N. P. Sable, “Comparative Analysis of Machine Learning Algorithms for Emotion Classification,” 2024 IEEE Pune Sect. Int. Conf. PuneCon 2024, pp. 1–6, 2024, doi: 10.1109/PuneCon63413.2024.10895276.

[21] T. A. Almeida, J. M. G. Hidalgo, and A. Yamakami, “Contributions to the Study of SMS Spam Filtering : New Collection and Results,” Proc. 11th ACM Symp. Doc. Eng., 2011.

[22] J. Ramos, “Using TF-IDF to determine word relevance in document queries,” Jan. 2003, [Online]. Available: https://api.semanticscholar.org/CorpusID:14638345

[23] C. Cortes and V. Vapnik, “Support-Vector Networks,” Kluwer Acad. Publ., vol. 20, pp. 273–297, 1995.

[24] D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression. Hoboken, New Jersey: John Wiley & Sons, Inc., 2013.

[25] P. Awasthi, M. Thomas, D. Junghare, and M. Bianco, “A Machine Learning Framework for Failure Mode Identification from Warranty Data,” Proc. - Annu. Reliab. Maintainab. Symp., pp. 1–6, 2025, doi: 10.1109/RAMS48127.2025.10935108.

[26] H. D. Vu, Q. T. Pham, V. K. Solanki, T. M. Hoang, and D. T. Tran, “Sentiment Analysis using Machine Learning and Deep Learning Models,” Proc. - 2024 IEEE Int. Conf. Mach. Learn. Appl. Netw. Technol. ICMLANT 2024, pp. 68–73, 2024, doi: 10.1109/ICMLANT63295.2024.00017.

[27] L. Breiman, “Random Forests,” Kluwer Acad. Publ., vol. 45, pp. 5–32, 2001.

[28] H. Om and A. Kumar Sharma, “Demystifying Existing Sentiment Analysis Approaches of Hindi and English Languages using Machine Learning,” Proc. - IEEE 2024 1st Int. Conf. Adv. Comput. Commun. Networking, ICAC2N 2024, pp. 1210–1217, 2024, doi: 10.1109/ICAC2N63387.2024.10895523.

[29] S. Srivastava, N. Bala, A. Gupta, B. D. Priya, S. Kumar, and A. Raj, “Optimization of Sentiment Analysis Models Using Bayesian Hyperparameter Tuning,” 2024 Int. Conf. Artif. Intell. Quantum Comput. Sens. Appl. ICAIQSA 2024 - Proc., pp. 1–6, 2024, doi: 10.1109/ICAIQSA64000.2024.10882347.

[30] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf. Process. Manag., vol. 45, no. 4, pp. 427–437, 2009, doi: 10.1016/j.ipm.2009.03.002.

[31] B. Das and S. Chakraborty, “An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation,” 2018, [Online]. Available: http://arxiv.org/abs/1806.06407

From Sparse Features to Transformers: A Statistical Evaluation of TF-IDF, FastText, and IndoBERT for Sentiment Classification of Indonesian Travel App Reviews

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

submit

tools

issn