Proboboost: A Hybrid Model for Sentiment Analysis of Kitabisa Reviews

Authors

  • Rakan Shafy Prasetya Universitas Dian Nuswantoro
  • Amiq Fahmi Universitas Dian Nuswantoro
  • MY Teguh Sulistyono Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v9i6.11138

Keywords:

Gradient Boosting, Kitabisa Application, Naive Bayes, Proboboost, Sentiment Analysis, TF-IDF (Term Frequency-Inverse Document Frequency

Abstract

The rapid advancement of digital technology has significantly transformed public behavior in social activities, particularly in online donations and zakat payments. The Kitabisa application was selected in this study not only for its popularity but also due to its high user engagement and large volume of reviews on the Google Play Store, making it an ideal representation of public trust in Indonesia’s digital philanthropy ecosystem. This research aims to analyze user sentiment toward the Kitabisa application using a hybrid Proboboost model, which combines Multinomial Naive Bayes (MNB) and Gradient Boosting Classifier through a soft voting mechanism. The model is designed to address class imbalance and improve accuracy in short-text sentiment analysis for the Indonesian language. The study employed preprocessing techniques including case folding, text cleaning, stopword removal, and stemming using the Sastrawi algorithm. Feature extraction was performed using TF-IDF, with an 80:20 train-test split and 5-fold cross-validation to ensure model reliability. Experimental results indicate that the Proboboost model achieved an accuracy of 89.51% and an F1-score of 87.4%, outperforming the Naive Bayes baseline with 87.98% accuracy. The sentiment distribution demonstrates a dominance of positive sentiment (87.24%), followed by negative (8.53%) and neutral (4.23%) reviews. These findings suggest that users generally express satisfaction and trust toward the Kitabisa platform. The results also confirm that the hybrid Proboboost model effectively balances classification performance between majority and minority sentiment classes, offering deeper insights into user perceptions of digital philanthropic services.

Downloads

Download data is not yet available.

References

[1] M. Wankhade, A. C. S. Rao, and C. Kulkarni, “A survey on sentiment analysis methods, applications, and challenges,” Artif. Intell. Rev., vol. 55, no. 7, pp. 5731–5780, Oct. 2022, doi: 10.1007/s10462-022-10144-1.

[2] O. Kononova, T. He, H. Huo, A. Trewartha, E. A. Olivetti, and G. Ceder, “Opportunities and challenges of text mining in materials research,” iScience, vol. 24, no. 3, p. 102155, Mar. 2021, doi: 10.1016/j.isci.2021.102155.

[3] M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,” Knowl.-Based Syst., vol. 226, p. 107134, 2021.

[4] Israt Jahan, Md Nakibul Islam, Md Mahadi Hasan, and Md Rafiuddin Siddiky, “Comparative analysis of machine learning algorithms for sentiment classification in social media text,” World J. Adv. Res. Rev., vol. 23, no. 3, pp. 2842–2852, Sep. 2024, doi: 10.30574/wjarr.2024.23.3.2983.

[5] L. Zhang, S. Wang, and B. Liu, “Deep Learning for Sentiment Analysis : A Survey,” Jan. 30, 2018, arXiv: arXiv:1801.07883. doi: 10.48550/arXiv.1801.07883.

[6] C. Kaur, “Sentiment Analysis of Tweets on Social Issues using Machine Learning Approach,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 9, no. 4, pp. 6303–6311, Aug. 2020, doi: 10.30534/ijatcse/2020/310942020.

[7] S. Rabbani, D. Safitri, F. T. P. Siregar, R. Rahmaddeni, and L. Efrizoni, “Evaluation of Support Vector Machine, Naive Bayes, Decision Tree, and Gradient Boosting Algorithms for Sentiment Analysis on ChatGPT Twitter Dataset,” Indones. J. Artif. Intell. Data Min., vol. 7, no. 1, pp. 11–21, Nov. 2023, doi: 10.24014/ijaidm.v7i1.24662.

[8] E. Cambria, D. Hazarika, S. Poria, A. Hussain, and R. B. V. Subramaanyam, “Benchmarking Multimodal Sentiment Analysis,” Jul. 29, 2017, arXiv: arXiv:1707.09538. doi: 10.48550/arXiv.1707.09538.

[9] M. Kamruzzaman, M. Hossain, Md. R. I. Imran, and S. C. Bakchy, “A Comparative Analysis of Sentiment Classification Based on Deep and Traditional Ensemble Machine Learning Models,” in 2021 International Conference on Science & Contemporary Technologies (ICSCT), Aug. 2021, pp. 1–5. doi: 10.1109/ICSCT53883.2021.9642583.

[10] M. Nalluri, M. Pentela, and N. R. Eluri, “A Scalable Tree Boosting System: XG Boost”, Accessed: Sep. 10, 2025. [Online]. Available: http://ijrsset.org/pdfs/v7-i12/5.pdf

[11] R. Zulfiqri, B. N. Sari, and T. N. Padilah, “Analisis sentimen ulasan pengguna aplikasi media sosial Instagram pada situs Google Play Store menggunakan Naïve Bayes Classifier,” J. Inform. Dan Tek. Elektro Terap., vol. 12, no. 3, 2024, Accessed: Sep. 10, 2025. [Online]. Available: https://journal.eng.unila.ac.id/index.php/jitet/article/view/4995

[12] E. Apriliyanto and Y. S. Rahayu, “Comparison of Sentiment Analysis from Twitter Data Collection with Naïve Bayes, Decision Tree, and k-Nearest Neighbor Methods,” J. Ilm. SINUS, vol. 22, no. 2, pp. 1–12, 2024.

[13] N. H. A. Malek, W. F. W. Yaacob, Y. B. Wah, S. A. M. Nasir, N. Shaadan, and S. W. Indratno, “Comparison of ensemble hybrid sampling with bagging and boosting machine learning approach for imbalanced data,” Indones J Elec Eng Comput Sci, vol. 29, no. 1, p. 598, 2023.

[14] S. Cui, Y. Han, Y. Duan, Y. Li, S. Zhu, and C. Song, “A Two-Stage Voting-Boosting Technique for Ensemble Learning in Social Network Sentiment Classification,” Entropy, vol. 25, no. 4, p. 555, Apr. 2023, doi: 10.3390/e25040555.

[15] C. P. Chai, “Comparison of text preprocessing methods,” Nat. Lang. Eng., vol. 29, no. 3, pp. 509–553, May 2023, doi: 10.1017/S1351324922000213.

[16] N. A. K. M. Haris, S. Mutalib, A. M. A. Malik, S. Abdul-Rahman, and S. N. K. Kamarudin, “Sentiment classification from reviews for tourism analytics,” Int. J. Adv. Intell. Inform., vol. 9, no. 1, p. 108, Mar. 2023, doi: 10.26555/ijain.v9i1.1077.

[17] W. I. Al-Obaydy, H. A. Hashim, Y. A. Najm, and A. A. Jalal, “Document classification using term frequency-inverse document frequency and K-means clustering,” Indones. J. Electr. Eng. Comput. Sci., vol. 27, no. 3, pp. 1517–1524, 2022.

[18] J. Eykens, R. Guns, and T. C. E. Engels, “Fine-grained classification of social science journal articles using textual data: A comparison of supervised machine learning approaches,” Quant. Sci. Stud., vol. 2, no. 1, pp. 89–110, Apr. 2021, doi: 10.1162/qss_a_00106.

[19] R. Sibindi, R. W. Mwangi, and A. G. Waititu, “A boosting ensemble learning based hybrid light gradient boosting machine and extreme gradient boosting model for predicting house prices,” Eng. Rep., vol. 5, no. 4, p. e12599, 2023, doi: 10.1002/eng2.12599.

[20] Amriana, A. A. Ilham, A. Achmad, and Yusran, “Ensemble Soft-Voting Model for Classification Optimization of Medicinal Plants Leaves,” in 2023 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT), Nov. 2023, pp. 147–152. doi: 10.1109/COMNETSAT59769.2023.10420635.

[21] S. Riyanto, I. S. Sitanggang, T. Djatna, and T. D. Atikah, “Comparative Analysis using Various Performance Metrics in Imbalanced Data for Multi-class Text Classification,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 6, 2023, doi: 10.14569/IJACSA.2023.01406116.

[22] “Kitabisa - Donasi, Zakat, Wakaf, dan Saling Jaga se-Indonesia.” Accessed: Sep. 11, 2025. [Online]. Available: https://kitabisa.com/

[23] R. O. Olanrewaju, S. A. Olanrewaju, and L. A. Nafiu, “Multinomial naïve bayes classifier: Bayesian versus nonparametric classifier approach,” Eur. J. Stat., vol. 2, pp. 8–8, 2022.

[24] M. Fahmy Amin, “Confusion matrix in three-class classification problems: A step-by-step tutorial,” J. Eng. Res., vol. 7, no. 1, pp. 0–0, 2023.

[25] N. V. Chawla, A. Lazarevic, L. O. Hall, and K. W. Bowyer, “SMOTEBoost: Improving Prediction of the Minority Class in Boosting,” in Knowledge Discovery in Databases: PKDD 2003, N. Lavrač, D. Gamberger, L. Todorovski, and H. Blockeel, Eds., in Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2003, pp. 107–119. doi: 10.1007/978-3-540-39804-2_12.

[26] R. Wijayanti and A. Arisal, “Ensemble approach for sentiment polarity analysis in user-generated Indonesian text,” in 2017 International Conference on Computer, Control, Informatics and its Applications (IC3INA), Jakarta: IEEE, Oct. 2017, pp. 158–163. doi: 10.1109/IC3INA.2017.8251759.

Downloads

Published

2025-12-09

How to Cite

[1]
R. S. Prasetya, A. Fahmi, and M. T. Sulistyono, “Proboboost: A Hybrid Model for Sentiment Analysis of Kitabisa Reviews”, JAIC, vol. 9, no. 6, pp. 3657–3668, Dec. 2025.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.