Knowledge Discovery Through Topic Modeling on GoPartner User Reviews Using BERTopic, LDA, and NMF
Abstract
Transportation and food delivery services are one of the driving sectors of the digital economy in Indonesia. The e-Conomy SEA 2023 report shows that the transportation and food delivery services sector experienced a decrease in GMV in 2023 by 8% from the previous year. The decline in GMV indicates a decrease in transaction value in the transportation and food delivery service sector. GoPartner is an application developed by GoTo to assist driver partners in carrying out various services in the gojek application which is one of the applications engaged in the transportation sector and food delivery services. Drivers as people who provide services directly to consumers are certainly one of the factors that influence customer behavior in using services. To find out the problems faced by drivers, this research conducts knowledge discovery through topic modeling on GoPartner application reviews using BERTopic, LDA, and NMF, each of these methods has a different approach. Based on the research results and the quality of the topics generated, BERTopic and LDA have better quality in analyzing GoPartner user reviews.
Downloads
References
S. Iqbal and Z. A. Bhatti, “A qualitative exploration of teachers’ perspective on smartphones usage in higher education in developing countries,” Int. J. Educ. Technol. High. Educ., vol. 17, no. 1, Dec. 2020, doi: 10.1186/s41239-020-00203-4.
Y. Liu, L. Liu, H. Liu, and S. Gao, “Combining goal model with reviews for supporting the evolution of apps,” IET Softw., vol. 14, no. 1, pp. 39–49, Feb. 2020, doi: 10.1049/iet-sen.2018.5192.
M. R. Maarif, “Summarizing Online Customer Review using Topic Modeling and Sentiment Analysis,” 2022. doi: 10.14421/jiska.2022.7.3.177-191.
D. Atzeni, D. Bacciu, D. Mazzei, and G. Prencipe, “A Systematic Review of Wi-Fi and Machine Learning Integration with Topic Modeling Techniques,” Sensors, vol. 22, no. 13. MDPI, Jul. 01, 2022. doi: 10.3390/s22134925.
L. George and P. Sumathy, “An integrated clustering and BERT framework for improved topic modeling,” Int. J. Inf. Technol., vol. 15, no. 4, pp. 2187–2195, Apr. 2023, doi: 10.1007/s41870-023-01268-w.
F. Alqurashi and I. Ahmad, “A data-driven multi-perspective approach to cybersecurity knowledge discovery through topic modelling,” Alexandria Eng. J., vol. 107, pp. 374–389, Nov. 2024, doi: 10.1016/j.aej.2024.07.044.
W. Ning, J. Liu, and H. Xiong, “Knowledge discovery using an enhanced latent Dirichlet allocation-based clustering method for solving on-site assembly problems,” Robot. Comput. Integr. Manuf., vol. 73, Feb. 2022, doi: 10.1016/j.rcim.2021.102246.
L. Mora, X. Wu, and A. Panori, “Mind the gap: Developments in autonomous driving research and the sustainability challenge,” Journal of Cleaner Production, vol. 275. Elsevier Ltd, Dec. 01, 2020. doi: 10.1016/j.jclepro.2020.124087.
X. Shu and Y. Ye, “Knowledge Discovery: Methods from data mining and machine learning,” Soc. Sci. Res., vol. 110, Feb. 2023, doi: 10.1016/j.ssresearch.2022.102817.
R. Egger and J. Yu, “A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts,” Front. Sociol., vol. 7, May 2022, doi: 10.3389/fsoc.2022.886498.
S. Ying, “Guests’ Aesthetic experience with lifestyle hotels: An application of LDA topic modelling analysis,” Heliyon, vol. 10, no. 16, Aug. 2024, doi: 10.1016/j.heliyon.2024.e35894.
C. Meaney et al., “Non-negative matrix factorization temporal topic models and clinical text data identify COVID-19 pandemic effects on primary healthcare and community health in Toronto, Canada,” J. Biomed. Inform., vol. 128, Apr. 2022, doi: 10.1016/j.jbi.2022.104034.
A. Kumar, A. Karamchandani, and S. Singh, “Topic Modeling of Neuropsychiatric Diseases Related to Gut Microbiota and Gut Brain Axis Using Artificial Intelligence Based BERTopic Model on PubMed Abstracts,” Neurosci. Informatics, p. 100175, Dec. 2024, doi: 10.1016/j.neuri.2024.100175.
S. Mutmainah, D. H. Fudholi, and S. Hidayat, “Analisis Sentimen dan Pemodelan Topik Aplikasi Telemedicine Pada Google Play Menggunakan BiLSTM dan LDA,” J. MEDIA Inform. BUDIDARMA, vol. 7, no. 1, p. 312, Jan. 2023, doi: 10.30865/mib.v7i1.5486.
M. R. Fahlevvi and Azhari, “Topic Modeling on Online News.Portal Using Latent Dirichlet Allocation (LDA),” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 16, no. 4, p. 335, Oct. 2022, doi: 10.22146/ijccs.74383.
B. Ogunleye, T. Maswera, L. Hirsch, J. Gaudoin, and T. Brunsdon, “Comparison of Topic Modelling Approaches in the Banking Context,” Appl. Sci., vol. 13, no. 2, Jan. 2023, doi: 10.3390/app13020797.
T. Ramamoorthy, V. Kulothungan, and B. Mappillairaju, “Topic modeling and social network analysis approach to explore diabetes discourse on Twitter in India,” Front. Artif. Intell., vol. 7, 2024, doi: 10.3389/frai.2024.1329185.
S. E. Uthirapathy and D. Sandanam, “Topic Modelling and Opinion Analysis on Climate Change Twitter Data Using LDA and BERT Model.,” in Procedia Computer Science, Elsevier B.V., 2022, pp. 908–917. doi: 10.1016/j.procs.2023.01.071.
L. B. Hutama and D. Suhartono, “Indonesian Hoax News Classification with Multilingual Transformer Model and BERTopic,” Inform., vol. 46, no. 8, pp. 81–90, 2022, doi: 10.31449/inf.v46i8.4336.
S. J. Blair, Y. Bi, and M. D. Mulvenna, “Aggregated topic models for increasing social media topic coherence,” Appl. Intell., vol. 50, no. 1, pp. 138–156, Jan. 2020, doi: 10.1007/s10489-019-01438-z.
M. Wang, S. Gao, W. Gui, J. Ye, and S. Mi, “Investigation of Pre-service Teachers’ Conceptions of the Nature of Science Based on the LDA Model,” Sci. Educ., vol. 32, no. 3, pp. 589–615, Jun. 2023, doi: 10.1007/s11191-022-00332-4.
Zoya, S. Latif, F. Shafait, and R. Latif, “Analyzing LDA and NMF Topic Models for Urdu Tweets via Automatic Labeling,” IEEE Access, vol. 9, pp. 127531–127547, 2021, doi: 10.1109/ACCESS.2021.3112620.
P. Li et al., “Guided Semi-Supervised Non-Negative Matrix Factorization,” Algorithms, vol. 15, no. 5, May 2022, doi: 10.3390/a15050136.
T. Gokcimen and B. Das, “Exploring climate change discourse on social media and blogs using a topic modeling analysis,” Heliyon, vol. 10, no. 11, Jun. 2024, doi: 10.1016/j.heliyon.2024.e32464.
Copyright (c) 2025 Metti Detricia Pratiwi, Ken Ditha Tania
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).