Optimizing Sentiment Classification Models for TikTok Comments using Emotion-Based Preprocessing and Grid Search

Authors

  • Bagas Restya Ermawan Universitas Amikom Yogyakarta
  • Mahendra Bayu Prayoga Universitas Amikom Yogyakarta
  • Akmal Rafi Fadhillah Universitas Amikom Yogyakarta
  • Ema Utami Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.30871/jaic.v10i1.11742

Keywords:

Emotion-Based Preprocessing, Grid Search, Hyperparameter Tuning, Machine Learning, Sentiment Analysis

Abstract

TikTok has become one of the social media platforms with a significant influence on public opinion formation in Indonesia. However, the linguistic characteristics of user comments which are expressive, concise, and feature emotional forms like emojis, emoticons, and excessive capitalization pose challenges for sentiment analysis. This research aims to optimize a sentiment classification model for TikTok comments using emotion-based preprocessing and hyperparameter optimization via Grid Search. The dataset comprises 4,500 comments from three different time periods discussing the Minister of Finance, Purbaya Yudhi Sadewa. Three testing scenarios were conducted: common preprocessing, emotion-based preprocessing, and a combination of emotion-based preprocessing with Grid Search. The results indicate that emotion-based preprocessing improved model accuracy by 4–5%, while Grid Search optimization provided an additional increase of up to 3%, achieving a peak F1-score of 0.92 with the LightGBM model. Analysis based on sentiment time-periods reveals that across the three different periods, sentiments remained predominantly positive. The integration of emotion-based processing and parameter tuning proved effective in enhancing the model's ability to understand emotional variations in text and to map periodic changes in public sentiment on Indonesian-language social media.

Downloads

Download data is not yet available.

References

[1] Z. Cheng dan Y. Li, “Like, Comment, and Share on TikTok: Exploring the Effect of Sentiment and Second-Person View on the User Engagement with TikTok News Videos,” Soc. Sci. Comput. Rev., vol. 42, no. 1, hlm. 201–223, Feb 2024, doi: 10.1177/08944393231178603.

[2] E. Supriyadi dan P. N. Makatita, “Sentiment Analysis of TikTok User Comments on QRIS Adoption in Indonesia Using IndoBERT,” Procedia Comput. Sci., vol. 269, hlm. 121–130, Jan 2025, doi: 10.1016/j.procs.2025.08.265.

[3] S. A. A. Hakami, R. Hendley, dan P. Smith, “Emoji Sentiment Roles for Sentiment Analysis: A Case Study in Arabic Texts,” dalam Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP), H. Bouamor, H. Al-Khalifa, K. Darwish, O. Rambow, F. Bougares, A. Abdelali, N. Tomeh, S. Khalifa, dan W. Zaghouani, Ed., Abu Dhabi, United Arab Emirates (Hybrid): Association for Computational Linguistics, Des 2022, hlm. 346–355. doi: 10.18653/v1/2022.wanlp-1.32.

[4] A. Khan, D. Majumdar, dan B. Mondal, “Sentiment analysis of emoji fused reviews using machine learning and Bert,” Sci. Rep., vol. 15, no. 1, hlm. 7538, Mar 2025, doi: 10.1038/s41598-025-92286-0.

[5] M. A. Palomino dan F. Aider, “Evaluating the Effectiveness of Text Pre-Processing in Sentiment Analysis,” Appl. Sci., vol. 12, no. 17, hlm. 8765, Jan 2022, doi: 10.3390/app12178765.

[6] H. Tang, W. Tang, D. Zhu, S. Wang, Y. Wang, dan L. Wang, “EMFSA: Emoji-based multifeature fusion sentiment analysis,” PLOS ONE, vol. 19, no. 9, hlm. e0310715, Sep 2024, doi: 10.1371/journal.pone.0310715.

[7] F.-Y. Chang, “A Quantitative Analysis of Comparison of Emoji Sentiment: Taiwan Mandarin Users and English Users,” dalam Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), Y.-C. Chang dan Y.-C. Huang, Ed., Taipei, Taiwan: The Association for Computational Linguistics and Chinese Language Processing (ACLCLP), Nov 2022, hlm. 283–288. Diakses: 9 November 2025. [Daring]. Tersedia pada: https://aclanthology.org/2022.rocling-1.35/

[8] M. A. K. Raiaan dkk., “A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks,” Decis. Anal. J., vol. 11, hlm. 100470, Jun 2024, doi: 10.1016/j.dajour.2024.100470.

[9] A. S. Aribowo, N. H. Cahyana, dan Y. Fauziah, “Enhancing Semi-Supervised Sentiment Analysis Through Hyperparameter Tuning Within Iterations: A Comparative Study Using Grid Search and Random Search,” dalam Proceedings of the 2023 1st International Conference on Advanced Informatics and Intelligent Information Systems (ICAI3S 2023), vol. 181, A. Putro Suryotomo dan H. Cahya Rustamaji, Ed., dalam Advances in Intelligent Systems Research, vol. 181. , Dordrecht: Atlantis Press International BV, 2024, hlm. 248–260. doi: 10.2991/978-94-6463-366-5_23.

[10] S. W. Iriananda, R. W. Budiawan, A. Y. Rahman, dan I. Istiadi, “Optimasi Klasifikasi Sentimen Komentar Pengguna Game Bergerak Menggunakan Svm, Grid Search Dan Kombinasi N-Gram,” J. Teknol. Inf. Dan Ilmu Komput., vol. 11, no. 4, hlm. 743–752, Agu 2024, doi: 10.25126/jtiik.1148244.

[11] S. Matharaarachchi, M. Domaratzki, dan S. Muthukumarana, “Enhancing SMOTE for imbalanced data with abnormal minority instances,” Mach. Learn. Appl., vol. 18, hlm. 100597, Des 2024, doi: 10.1016/j.mlwa.2024.100597.

[12] R. Dolak dan P. Kajzar, “Web Scraping and Its Use for Teaching in Course Information Systems in Tourism,” dalam Innovative Technologies and Learning, W.-S. Wang, F. E. Sandnes, C.-F. Lai, T. A. Sandtrø, dan Y.-M. Huang, Ed., Cham: Springer Nature Switzerland, 2026, hlm. 222–230.

[13] S. Patankar dan M. Phadke, “A CNN-transformer framework for emotion recognition in code-mixed English–Hindi data,” Discov. Artif. Intell., vol. 5, no. 1, hlm. 160, Jul 2025, doi: 10.1007/s44163-025-00400-y.

[14] K. S. Eljil, F. Nait-Abdesselam, E. Hamouda, dan M. Hamdi, “Enhancing Sentiment Analysis on Social Media with Novel Preprocessing Techniques,” J. Adv. Inf. Technol., vol. 14, no. 6, hlm. 1206–1213, 2023, doi: 10.12720/jait.14.6.1206-1213.

[15] A. Thakkar, D. Mungra, A. Agrawal, dan K. Chaudhari, “Improving the Performance of Sentiment Analysis Using Enhanced Preprocessing Technique and Artificial Neural Network,” IEEE Trans. Affect. Comput., vol. 13, no. 4, hlm. 1771–1782, Okt 2022, doi: 10.1109/TAFFC.2022.3206891.

[16] A. R. Lubis, Y. Y. Lase, D. A. Rahman, dan D. Witarsyah, “Improving Spell Checker Performance for Bahasa Indonesia Using Text Preprocessing Techniques with Deep Learning Models,” Ingénierie Systèmes Inf., vol. 28, no. 5, hlm. 1335–1342, Okt 2023, doi: 10.18280/isi.280522.

[17] Z. Mansur, N. Omar, S. Tiun, dan E. M. Alshari, “A normalization model for repeated letters in social media hate speech text based on rules and spelling correction,” PloS One, vol. 19, no. 3, hlm. e0299652, 2024, doi: 10.1371/journal.pone.0299652.

[18] Arif Bijaksana Putra Negara, “The Influence Of Applying Stopword Removal And Smote On Indonesian Sentiment Classification,” Lontar Komput. J. Ilm. Teknol. Inf., vol. 14, no. 03, hlm. 172–185, Okt 2025, doi: 10.24843/LKJITI.2023.v14.i03.p05.

[19] K. Machová, M. Mikula, X. Gao, dan M. Mach, “Lexicon-based Sentiment Analysis Using the Particle Swarm Optimization,” Electronics, vol. 9, no. 8, hlm. 1317, Agu 2020, doi: 10.3390/electronics9081317.

[20] H. Ahmad, W. Akbar, N. Aslam, A. Ahmed, dan M. Khurshid, “TF-IDF Feature Extraction based Sarcasm Detection on Social Media,” J. Comput. Biomed. Inform., vol. 5, no. 01, Jun 2023, doi: 10.56979/501/2023.

[21] S. F. Taskiran, B. Turkoglu, E. Kaya, dan T. Asuroglu, “A comprehensive evaluation of oversampling techniques for enhancing text classification performance,” Sci. Rep., vol. 15, no. 1, hlm. 21631, Jul 2025, doi: 10.1038/s41598-025-05791-7.

[22] H. Allam, L. Makubvure, B. Gyamfi, K. N. Graham, dan K. Akinwolere, “Text Classification: How Machine Learning Is Revolutionizing Text Categorization,” Information, vol. 16, no. 2, hlm. 130, Feb 2025, doi: 10.3390/info16020130.

[23] I. Wardhana, Musi Ariawijaya, Vandri Ahmad Isnaini, dan Rahmi Putri Wirman, “Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering,” J. RESTI Rekayasa Sist. Dan Teknol. Inf., vol. 6, no. 1, hlm. 92–99, Feb 2022, doi: 10.29207/resti.v6i1.3682.

[24] M. Ogunsanya, J. Isichei, dan S. Desai, “Grid search hyperparameter tuning in additive manufacturing processes,” Manuf. Lett., vol. 35, hlm. 1031–1042, Agu 2023, doi: 10.1016/j.mfglet.2023.08.056.

[25] I. V. Tetko, R. van Deursen, dan G. Godin, “Be aware of overfitting by hyperparameter optimization!,” J. Cheminformatics, vol. 16, no. 1, hlm. 139, Des 2024, doi: 10.1186/s13321-024-00934-w.

[26] C. Bentéjac, A. Csörgő, dan G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artif. Intell. Rev., vol. 54, no. 3, hlm. 1937–1967, Mar 2021, doi: 10.1007/s10462-020-09896-5.

Downloads

Published

2026-02-04

How to Cite

[1]
B. R. Ermawan, M. B. Prayoga, A. R. Fadhillah, and E. Utami, “Optimizing Sentiment Classification Models for TikTok Comments using Emotion-Based Preprocessing and Grid Search”, JAIC, vol. 10, no. 1, pp. 535–547, Feb. 2026.

Similar Articles

<< < 61 62 63 

You may also start an advanced similarity search for this article.