Comparative Performance of SVM and BERT-Base Using Hybrid Preprocessing for Fast Fashion Sentiment Analysis

Authors

  • Restu Lestari Mulianingrum Universitas Dian Nuswantoro, Department of Informatics Engineering, Semarang, Indonesia
  • Erwin Yudi Hidayat Universitas Dian Nuswantoro, Department of Informatics Engineering, Semarang, Indonesia

DOI:

https://doi.org/10.30871/jaic.v9i6.11385

Keywords:

BERT, Fast Fashion, Sentiment Analysis, SVM, TikTok

Abstract

Fast fashion poses major environmental and social challenges, yet public awareness in Indonesia remains insufficiently understood. This study compares Support Vector Machine and BERT-Base for sentiment analysis of 3,513 TikTok comments on fast fashion sustainability using a hybrid preprocessing pipeline that incorporates a 404-entry slang dictionary and IndoNLP utilities to address informal language, code-mixing, and character elongation. Sentiment labels generated using VADER were validated against 1,747 manually annotated samples, achieving Cohen's Kappa of 0.7155, indicating substantial agreement. BERT-Base achieves 92.7% accuracy with F1-scores of 0.86, 0.94, and 0.93 for negative, neutral, and positive classes, while SVM attains competitive 90.4% accuracy with F1-scores of 0.84, 0.93, and 0.91. BERT demonstrates superior negative sentiment detection with recall of 0.87 compared to SVM at 0.82, critical for identifying sustainability concerns. Computational analysis reveals significant trade-offs as BERT requires 230.2 seconds of GPU training and 3.449 seconds of inference, whereas SVM operates efficiently on CPU with 25.9 seconds of training and 0.051 seconds of inference, representing 8.9× and 67.6× efficiency advantages. The sentiment distribution comprising 46.9% neutral, 34.5% positive, and 18.6% negative comments indicates limited critical awareness among Indonesian users. These findings demonstrate that systematic preprocessing bridges the performance gap between classical and transformer models while enabling deployment decisions based on resource constraints, providing methodological insights for low-resource informal text analysis and practical guidance for scalable social listening, greenwashing detection, and evidence-based sustainability communication strategies.

Downloads

Download data is not yet available.

References

[1] F. Bonelli, R. Caferra, and P. Morone, “In need of a sustainable and just fashion industry: identifying challenges and opportunities through a systematic literature review in a Global North/Global South perspective,” Discov. Sustain., vol. 5, no. 1, 2024, doi: 10.1007/s43621-024-00400-5.

[2] N. Olivar Aponte, J. Hernández Gómez, V. Torres Argüelles, and E. D. Smith, “Fast fashion consumption and its environmental impact: a literature review,” Sustain. Sci. Pract. Policy, vol. 20, no. 1, p., 2024, doi: 10.1080/15487733.2024.2381871.

[3] United Nations Environment Programme, Catalysing Science-based Policy Action On Sustainable Consumption And Production: The value-chain approach & its application to food, construction and textiles. 2025. [Online]. Available: https://www.unep.org/resources/publication/catalysing-science-based-policy-action-sustainable-consumption-and-production

[4] M. Stenton, V. Kapsali, R. S. Blackburn, and J. A. Houghton, “From clothing rations to fast fashion: Utilising regenerated protein fibres to alleviate pressures on mass production,” Energies, vol. 14, no. 18, pp. 1–18, 2021, doi: 10.3390/en14185654.

[5] European Parliament, “The impact of textile production and waste on the environment (infographics).” Accessed: Sep. 11, 2025. [Online]. Available: https://www.europarl.europa.eu/topics/en/article/20201208STO93327/the-impact-of-textile-production-and-waste-on-the-environment-infographics

[6] K. Bailey, A. Basu, and S. Sharma, “The Environmental Impacts of Fast Fashion on Water Quality: A Systematic Review,” Water (Switzerland), vol. 14, no. 7, 2022, doi: 10.3390/w14071073.

[7] K. Khurana and S. S. Muthu, “Are low- and middle-income countries profiting from fast fashion?,” J. Fash. Mark. Manag., vol. 26, no. 2, pp. 289–306, 2022, doi: 10.1108/JFMM-12-2020-0260.

[8] Y. Defrita Rufikasari, “Telaah Teologi, Ekonomi Dan Ekologi Terhadap Fenomena Fast Fashion Industry,” J. Kepemimp. Kristen, Teol. dan Entrep., vol. 1, no. 2, pp. 64–83, 2023, doi: 10.61660/tep.v1i2.23.

[9] B. Ozbay, “Fast Fashion Market Report | Fashionbi,” Fashionbi. Accessed: Sep. 11, 2025. [Online]. Available: http://fashionbi.com/market/fast-fashion/all

[10] T. Widari, Aliffianti, and M. Indra, “Fast fashion: Consumptive behavior in fashion industry Generation Z in Yogyakarta,” IAS J. Localities, vol. 1, no. 2, pp. 104–113, 2023, doi: 10.62033/iasjol.v1i2.18.

[11] C. A Lin, X. Wang, and L. Dam, “TikTok Videos and Sustainable Apparel Behavior: Social Consciousness, Prior Consumption and Theory of Planned Behavior,” Emerg. Media, vol. 1, no. 1, pp. 46–69, 2023, doi: 10.1177/27523543231188279.

[12] B. Zhong, J. Deng, and X. Liu, “Analyzing the influence of TikTok on sustainable choices: the moderating role of environmental consciousness,” Acta Psychol. (Amst)., vol. 258, no. September 2024, p. 105182, 2025, doi: 10.1016/j.actpsy.2025.105182.

[13] D. El-Shihy and S. Awaad, “Leveraging social media for sustainable fashion: how brand and user-generated content influence Gen Z’s purchase intentions,” Futur. Bus. J., vol. 11, no. 1, 2025, doi: 10.1186/s43093-025-00529-3.

[14] Z. Cheng and Y. Li, “Like, Comment, and Share on TikTok: Exploring the Effect of Sentiment and Second-Person View on the User Engagement with TikTok News Videos,” Soc. Sci. Comput. Rev., vol. 42, no. 1, pp. 201–223, 2024, doi: 10.1177/08944393231178603.

[15] M. He, C. Ma, and R. Wang, “A Data-Driven Approach for University Public Opinion Analysis and Its Applications,” Appl. Sci., vol. 12, no. 18, 2022, doi: 10.3390/app12189136.

[16] H. M. Abiola, A. Iyanuoluwa, A. A. A., A. M. Gadafi, and A. Ishaq, “Tiktok Through AI Eyes: A Deep Learning Approach to Sentiment Analysis,” Kwaghe Int. J. Eng. Inf. Technol., vol. 2, no. 2, pp. 57–77, 2025, doi: 10.58578/kijeit.v2i2.5485.

[17] V. Piccialli and M. Sciandrone, “Nonlinear optimization and support vector machines,” Ann. Oper. Res., vol. 314, no. 1, pp. 15–47, 2022, doi: 10.1007/s10479-022-04655-x.

[18] T. Ahmed Khan, R. Sadiq, Z. Shahid, M. M. Alam, and M. Mohd Su’ud, “Sentiment Analysis using Support Vector Machine and Random Forest,” J. Informatics Web Eng., vol. 3, no. 1, pp. 67–75, 2024, doi: 10.33093/jiwe.2024.3.1.5.

[19] Y. Song, X. Liu, and Z. Zhou, “A Comprehensive Review of Text Classification Algorithms,” J. Electron. Inf. Sci., vol. 9, no. 2, pp. 34–42, 2024, doi: 10.23977/jeis.2024.090205.

[20] M. B. Sitepu, I. R. Munthe, and S. Z. Harahap, “Implementation of Support Vector Machine Algorithm for Shopee Customer Sentiment Anlysis,” Sinkron, vol. 7, no. 2, pp. 619–627, 2022, doi: 10.33395/sinkron.v7i2.11408.

[21] M. Rahardi, A. Aminuddin, F. F. Abdulloh, and R. A. Nugroho, “Sentiment Analysis of Covid-19 Vaccination using Support Vector Machine in Indonesia,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 6, pp. 534–539, 2022, doi: 10.14569/IJACSA.2022.0130665.

[22] A. Aitim, M. Abdulla, and A. Altayeva, “Sentiment Analysis Using Natural Language Processing,” vol. 3567, 2024.

[23] K. Puh and M. Bagi, “Predicting sentiment and rating of tourist reviews using machine learning,” vol. 6, no. 3, pp. 1188–1204, 2025, doi: 10.1108/JHTI-02-2022-0078.

[24] M. T. Stow, C. Ugwu, and L. N. Onyejegbu, “Improved Hybrid Model for Classification of Text Documents,” vol. 2, no. 2, pp. 17–23, 2023.

[25] A. Chiorrini, C. Diamantini, A. Mircoli, and D. Potena, “Emotion and sentiment analysis of tweets using BERT,” CEUR Workshop Proc., vol. 2841, 2021.

[26] A. Khan, D. Majumdar, and B. Mondal, “Sentiment analysis of emoji fused reviews using machine learning and Bert,” Sci. Rep., vol. 15, no. 1, pp. 1–14, 2025, doi: 10.1038/s41598-025-92286-0.

[27] M. P. Geetha and D. Karthika Renuka, “Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model,” Int. J. Intell. Networks, vol. 2, no. July, pp. 64–69, 2021, doi: 10.1016/j.ijin.2021.06.005.

[28] Z. Yin et al., “DPG-LSTM: An Enhanced LSTM Framework for Sentiment Analysis in Social Media Text Based on Dependency Parsing and GCN,” 2023.

[29] C. Raskoti and W. Li, “Exploring Transformer-Augmented LSTM for Temporal and Spatial Feature Learning in Trajectory Prediction,” arXiv, 2024.

[30] W. Suwarningsih, R. A. Pratama, and F. Y. Rahadika, “RoBERTa : language modelling in building Indonesian question-answering systems Language modelling,” vol. 20, no. 6, pp. 1248–1255, 2022, doi: 10.12928/TELKOMNIKA.v20i6.24248.

[31] A. F. Hidayatullah, R. Anna, A. Daphne, T. Ching, and L. Atika, “Pre ‑ trained language model for code ‑ mixed text in Indonesian , Javanese , and English using transformer,” Soc. Netw. Anal. Min., 2025, doi: 10.1007/s13278-025-01444-9.

[32] M. Usman, M. Ahmad, M. Shahiki, I. Gelbukh, and R. Quintero, “Multilingual Hate Speech Detection in Social Media Using Translation-Based Approaches with Large Language Models,” arXiv Prepr. arXiv2506.08147, 2025.

[33] R. Qasim, W. H. Bangyal, M. A. Alqarni, and A. A. Almazroi, “A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification,” vol. 2022, 2022, doi: 10.1155/2022/3498123.

[34] U. K. Das et al., “Enhancing sentiment analysis accuracy on social media comments using a tuned BERT model,” Discov. Comput., vol. 28, no. 1, p. 198, 2025, doi: 10.1007/s10791-025-09599-x.

[35] A. Agrawal, S. Tripathi, M. Vardhan, V. Sihag, and G. Choudhary, “BERT-Based Transfer-Learning Approach for Nested Named-Entity Recognition Using Joint Labeling,” Appl. Sci., vol. 12, no. 3, 2022, doi: 10.3390/app12030976.

[36] M. A. Palomino and F. Aider, “Evaluating-the-Effectiveness-of-Text-PreProcessing-in-Sentiment-AnalysisApplied-Sciences-Switzerland.pdf,” Mdpi, vol. 12, p. 8765, 2022.

[37] Y. Fauziah, B. Yuwono, and A. S. Aribowo, “Lexicon Based Sentiment Analysis in Indonesia Languages : A Systematic Literature Review,” RSF Conf. Ser. Eng. Technol., vol. 1, no. 1, pp. 363–367, 2021, doi: 10.31098/cset.v1i1.397.

[38] K. Makkar, P. Kumar, M. Poriye, and S. Aggarwal, “Improvisation in opinion mining using data preprocessing techniques based on consumer’s review,” Int. J. Adv. Technol. Eng. Explor., vol. 10, no. 99, pp. 258–278, 2023, doi: 10.19101/IJATEE.2021.875886.

[39] A. A. Aladeemy et al., “Advancements and challenges in Arabic sentiment analysis: A decade of methodologies, applications, and resource development,” Heliyon, vol. 10, no. 21, p. e39786, 2024, doi: 10.1016/j.heliyon.2024.e39786.

[40] A. Kukkar, R. Mohana, A. Sharma, A. Nayyar, and M. A. Shah, “Improving Sentiment Analysis in Social Media by Handling Lengthened Words,” IEEE Access, vol. 11, no. December 2022, pp. 9775–9788, 2023, doi: 10.1109/ACCESS.2023.3238366.

[41] Hyuto, “indoNLP: Indonesian Natural Language Processing.” [Online]. Available: https://hyuto.github.io/indo-nlp/

[42] M. M. Danyal, S. S. Khan, M. Khan, M. B. Ghaffar, B. Khan, and M. Arshad, “Sentiment Analysis Based on Performance of Linear Support Vector Machine and Multinomial Naïve Bayes Using Movie Reviews with Baseline Techniques,” J. Big Data, vol. 5, no. September, pp. 1–18, 2023, doi: 10.32604/jbd.2023.041319.

[43] M. S. Mayaleh, S. A. Mayaleh, M. S. Mayaleh, S. A. Mayaleh, E. Sentiment, and S. Datasets, “Enhancing Sentiment Classification on Small Datasets through Data Augmentation and Transfer Learning : A Comparative Study To cite this version : HAL Id : hal-05090101 Enhancing Sentiment Classification on Small Datasets through Data Augmentation and Tran,” pp. 0–16, 2025.

[44] S. Bengesi, T. Oladunni, R. Olusegun, and H. Audu, “A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets,” IEEE Access, vol. 11, no. January, pp. 11811–11826, 2023, doi: 10.1109/ACCESS.2023.3242290.

[45] C. A. Cruz and F. Balahadia, “Analyzing Public Concern Responsesfor Formulating Ordinances and Lawsusing Sentiment Analysis through VADER Application,” Int. J. Comput. Sci. Res., vol. 6, pp. 842–856, 2022, doi: 10.25147/ijcsr.2017.001.1.77.

[46] J. R. Landis and G. G. Koch, “The Measurement of Observer Agreement for Categorical Data,” Biometrics, vol. 33, no. 1, pp. 159–174, Nov. 1977, doi: 10.2307/2529310.

[47] A. Vohra and R. Garg, “Deep learning based sentiment analysis of public perception of working from home through tweets,” J. Intell. Inf. Syst., vol. 60, no. 1, pp. 255–274, 2023, doi: 10.1007/s10844-022-00736-2.

[48] M. A. Talukder et al., “A hybrid deep learning model for sentiment analysis of COVID-19 tweets with class balancing,” Sci. Rep., vol. 15, no. 1, pp. 1–19, 2025, doi: 10.1038/s41598-025-97778-7.

[49] A. K. Durairaj and A. Chinnalagu, “Transformer based Contextual Model for Sentiment Analysis of Customer Reviews: A Fine-tuned BERT A Sequence Learning BERT Model for Sentiment Analysis,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 11, pp. 474–480, 2021, doi: 10.14569/IJACSA.2021.0121153.

[50] C. M. Greco and A. Tagarelli, Bringing order into the realm of Transformer-based language models for artificial intelligence and law, vol. 32, no. 4. Springer Netherlands, 2024. doi: 10.1007/s10506-023-09374-7.

[51] A. Areshey and H. Mathkour, “Transfer Learning for Sentiment Classification Using Bidirectional Encoder Representations from Transformers (BERT) Model,” Sensors, vol. 23, no. 11, 2023, doi: 10.3390/s23115232.

[52] R. Obiedat et al., “Sentiment Analysis of Customers’ Reviews Using a Hybrid Evolutionary SVM-Based Approach in an Imbalanced Data Distribution,” IEEE Access, vol. 10, pp. 22260–22273, 2022, doi: 10.1109/ACCESS.2022.3149482.

Downloads

Published

2025-12-07

How to Cite

[1]
R. L. Mulianingrum and E. Y. Hidayat, “Comparative Performance of SVM and BERT-Base Using Hybrid Preprocessing for Fast Fashion Sentiment Analysis”, JAIC, vol. 9, no. 6, pp. 3464–3478, Dec. 2025.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.