Emotion Classification of Indonesian Tweets using BERT Embedding
Abstract
Twitter is one of the social media that has the largest users in the world. Indonesia is one of the countries that has the 5th largest number of Twitter users in the world which causes a high possibility of conflict between Indonesian Twitter users due to emotional tension in tweets. In this paper, we will compare the BERT embedding method with CNN and LSTM. The results of this experiment are BERT-CNN has the best performance results which has an accuracy of 61% compared to BERT-LSTM. In the experiment several stages of data preprocessing, data cleaning, data spiting and data training were carried out and the results were evaluated using confusion metrics.
Downloads
References
C. M. Annur, “Pengguna Twitter di Indonesia Capai 24 Juta hingga Awal 2023, Peringkat Berapa di Dunia?,” Databoks, 2023, [Online]. Available: https://databoks.katadata.co.id/datapublish/2023/02/27/pengguna-twitter-di-indonesia-capai-24-juta-hingga-awal-2023-peringkat-berapa-di-dunia
S. Albawi, T. A. Mohammed, and S. Al-Zawi, “Understanding of a convolutional neural network,” Proc. 2017 Int. Conf. Eng. Technol. ICET 2017, vol. 2018-January, pp. 1–6, 2018, doi: 10.1109/ICEngTechnol.2017.8308186.
K. Smagulova and A. P. James, “A survey on LSTM memristive neural network architectures and applications,” Eur. Phys. J. Spec. Top., vol. 228, no. 10, pp. 2313–2324, 2019, doi: 10.1140/epjst/e2019-900046-x.
A. Glenn, P. LaCasse, and B. Cox, “Emotion classification of Indonesian Tweets using Bidirectional LSTM,” Neural Comput. Appl., vol. 35, no. 13, pp. 9567–9578, 2023, doi: 10.1007/s00521-022-08186-1.
M. F. Heldiansyah and E. Winarko, “Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding,” Proc. 2022 Int. Conf. Data Softw. Eng. ICoDSE 2022, pp. 53–58, 2022, doi: 10.1109/ICoDSE56892.2022.9972229.
F. M. Rusli, R. Rismala, and H. Nurrahmi, “Emotion Classification on Indonesian Twitter Using Convolutional Neural Network (CNN),” 2021 9th Int. Conf. Inf. Commun. Technol. ICoICT 2021, pp. 213–218, 2021, doi: 10.1109/ICoICT52021.2021.9527447.
M. S. Saputri, R. Mahendra, and M. Adriani, “Emotion Classification on Indonesian Twitter Dataset,” Proc. 2018 Int. Conf. Asian Lang. Process. IALP 2018, pp. 90–95, 2019, doi: 10.1109/IALP.2018.8629262.
K. Sailunaz and R. Alhajj, “Emotion and sentiment analysis from Twitter text,” J. Comput. Sci., vol. 36, p. 101003, 2019, doi: 10.1016/j.jocs.2019.05.009.
H. Dalianis, “Evaluation Metrics and Evaluation,” Clin. Text Min., no. 1967, pp. 45–53, 2018, doi: 10.1007/978-3-319-78503-5_6.
A. Bruns and S. Stieglitz, “Metrics for understanding communication on Twitter,” in Twitter and society [Digital Formations, Volume 89], A. Bruns, M. Mahrt, K. Weller, J. Burgess, and C. Puschmann, Eds., United States of America: Peter Lang Publishing, 2014, pp. 69–82. Accessed: Nov. 30, 2023. [Online]. Available: https://eprints.qut.edu.au/66326/.
Copyright (c) 2023 Muhammad Habib Algifari, Eko Dwi Nugroho
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).