Emotion Classification of Indonesian Tweets using BERT Embedding

Keywords: BERT, CNN, LSTM, Twitter

Abstract

Twitter is one of the social media that has the largest users in the world. Indonesia is one of the countries that has the 5th largest number of Twitter users in the world which causes a high possibility of conflict between Indonesian Twitter users due to emotional tension in tweets. In this paper, we will compare the BERT embedding method with CNN and LSTM. The results of this experiment are BERT-CNN has the best performance results which has an accuracy of 61% compared to BERT-LSTM. In the experiment several stages of data preprocessing, data cleaning, data spiting and data training were carried out and the results were evaluated using confusion metrics.

Downloads

Download data is not yet available.

References

C. M. Annur, “Pengguna Twitter di Indonesia Capai 24 Juta hingga Awal 2023, Peringkat Berapa di Dunia?,” Databoks, 2023, [Online]. Available: https://databoks.katadata.co.id/datapublish/2023/02/27/pengguna-twitter-di-indonesia-capai-24-juta-hingga-awal-2023-peringkat-berapa-di-dunia

S. Albawi, T. A. Mohammed, and S. Al-Zawi, “Understanding of a convolutional neural network,” Proc. 2017 Int. Conf. Eng. Technol. ICET 2017, vol. 2018-January, pp. 1–6, 2018, doi: 10.1109/ICEngTechnol.2017.8308186.

K. Smagulova and A. P. James, “A survey on LSTM memristive neural network architectures and applications,” Eur. Phys. J. Spec. Top., vol. 228, no. 10, pp. 2313–2324, 2019, doi: 10.1140/epjst/e2019-900046-x.

A. Glenn, P. LaCasse, and B. Cox, “Emotion classification of Indonesian Tweets using Bidirectional LSTM,” Neural Comput. Appl., vol. 35, no. 13, pp. 9567–9578, 2023, doi: 10.1007/s00521-022-08186-1.

M. F. Heldiansyah and E. Winarko, “Emotion Detection on Indonesian Tweets Using CNN and Contextualized Word Embedding,” Proc. 2022 Int. Conf. Data Softw. Eng. ICoDSE 2022, pp. 53–58, 2022, doi: 10.1109/ICoDSE56892.2022.9972229.

F. M. Rusli, R. Rismala, and H. Nurrahmi, “Emotion Classification on Indonesian Twitter Using Convolutional Neural Network (CNN),” 2021 9th Int. Conf. Inf. Commun. Technol. ICoICT 2021, pp. 213–218, 2021, doi: 10.1109/ICoICT52021.2021.9527447.

M. S. Saputri, R. Mahendra, and M. Adriani, “Emotion Classification on Indonesian Twitter Dataset,” Proc. 2018 Int. Conf. Asian Lang. Process. IALP 2018, pp. 90–95, 2019, doi: 10.1109/IALP.2018.8629262.

K. Sailunaz and R. Alhajj, “Emotion and sentiment analysis from Twitter text,” J. Comput. Sci., vol. 36, p. 101003, 2019, doi: 10.1016/j.jocs.2019.05.009.

H. Dalianis, “Evaluation Metrics and Evaluation,” Clin. Text Min., no. 1967, pp. 45–53, 2018, doi: 10.1007/978-3-319-78503-5_6.

A. Bruns and S. Stieglitz, “Metrics for understanding communication on Twitter,” in Twitter and society [Digital Formations, Volume 89], A. Bruns, M. Mahrt, K. Weller, J. Burgess, and C. Puschmann, Eds., United States of America: Peter Lang Publishing, 2014, pp. 69–82. Accessed: Nov. 30, 2023. [Online]. Available: https://eprints.qut.edu.au/66326/.

Published
2023-11-30
How to Cite
[1]
M. Algifari and E. Nugroho, “Emotion Classification of Indonesian Tweets using BERT Embedding”, JAIC, vol. 7, no. 2, pp. 172-176, Nov. 2023.
Section
Articles