Implementation of Information Gain for Sentiment Analysis of PSE Policy using Naïve Bayes Algorithm
Abstract
The Ministry of Communication and Information Technology of Indonesia (Kominfo) has established the Penyelenggara Sistem Elektronik (PSE) policy as a mandatory registration requirement for both domestic and foreign Electronic Systems (ES). As a result, Kominfo will impose sanctions on all ES by temporarily suspending their access if they fail to register by July 29, 2022, at 23:59 WIB. This policy has sparked both support and opposition among the Indonesian public, and it has become a topic of discussion, including among Twitter users. Therefore, sentiment analysis is employed as a solution to identify public concerns or issues regarding the policy based on negative and positive tweets. The objective of this research is to evaluate the results of feature selection using Information Gain and the Naïve Bayes Classifier algorithm in analyzing Twitter users' sentiment towards the policies of the Information and PSE of the Ministry of Communication and Information Technology. A total of 1153 lines of tweets were collected from the Twitter platform using the keyword "PSE Kominfo," which were then analyzed using the Naïve Bayes Classifier algorithm and Information Gain feature selection with three scenarios: 90:10, 80:20, and 70:30. Based on the evaluation using the confusion matrix, overall, Scenario 1 with a 90:10 ratio and Information Gain feature selection performed the best, achieving an accuracy of 79.7%, recall of 85%, and an F-1 score of 88%. However, the best precision was observed in Scenario 2 with an 80:20 ratio, reaching 92% due to the higher proportion of positive predictions made by the model compared to other scenarios.
Downloads
References
Kominfo, “Peraturan Menteri Komunikasi dan Informatika Nomor 5 Tahun 2020 tentang Penyelenggara Sistem Elektronik Lingkup Privat,” https://pse.kominfo.go.id/, Jul. 22, 2022. .
C. Rahmawati and P. Sukmasetya, “Sentimen Analisis Opini Masyarakat Terhadap Kebijakan Kominfo atas Pemblokiran Situs non-PSE pada Media Sosial Twitter,” JURIKOM (Jurnal Ris. Komputer), vol. 9, no. 5, p. 1393, Oct. 2022, doi: 10.30865/jurikom.v9i5.4950.
CNN Indonesia, “Analisis PSE Kominfo Buat Kepentingan Siapa?,” https://www.cnnindonesia.com/teknologi/20220804012642-192-830038/pse-kominfo-buat-kepentingan-siapa, Aug. 04, 2022. .
O. Y. Adwan, M. Al-Tawil, A. M. Huneiti, R. A. Shahin, A. A. Abu Zayed, and R. H. Al-Dibsi, “Twitter sentiment analysis approaches: A survey,” Int. J. Emerg. Technol. Learn., vol. 15, no. 15, pp. 79–93, 2020, doi: 10.3991/ijet.v15i15.14467.
I. T. Julianto, D. Kurniadi, M. R. Nashrulloh, A. Mulyani, and J. I. Komputer, “Comparison Of Classification Algorithm And Feature Selection In Bitcoin Sentiment Analysis,” J. Tek. Inform., vol. 3, no. 3, 2022, doi: 10.20884/1.jutif.2022.3.3.343.
A. Isnanda, Y. Umaidah, and J. H. Jaman, “Implementasi Naïve Bayes Classifier Dan Information Gain Pada Analisis Sentimen Penggunaan E-Wallet Saat Pandemi,” J. Teknol. Inform. dan Komput., vol. 7, no. 2, pp. 144–153, Sep. 2021, doi: 10.37012/jtik.v7i2.648.
A. Bijaksana, P. Negara, H. Muhardi, and I. M. Putri, “Analisis Sentimen Maskapai Penerbangan Menggunakan Metode Naive Bayes Dan Seleksi Fitur Information Gain Sentiment Analysis On Airlines Using Naïve Bayes Method And Feature Selection Information Gain,” vol. 7, no. 3, pp. 599–606, 2020, doi: 10.25126/jtiik.202071947.
F. Nasser K, “A Review of Data Mining and Knowledge Discovery Approaches for Bioinformatics,” Iraqi J. Sci., vol. 63, 2022, Accessed: Aug. 21, 2023. [Online]. Available: 10.24996/ijs.2022.63.7.37.
I. A. Ahmad Sabri, M. Man, W. A. W. Abu Bakar, and A. N. Mohd Rose, “Web Data Extraction Approach for Deep Web using WEIDJ,” in Procedia Computer Science, 2019, vol. 163, pp. 417–426, doi: 10.1016/j.procs.2019.12.124.
W. Zhang, X. Li, Y. Deng, L. Bing, and W. Lam, “A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges,” Mar. 2022, [Online]. Available: http://arxiv.org/abs/2203.01054.
B. Al Hadits, “Analisis Sentimen Kepuasan Pelanggan Indihome Pada Media Sosial Twitter Menggunakan Metode Naïve Bayes Classifiers Proposal Skripsi,” 2020.
C. Barrie and J. Ho, “academictwitteR: an R package to access the Twitter Academic Research Product Track v2 API endpoint,” J. Open Source Softw., vol. 6, no. 62, p. 3272, Jun. 2021, doi: 10.21105/joss.03272.
Jurafsky, Daniel, & Martin, James H. 2019. Speech and Language Processing. Pearson.
Y. Zhou, G. Cheng, S. Jiang, and M. Dai, “Building an Efficient Intrusion Detection System Based on Feature Selection and Ensemble Classifier,” Apr. 2019, doi: 10.1016/j.comnet.2020.107247
S. Chen, G. I. Webb, L. Liu, and X. Ma, “A novel selective naïve Bayes algorithm,” Knowl Based Syst, vol. 192, Mar. 2020, doi: 10.1016/j.knosys.2019.105361.
Provost, F., & Fawcett, T. (2013). Data Science and Its Relationship to Big Data and Data-Driven Decision Making. Big Data, 1, 51-59
A. S. Ramadhani, “Analisis Sentimen Netizen Terhadap Trailer Film di YouTube SKRIPSI,” 2020.
Copyright (c) 2023 Stevanus Ertito Pramudja, Yuyun Umaidah, Aries Suharso
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).