A Comparative Study of Structural Improvement and Hyperparameter Optimization for Enhancing Naïve Bayes Performance
DOI:
https://doi.org/10.30871/jaic.v10i2.12278Keywords:
Ant Colony Optimization, Genetic Algorithm, Grid Search, Naïve Bayes, Tree Augmented Naïve BayesAbstract
The Naïve Bayes (NB) algorithm is widely used due to its simplicity and computational efficiency; however, its performance often degrades on real-world data because the assumption of feature independence is frequently violated. This study evaluates two strategies for improving the performance of Naïve Bayes, namely hyperparameter optimization and feature dependency structure enhancement through Tree Augmented Naïve Bayes (TAN). The Breast Cancer Wisconsin (Diagnostic) dataset was selected because it is a representative and widely used public dataset for evaluating medical classification algorithms, thereby facilitating method validation and comparison with previous studies. Experiments were conducted on 30 numerical features with an imbalanced class distribution (62.74% Benign and 37.26% Malignant). The baseline Naïve Bayes model achieved an accuracy of 0.9386. Applying TAN as a standalone approach improved performance to an accuracy of 0.9474 and an F1-score of 0.9286. In contrast, hyperparameter optimization using Genetic Algorithm (GA), Grid Search (GS), and Ant Colony Optimization (ACO) without TAN did not yield meaningful performance improvements; in fact, GA resulted in a decrease in accuracy compared to the baseline model. The combinations of Naïve Bayes + TAN + GS and Naïve Bayes + TAN + ACO achieved the best performance, with identical metrics: accuracy of 0.9561, precision of 0.9744, recall of 0.9048, F1-score of 0.9383, and AUC of 0.9960. These results indicate that improving the feature dependency structure through TAN has a more fundamental impact on enhancing Naïve Bayes performance than hyperparameter optimization alone.
Downloads
References
[1] W. Ningsih, B. Alfianda, and D. Wulandari, “Comparison of Naive Bayes and SVM algorithms in Twitter sentiment analysis on electric car use in Indonesia,” Indonesian Journal of Machine Learning and Computer Science, vol. 4, no. 4, pp. 556–562, 2024, doi: 10.57152/malcom.v4i2.1253.
[2] G. S. Al-Husna, D. Asmarajati, I. A. Ihsanuddin, and R. Mahmudati, “Perbandingan metode Naïve Bayes dan Support Vector Machine untuk analisis sentimen pada ulasan pengguna Aplikasi Linkedin,” Jurnal Ilmiah Teknik dan Ilmu Komputer, vol. 3, no. 2, pp. 139–144, 2024, doi: 10.55123/storage.v3i2.3602.
[3] B. Shafa, H. H. Handayani, S. Arum, and P. Lestari, “Prediksi kanker paru dengan normalisasi menggunakan perbandingan algoritma Random Forest, Decision Tree, dan Naïve Bayes,” Jurnal Pendidikan Teknologi Informasi, vol. 4, no. 3, pp. 1057–1070, 2024, doi: 10.51454/decode.v4i3.779.
[4] N. A. Prakoso Indaryono, “Analisa perbandingan algoritma Random Forest dan Naïve Bayes untuk klasifikasi curah hujan berdasarkan iklim di Indonesia,” Jurnal Ilmiah Penelitian dan Pembelajaran Informatika, vol. 9, no. 1, pp. 158–167, 2024, doi: 10.29100/jipi.v9i1.4421.
[5] B. Ramadhani and R. R. Suryono, “Komparasi algoritma Naïve Bayes dan Logistic Regression untuk analisis sentimen metaverse,” Jurnal Media Informatika Budidarma, vol. 8, no. 2, pp. 714–720, 2024, doi: 10.30865/mib.v8i2.7458.
[6] I. R. Hendrawan, E. Utami, and A. D. Hartanto, “Comparison of Naïve Bayes algorithm and XGBoost on local product review text classification,” Edumatic: Jurnal Pendidikan Informatika, vol. 6, no. 1, pp. 143–149, 2022, doi: 10.29408/edumatic.v6i1.5613.
[7] B. Pramuditya and A. Prabowo, “Optimasi fuzzy logic menggunakan genetic algorithm dalam menentukan program diet dan bulking,” Jurnal Pengembangan IT, vol. 10, no. 4, pp. 1044–1058, 2025, doi: 10.30591/jpit.v10i4.9459.
[8] M. Fawzan and D. Udjulawa, “Optimasi hyperparameter CNN dengan arsitektur VGG16 menggunakan Grid Search untuk klasifikasi penyakit buah delima,” Journal of Computer Science and Artificial Intelligence, vol. 5, no. 2, pp. 306–331, 2025, doi: 10.29240/arcitech.v5i2.15175.
[9] F. Suryana, N. Nurdin, and D. Hamdhana, “Implementation of Ant Colony Optimization algorithm for route optimization of tourist paths in Takengon,” Journal of Applied Informatics and Computing, vol. 9, no. 4, pp. 1886–1896, Aug. 2025, doi: 10.30871/jaic.v9i4.9706.
[10] A. Alwi, I. Iskandar, and D. Setyanto, “The philosophy of Naïve Bayes and its comparison with Tree Augmented Naïve Bayes,” Saudi Journal of Engineering and Technology, vol. 7, no. 7, pp. 377–385, 2022, doi: 10.36348/sjet.2022.v07i07.005.
[11] M. Agustriya, M. Ula, and K. Agustini, “Analisis kinerja algoritma klasifikasi Naïve Bayes menggunakan genetic algorithm dan bagging untuk Data Publik Risiko Transaksi Kartu Kredit,” Jurnal Sistem dan Teknologi Informasi, vol. 12, no. 3, pp. 584–591, 2024, doi: 10.26418/justin.v12i3.80136.
[12] E. Y. Hidayat et al., “Genetic algorithm-based convolutional neural network feature engineering for optimizing coronary heart disease prediction,” Healthcare Informatics Research, vol. 30, no. 3, pp. 234–243, 2024, doi: 10.4258/hir.2024.30.3.234.
[13] S. L. Pamungkas, R. O. S. Gurning, D. W. Handani, and A. Hafizh, “Tree Augmented Naïve Bayesian network application on multiplicative premium equation,” International Journal of Safety and Security Engineering, vol. 15, no. 12, pp. 2625–2638, 2025, doi: 10.18280/ijsse.151218.
[14] F. R. Dastjerdi and L. Cai, “Augmenting Naïve Bayes classifiers with k-tree topology,” Mathematics, vol. 13, no. 13, pp. 1–16, 2025, doi: 10.3390/math13132185.
[15] G. A. Ruz, P. Araya-Díaz, and P. A. Henríquez, “Facial biotype classification using tree augmented Naïve Bayes,” BMC Medical Informatics and Decision Making, vol. 22, no. 1, pp. 1–10, 2022, doi: 10.1186/s12911-022-02062-7.
[16] H. Kim et al., “Prediction of run-off road crash severity through Tree Augmented Naïve Bayes learning,” Applied Sciences, vol. 12, no. 3, 2022, doi: 10.3390/app12031120.
[17] A. D. Rachmatsyah, T. Sugihartono, and K. Irfan, “Perbandingan teknik optimasi Grid Search dan Randomized Search pada klasifikasi SVM,” Sistem Komputer dan Teknik Informatika, vol. 8, no. 1, pp. 13–22, 2024, doi: 10.36080/skanika.v8i1.3328.
[18] A. Nadroh et al., “Klasifikasi status gizi balita menggunakan SVM dengan optimasi Grid Search Cross-Validation,” Jurnal Manajemen Informatika & Komputerisasi Akuntansi, vol. 8, no. 2, pp. 250–257, 2024, doi: 10.46880/jmika.vol8no2.pp250-257.
[19] M. D. Wardana et al., “Implementation of Ant Colony Optimization in obesity level classification using Random Forest,” Jurnal Teknik Informatika (Jutif), vol. 6, no. 5, pp. 3543–3557, 2025, doi: 10.52436/1.jutif.2025.6.5.4696.
[20] F. Y. Santoso, E. Sediyono, and H. D. Purnomo, “Optimalisasi hyperparameter convolutional neural networks menggunakan Ant Colony Optimization,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 2, pp. 243–248, 2024, doi: 10.25126/jtiik.20241127105.
[21] M. M. Hossin et al., “Breast cancer detection using machine learning algorithms on the Wisconsin dataset,” Bulletin of Electrical Engineering and Informatics, vol. 12, no. 4, pp. 2446–2456, 2023, doi: 10.11591/eei.v12i4.4448.
[22] M. I. Fikri, T. S. Sabrila, and Y. Azhar, “Perbandingan metode Naïve Bayes dan SVM pada analisis sentimen Twitter,” Smatika Journal, vol. 10, no. 2, pp. 71–76, 2020, doi: 10.32664/smatika.v10i02.455.
[23] A. Putri et al., “Comparison of K-NN, Naïve Bayes and SVM algorithms for graduation prediction,” Indonesian Journal of Machine Learning and Computer Science, vol. 3, no. 1, pp. 20–26, 2023, doi: 10.57152/malcom.v3i1.610.
[24] M. R. Zuhri, K. Kusrini, and D. Ariatmanto, “Analisis perbandingan algoritma klasifikasi untuk identifikasi diabetes,” Jurnal Informatika Teknologi dan Sains, vol. 7, no. 1, pp. 11–20, 2025, doi: 10.51401/jinteks.v7i1.5146.
[25] I. H. Kusuma and N. Cahyono, “Analisis sentimen masyarakat terhadap penggunaan e-commerce menggunakan KNN,” Jurnal Informatika: Jurnal Pengembangan IT, vol. 8, no. 3, pp. 302–307, 2023, doi: 10.30591/jpit.v8i3.5734.
[26] S. Sathyanarayanan, “Confusion matrix-based performance evaluation metrics,” African Journal of Biomedical Research, vol. 27, no. 4, pp. 4023–4031, 2024, doi: 10.53555/ajbr.v27i4s.4345.
[27] L. Lavazza, S. Morasca, and G. Rotoloni, “Software defect prediction evaluation: New metrics based on the ROC curve,” Information and Software Technology, vol. 187, p. 107865, 2025, doi: 10.1016/j.infsof.2025.107865.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jejen Jaenudin, Novita BR Ginting, Yusup Maolani

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








