Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction

Authors

  • Windy Aldora Rayadhani Universitas Amikom Yogyakarta
  • Majid Rahardi Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.30871/jaic.v9i6.11451

Keywords:

Cardiovascular Disease, Random Forest, SVM, Naïve Bayes, Clinical Decision Support

Abstract

Cardiovascular disease is one of the leading causes of death worldwide; therefore, accurate early detection is essential to reduce fatal risks. This study aims to compare the performance of three machine learning algorithms — Random Forest, Support Vector Machine (SVM), and Naïve Bayes — in predicting cardiovascular disease risk using the Mendeley Cardiovascular Disease Dataset, which contains 1,000 patient records and 14 clinical attributes. The models were evaluated using accuracy, precision, recall, and F1-score metrics, and their performance differences were statistically tested using the paired t-test. The experimental results indicate that the Random Forest algorithm achieved the best performance with 99% accuracy, 100% recall, 98% precision, and an F1-score of 99%. The SVM model followed with 98% accuracy and 100% recall, while the Naïve Bayes algorithm obtained 94.5% accuracy and an F1-score of 95%. The p-value < 0.05 confirmed that the performance differences among the three models were statistically significant. From a clinical perspective, a model with high recall, such as Random Forest, is more desirable because it reduces the likelihood of false negatives, which are critical in heart disease diagnosis. The feature importance analysis also revealed that age, resting blood pressure, and cholesterol level were the most influential factors in predicting cardiovascular risk. These findings suggest that machine learning algorithms, particularly Random Forest, have strong potential to be implemented in Clinical Decision Support Systems (CDSS) for accurate and efficient early detection of cardiovascular disease.

Downloads

Download data is not yet available.

References

[1] W. H. Organization, “Cardiovascular diseases.” [Online]. Available: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1

[2] B. Ristevski and M. Chen, “Big Data Analytics in Medicine and Healthcare,” J. Integr. Bioinform., vol. 15, no. 3, pp. 1–5, 2018, doi: 10.1515/jib-2017-0030.

[3] J. P. Jiawei Han, Micheline Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann, 2012.

[4] M. Wahidin, R. I. Agustiya, and G. Putro, “Beban Penyakit dan Program Pencegahan dan Pengendalian Penyakit Tidak Menular di Indonesia,” J. Epidemiol. Kesehat. Indones., vol. 6, no. 2, pp. 105–112, 2023, doi: 10.7454/epidkes.v6i2.6253.

[5] W. H. Organization, Noncommunicable Diseases Country Profiles 2014. Geneva, Switzerland: World Health Organization, 2014. [Online]. Available: https://www.who.int/publications/i/item/9789241507509

[6] R. Detrano et al., “International application of a new probability algorithm for the diagnosis of coronary artery disease,” Am. J. Cardiol., vol. 64, no. 5, pp. 304–310, 1989, doi: 10.1016/0002-9149(89)90524-9.

[7] N. Nasution, M. A. Hasan, and F. Bakri Nasution, “Predicting Heart Disease Using Machine Learning: An Evaluation of Logistic Regression, Random Forest, SVM, and KNN Models on the UCI Heart Disease Dataset,” IT J. Res. Dev., vol. 9, no. 2, pp. 140–150, 2025, doi: 10.25299/itjrd.2025.17941.

[8] S. Hadijah Hasanah, “Application of Machine Learning for Heart Disease Classification Using Naive Bayes,” J. Mat. MANTIK, vol. 8, no. 1, pp. 68–77, 2022, doi: 10.15642/mantik.2022.8.1.68-77.

[9] J. M. Adinulhaq and M. Sam’an, “Perbandingan Kinerja Akurasi Model Mesin Learning Untuk Prediksi Penyakit Jantung,” J. Komput. Dan Teknol. Inf., vol. 1, no. 2, pp. 48–55, 2023, doi: 10.26714/jkti.v1i2.12918.

[10] M. Kholish, A. Herdianto, R. F. Setiawan, and R. Samsinar, “Perbandingan Algoritma Random Forest dan Naive Bayes dalam Memprediksi Penyakit Diabetes,” Hubisintek, vol. 5, no. 1, pp. 322–328, 2024, [Online]. Available: https://ojs.udb.ac.id/index.php/HUBISINTEK/article/view/4757

[11] B. P. Doppala and D. Bhattacharyya, “Cardiovascular Disease Dataset.” [Online]. Available: https://data.mendeley.com/datasets/dzz48mvjht/1/files/e4a4a2de-2783-4ea8-9958-0fc3c82cadd4

[12] V. Chernykh, A. Stepnov, and B. O. Lukyanova, “Data preprocessing for machine learning in seismology,” CEUR Workshop Proc., vol. 2930, no. October, pp. 119–123, 2021.

[13] J. M. H. Pinheiro et al., “The Impact of Feature Scaling In Machine Learning: Effects on Regression and Classification Tasks,” vol. XX, no. X, 2025, [Online]. Available: http://arxiv.org/abs/2506.08274

[14] Aurélien Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Editio. Sebastopol, CA, USA: O’Reilly Media. [Online]. Available: https://www.oreilly.com/library/view/hands-on-machine-learning/9781098125967/

[15] L. N. Farida and S. Bahri, “Klasifikasi Gagal Jantung menggunakan Metode SVM (Support Vector Machine),” Komputika J. Sist. Komput., vol. 13, no. 2, pp. 149–156, 2024, doi: 10.34010/komputika.v13i2.11330.

[16] Natasuwarna, “Seleksi Fitur Support Vector Machine pada Analisis Sentimen Keberlanjutan Pembelajaran Daring Support Vector Machine Feature Selection on Online Learning Sustainability Sentiment Analysis,” vol. 19, no. 4, pp. 437–448, 2020.

[17] M. B. Anggara, F. T. Informasi, and U. B. Bandung, “Mohammad Bayu Anggara,” vol. 20, pp. 32–42, 2025.

[18] W. Wijiyanto, A. I. Pradana, S. Sopingi, and V. Atina, “Teknik K-Fold Cross Validation untuk Mengevaluasi Kinerja Mahasiswa,” J. Algoritm., vol. 21, no. 1, pp. 239–248, 2024, doi: 10.33364/algoritma/v.21-1.1618.

Downloads

Published

2025-12-06

How to Cite

[1]
W. A. Rayadhani and M. Rahardi, “Comparative Analysis of Random Forest, SVM, and Naive Bayes for Cardiovascular Disease Prediction”, JAIC, vol. 9, no. 6, pp. 3234–3243, Dec. 2025.

Most read articles by the same author(s)

1 2 3 > >> 

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.