Comparative Analysis of the C5.0 Algorithm and Other Machine Learning Models for Early Detection of Multi-Class Heart Disease

Authors

  • Mardhatillah Mardhatillah Universitas Malikussaleh
  • Hafizh Al-Kautsar Aidilof Universitas Malikussaleh
  • Asrianda Aidilof Universitas Malikussaleh

DOI:

https://doi.org/10.30871/jaic.v9i4.9753

Keywords:

Classification, Decision Tree, Heart Diseases, Early Detection

Abstract

Cardiovascular diseases represent the leading cause of mortality worldwide, making accurate and early detection a critical factor for effective medical intervention and improved patient prognosis. While machine learning (ML) offers promising tools for predictive diagnostics, many existing studies rely on single-algorithm approaches or less-than-robust validation methods, thereby limiting the generalizability and real-world applicability of their findings.This study aims to conduct a rigorous, head-to-head comparative evaluation of multiple machine learning algorithms for the multi-class classification of heart disease, with the goal of identifying the most effective and reliable model for this complex clinical task.We utilized a private dataset comprising 300 patient medical records, each described by 11 clinically relevant features. To ensure a robust and unbiased evaluation, a stratified 5-fold cross-validation methodology was employed. Five widely-used classification algorithms were evaluated: Naïve Bayes (NB), Logistic Regression (LR), Random Forest (RF), a C5.0-analog Decision Tree (DT), and Support Vector Machine (SVM). Model performance was assessed using standard metrics, including accuracy, precision, recall, and F1-score.The comparative analysis revealed that the Naïve Bayes algorithm delivered superior performance, achieving the highest mean accuracy of 43.33% (±4.22%). It also led in other key metrics with a mean precision of 43.40%, recall of 43.64%, and an F1-score of 41.26%. Other algorithms, such as Logistic Regression (40.67% accuracy) and Random Forest (39.33% accuracy), demonstrated competitive performance but were ultimately surpassed by the Naïve Bayes model in this specific multi-class classification context.This research underscores the critical importance of employing robust validation techniques and comprehensive comparative analyses to identify optimal models for clinical applications. The Naïve Bayes algorithm emerges as a strong candidate for developing a reliable clinical decision support system for the early differentiation of various heart conditions, providing a foundation for future data-driven diagnostic tools.

Downloads

Download data is not yet available.

References

[1] , &S. A. Vaughn, “High-speed digital-to-RF converter,” U.S. Patent 5 M. Abdar, S. R. N. Kalhori, T. Sutikno, I. M. I. Subroto, and G. Arji, “Comparing Performance of Data Mining Algorithms in Prediction Heart Diseases,” International Journal of Electrical and Computer Engineering (IJECE), vol. 5, no. 6, pp. 1569–1576, Dec. 2015.

[2] Wiharto and F. N. Mufidah, “Early detection of coronary heart disease based on risk factors using inte[pr]ta[le]machine learning,” International Journal of Advances in Applied Sciences (IJAAS), vol. 13, no. 4, pp. 944–956, Dec. 2024.

[3] E. Ahmadi, G. R. Weckman, and D. T. Masel, “Decision making model to predict presence of coronary artery disease using neural network and C5.0 decision tree,” Journal of Ambient Intelligence and Humanized Computing, vol. 9, no. 4, pp. 1083–1094, Aug. 2018. 11

[4] D. M. Hannon, J. D. A. Syed, B. McNicholas, M. Madden, J. G. Laffey, and S. B. S. Walsh, “The development of a C5.0 machine learning model in a limited data set to predict early mortality in patients with ARDS undergoing an initial session of prone positioning,” Intensive Care Medicine Experimental, vol. 12, no. 1, p. 88, Nov. 2024.

[5] C. M. Kapp, A. G. Kapp, S. G. Gierten, and H. A. Kestler, “Demonstration of the potential of white-box machine learning approaches to gain insights from cardiovascular disease electrocardiograms,” PLOS One, vol. 15, no. 12, p. e0243615, Dec. 2020. 14

[6] Yuliana, M. R. Shihab, and A. P. Widodo, “Application of the C5.0 Algorithm to Determine the Eligibility of BPJS Contribution Assistance Recipients in the National Health Insurance Program,” International Journal of Engineering, Science and Information Technology, vol. 5, no. 2, pp. 405–412, Mar. 2025.

[7] J. L. Delgado-Gallegos et al., “Application of C5.0 Algorithm for the Assessment of Perceived Stress in Healthcare Professionals Attending COVID-19,” Brain Sciences, vol. 13, no. 3, p. 513, Mar. 2023.

[8] E. Gozali, S. H. Gohari, K. Khademvatani, and R. T. Asr, “Diagnosis of Heart Disease Using Data Mining Techniques: A Systematic Review of Influential Factors and Outcomes,” Frontiers in Health Informatics, vol. 13, p. 179, Jan. 2024.

[9] B. Martins, D. Ferreira, C. Neto, A. Abelha, and J. Machado, “Data Mining for Cardiovascular Disease Prediction,” Journal of Medical Systems, vol. 45, no. 1, p. 6, Jan. 2021.

[10] R. Pandya and J. Pandya, “C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning,” International Journal of Computer Applications, vol. 117, no. 16, pp. 18–21, May 2015.

[11] L. Zhang, M. A. H. Talukder, M. R. Islam, M. M. H. Sarker, and M. A. Ali, “Machine Learning–Based Linguistic Understandability Prediction of Health Resources for International Students at Australian Universities: Algorithm Development and Validation,” JMIR Medical Informatics, vol. 9, no. 5, p. e28413, May 2021.

[12] T. A. Dalal, S. A. Oyewola, and O. J. Okesola, “An Extra Tree Model for Heart Disease Prediction,” Journal of Data Analysis and Information Processing, vol. 13, no. 2, pp. 205-225, May 2025. (Note: This cites Dalal et al. (2023) for C5.0 usage, but the primary focus of is the Extra Tree model by Oyewola et al. (2025). The original Dalal et al. (2023) paper would be ideal if found.)

[13] P. Singh and R. Kumar, “A Comparative Study of Heart Disease Prediction using Machine Learning,” CEUR Workshop Proceedings (CEUR-WS.org), vol. 3733, pp. 28-37, Jun. 2024.

[14] B. Ahmad, J. Chen, and H. Chen, “Feature selection strategies for optimized heart disease diagnosis using ML and DL models,” arXiv preprint arXiv:2503.16577, Mar. 2025.

[15] S. Q. Sultan, N. Javaid, N. Alrajeh, and M. Aslam, “A Novel Stacking Deep-Generalized Neural Network (NCDG) Model for the Prediction of Heart Disease with Explainable Artificial Intelligence,” Symmetry, vol. 17, no. 2, p. 185, Feb. 2025.

[16] M. Abdar et al., “A New Boosted C5.0 and Chi-Squared Automatic Interaction Detection Based on an Ensemble Learning Strategy for Proposing a Clinical Decision Support System for Liver Transplant,” Applied Sciences, vol. 15, no. 3, p. 1248, Jan. 2025. (Citing Boosted C5.0 performance from another study)

[17] M.-W. Huang, T.-L. Chen, C.-S. Lin, and W.-H. Chen, “Health Data-Driven Machine Learning Algorithms Applied to Risk Indicators Assessment for Chronic Kidney Disease,” Risk Management and Healthcare Policy, vol. 14, pp. 4817–4829, Oct. 2021.

[18] R. Hammoud, F. Al-Wesabi, A. Alzahrani, D. Al Duhayyim, and A. M. Hilal, “Improving Heart Disease Prediction Using Random Forest and AdaBoost Algorithms,” International Journal of Online and Biomedical Engineering (iJOE), vol. 17, no. 11, pp. 62-78, 2021. (Citing C5.0 accuracy of 93.02% from another study on Statlog dataset) 2

[19] Q. K. Al-Shayea, A. M. Elhassan, and M. A. El-Affendi, “Machine learning algorithms for heart disease diagnosis: A systematic review,” Current Problems in Cardiology, vol. 50, no. 12, p. 102594, Dec. 2025 (Online May 2025).

[20] S. M. R. Shah et al., “Unveiling the potential of artificial intelligence in revolutionizing disease diagnosis and prediction: a comprehensive review of machine learning and deep learning approaches,” European Journal of Medical Research, vol. 30, p. 418, May 2025.

[21] A. Ozkan, A. Koklu, and M. A. Sertbas, “A novel method for medical diagnosis: PSO + Boosted C5.0,” in 2015 Medical Technologies National Congress (TIPTEKNO), Bodrum, Turkey, 2015, pp. 1-4.

[22] D. Rodriguez-Fernandez, L. Revelo-Fuelagan, S. Garcia-Loor, D. Guevara-Ramirez, and S. L. Toral-Ramon, “Classification of Heart Failure Using Machine Learning: A Comparative Study,” Life (Basel), vol. 15, no. 3, p. 496, Mar. 2025.

[23] M. K. Gourisaria, S. S. S. P. Singh, M. M. Rautaray, and S. S. Rautaray, "Heart Disease Detection using Core Machine Learning and Deep Learning Techniques: A Comparative Study," International Journal of Engineering and Technology (IJET), vol. 11, no. 3, pp. 531-538, 2020. 24

[24] M. Abdar and V. Makarenkov, "Decision making model to predict presence of coronary artery disease using neural network and C5.0 decision tree," Journal of Ambient Intelligence and Humanized Computing, vol. 9, no. 4, pp. 1083–1094, Aug. 2018.

[25] N. M. Lutimath, C. Chethan, and B. S. Pol, "An Efficient Heart Disease Prediction System using C5.0 Algorithm," International Journal of Recent Technology and Engineering (IJRTE), vol. 8, no. 2S10, pp. 474-478, Sep. 2019. 25

[26] G. S. Hussin, M. A. M. Ali, N. A. J. Sufri, and N. H. A. H. Malim, "Prediction of Heart Disease using Machine Learning Algorithms," International Journal of Advanced Computer Science and Applications (IJACSA), vol. 12, no. 9, pp. 712-720, 2021.

[27] S. Mohan, C. Thirumalai, and G. Srivastava, "Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques," IEEE Access, vol. 7, pp. 81542-81554, 2019.

[28] P. K. Anooj, "Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules," Journal of King Saud University - Computer and Information Sciences, vol. 24, no. 1, pp. 27-40, Jan. 2012.

[29] M. A. Jabbar, B. L. Deekshatulu, and P. Chandra, "Heart disease prediction using lazy associative classification," Journal of Theoretical and Applied Information Technology, vol. 58, no. 1, pp. 14-22, 2013.

[30] S. Palaniappan and R. Awang, "Intelligent heart disease prediction system using data mining techniques," in 2008 IEEE/ACS International Conference on Computer Systems and Applications, Doha, Qatar, 2008, pp. 108-115.

Downloads

Published

2025-08-06

How to Cite

[1]
M. Mardhatillah, H. A.-K. Aidilof, and A. Aidilof, “Comparative Analysis of the C5.0 Algorithm and Other Machine Learning Models for Early Detection of Multi-Class Heart Disease”, JAIC, vol. 9, no. 4, pp. 1559–1568, Aug. 2025.

Issue

Section

Articles

Similar Articles

<< < 1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.