Early Detection of Type 2 Diabetes Using C4.5 Decision Tree Algorithm on Clinical Health Records

Authors

  • Hani Setiani Informatika, Universitas Sragen
  • Muhammad Noor Arridho Teknologi Rekayasa Logistik, Politeknik Sinar Mas Berau Coal
  • Supriyanto Supriyanto Informatika, Universitas Sragen

DOI:

https://doi.org/10.30871/jaic.v9i4.10190

Keywords:

Classification, Type 2 Diabetes, C4.5 Algorithm

Abstract

Type 2 Diabetes is a chronic metabolic disorder marked by elevated blood glucose levels. It is the most prevalent form of diabetes in society, commonly triggered by poor lifestyle habits and hereditary factors. If left unmanaged, the disease can lead to serious complications such as hypertension and other chronic conditions. Therefore, early detection plays a critical role in minimizing long-term impacts and promoting healthier behavioral changes. This research focuses on classifying Type 2 Diabetes using clinical data with the C4.5 Decision Tree algorithm. The dataset encompasses attributes including gender, age, height, weight, waist circumference, BMI, systolic and diastolic blood pressure, respiratory rate, and pulse rate. The model was evaluated under two scenarios: without data balancing and after applying the SMOTE technique for balancing. In the first scenario, the best performance was achieved with a training-testing split of 80:20, resulting in an F1 Score of 67.76%. However, the performance varied across different data proportions. In contrast, the second scenario showed more consistent results, with the 60:40 split yielding the highest F1 Score of 66.67%. These findings suggest that SMOTE effectively reduces bias toward the majority class and enhances sensitivity to the minority class. Therefore, data balancing is a crucial step in developing a reliable classification model for Diabetes Mellitus diagnosis.

Downloads

Download data is not yet available.

References

[1] Dewi Nasien et al., “Perbandingan Implementasi Machine Learning Menggunakan Metode KNN, Naive Bayes, dan Logistik Regression Untuk Mengklasifikasi Penyakit Diabetes,” JEKIN - J. Tek. Inform., vol. 4, no. 1, pp. 10–17, 2024, doi: 10.58794/jekin.v4i1.640.

[2] L. M. Cendani and A. Wibowo, “Perbandingan Metode Ensemble Learning pada Klasifikasi Penyakit Diabetes,” J. Masy. Inform., vol. 13, no. 1, pp. 33–44, 2022, doi: 10.14710/jmasif.13.1.42912.

[3] A. P. Silalahi, H. G. Simanullang, and M. I. Hutapea, “Supervised Learning Metode K-Nearest Neighbor Untuk Prediksi Diabetes Pada Wanita,” METHOMIKA J. Manaj. Inform. dan Komputerisasi Akunt., vol. 7, no. 1, pp. 144–149, 2023, doi: 10.46880/jmika.vol7no1.pp144-149.

[4] J. Ginting, R. Ginting, and H. Hartono, “Deteksi Dan Prediksi Penyakit Diabetes Melitus Tipe 2 Menggunakan Machine Learning (Scooping Review),” J. Keperawatan Prior., vol. 5, no. 2, pp. 93–105, 2022, doi: 10.34012/jukep.v5i2.2671.

[5] A. M. Ridwan and G. D. Setiawan, “Perbandingan Berbagai Model Machine Learning Untuk Mendeteksi Diabetes,” Teknokom, vol. 6, no. 2, pp. 127–132, 2023, doi: 10.31943/teknokom.v6i2.152.

[6] C. A. Rahayu, “Prediksi Penderita Diabetes Menggunakan Metode Naive Bayes,” J. Inform. dan Tek. Elektro Terap., vol. 11, no. 3, 2023, doi: 10.23960/jitet.v11i3.3055.

[7] M. Danny and A. Muhidin, “Analisis Prediksi Resiko Diabetes Tahap Awal Menggunakan Algoritma Naive Bayes,” J. Teknol. Inform. dan Komput., vol. 9, no. 2, pp. 1443–1459, 2023, doi: 10.37012/jtik.v9i2.2017.

[8] M. Rizky, A. Pramuntadi, W. D. Prastowo, and D. H. Gutama, “Implementasi Metode Deep Neural Network pada Klasifikasi Penyakit Diabetes Melitus Tipe 2,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 4, no. 3, pp. 1043–1050, 2024, doi: 10.57152/malcom.v4i3.1279.

[9] D. Avianto and A. P. Wibowo, “Pembentukan Pohon Keputusan Untuk Penerima Bantuan Beras Miskin Menggunakan Algoritma Decision Tree C4.5,” Netw. Eng. Res. Oper., vol. 9, no. 1, pp. 59–68, 2024, doi: 10.21107/nero.v9i1.28020.

[10] H. Setiani, A. Sunyoto, and A. Nasiri, “Metode Naïve Bayes dan Particle Swarm Optimization untuk Klasifikasi Penyakit Jantung,” Explore, vol. 12, no. 2, p. 6, 2022, doi: 10.35200/explore.v12i2.566.

[11] H. Setiani and N. Tristanti, “Penerapan Metode Correlated Naive Bayes Untuk Klasifikasi Penyakit Kanker Payudara,” J. Inf. Syst. Informatics Eng., vol. 9, no. 1, pp. 18–26, 2025.

[12] Ihsan Zulfahmi, “Analisis Sentimen Aplikasi PLN Mobile Menggunakan Metode Decission Tree,” J. Penelit. Rumpun Ilmu Tek., vol. 3, no. 1, pp. 11–21, 2023, doi: 10.55606/juprit.v3i1.3096.

[13] Imam Nawawi and Zaehol Fatah, “Penerapan Decision Trees dalam Mendeteksi Pola Tidur Sehat Berdasarkan Kebiasaan Gaya Hidup,” J. Ilm. Sains Teknol. Dan Inf., vol. 2, no. 4, pp. 34–41, 2024, doi: 10.59024/jiti.v2i4.969.

[14] R. N. Sari and I. Purwanto, “Sistem Informasi Geografis Fasilitas Kesehatan di Tuntungan Berbasis Android,” Bull. Comput. Sci. Res., vol. 3, no. 3, pp. 257–262, 2023, doi: 10.47065/bulletincsr.v3i3.244.

[15] A. K. Wahyudi, N. Azizah, and H. Saputro, “Data Mining Klasifikasi Kepribadian Siswa Smp Negeri 5 Jepara Menggunakan Metode Decision Tree Algoritma C4.5,” J. Inf. Syst. Comput., vol. 2, no. 2, pp. 8–13, 2022, doi: 10.34001/jister.v2i2.392.

[16] J. M. H. Y. Al-Afghoni, Wahyudi Setiawan, and Y. Dwi Putra Negara, “Klasifikasi Jenis Benih Kacang Menggunakan Smote Dan Decision Tree C4.5,” JATI (Jurnal Mhs. Tek. Inform., vol. 9, no. 1, pp. 462–469, 2024, doi: 10.36040/jati.v9i1.12366.

[17] N. Qisthi, D. Kasoni, L. Liesnaningsih, and N. Heriyani, “Penerapan Data Mining Untuk Prediksi Stunting Pada Balita Menggunakan Algoritma C4.5,” Insa. Pembang. Sist. Inf. dan Komput., vol. 12, no. 2, pp. 18–25, 2024, doi: 10.58217/ipsikom.v12i2.314.

Downloads

Published

2025-08-07

How to Cite

[1]
H. Setiani, M. N. Arridho, and S. Supriyanto, “Early Detection of Type 2 Diabetes Using C4.5 Decision Tree Algorithm on Clinical Health Records”, JAIC, vol. 9, no. 4, pp. 1663–1669, Aug. 2025.

Issue

Section

Articles