Two-Level Ensemble with Four Meta-Features for Diabetes Classification on Clinical Tabular Data

Farid Ma'ruf; Ifnu Wisma Dwi Prasetya; Ita Aristia Sa’ida

doi:10.30871/jaic.v10i2.12239

Authors

Farid Ma'ruf Universitas Nahdlatul Ulama Sunan Giri
Ifnu Wisma Dwi Prasetya Universitas Nahdlatul Ulama Sunan Giri
Ita Aristia Sa’ida Universitas Nahdlatul Ulama Sunan Giri

DOI:

https://doi.org/10.30871/jaic.v10i2.12239

Keywords:

Diabetes Classification, Clinical Tabular Data, Two-Level Ensemble, Meta-Features, XGBoost, Deep Neural Network, SHAP, LIME, Calibration

Abstract

Diabetes mellitus remains a major global public health challenge due to its increasing prevalence, high risk of chronic complications, and growing burden on healthcare systems. In this context, early detection supported by artificial intelligence has become increasingly important, particularly for large-scale clinical tabular data. However, no single model consistently performs best across all clinical tabular datasets, and models with strong discriminative ability do not always provide reliable probability estimates or sufficient interpretability. This study proposes a two-level ensemble model with four meta-features for diabetes classification on clinical tabular data. At the first level, XGBoost and a baseline Deep Neural Network (DNN) were used as heterogeneous base learners. Their prediction probabilities were then transformed into four meta-features, namely the XGBoost probability, the DNN probability, the absolute difference between the two probabilities, and their product, which were subsequently modeled using Logistic Regression at the second level. The proposed model was evaluated against XGBoost, Random Forest, and Baseline DNN using Stratified 5-Fold Cross-Validation and an independent hold-out test. Performance was assessed using ROC-AUC, accuracy, precision, recall, F1-score, specificity, Brier score, confusion matrix, threshold optimization for screening mode, isotonic probability calibration, SHAP, LIME, and DeLong statistical testing. On the hold-out test, the proposed Meta Level-2 LR (4 features) achieved a ROC-AUC of 0.979451, accuracy of 0.97170, F1-score of 0.806562, precision of 0.962480, specificity of 0.997486, and the best Brier score of 0.022476. Although XGBoost obtained the highest ROC-AUC (0.979969), the proposed model demonstrated the most balanced overall performance, particularly in terms of F1-score, precision, specificity, calibration quality, and suitability for clinical decision support. SHAP and LIME further indicated that the most influential features were clinically plausible, especially HbA1c_level, blood_glucose_level, age, and BMI. These findings indicate that the proposed two-level ensemble provides a strong balance among discriminative performance, probability reliability, and interpretability, and therefore has strong potential for clinical decision support in diabetes classification.

Downloads

Download data is not yet available.

References

[1] F. Najafi et al., “The incidence of diabetes mellitus and its determining factors in a Kurdish population : insights from a cohort study in western Iran,” Sci. Rep., pp. 1–11, 2024, doi: 10.1038/s41598-024-66795-3.

[2] International Diabetes Federation, IDF Diabetes Atlas, 11th ed. Brussels, Belgium: International Diabetes Federation, 2025. Available: IDF Diabetes Atlas official website. 2025.

[3] M. Wahidin et al., “Projection of diabetes morbidity and mortality till 2045 in Indonesia based on risk factors and NCD prevention and control programs,” Sci. Rep., pp. 1–17, 2024, doi: 10.1038/s41598-024-54563-2.

[4] L. Fregoso-Aparicio, J. Noguez, L. Montesinos, and J. A. García-García, “Machine learning and deep learning predictive models for type 2 diabetes: a systematic review,” Diabetol. Metab. Syndr., vol. 13, no. 1, 2021, doi: 10.1186/s13098-021-00767-9.

[5] G. R. D. E. A. Lgorithm, “Diabetes Prediction Using Support Vector,” vol. 8, no. 1, pp. 44–52, 2024.

[6] C. N. Noviyanti, “Journal of Information System Early Detection of Diabetes Using Random Forest Algorithm,” vol. 2, no. 1, pp. 41–48, 2024.

[7] R. Hidayat, D. Mahdiana, and A. Fergina, “Comparative Analysis of Logistic Regression , SVM , Xgboost , and Random Forest Algorithms for Diabetes Classification,” vol. 7, no. 1, pp. 281–291, 2024, doi: 10.32493/jtsi.v7i1.38258.

[8] I. N. Mahmood and H. S. Abdullah, “Analyzing the behavior of different classification algorithms in diabetes prediction,” vol. 13, no. 1, pp. 201–206, 2024, doi: 10.11591/ijai.v13.i1.pp201-206.

[9] F. Refindha, A. Harianto, Z. Alawi, and I. Aristia, “Pengaruh Komposisi Split Data Pada Akurasi Klasifikasi Penderita Diabetes Menggunakan,” vol. 8, no. 1, pp. 36–44, 2025.

[10] C. C. Olisah, L. Smith, and M. Smith, “Diabetes mellitus prediction and diagnosis from a data preprocessing and machine learning perspective,” Comput. Methods Programs Biomed., vol. 220, p. 106773, 2022, doi: 10.1016/j.cmpb.2022.106773.

[11] B. P. Pamungkas, M. J. Vikri, and I. Aristia, “Application of SMOTE-ENN Method in Data Balancing for Classification of Diabetes Health Indicators with C4 . 5 Algorithm,” vol. 14, pp. 183–188, 2025.

[12] F. Mohsen, H. R. H. Al-absi, and N. El Hajj, “OPEN A scoping review of arti fi cial intelligence-based methods for diabetes risk prediction,” pp. 1–15, doi: 10.1038/s41746-023-00933-5.

[13] O. B. Ayoade and S. Shahrestani, “Machine Learning and Deep Learning Approaches for Predicting Diabetes Progression : A Comparative Analysis,” 2025.

[14] R. S. Chhillar, “Optimized stacking ensemble for early-stage diabetes mellitus prediction,” vol. 13, no. 6, pp. 7048–7055, 2023, doi: 10.11591/ijece.v13i6.pp7048-7055.

[15] R. Alkhanbouli, H. Matar Abdulla Almadhaani, F. Alhosani, and M. C. E. Simsekler, “The role of explainable artificial intelligence in disease prediction: a systematic literature review and future research directions,” BMC Med. Inform. Decis. Mak., vol. 25, no. 1, 2025, doi: 10.1186/s12911-025-02944-6.

[16] Z. Ganji, F. Nikparast, N. Shoeibi, A. Shoeibi, and H. Zare, “Decoding Parkinson’s diagnosis: An OCT-based explainable AI with SHAP/LIME transparency from the Persian Cohort Study,” Photodiagnosis Photodyn. Ther., vol. 54, no. May, 2025, doi: 10.1016/j.pdpdt.2025.104668.

[17] S. Ahmed, M. S. Kaiser, M. Shahadat Hossain, and K. Andersson, “A Comparative Analysis of LIME and SHAP Interpreters With Explainable ML-Based Diabetes Predictions,” IEEE Access, vol. 13, no. July 2024, pp. 37370–37388, 2025, doi: 10.1109/ACCESS.2024.3422319.

[18] M. Altalhan, A. Algarni, and M. Turki-Hadj Alouane, “Imbalanced Data Problem in Machine Learning: A Review,” IEEE Access, vol. 13, no. December 2024, pp. 13686–13699, 2025, doi: 10.1109/ACCESS.2025.3531662.

[19] M. Bhandarkar, V. S. Bendre, Y. V. Bellary, and A. K. Bhole, “Ensemble stacking classifier model for prediction of diabetes,” vol. 13, no. 3, pp. 499–508, 2024, doi: 10.11591/ijict.v13i3.pp499-508.

[20] S. Reza, R. Amin, R. Yasmin, W. Kulsum, and S. Ruhi, “Heliyon Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data,” Heliyon, vol. 10, no. 2, p. e24536, 2024, doi: 10.1016/j.heliyon.2024.e24536.

[21] M. Mustafa, “Diabetes Prediction Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/iammustafatz/diabetes-prediction-dataset

[22] X. Li, “Design of a Multi-Model Diabetes Risk Prediction System for Clinical Application,” Appl. Comput. Eng., vol. 155, no. 1, pp. 125–136, 2025, doi: 10.54254/2755-2721/2025.gl23415.

[23] H. Sadr et al., “Unveiling the potential of artificial intelligence in revolutionizing disease diagnosis and prediction: a comprehensive review of machine learning and deep learning approaches,” Eur. J. Med. Res., vol. 30, no. 1, 2025, doi: 10.1186/s40001-025-02680-7.

[24] M. Afkanpour, E. Hosseinzadeh, and H. Tabesh, “Identify the most appropriate imputation method for handling missing values in clinical structured datasets : a systematic review,” BMC Med. Res. Methodol., 2024, doi: 10.1186/s12874-024-02310-6.

[25] B. Toleva, I. Atanasov, and I. Ivanov, “An Effective Methodology for Diabetes Prediction in the Case of Class Imbalance,” pp. 1–17, 2025.

[26] A. El, S. El, and H. M. El Bakry, “Pediatric diabetes prediction using deep learning,” Sci. Rep., no. 0123456789, pp. 1–20, 2024, doi: 10.1038/s41598-024-51438-4.

[27] R. K. Bhujade and S. Asthana, “An Extensive review of ReLu and Sigmoid Function in Multiple Hidden Layer Back Propagation Neural Network Model,” Int. J. Appl. Eng. Technol., vol. 5, no. 2, pp. 67–70, 2023.

[28] B. BAKIRARAR and A. H. ELHAN, “Class Weighting Technique to Deal with Imbalanced Class Problem in Machine Learning: Methodological Research,” Turkiye Klin. J. Biostat., vol. 15, no. 1, pp. 19–29, 2023, doi: 10.5336/biostatic.2022-93961.

[29] M. Salmi, D. Atif, D. Oliva, A. Abraham, and S. Ventura, Handling imbalanced medical datasets: review of a decade of research, vol. 57, no. 10. Springer Netherlands, 2024. doi: 10.1007/s10462-024-10884-2.

[30] H. Setiawan, A. Firnanda, and U. Khair, “Enhancing the Accuracy of Diabetes Prediction Using Feedforward Neural Networks : Strategies for Improved Recall and Generalization,” vol. 4, no. 1, pp. 201–207, 2024.

[31] M. Saleh, A. L. Reshan, S. Amin, and M. A. L. I. Zeb, “An Innovative Ensemble Deep Learning Clinical Decision Support System for Diabetes Prediction,” IEEE Access, vol. 12, no. May, pp. 106193–106210, 2024, doi: 10.1109/ACCESS.2024.3436641.

[32] J. J. Eertink, M. W. Heymans, G. J. C. Zwezerijnen, J. M. Zijlstra, H. C. W. De Vet, and R. Boellaard, “External validation : a simulation study to compare cross ‑ validation versus holdout or external testing to assess the performance of clinical prediction models using PET data from DLBCL patients,” pp. 4–11, 2022, doi: 10.1186/s13550-022-00931-w.

[33] A. Apicella, F. Isgrò, and R. Prevete, “machine learning and transfer learning,” pp. 1–58, 2025.

[34] L. Sasse, J. Dukart, S. B. Eickhoff, M. Götz, S. Hamdan, and V. Komeyer, “Overview of leakage scenarios in supervised machine learning,” 2025.

[35] S. Liu, Q. Tian, Y. Liu, and P. Li, “Joint Statistical Inference for the Area under the ROC Curve and Youden Index under a Density Ratio Model,” pp. 1–21, 2024.

Two-Level Ensemble with Four Meta-Features for Diabetes Classification on Clinical Tabular Data

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

submit

tools

issn