Interpretable Ensemble Models for Lifestyle-Based Sleep Disorder Prediction

Authors

  • Farhan Rahardian Universitas Dian Nuswantoro
  • Sindhu Rakasiwi Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v10i1.12125

Keywords:

Ensemble Learning, Hyperparameter, Machine Learning, Sleep Disorder

Abstract

Sleep disorders are a major global health concern that affect cognitive performance, mental well-being, and long-term physiological health. Conventional diagnostic methods such as polysomnography are time-consuming and resource-intensive, limiting their use for large-scale early detection. Therefore, machine learning offers a practical alternative for predictive and data-driven sleep disorder analysis. This study compares the performance of four ensemble learning algorithms Random Forest, Gradient Boosting, AdaBoost, and XGBoost in predicting sleep disorders based on lifestyle and physiological factors using the Sleep Health and Lifestyle dataset consisting of 374 samples and three classes: insomnia, none, and sleep apnea. The research workflow includes data preprocessing, feature encoding, dataset splitting (70:30), and hyperparameter optimization using Grid Search with 5-fold Cross Validation to improve model stability and generalization given the limited dataset size. Model evaluation is conducted using accuracy, precision, recall, and F1-score calculated with a macro-average approach to ensure fair multi-class performance assessment. The results show that AdaBoost and XGBoost achieve the highest test accuracy of 90.27%, while Random Forest and Gradient Boosting obtain 89.38%. Performance differences among models are relatively small (±1%) but indicate consistent predictive behavior. Feature importance analysis identifies BMI category and systolic blood pressure as the most influential predictors, followed by occupation and physical activity level, highlighting the relevance of lifestyle and physiological factors in sleep disorder risk. Overall, this study demonstrates that ensemble learning models provide reliable predictive performance and interpretable insights to support early detection of sleep disorders based on lifestyle patterns.

Downloads

Download data is not yet available.

References

[1] X. Liu et al., “Poor sleep quality and its related risk factors among university students,” Ann. Palliat. Med., vol. 10, no. 4, pp. 4479–4485, 2021, doi: 10.21037/apm-21-472.

[2] Fahruzi et al., “Asesmen ECG-Apnea satu sadapan untuk peningkatan akurasi klasifikasi gangguan tidur berdasarkan AdaBoost (Single lead ECG-apnea recordings assessment for improved accuracy in classification of sleep disorder based on AdaBoost),” Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 9, no. 2, pp. 196–204, 2020, doi: 10.22146/jnteti.v9i2.159

[3] Y. Wang, S. Ye, Z. Xu, Y. Chu, J. Zhang, and W. Yu, “Research on Sleep Staging Based on Support Vector Machine and Extreme Gradient Boosting Algorithm,” Nat. Sci. Sleep, vol. 16, pp. 1827–1847, 2024, doi: 10.2147/NSS.S467111.

[4] A. S. Zamani et al., “The prediction of sleep quality using wearable-assisted smart health monitoring systems based on statistical data,” J. King Saud Univ. - Sci., vol. 35, no. 9, Dec. 2023, doi: 10.1016/j.jksus.2023.102927.

[5] J. Kufel et al., “What Is Machine Learning, Artificial Neural Networks and Deep Learning?—Examples of Practical Applications in Medicine,” Diagnostics, vol. 13, no. 15, 2023, doi: 10.3390/diagnostics13152582.

[6] M. A. Rahman et al., “Improving Sleep Disorder Diagnosis Through Optimized Machine Learning Approaches,” IEEE Access, vol. 13, pp. 20989–21004, 2025, doi: 10.1109/ACCESS.2025.3535535.

[7] S. Salmon, A. Azahari, and H. Ekawati, “Perbandingan Kinerja Algoritma K-Nearest Neighbor dan Algoritma Random Forest Untuk Klasifikasi Data Mining Pada Penyakit Gagal Ginjal,” Build. Informatics, Technol. Sci., vol. 6, no. 3, pp. 1943–1953, 2024, doi: 10.47065/bits.v6i3.6476.

[8] M. Fadhiel Alie and R. Rahmanda, “Model Prediksi Gangguan Tidur berdasarkan Beberapa Faktor menggunakan Machine Learning Prediction of Sleep Disorders based on Several Factors using Machine Learning,” 2024. [Online]. Available: www.jurnal.unimed.ac.id

[9] R. Sudiyarno, A. Setyanto, and E. T. Luthfi, “Peningkatan Performa Pendeteksian Anomali Menggunakan Ensemble Learning dan Feature Selection,” Creat. Inf. Technol. J., vol. 7, no. 1, p. 1, 2021, doi: 10.24076/citec.2020v7i1.238.

[10] A. Mohammed and R. Kora, “A comprehensive review on ensemble deep learning: Opportunities and challenges,” J. King Saud Univ. - Comput. Inf. Sci., vol. 35, no. 2, pp. 757–774, 2023, doi: 10.1016/j.jksuci.2023.01.014.

[11] J. A. Ilemobayo, O. Durodola, A. Ogungbire, and A. Osinuga, “Hyperparameter Tuning in Machine Learning : A Comprehensive Review,” vol. 26, no. 6, pp. 388–395, 2024, doi: 10.9734/jerr/2024/v26i61188.

[12] E. Shahat, B. Data, D. El Shahat, A. Tolba, M. Abouhawwash, and M. A. Basset, “Machine learning and deep learning models based grid search cross validation for short ‑ term solar irradiance forecasting,” J. Big Data, 2024, doi: 10.1186/s40537-024-00991-w.

[13] P. Charilaou and R. Battat, “Machine learning models and over-fitting considerations,” vol. 28, no. 5, pp. 605–607, 2022, doi: 10.3748/wjg.v28.i5.605.

[14] G. Airlangga, “Evaluating Machine Learning Models for Predicting Sleep Disorders in a Lifestyle and Health Data Context,” JIKO (Jurnal Inform. dan Komputer), vol. 7, no. 1, pp. 51–57, 2024, doi: 10.33387/jiko.v7i1.7870.

[15] B. Pardamean, A. Budiarto, B. Mahesworo, A. A. Hidayat, and D. Sudigyo, “Sleep Stage Classification For Medical Purposes: Machine Learning Evaluation For Imbalanced Data,” Research Square, Preprint, 2022, doi: 10.21203/rs.3.rs-1208553/v1.

[16] L. Zhang et al., “Utilizing machine learning techniques to identify severe sleep disturbances in Chinese adolescents: an analysis of lifestyle, physical activity, and psychological factors,” Front. Psychiatry, vol. 15, 2024, doi: 10.3389/fpsyt.2024.1447281.

[17] S. Ha et al., “Predicting the Risk of Sleep Disorders Using a Machine Learning-Based Simple Questionnaire: Development and Validation Study,” J. Med. Internet Res., vol. 25, no. 1, 2023, doi: 10.2196/46520.

[18] J. Ramesh, N. Keeran, A. Sagahyroon, and F. Aloul, “Towards validating the effectiveness of obstructive sleep apnea classification from electronic health records using machine learning,” Healthc., vol. 9, no. 11, 2021, doi: 10.3390/healthcare9111450.

[19] Fakhruddin Fakhruddin and Sefrika Entas, “Perbandingan Algoritma C4.5 dan Naïve Bayes dalam Prediksi Kualitas Tidur pada Kesehatan,” Vitam. J. ilmu Kesehat. Umum, vol. 3, no. 4, pp. 217–234, Sep. 2025, doi: 10.61132/vitamin.v3i4.1773.

[20] M. Mostafa Monowar et al., “Advanced sleep disorder detection using multi-layered ensemble learning and advanced data balancing techniques,” Front. Artif. Intell., vol. 7, 2024, doi: 10.3389/frai.2024.1506770.

[21] R. H. Saputra and R. R. Suryono, “Perbandingan Algoritma SVM , Random Forest , dan Naive Bayes Terhadap Kasus Scam di Media Sosial Twitter,” vol. 7, no. 2, pp. 907–919, 2025, doi: 10.47065/bits.v7i2.7236.

[22] A. B. Mawardi, R. S. Pradini, and M. S. Haris, “Komparasi Algoritma Boosting Untuk Prediksi Gangguan Tidur,” J. Inform. dan Tek. Elektro Terap., vol. 13, no. 3, 2025, doi: 10.23960/jitet.v13i3.7281.

[23] A. Widianti and I. Pratama, “Penanganan Missing Values Dan Prediksi Data Timbunan Sampah Berbasis Machine Learning,” Rabit J. Teknol. dan Sist. Inf. Univrab, vol. 9, no. 2, pp. 242–251, 2024, doi: 10.36341/rabit.v9i2.4789.

[24] N. Q. Rizkina and F. N. Hasan, “Analisis Sentimen Komentar Netizen Terhadap Pembubaran Konser NCT 127 Menggunakan Metode Naive Bayes,” J. Inf. Syst. Res., vol. 4, no. 4, pp. 1136–1144, 2023, doi: 10.47065/josh.v4i4.3803.

[25] D. D. N. Cahyo and A. Sunyoto, “Analisis Perbandingan Klasifikasi dalam Data Mining pada Prediksi Hujan dengan menggunakan Algoritma LSTM dan GRU,” J. Sains dan Inform., vol. 11, no. 1, pp. 40–49, 2025, doi: 10.34128/jsi.v11i1.1212.

[26] E. S. M. El-Kenawy, A. Ibrahim, A. A. Abdelhamid, N. Khodadadi, L. Abualigah, and M. M. Eid, “Predicting Sleep Disorders: Leveraging Sleep Health and Lifestyle Data with Dipper Throated Optimization Algorithm for Feature Selection and Logistic Regression for Classification,” Comput. J. Math. Stat. Sci., vol. 3, no. 2, pp. 341–358, 2024, doi: 10.21608/cjmss.2024.290167.1053.

[27] E. Alshdaifat, D. Alshdaifat, A. Alsarhan, F. Hussein, and S. M. F. S. El-Salhi, “The effect of preprocessing techniques, applied to numeric features, on classification algorithms’ performance,” Data, vol. 6, no. 2, pp. 1–23, 2021, doi: 10.3390/data6020011.

[28] V. Pranith Reddy, “Applying Machine Learning Algorithms for the Classification of Sleep Disorders,” Int. J. Sci. Res. Eng. Manag., vol. 09, no. 05, pp. 1–9, 2025, doi: 10.55041/ijsrem48664.

[29] A. Data, E. Eda, S. Peraih, and M. Olimpiade, “Exploratory Data Analysis ( EDA ): A Study of Olympic Medallist,” vol. 11, no. 3 , pp. 578–587, 2022. [Online]. Available: https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/1857

[30] B. Vrigazova, “The Proportion for Splitting Data into Training and Test Set for the Bootstrap in Classification Problems,” vol. 12, no. 1, pp. 228–242, 2021, doi: 10.2478/bsrj-2021-0015

[31] R. R. Pratama, “Analisis Model Machine Learning Terhadap Pengenalan Aktifitas Manusia,” vol. 19, no. 2, pp. 302–311, 2020, doi: 10.30812/matrik.v19i2.688

[32] L. Yin, B. Li, P. Li, and R. Zhang, “Research on stock trend prediction method based on optimized random forest,” CAAI Trans. Intell. Technol., vol. 8, no. 1, pp. 274–284, 2023, doi: 10.1049/cit2.12067.

[33] Z. Wang, Z. Zhao, and C. Yin, “Fine Crop Classification Based on UAV Hyperspectral Images and Random Forest,” ISPRS Int. J. Geo-Information, vol. 11, no. 4, 2022, doi: 10.3390/ijgi11040252.

[34] N. H. Setyawan and N. Wakhidah, “Analisis perbandingan metode logistic regression, random forest, gradient boosting untuk prediksi diabetes,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 10, no. 1, pp. 150–162, 2025, doi: 10.29100/jipi.v10i1.5743

[35] K. P. Murphy, Probabilistic Machine Learning: An Introduction. London, England: MIT Press, 2022. [Online]. Available: https://probml.github.io/pml-book/book1.html

[36] XGBoost Team, “XGBoost: A Scalable Tree Boosting System.” Accessed: Nov. 08, 2025. [Online]. Available: https://xgboost.readthedocs.io/en/stable/

[37] E. Elgeldawi, A. Sayed, A. R. Galal, and A. M. Zaki, “Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis,” pp. 1–21, 2021, doi: 10.3390/informatics8040079

Downloads

Published

2026-02-11

How to Cite

[1]
F. Rahardian and S. Rakasiwi, “Interpretable Ensemble Models for Lifestyle-Based Sleep Disorder Prediction”, JAIC, vol. 10, no. 1, pp. 1111–1124, Feb. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.