Interpretable Ensemble Models for Lifestyle-Based Sleep Disorder Prediction
DOI:
https://doi.org/10.30871/jaic.v10i1.12125Keywords:
Ensemble Learning, Hyperparameter, Machine Learning, Sleep DisorderAbstract
Sleep disorders are a major global health concern that affect cognitive performance, mental well-being, and long-term physiological health. Conventional diagnostic methods such as polysomnography are time-consuming and resource-intensive, limiting their use for large-scale early detection. Therefore, machine learning offers a practical alternative for predictive and data-driven sleep disorder analysis. This study compares the performance of four ensemble learning algorithms Random Forest, Gradient Boosting, AdaBoost, and XGBoost in predicting sleep disorders based on lifestyle and physiological factors using the Sleep Health and Lifestyle dataset consisting of 374 samples and three classes: insomnia, none, and sleep apnea. The research workflow includes data preprocessing, feature encoding, dataset splitting (70:30), and hyperparameter optimization using Grid Search with 5-fold Cross Validation to improve model stability and generalization given the limited dataset size. Model evaluation is conducted using accuracy, precision, recall, and F1-score calculated with a macro-average approach to ensure fair multi-class performance assessment. The results show that AdaBoost and XGBoost achieve the highest test accuracy of 90.27%, while Random Forest and Gradient Boosting obtain 89.38%. Performance differences among models are relatively small (±1%) but indicate consistent predictive behavior. Feature importance analysis identifies BMI category and systolic blood pressure as the most influential predictors, followed by occupation and physical activity level, highlighting the relevance of lifestyle and physiological factors in sleep disorder risk. Overall, this study demonstrates that ensemble learning models provide reliable predictive performance and interpretable insights to support early detection of sleep disorders based on lifestyle patterns.
Downloads
References
[1] X. Liu et al., “Poor sleep quality and its related risk factors among university students,” Ann. Palliat. Med., vol. 10, no. 4, pp. 4479–4485, 2021, doi: 10.21037/apm-21-472.
[2] Fahruzi et al., “Asesmen ECG-Apnea satu sadapan untuk peningkatan akurasi klasifikasi gangguan tidur berdasarkan AdaBoost (Single lead ECG-apnea recordings assessment for improved accuracy in classification of sleep disorder based on AdaBoost),” Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 9, no. 2, pp. 196–204, 2020, doi: 10.22146/jnteti.v9i2.159
[3] Y. Wang, S. Ye, Z. Xu, Y. Chu, J. Zhang, and W. Yu, “Research on Sleep Staging Based on Support Vector Machine and Extreme Gradient Boosting Algorithm,” Nat. Sci. Sleep, vol. 16, pp. 1827–1847, 2024, doi: 10.2147/NSS.S467111.
[4] A. S. Zamani et al., “The prediction of sleep quality using wearable-assisted smart health monitoring systems based on statistical data,” J. King Saud Univ. - Sci., vol. 35, no. 9, Dec. 2023, doi: 10.1016/j.jksus.2023.102927.
[5] J. Kufel et al., “What Is Machine Learning, Artificial Neural Networks and Deep Learning?—Examples of Practical Applications in Medicine,” Diagnostics, vol. 13, no. 15, 2023, doi: 10.3390/diagnostics13152582.
[6] M. A. Rahman et al., “Improving Sleep Disorder Diagnosis Through Optimized Machine Learning Approaches,” IEEE Access, vol. 13, pp. 20989–21004, 2025, doi: 10.1109/ACCESS.2025.3535535.
[7] S. Salmon, A. Azahari, and H. Ekawati, “Perbandingan Kinerja Algoritma K-Nearest Neighbor dan Algoritma Random Forest Untuk Klasifikasi Data Mining Pada Penyakit Gagal Ginjal,” Build. Informatics, Technol. Sci., vol. 6, no. 3, pp. 1943–1953, 2024, doi: 10.47065/bits.v6i3.6476.
[8] M. Fadhiel Alie and R. Rahmanda, “Model Prediksi Gangguan Tidur berdasarkan Beberapa Faktor menggunakan Machine Learning Prediction of Sleep Disorders based on Several Factors using Machine Learning,” 2024. [Online]. Available: www.jurnal.unimed.ac.id
[9] R. Sudiyarno, A. Setyanto, and E. T. Luthfi, “Peningkatan Performa Pendeteksian Anomali Menggunakan Ensemble Learning dan Feature Selection,” Creat. Inf. Technol. J., vol. 7, no. 1, p. 1, 2021, doi: 10.24076/citec.2020v7i1.238.
[10] A. Mohammed and R. Kora, “A comprehensive review on ensemble deep learning: Opportunities and challenges,” J. King Saud Univ. - Comput. Inf. Sci., vol. 35, no. 2, pp. 757–774, 2023, doi: 10.1016/j.jksuci.2023.01.014.
[11] J. A. Ilemobayo, O. Durodola, A. Ogungbire, and A. Osinuga, “Hyperparameter Tuning in Machine Learning : A Comprehensive Review,” vol. 26, no. 6, pp. 388–395, 2024, doi: 10.9734/jerr/2024/v26i61188.
[12] E. Shahat, B. Data, D. El Shahat, A. Tolba, M. Abouhawwash, and M. A. Basset, “Machine learning and deep learning models based grid search cross validation for short ‑ term solar irradiance forecasting,” J. Big Data, 2024, doi: 10.1186/s40537-024-00991-w.
[13] P. Charilaou and R. Battat, “Machine learning models and over-fitting considerations,” vol. 28, no. 5, pp. 605–607, 2022, doi: 10.3748/wjg.v28.i5.605.
[14] G. Airlangga, “Evaluating Machine Learning Models for Predicting Sleep Disorders in a Lifestyle and Health Data Context,” JIKO (Jurnal Inform. dan Komputer), vol. 7, no. 1, pp. 51–57, 2024, doi: 10.33387/jiko.v7i1.7870.
[15] B. Pardamean, A. Budiarto, B. Mahesworo, A. A. Hidayat, and D. Sudigyo, “Sleep Stage Classification For Medical Purposes: Machine Learning Evaluation For Imbalanced Data,” Research Square, Preprint, 2022, doi: 10.21203/rs.3.rs-1208553/v1.
[16] L. Zhang et al., “Utilizing machine learning techniques to identify severe sleep disturbances in Chinese adolescents: an analysis of lifestyle, physical activity, and psychological factors,” Front. Psychiatry, vol. 15, 2024, doi: 10.3389/fpsyt.2024.1447281.
[17] S. Ha et al., “Predicting the Risk of Sleep Disorders Using a Machine Learning-Based Simple Questionnaire: Development and Validation Study,” J. Med. Internet Res., vol. 25, no. 1, 2023, doi: 10.2196/46520.
[18] J. Ramesh, N. Keeran, A. Sagahyroon, and F. Aloul, “Towards validating the effectiveness of obstructive sleep apnea classification from electronic health records using machine learning,” Healthc., vol. 9, no. 11, 2021, doi: 10.3390/healthcare9111450.
[19] Fakhruddin Fakhruddin and Sefrika Entas, “Perbandingan Algoritma C4.5 dan Naïve Bayes dalam Prediksi Kualitas Tidur pada Kesehatan,” Vitam. J. ilmu Kesehat. Umum, vol. 3, no. 4, pp. 217–234, Sep. 2025, doi: 10.61132/vitamin.v3i4.1773.
[20] M. Mostafa Monowar et al., “Advanced sleep disorder detection using multi-layered ensemble learning and advanced data balancing techniques,” Front. Artif. Intell., vol. 7, 2024, doi: 10.3389/frai.2024.1506770.
[21] R. H. Saputra and R. R. Suryono, “Perbandingan Algoritma SVM , Random Forest , dan Naive Bayes Terhadap Kasus Scam di Media Sosial Twitter,” vol. 7, no. 2, pp. 907–919, 2025, doi: 10.47065/bits.v7i2.7236.
[22] A. B. Mawardi, R. S. Pradini, and M. S. Haris, “Komparasi Algoritma Boosting Untuk Prediksi Gangguan Tidur,” J. Inform. dan Tek. Elektro Terap., vol. 13, no. 3, 2025, doi: 10.23960/jitet.v13i3.7281.
[23] A. Widianti and I. Pratama, “Penanganan Missing Values Dan Prediksi Data Timbunan Sampah Berbasis Machine Learning,” Rabit J. Teknol. dan Sist. Inf. Univrab, vol. 9, no. 2, pp. 242–251, 2024, doi: 10.36341/rabit.v9i2.4789.
[24] N. Q. Rizkina and F. N. Hasan, “Analisis Sentimen Komentar Netizen Terhadap Pembubaran Konser NCT 127 Menggunakan Metode Naive Bayes,” J. Inf. Syst. Res., vol. 4, no. 4, pp. 1136–1144, 2023, doi: 10.47065/josh.v4i4.3803.
[25] D. D. N. Cahyo and A. Sunyoto, “Analisis Perbandingan Klasifikasi dalam Data Mining pada Prediksi Hujan dengan menggunakan Algoritma LSTM dan GRU,” J. Sains dan Inform., vol. 11, no. 1, pp. 40–49, 2025, doi: 10.34128/jsi.v11i1.1212.
[26] E. S. M. El-Kenawy, A. Ibrahim, A. A. Abdelhamid, N. Khodadadi, L. Abualigah, and M. M. Eid, “Predicting Sleep Disorders: Leveraging Sleep Health and Lifestyle Data with Dipper Throated Optimization Algorithm for Feature Selection and Logistic Regression for Classification,” Comput. J. Math. Stat. Sci., vol. 3, no. 2, pp. 341–358, 2024, doi: 10.21608/cjmss.2024.290167.1053.
[27] E. Alshdaifat, D. Alshdaifat, A. Alsarhan, F. Hussein, and S. M. F. S. El-Salhi, “The effect of preprocessing techniques, applied to numeric features, on classification algorithms’ performance,” Data, vol. 6, no. 2, pp. 1–23, 2021, doi: 10.3390/data6020011.
[28] V. Pranith Reddy, “Applying Machine Learning Algorithms for the Classification of Sleep Disorders,” Int. J. Sci. Res. Eng. Manag., vol. 09, no. 05, pp. 1–9, 2025, doi: 10.55041/ijsrem48664.
[29] A. Data, E. Eda, S. Peraih, and M. Olimpiade, “Exploratory Data Analysis ( EDA ): A Study of Olympic Medallist,” vol. 11, no. 3 , pp. 578–587, 2022. [Online]. Available: https://sistemasi.ftik.unisi.ac.id/index.php/stmsi/article/view/1857
[30] B. Vrigazova, “The Proportion for Splitting Data into Training and Test Set for the Bootstrap in Classification Problems,” vol. 12, no. 1, pp. 228–242, 2021, doi: 10.2478/bsrj-2021-0015
[31] R. R. Pratama, “Analisis Model Machine Learning Terhadap Pengenalan Aktifitas Manusia,” vol. 19, no. 2, pp. 302–311, 2020, doi: 10.30812/matrik.v19i2.688
[32] L. Yin, B. Li, P. Li, and R. Zhang, “Research on stock trend prediction method based on optimized random forest,” CAAI Trans. Intell. Technol., vol. 8, no. 1, pp. 274–284, 2023, doi: 10.1049/cit2.12067.
[33] Z. Wang, Z. Zhao, and C. Yin, “Fine Crop Classification Based on UAV Hyperspectral Images and Random Forest,” ISPRS Int. J. Geo-Information, vol. 11, no. 4, 2022, doi: 10.3390/ijgi11040252.
[34] N. H. Setyawan and N. Wakhidah, “Analisis perbandingan metode logistic regression, random forest, gradient boosting untuk prediksi diabetes,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 10, no. 1, pp. 150–162, 2025, doi: 10.29100/jipi.v10i1.5743
[35] K. P. Murphy, Probabilistic Machine Learning: An Introduction. London, England: MIT Press, 2022. [Online]. Available: https://probml.github.io/pml-book/book1.html
[36] XGBoost Team, “XGBoost: A Scalable Tree Boosting System.” Accessed: Nov. 08, 2025. [Online]. Available: https://xgboost.readthedocs.io/en/stable/
[37] E. Elgeldawi, A. Sayed, A. R. Galal, and A. M. Zaki, “Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis,” pp. 1–21, 2021, doi: 10.3390/informatics8040079
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Farhan Rahardian, Sindhu Rakasiwi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








