Analysis of Stacking Ensemble Method in Machine Learning Algorithms to Predict Student Depression

Naouthla Asia Levina; Majid Rahardi

doi:10.30871/jaic.v9i6.11453

Authors

Naouthla Asia Levina Universitas Amikom Yogyakarta
Majid Rahardi Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.30871/jaic.v9i6.11453

Keywords:

Depression, University Students, Machine Learning, Stacking Ensemble, Prediction

Abstract

Mental health issues, particularly depression among university students, require early detection and intervention due to their profound impact on academic performance and overall well-being. Although machine learning has been utilized in previous studies to predict depression, most research still relies on single-model approaches and rarely employs publicly available datasets that have undergone comprehensive preprocessing. This study aims to develop a depression prediction model for university students using a two-level stacking ensemble technique with cross-validation stacking, integrating Random Forest, Gradient Boosting, and XGBoost as base learners, and Logistic Regression as the meta-learner. A public dataset from Kaggle was utilized, consisting of 502 student records and 10 multidimensional predictor variables. Data preprocessing included cleaning, feature encoding, and standardization. Model performance was evaluated using accuracy, precision, recall, F1-score, and ROC-AUC metrics. The proposed stacking ensemble model achieved excellent performance, with an accuracy of 98.02%, ROC-AUC of 99.8%, precision of 96%, recall of 100%, and an F1-score of 98% for the depression class. These results demonstrate that the stacking ensemble method is highly effective for early depression detection among university students and has strong potential for implementation as a decision-support tool in academic environments.

Downloads

Download data is not yet available.

References

[1] K. Vitoasmara, F. Vio Hidayah, R. Yuna Aprillia, and L. A. Dyah Dewi, “Gangguan Mental (Mental Disorders),” Student Res. J., no. 2, pp. 57–68, 2024, [Online]. Available: https://doi.org/10.55606/srjyappi.v2i3.1219

[2] I. Setiawan, I. F. Yasin, Y. T. Desianti, and A. Surakarta, “Komparasi Kinerja Algoritma Random Forest , Decision Tree , Naïve Bayes , dan KNN dalam Prediksi Tingkat Depresi Mahasiswa Menggunakan Student Depression Dataset,” vol. 6, no. 1, pp. 47–58, 2025.

[3] I. Zulfahmi, H. Syahputra, S. I. Naibaho, M. A. Maulana, and E. P. Sinaga, “Perbandingan Algoritma Support Vector Machine (SVM) dan Decision Tree Untuk Deteksi Tingkat Depresi Mahasiswa,” Bina Insa. Ict J., vol. 10, no. 1, p. 52, 2023, doi: 10.51211/biict.v10i1.2304.

[4] S. N. Abdussamad, N. P. Doholio, W. P. Lasaleng, and P. Ayu, “Klasifikasi Tingkat Depresi Mahasiswa Menggunakan Image Recognition dengan Support Vector Machine,” vol. 4, no. 1, pp. 30–36, 2025, doi: 10.55657/rmns.v4i1.193.

[5] D. K. Widyatna, T. Sagirani, and N. Wahyuningtyas, “Sistem Pakar Berbasis Android untuk Mengetahui Tingkat Depresi Dini Mahasiswa Menggunakan Metode Dempster Shafer,” vol. 09, no. 01, pp. 1–6, 2024.

[6] M. Rijal, F. Aziz, and S. Abasa, “Prediksi Depresi : Inovasi Terkini Dalam Kesehatan Mental Melalui Metode Machine Learning Depression Prediction : Recent Innovations in Mental Health Journal Pharmacy and Application,” J. Pharm. Appl. Comput. Sci., vol. 2, no. 1, pp. 9–14, 2024, [Online]. Available: https://doi.org/10.59823/jopacs.v2i1.47

[7] B. L. Schaab et al., “How do machine learning models perform in the detection of depression, anxiety, and stress among undergraduate students? A systematic review,” Cad. Saude Publica, vol. 40, no. 11, 2024, doi: 10.1590/0102-311XEN029323.

[8] U. Royal, “Stacking Ensemble Model Machine Learning Deteksi Dini Risiko Kesehatan Mental Di,” vol. 4307, no. August, pp. 4256–4266, 2024.

[9] A. Daza Vergaray, J. C. H. Miranda, J. B. Cornelio, A. R. López Carranza, and C. F. Ponce Sánchez, “Predicting the depression in university students using stacking ensemble techniques over oversampling method,” Inform. Med. Unlocked, vol. 41, p. 101295, 2023, doi: 10.1016/j.imu.2023.101295.

[10] A. Selvaraj and L. Mohandoss, “Enhancing Depression Detection: A Stacked Ensemble Model with Feature Selection and RF Feature Importance Analysis Using NHANES Data,” Appl. Sci., vol. 14, no. 16, 2024, doi: 10.3390/app14167366.

[11] M. E. Hasan et al., “Prevalence, associated factors, and machine learning-based prediction of depression, anxiety, and stress among university students: a cross-sectional study from Bangladesh,” J. Health. Popul. Nutr., vol. 44, no. 1, p. 361, 2025, doi: 10.1186/s41043-025-01095-8.

[12] M. Peran, P. Kecerdasan, B. Kualitas, and G. Hermawan, “Memahami Peran Dataset dalam Penelitian Kecerdasan Buatan : Kualitas , Aksesibilitas , dan Tantangan,” no. October, 2024, doi: 10.13140/RG.2.2.34468.49288.

[13] H. H. Arrosyid, Z. Pratama, and G. Priambodo, “Penurunan Cancellation Rate Pada City Hotel Menggunakan Metode Issue Tree,” vol. 1, no. 1, pp. 31–39, 2025.

[14] V. Chernykh, A. Stepnov, and B. O. Lukyanova, “Data preprocessing for machine learning in seismology,” CEUR Workshop Proc., vol. 2930, no. October, pp. 119–123, 2023.

[15] T. Zhou and H. Jiao, “Exploration of the Stacking Ensemble Machine Learning Algorithm for Cheating Detection in Large-Scale Assessment,” Educ. Psychol. Meas., vol. 83, no. 4, pp. 831–854, 2023, doi: 10.1177/00131644221117193.

[16] Z. Shao, M. N. Ahmad, and A. Javed, “Comparison of Random Forest and XGBoost Classifiers Using Integrated Optical and SAR Features for Mapping Urban Impervious Surface,” Remote Sens., vol. 16, no. 4, 2024, doi: 10.3390/rs16040665.

[17] S. B. Nadkarni, G. S. Vijay, and R. C. Kamath, “Comparative Study of Random Forest and Gradient Boosting Algorithms to Predict Airfoil Self-Noise,” Eng. Proc., vol. 59, no. 1, 2023, doi: 10.3390/engproc2023059024.

[18] R. Kablan, H. A. Miller, S. Suliman, and H. B. Frieboes, “Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID- 19 . The COVID-19 resource centre is hosted on Elsevier Connect , the company ’ s public news and information ,” no. January, 2020.

[19] C. Yang, E. A. Fridgeirsson, J. A. Kors, J. M. Reps, P. R. Rijnbeek, and D. Ross, “An initial investigation into more complex stacking methods to improve transportability of prediction models developed across multiple databases,” Obs. Heal. Data Sci. Informatics ( OHDSI ), 2023.

[20] T. Tong and Z. Li, “Predicting learning achievement using ensemble learning with result explanation,” PLoS One, vol. 20, no. 1, pp. 1–25, 2025, doi: 10.1371/journal.pone.0312124.

[21] M. Rosenblatt, L. Tejavibulya, R. Jiang, S. Noble, and D. Scheinost, “Data leakage inflates prediction performance in connectome-based machine learning models,” Nat. Commun., vol. 15, no. 1, pp. 1–15, 2024, doi: 10.1038/s41467-024-46150-w.

[22] M. R. Sudrajat and M. Zakariyah, “Penerapan Natural Language Processing dan Machine Learning untuk Prediksi Stres Siswa SMA Berdasarkan Analisis Teks,” vol. 6, no. 3, 2024, doi: 10.47065/bits.v6i3.6180.

[23] P. Proskura and A. Zaytsev, “Effective Training-Time Stacking for Ensembling of Deep Neural Networks,” ACM Int. Conf. Proceeding Ser., pp. 78–82, 2022, doi: 10.1145/3573942.3573954.

[24] X. Yuan, S. Liu, W. Feng, and G. Dauphin, “Feature Importance Ranking of Random Forest-Based End-to-End Learning Algorithm,” Remote Sens., vol. 15, no. 21, pp. 1–20, 2023, doi: 10.3390/rs15215203.

Analysis of Stacking Ensemble Method in Machine Learning Algorithms to Predict Student Depression

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

submit

tools

issn