Comparing Decision Tree and Optimized LightGBM for Attrition Prediction

Authors

  • Dhea Maharani Sistem Informasi, Fakultas Ilmu Komputer, Universitas Dian Nuswantoro
  • Farrikh Alzami Universitas Dian Nuswantoro
  • MY. Teguh Sulistyono Sistem Informasi, Fakultas Ilmu Komputer, Universitas Dian Nuswantoro
  • Aris Nurhindarto Sistem Informasi, Fakultas Ilmu Komputer, Universitas Dian Nuswantoro
  • Dewi Agustini Santoso Faculty of Computer Science, Universitas Dian Nuswantoro
  • Muslih Muslih Faculty of Computer Science, Universitas Dian Nuswantoro
  • Henry Bastian Faculty of Computer Science, Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v10i3.12678

Keywords:

Employee Attrition, Feature Importance, Hyperparameter Tuning, LightGBM, Machine Learning

Abstract

Employee turnover poses a considerable challenge for organizations, impacting productivity and raising recruitment expenses. This research seeks to evaluate the effectiveness of Decision Tree and Light Gradient Boosting Machine (LightGBM) models in forecasting employee attrition. The study utilizes a quantitative experimental design, leveraging a secondary dataset sourced from Mendeley. Before model development, data preprocessing was performed, and model evaluation was carried out using metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Each algorithm was assessed under three different configurations baseline, regularization, and hyperparameter tuning through GridSearchCV. The experimental findings indicate that the Decision Tree model is prone to overfitting and has limited capabilities in detecting attrition classes, even though optimization raises the ROC-AUC score to 0.80. In comparison, LightGBM demonstrates more reliable and consistent performance. The Tuned LightGBM model achieved the highest performance on the test dataset, with an Accuracy of 0.81, a Precision of 0.82, a Recall of 0.71, F1-Score of 0.76, and an ROC-AUC of 0.85. An analysis of feature importance reveals that job satisfaction, work-life balance, emotional commitment, work experience, and allowances are the key factors influencing attrition prediction. These results indicate that LightGBM not only performs exceptionally well, but it is also able to offer insights into the critical factors that are important for data-driven retention strategies.

Downloads

Download data is not yet available.

References

[1] M. Fadel, K. Kanasfi, Z. Arifin, and G. Triyono, “Application Of Ensemble Method For Employee Turnover Predictions In Financial Services Company,” J. Tek. Inform. Jutif, vol. 5, no. 3, pp. 767–775, May 2024, doi: 10.52436/1.jutif.2024.5.3.1871.

[2] J. Park, Y. Feng, and S.-P. Jeong, “Developing an advanced prediction model for new employee turnover intention utilizing machine learning techniques,” Sci. Rep., vol. 14, no. 1, p. 1221, Jan. 2024, doi: 10.1038/s41598-023-50593-4.

[3] A. Nurhindarto, E. W. Andriansyah, F. Alzami, P. Purwanto, M. A. Soeleman, and D. P. Prabowo, “Employee Attrition and Performance Prediction using Univariate ROC feature selection and Random Forest,” Kinet. Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control, Nov. 2021, doi: 10.22219/kinetik.v6i4.1345.

[4] M. S. Alshiddy and B. N. Aljaber, “Employee Attrition Prediction using Nested Ensemble Learning Techniques,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 7, 2023, doi: 10.14569/IJACSA.2023.01407101.

[5] “Employee Turnover Prediction Research of Human Resource Management on Machine Learning Algorithms and Big Data Analysis,” J. Organ. End User Comput., vol. 38, no. 1, Jan. 2026, doi: 10.4018/JOEUC.399146.

[6] X. Wang and N. Huang, “Application of data visualization technology in human resource management and employee resignation prediction,” Syst. Soft Comput., vol. 7, p. 200355, Dec. 2025, doi: 10.1016/j.sasc.2025.200355.

[7] R. Govindarajan, N. K. Kumar, S. R. P, S. P. E, D. B, and P. K. G, “Predicting Employee Attrition: A Comparative Analysis of Machine Learning Models Using the IBM Human Resource Analytics Dataset,” Procedia Comput. Sci., vol. 258, pp. 4084–4093, 2025, doi: https://doi.org/10.1016/j.procs.2025.04.659.

[8] S. Rawat, A. Rawat, D. Kumar, and A. S. Sabitha, “Application of machine learning and data visualization techniques for decision support in the insurance sector,” Int. J. Inf. Manag. Data Insights, vol. 1, no. 2, p. 100012, Nov. 2021, doi: 10.1016/j.jjimei.2021.100012.

[9] “Grey clustering machine learning model for predicting voluntary employee turnover,” Grey Syst. Theory Appl., vol. 15, no. 4, pp. 771–791, Aug. 2025, doi: 10.1108/GS-02-2025-0020.

[10] R. H. M. Aly, A. I. Hussein, and K. H. Rahouma, “Grasshopper KUWAHARA and Gradient Boosting Tree for Optimal Features Classifications,” Comput. Mater. Contin., vol. 72, no. 2, pp. 3985–3997, 2022, doi: https://doi.org/10.32604/cmc.2022.025862.

[11] J. C. M. Bustillo, “Optimization-based Techniques Prediction Model in Determining Employee Turnover,” Procedia Comput. Sci., vol. 252, pp. 440–449, Jan. 2025, doi: 10.1016/j.procs.2025.01.003.

[12] H.-C. Chen, J.-Y. Wang, Y.-C. Lee, and S.-Y. Yang, “Examining the Predictors of Turnover Behavior in Newly Employed Certified Nurse Aides: A Prospective Cohort Study,” Saf. Health Work, vol. 14, no. 2, pp. 185–192, 2023, doi: https://doi.org/10.1016/j.shaw.2023.04.003.

[13] E. Ahmed and M. Omer, “Predicting Employee Attrition Using Artificial Neural Networks: A Comparative Study of Machine Learning Models and Imbalanced Data Handling Techniques,” 2025, SSRN. doi: 10.2139/ssrn.5105905.

[14] K.-T. Nguyen, T.-N. Tran, and H.-T. Nguyen, “Research on the Influence of Hyperparameters on the LightGBM Model in Load Forecasting,” Eng. Technol. Appl. Sci. Res., vol. 14, no. 5, pp. 17005–17010, Oct. 2024, doi: 10.48084/etasr.8266.

[15] H. Zhang, Y. Wang, Z. Li, and X. Wang, “Machine Learning Models for Bank Customer Churn Prediction: A Comparative Study of LightGBM, CatBoost, and XGBoost,” in Proceedings of the 2025 International Conference on Big Data, Artificial Intelligence and Digital Economy, Kunming China: ACM, Jul. 2025, pp. 6–16. doi: 10.1145/3767052.3767054.

[16] S. A. Alteer and A. Alariyibi, “Customer Churn Prediction Using Machine Learning: A Case Study of Libyan Internet Service Provider Company,” in 2024 IEEE 4th International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), May 2024, pp. 605–612. doi: 10.1109/MI-STA61267.2024.10599671.

[17] G. Vijh, N. Sharma, S. Tiwari, S. Vijh, and A. Sao, “Predicting Accurate Employee Performance: An Evaluation of Regression Models,” Procedia Comput. Sci., vol. 259, pp. 433–442, 2025, doi: https://doi.org/10.1016/j.procs.2025.03.345.

[18] Z. Liu and T. Kong, “Evaluation of Enterprise Internal Control Based on Artificial Intelligence,” Procedia Comput. Sci., vol. 262, pp. 1217–1227, 2025, doi: https://doi.org/10.1016/j.procs.2025.05.163.

[19] G. Vijh, N. Sharma, S. Tiwari, S. Vijh, and A. Sao, “Predicting Accurate Employee Performance: An Evaluation of Regression Models,” Procedia Comput. Sci., vol. 259, pp. 433–442, 2025, doi: 10.1016/j.procs.2025.03.345.

[20] M. Kang and H. Yim, “Unveiling employee perspectives: A comparative analysis of online reviews on Korean SMEs and large corporations,” Int. J. Inf. Manag. Data Insights, vol. 4, no. 2, p. 100268, Nov. 2024, doi: 10.1016/j.jjimei.2024.100268.

[21] M. Madanchian, H. Taherdoost, and N. Mohamed, “AI-Based Human Resource Management Tools and Techniques; A Systematic Literature Review,” Procedia Comput. Sci., vol. 229, pp. 367–377, Jan. 2023, doi: 10.1016/j.procs.2023.12.039.

Downloads

Published

2026-06-08

How to Cite

[1]
D. Maharani, “Comparing Decision Tree and Optimized LightGBM for Attrition Prediction”, JAIC, vol. 10, no. 3, pp. 2165–2177, Jun. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.