A Probabilistic Ensemble-Based Decision Support Framework for Teacher Promotion Assessment

Authors

  • Ray Eka Novasani Universitas Dian Nuswantoro
  • MY Teguh Sulistyono Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v10i2.12356

Keywords:

Brier Score, Ensemble Learning, Probabilistic Prediction, Teacher Promotion, Machine Learning

Abstract

This study proposes a probabilistic ensemble-based decision support framework for analyzing teacher promotion eligibility within the institutional Credit Point Assessment system. The dataset consists of 20 finalized teacher promotion records collected retrospectively from the institutional personnel administration unit covering the 2022–2024 assessment period. All personal identifiers were removed prior to analysis to ensure ethical compliance and data confidentiality. Data preprocessing included categorical variable transformation using One-Hot Encoding and numerical feature standardization through Min–Max normalization. The dataset was divided using stratified sampling to preserve class distribution, and preprocessing procedures were applied exclusively to the training data to prevent data leakage. Probabilistic predictions were generated using Random Forest and Extreme Gradient Boosting (XGBoost), and combined through a soft voting ensemble strategy to enhance robustness. Model performance was evaluated using confusion-matrix-based metrics, ROC-AUC, and probability calibration analysis through the Brier Score. Among the evaluated models, XGBoost achieved the lowest Brier Score (0.2034), indicating superior probability calibration, while the ensemble model demonstrated more stable classification behavior. Feature importance analysis identified cumulative credit points and professional development activities as dominant predictors, whereas demographic attributes showed minimal influence. Rather than serving as an automated decision-making mechanism, the proposed framework functions as a decision-support tool by providing interpretable probability estimates of promotion eligibility. Given the limited sample size and institutional data constraints, findings are intended to support analytical interpretation within a specific organizational context rather than broad predictive generalization.

Downloads

Download data is not yet available.

References

1] A. Widayati, J. MacCallum, and A. Woods-McConney, “Teachers’ perceptions of continuing professional development: a study of vocational high school teachers in Indonesia,” Teach. Dev., vol. 25, no. 5, pp. 604–621, 2021, doi: 10.1080/13664530.2021.1933159.

[2] I. Rahmi and S. Rassanjani, “Enhancing teacher quality in Indonesia: The impact of teacher professional development on achieving sustainable development goal 4.c,” Soc. Sci. Humanit. Open, vol. 12, no. October, p. 102123, 2025, doi: 10.1016/j.ssaho.2025.102123.

[3] V. Pasupuleti, B. Thuraka, C. S. Kodete, and S. Malisetty, “Enhancing Supply Chain Agility and Sustainability through Machine Learning: Optimization Techniques for Logistics and Inventory Management,” Logistics, vol. 8, no. 3, 2024, doi: 10.3390/logistics8030073.

[4] D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, pp. 1–13, 2020, doi: 10.1186/s12864-019-6413-7.

[5] A. Whata, K. Dibeco, K. Madzima, and I. Obagbuwa, “Uncertainty quantification in multi-class image classification using chest X-ray images of COVID-19 and pneumonia,” Front. Artif. Intell., vol. 7, no. Ml, 2024, doi: 10.3389/frai.2024.1410841.

[6] A. J. Zeleke, P. Palumbo, P. Tubertini, R. Miglio, and L. Chiari, “Machine learning-based prediction of hospital prolonged length of stay admission at emergency department: a Gradient Boosting algorithm analysis,” Front. Artif. Intell., vol. 6, 2023, doi: 10.3389/frai.2023.1179226.

[7] C. C. Bey Lirna, T. Trimono, and A. T. Damaliana, “Employee Voluntary Attrition Prediction At Pt.Xyz: Ensemble Machine Learning Approach With Soft Voting Classifier,” J. Tek. Inform., vol. 5, no. 5, pp. 1231–1239, 2024, doi: 10.52436/1.jutif.2024.5.5.2007.

[8] A. T. Wibowo, M. Y. Teguh Sulistyono, and M. Hariadi, “Cryptospatial Coordinate Using The Rpca Based On A Point In Polygon Test For Cultural Heritage Tourism,” Commun. - Sci. Lett. Univ. Žilina, vol. 22, no. 4, pp. 211–217, 2020, doi: 10.26552/com.C.2020.4.211-217.

[9] T. Kavzoglu and A. Teke, “Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost),” Arab. J. Sci. Eng., vol. 47, pp. 7367–7385, Jun. 2022, doi: 10.1007/s13369-022-06560-8.

[10] A. D. Kayit and M. T. Ismail, “Advancing stock price prediction through the development of hybrid ensembles: a comprehensive comparative analysis of machine learning approaches,” J. Big Data, vol. 12, Dec. 2025, doi: 10.1186/s40537-025-01185-8.

[11] T. K. Robby and Noviyanti, “Analisis Proses Kenaikan Jenjang pada Jabatan Fungsional Pranata Humas di DPRD Kabupaten Sidoarjo,” https://journal.unesa.ac.id/index.php/innovant/article/view/27339.

[12] M. N. Ambarita, M. Nasution, and R. Mutiah, “Analisis Prediksi Prestasi Siswa UPTD SD Negeri 30 Aek Batu dalam Machine Learning dengan Metode Naive Bayes,” J. Syntax Admiration, vol. 5, no. 8, pp. 3167–3177, 2024, doi: 10.46799/jsa.v5i8.1493.

[13] A. Thakur et al., “Product Length Predictions with Machine Learning: An Integrated Approach Using Extreme Gradient Boosting,” SN Comput. Sci., vol. 5, Aug. 2024, doi: 10.1007/s42979-024-02999-8.

[14] J. H. Hasugian and J. E. Situmorang, “Sosialisasi Perhitungan Dan Penilaian Angka Kredit Berdasarkan Permenpanrb Nomor 1 Tahun 2023bagi Guru - Guru TK, SD. SMP Se Kota Pematang SIANTAR,” J. Pengabdi. Masy. Sapangambei Manoktok Hitei, vol. 3, pp. 166–174, Oct. 2023, doi: 10.36985/jy890y05.

[15] P. Koukaras and C. Tjortjis, “Data Preprocessing and Feature Engineering for Data Mining: Techniques, Tools, and Best Practices,” Oct. 2025, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/ai6100257.

[16] O. Alshboul, A. Shehadeh, G. Almasabha, and A. S. Almuflih, “Extreme Gradient Boosting-Based Machine Learning Approach for Green Building Cost Prediction,” Sustain., vol. 14, Jun. 2022, doi: 10.3390/su14116651.

[17] Z. Zhou, C. Qiu, and Y. Zhang, “A comparative analysis of linear regression, neural networks and random forest regression for predicting air ozone employing soft sensor models,” Sci. Rep., vol. 13, Dec. 2023, doi: 10.1038/s41598-023-49899-0.

[18] J. Yang, A. A. S. Soltan, and D. A. Clifton, “Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening,” npj Digit. Med., vol. 5, Dec. 2022, doi: 10.1038/s41746-022-00614-9.

[19] I. D. Mienye and Y. Sun, “A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects,” 2022, Institute of Electrical and Electronics Engineers Inc. doi: 10.1109/ACCESS.2022.3207287.

[20] Gullam Almuzadid and Egia Rosi Subhiyakto, “Stroke Risk Classification Using the Ensemble Learning Method of XGBoost and Random Forest,” J. Appl. Informatics Comput., vol. 9, pp. 828–837, Jun. 2025, doi: 10.30871/jaic.v9i3.9528.

[21] S. Demir and E. K. Sahin, “An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost,” Neural Comput. Appl., vol. 35, pp. 3173–3190, Feb. 2023, doi: 10.1007/s00521-022-07856-4.

[22] A. S. Hairani, R. Cahyono, and A. A. Balta, “Sistem Pendidikan Kinerja Siswa Berbasis Web Menggunakan Algoritma Decision Tree dan XGBost,” vol. 5, no. 2, pp. 323–330, 2025.

[23] R. Ahmadian, M. Ghatee, and J. Wahlström, “Superior Scoring Rules for Probabilistic Evaluation of Single-Label Multi-Class Classification Tasks,” Jul. 2024, doi: 10.1016/j.ijar.2025.109421.

[24] I. N. Bhakti, A. Z. Sholikhin, M. Abi, E. Daniati, and A. Ristyawan, “inotek,+1155-1164+S24-0030+Klasifikasi+Kategori+Berita+Menggunakan+Naive+Bayes,” pp. 1155–1164, 2024.

[25] A. S. Alfath, A. K. Wardhana, and R. Rumini, “Hypertension Risk Prediction Using Stacking Ensemble of CatBoost, XGBoost, and LightGBM: A Machine Learning Approach,” J. Appl. Informatics Comput., vol. 9, pp. 3146–3156, Dec. 2025, doi: 10.30871/jaic.v9i6.10370.

[26] D. B. M. Siahaan, E. C. Bagre, J. I. Wanda, G. Silahooy, and H. Sutejo, “Implementation Of Naïve Bayes Algorithm On The Eligibility Of Kartu Indonesia Pintar Scholarship (Case Study: University Of Sepuluh Nopember Papua),” J. Ilm. Sist. Inf., vol. 4, pp. 191–204, Dec. 2025, doi: 10.51903/rdzdm469.

[27] Z. Z. Hulaifah Al Abrori and E. R. Subhiyakto, “Analisis Komparatif Akurasi Prediksi Kanker Payudara Menggunakan Algoritma Random Forest dan Logistic Regression,” J. Algoritm., vol. 22, pp. 300–311, May 2025, doi: 10.33364/algoritma/v.22-1.2164.

[28] M. Salsabila, “Pendekatan visual analytics dalam pemodelan prediksi cacat perangkat lunak menggunakan kombinasi pca dan smote,” https://repository.uinjkt.ac.id/dspace/handle/123456789/65279.

[29] R. D. Yuniarsyih R.A, R. A. Muhadi, A. Fitrianto, and P. Silvianti, “Analisis Regresi Logistik Biner dan Random Forest untuk Prediksi Faktor-Faktor Stunting di Pulau Jawa,” Euler J. Ilm. Mat. Sains dan Teknol., vol. 13, no. 2, pp. 147–156, 2025, doi: 10.37905/euler.v13i2.31680.

Downloads

Published

2026-04-16

How to Cite

[1]
R. E. Novasani and M. T. Sulistyono, “A Probabilistic Ensemble-Based Decision Support Framework for Teacher Promotion Assessment”, JAIC, vol. 10, no. 2, pp. 1369–1382, Apr. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.