Stacking of DT, RF, and Gradient Boosting Algorithms for Classification of Building Damage Due to Earthquakes
DOI:
https://doi.org/10.30871/jaic.v9i6.11272Keywords:
Building Damage, Earthquake Clasification, Ensemble Stacking, ADASYNAbstract
Classification of building damage levels due to earthquakes is an important aspect in disaster mitigation and post-disaster risk assessment. This study aims to improve classification accuracy on imbalanced data using an ensemble stacking method. It combines Decision Tree, Random Forest, and Gradient Boosting algorithms, with Logistic Regression as a meta-learner. The building damage dataset from the 2015 Gorkha Nepal earthquake underwent data cleaning, categorical transformation, normalization, and balancing using ADASYN. Evaluation showed that Random Forest was the best single model. The stacking model achieved the highest accuracy of 91.77% after balancing. These results show that stacking improves generalization and classification accuracy on imbalanced data. This suggests significant potential for integration into disaster decision-support systems that require fast, accurate building-damage assessment.
Downloads
References
[1] Badan Geologi. (2021). Peta sumber dan bahaya gempa Indonesia 2021. Pusat Vulkanologi dan Mitigasi Bencana Geologi, Kementerian Energi dan Sumber Daya Mineral. https://www.esdm.go.id
[2] Badan Meteorologi, Klimatologi, dan Geofisika. (2022). Informasi gempa terkini dan sesar aktif di Indonesia. https://www.bmkg.go.id.
[3] Winarsih, S., et al. (2025). Optimizing earthquake damage prediction using particle swarm optimization-based feature selection. Jurnal Informatika dan Komputer, 11(1), 77–86.
[4] Dachi, M. A., & Sitompul, O. S. (2023). Penerapan metode ensemble learning untuk klasifikasi data menggunakan stacking, bagging, dan boosting. Jurnal Teknologi Informasi dan Ilmu Komputer, 10(2), 121–130.
[5] Joses, Y. S., Yulvida, E., & Rochimah, S. (2024). Ensemble learning menggunakan stacking untuk meningkatkan kinerja prediksi pada data tidak seimbang. Jurnal Teknologi dan Sistem Komputer, 12(1), 89–97.
[6] DrivenData. (2020). Richter’s predictor: Modelling earthquake damage. https://www.drivendata.org/competitions/57/nepal-earthquake/
[7] Buhl, N. (2023). Mastering data cleaning & data preprocessing. Encord. Retrieved May 26, 2025, from https://encord.com/blog/data-cleaning-data-preprocessing/
[8] Dibimbing.id. (2025, Maret 24). One Hot Encoding adalah: Arti, Manfaat, dan Penerapannya. Retrieved May 26, 2025, dari https://dibimbing.id/blog/detail/one-hot-encoding-adalah
[9] Monika, A. P., Risti, F. E. P., Binanto, I., & Sianipar, N. F. (2023). Perbandingan algoritma klasifikasi Random Forest, Gaussian Naive Bayes, dan K-Nearest Neighbor untuk data tidak seimbang dan data yang diseimbangkan dengan metode Adaptive Synthetic pada dataset LCMS tanaman keladi tikus. Jurnal Seminar Nasional Teknik Elektro, Informatika & Sistem Informasi (SINTaKS), 3–7.
[10] M. Ibrahim, “Evolution of Random Forest from Decision Tree and Bagging: A Bias–Variance Perspective,” Dhaka University Journal of Applied Science and Engineering, vol. 7, no. 1, pp. 66–71, 2022. doi: 10.3329/dujase.v7i1.62888
[11] R. Zuhri, Kusrini, and D. Ariatmanto, “Analisis perbandingan algoritma klasifikasi untuk identifikasi diabetes dengan menggunakan metode Random Forest dan Naive Bayes,“ Jurnal Inovasi Teknologi dan Sains (JINTEKS), vol. 4, no. 2, pp. 222-230,2022. [Online]. Available: https://www.jurnal.uts.ac.id/index.php/JINTEKS/article/view/5146
[12] L. W. Rizkallah, “Enhancing the performance of gradient boosting trees on regression problems,” Journal of Big Data, vol. 12, art. no. 35, pp. 1–14, 2025. doi: 10.1186/s40537-025-01071-3
[13] W. N. Ismail and H. A. Alsalamah, “GA-Stacking: A New Stacking-Based Ensemble Learning Method to Forecast the COVID-19 Outbreak,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 2, pp. 202–210, 2023.
[14] Swaminathan, S., & Tantri, B. R. (2024). Confusion matrix-based performance evaluation metrics. African Journal of Biomedical Research, 27(4S), 4023–4031.
[15] Abubakar, P. (2025, May 8). Evaluation metrics in machine learning: Accuracy, precision, recall & f1-score. Medium. https://medium.com/@abubakarp789/evaluatiom-metrics-in-machine-learning-accuracy-precision-recall-f1-score-c4c4e553677a
[16] Haya, A., & Ramme, M. Y. (2024). Penerapan algoritma stacking ensemble machine learning berbasis pohon untuk prediksi penyakit diabetes. Prosiding Seminar Nasional Sains Data, 4(1), 954–961.
[17] Ghimire, S., Gueguen, P., Giffard-Roisin, S., & Schorlemmer, D. (2022). Testing machine learning models for seismic damage prediction at a regional scale using a building damage dataset collected after the 2015 Gorkha, Nepal earthquake. Earthquake Spectra, 38(4), 2970–2993.
[18] M. Ahmed, A. Khan, and S. Hussain, “An improved adaptive synthetic sampling approach for imbalanced data classification,” Expert Systems with Applications, vol. 206, p. 117816, 2022, doi: 10.1016/j.eswa.2022.117816
[19] E. Elgeldawi and A. M. Zaki, “Hyperparameter Tuning for Machine Learning Algorithms: A Comprehensive Comparative Analysis,” Informatics, vol. 8, no. 4, p. 79, 2021. [Online]. Available: https://doi.org/10.3390/informatics8040079
[20] A. Ben-David, D. Lustgarten, and Y. Koren, “High Per Parameter: A Large-Scale Study of Hyperparameter Tuning for Machine Learning Algorithms,” Algorithms, vol. 15, no. 9, p. 315, 2022. [Online]. Available: https://www.mdpi.com/1999-4893/15/9/315
[21] M. Saarela and S. Jauhiainen, “Comparison of feature importance measures as explanations for classification models,” SN Applied Sciences, vol. 3, no. 2, pp. 41–48, 2021, https://doi.org/10.1007/s42452-021-04148-9.
[22] J. Zhou, A. H. Gandomi, F. Chen, and A. Holzinger, “Evaluating the quality of machine learning explanations: A survey on methods and metrics,” Electronics, vol. 10, no. 5, p. 593, 2021, https://doi.org/10.3390/electronics10050593
[23] M. I. Prasetiyowati, N. U. Maulidevi, and K. Surendro, “Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest,” Journal of Big Data, vol. 8, no. 1, p. 84, 2021.
[24] A. Alsahaf, A. A. Bakar, and Z. A. Othman, “A framework for feature selection through boosting,” Expert Syst. Appl., vol. 189, p. 116140, 2022.
[25] M. I. Prasetiyowati, N. U. Maulidevi, and K. Surendro, “Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest,” Journal of Big Data, vol. 8, no. 84, pp. 1–24, 2021.
[26] C. Arnold, “The role of hyperparameters in machine learning models and how to tune them,” Political Science Research and Methods, vol. 12, no. 4, pp. 841–848, 2024, doi: 10.1017/psrm.2023.61.
[27] D. V. Ramadhanti, “Perbandingan SMOTE dan ADASYN pada data imbalance,” Jurnal Gaussian, vol. 11, no. 4, pp. 503–510, 2022.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nur Aqliah Ilmi, Nurul Anisa Sri Winarsih

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








