Flood Status Prediction Based on Water Level Data Using Machine Learning Models

Authors

  • Aisyah Putri Widyastuti Universitas Dian Nuswantoro
  • Sindhu Rakasiwi Universitas Dian Nuswantoro

DOI:

https://doi.org/10.30871/jaic.v10i3.12809

Keywords:

ADASYN, Flood, Hyperparameter Tuning, Machine Learning, predict

Abstract

Flooding is one of the hydrometeorological disasters that frequently occurs in Indonesia and causes various social and economic losses. This study aims to compare the performance of five machine learning algorithms, namely Random Forest, Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Logistic Regression, as well as one Long Short-Term Memory (LSTM) deep learning model in predicting flood status based on water level data from seven observation posts in the DKI Jakarta area and its surroundings. The research stages include data preprocessing, handling unbalanced data using ADASYN, hyperparameter tuning, and evaluation using accuracy, precision, recall, and F1-score. To avoid data leakage, the data division process is carried out before preprocessing and oversampling. The results show that XGBoost produces the best performance with 96.0% accuracy, 95.5% precision, 96.9% recall, and 96.2% F1-score after hyperparameter tuning. The LSTM model also demonstrated competitive performance with an accuracy of 94.5% and an F1-score of 94.5%. Learning curve analysis showed that all models exhibited normal learning patterns with no indication of data leakage. The results indicate that XGBoost and LSTM have good potential for application in flood early warning systems based on water level data.

Downloads

Download data is not yet available.

References

[1] M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” vol. 5, no. 2, 2021.

[2] J. Akbar, M. Ali, and S. Yudono, “Water Level Classification for Detect Flood Disaster Status using KNN and SVM,” vol. 13, pp. 298–302, 2024.

[3] M. E. El-mahdy, F. Ali, F. Ibraheem, A. Fakhry, and A. El-tantawi, “Flood classification and prediction in South Sudan using artificial intelligence models under a changing climate,” Alexandria Eng. J., vol. 97, no. March, pp. 127–141, 2024, doi: 10.1016/j.aej.2024.03.082.

[4] M. D. Wilson, E. M. Lane, and J. Brasington, “Estimating uncertainty in flood model outputs using machine learning informed by Monte Carlo analysis,” J. Hydrol., vol. 662, no. PC, p. 133928, 2025, doi: 10.1016/j.jhydrol.2025.133928.

[5] Y. Tang, Y. Sun, Z. Han, S. Soomro, and Q. Wu, “Journal of Hydrology : Regional Studies flood forecasting based on machine learning pattern recognition and dynamic migration of parameters,” J. Hydrol. Reg. Stud., vol. 47, p. 101406, 2023, doi: 10.1016/j.ejrh.2023.101406.

[6] N. Rahmadani, A. S. Handayani, and I. Hadi, “Penerapan Algoritma Random Forest untuk Memprediksi Curah Hujan pada Masa Mendatang di Daerah Berpotensi Banjir,” vol. 6, no. 2, pp. 1222–1230, 2024, doi: 10.47065/bits.v6i2.5593.

[7] V. Frendyatha, M. Akrom, and G. Alfa, “Investigasi Efisiensi Penghambatan Korosi Senyawa Quinoxaline Berbasis Machine Learning A Study on the Corrosion Inhibition Efficiency of Quinoxaline Compounds Utilizing Machine Learning,” vol. 21, no. 2, pp. 65–69, 2024.

[8] S. Samantaray, A. Sahoo, and A. Agnihotri, “MethodsX Prediction of Flood Discharge Using Hybrid PSO-SVM Algorithm in Barak River Basin,” MethodsX, vol. 10, no. February, p. 102060, 2023, doi: 10.1016/j.mex.2023.102060.

[9] M. F. Oemarki et al., “Perbandingan Akurasi Metode Support Vector Machine Dan K-Nearest Neighbour Dalam Prediksi Curah Hujan,” no. April, pp. 160–167, 2024.

[10] C. Mondal and J. Uddin, “Heliyon Classification of short-term flood events using stochastic variable selection and Gaussian Naïve Bayes classifier : A case study of Sirajganj district , Bangladesh,” vol. 11, no. October 2024, 2025.

[11] S. Cumel, David Zamri, Rahmaddeni, “Perbandingan Metode Data Mining untuk Prediksi Banjir dengan Algoritma Naïve Bayes dan KNN,” SENTIMAS Semin. Nas. Penelit. dan …, pp. 40–48, 2022, [Online]. Available: https://journal.irpi.or.id/index.php/sentimas/article/download/353/132

[12] S. M. Natzir, “Perbandingan Kinerja Model Pembelajaran Mesin dalam Prediksi Banjir menggunakan KNN , Naive Bayes , dan Random Forest,” vol. 14, no. c, pp. 59–64, 2023.

[13] M. Bagas, A. Darmawan, F. Dewanta, and S. Astuti, “Analisis Perbandingan Algoritma Decision Tree , Random Forest , dan Naïve Bayes untuk Prediksi Banjir di Desa Dayeuhkolot Comparative Analysis of Decision Tree , Random Forest , and Naïve Bayes Algorithm for Flood Prediction at Dayeuhkolot Village,” vol. 9, no. 1, pp. 52–61.

[14] D. R. Forest, W. H. Sasoko, E. W. Pujiharto, R. Haris, and A. Y. Kania, “Prediksi Banjir Di Dki Jakarta Dengan Menggunakan Algoritma K-Means,” vol. 05, no. 01, pp. 43–49, 2024.

[15] B. Iklim and D. I. Indonesia, “Analisa perbandingan algoritma random forest dan naïve bayes untuk klasifikasi curah hujan berdasarkan iklim di indonesia,” vol. 9, no. 1, pp. 158–167, 2024.

[16] D. I. Bandar, L. Menggunakan, and R. Forest, “Analisis faktor dan pola kejadian banjir di bandar lampung menggunakan arima, random forest, dan xgboost,” vol. 3, no. 2.

[17] R. Handayan and W. Prarikeslan, “Analisis Multikriteria Dan Regresi Logistik Terhadap Kerawanan Banjir Di Kecamatan Koto Xi Tarusan, Kabupaten Pesisir Selatan,” vol. 10, no. September, pp. 240–262, 2025.

[18] A. Wijayanto, A. Sugiharto, R. Santoso, U. Diponegoro, and P. Korespondensi, “Identifikasi Dini Curah Hujan Berpotensi Banjir Menggunakan Algoritma Long Short-Term Memory ( Lstm ) Dan Isolation Forest Early Identification Of Rainfall With Flood Potential Using Long Short-Term Memory ( Lstm ) And Isolation Forest Algorithms Case Stu,” vol. 11, no. 3, 2024, doi: 10.25126/jtiik.938718.

[19] H. Kardhana, P. Dwi, R. Deno, F. Immaddudin, and W. Rohmat, “From unreliable observations to reliable forecasts : Enhancing Jakarta flood prediction using HEC-HMS-assisted LSTM modeling,” Environ. Challenges, vol. 23, no. December 2025, p. 101464, 2026, doi: 10.1016/j.envc.2026.101464.

[20] T. Hermawan and E. Zuliarso, “Perbandingan Metode Recurrent Neural Network ( RNN ) dan Long Short-Term Memory ( LSTM ) untuk Prediksi Curah Hujan,” vol. 7, no. 2, pp. 1450–1463, 2025, doi: 10.47065/bits.v7i2.8099.

[21] N. M. Lefi, M. Rahardi, F. I. Komputer, and U. A. Yogyakarta, “Hyperparameter Optimization and Feature Selection Analysis on the XGBoost Model for Hepatitis C Infection Prediction,” vol. 9, no. 6, pp. 3338–3345, 2025.

[22] Nurhidayat, “Banjir, Rob, Genangan ( Penyebab, Dampak, dan Upaya Penanggulangan ),” kmsmbkg. [Online]. Available: https://kms.bmkg.go.id/2025/11/banjir-rob-dan-genangan-penyebab-dampak-dan-upaya-penanggulangan/

[23] S. Jurnal, “(SINTEK) Analisis Sentimen Bencana Banjir Sumatera Menggunakan Tf-Idf Dan Logistic Regression,” vol. VI, no. 1.

[24] F. Prone, A. Using, F. Cross, and N. Neighbors, “Klasifikasi Daerah Rawan Banjir menggunakan 10 - Fold Cross Validation dan K - Nearest Neighbors,” vol. 13, pp. 315–323, 2024.

[25] Y. D. Evitasari, “Evaluasi Support Vector Machine Dengan Optimasi Metode Genetic Algorithm Pada Klasifikasi Banjir Kota Samarinda Evaluation Support Vector Machine With Optimization Genetic Algorithm Method On Flood Classification In Samarinda,” vol. 6, no. 1, pp. 49–53, 2023.

[26] A. H. Zuhairi, F. Yakub, S. Member, and M. A. S. Omar, “Performance Analysis of Tree-Based Ensemble Machine Learning Model for Flood Forecasting in Tropical Regions,” IEEE Access, vol. 13, no. September, pp. 200840–200860, 2025, doi: 10.1109/ACCESS.2025.3632033.

Downloads

Published

2026-06-18

How to Cite

[1]
A. Putri Widyastuti and S. Rakasiwi, “Flood Status Prediction Based on Water Level Data Using Machine Learning Models”, JAIC, vol. 10, no. 3, pp. 3046–3059, Jun. 2026.

Most read articles by the same author(s)

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.