Comparison of CatBoost and LightGBM Models for Air Humidity Prediction

Authors

  • Tangkas Surya Wibawa Informatics Engineering, Dian Nuswantoro University
  • Novita Kurnia Ningrum Informatics Engineering, Dian Nuswantoro University
  • Ahmad Syahreza Informatics Engineering, Dian Nuswantoro University

DOI:

https://doi.org/10.30871/jaic.v9i3.9570

Keywords:

Air Humidty, CatBoost, LightGBM, Machine Learning, Weather Prediction

Abstract

This study uses historical weather data from the Badan Meteorologi, Klimatologi, dan Geofisika (BMKG) to evaluate the performance of two combination machine learning models, LightGBM and CatBoost, in predicting air humidity. Daily weather data including temperature, humidity, rainfall, daylight duration, and wind characteristics are included in the dataset. Among the preprocessing procedures were label encoding, normalization with MinMaxScaler, and managing missing values. Date fields' temporal information was extracted using feature engineering. Both models were optimized using GridSearchCV with three-fold cross-validation after being trained with an 80/20 split. Using R², MAE, and RMSE, the model's performance has been evaluated. CatBoost outperformed LightGBM, which received an R² score of 0.7981, with a better R² score (0.8191) and smaller prediction errors (MAE = 0.0570, RMSE = 0.0744). While feature importance analysis indicated that temperature and seasonal features were important predictors, residual plots validated the models low bias and good generalization. Both models can help with strategic decision-making in climate-sensitive businesses and salt production, according to the results, and are suitable for humidity forecasting.

Downloads

Download data is not yet available.

References

[1] R. Hartati, W. Widianingsih, B. W. RTD, M. B. Puspa, and E. Supriyo, “Analisa Air Tambak Desa Kaliwlingi sebagai Bahan Baku Produksi Garam Konsumsi,” J. Mar. Res., vol. 11, no. 4, pp. 657–666, 2022, doi: 10.14710/jmr.v11i4.35353.

[2] S. Redjeki, “Produksi Garam Industri Dari Garam Rakyat Industrial Salt Production From People’s Salt.”

[3] I. Sulistiyawati, N. L. Rahayu, M. Falah, and W. M. Endris, “Konsumsi Garam Beryodium Sebagai Upaya Preventif Penyakit Gaky Di Masyarakat.”

[4] O. Putri and T. Sugiarti, “Perkembangan dan Faktor yang Mempengaruhi Permintaan Volume Impor Garam Industri di Indonesia,” J. Ekon. Pertan. dan Agribisnis, vol. 5, no. 3, pp. 748–761, Jul. 2021, doi: 10.21776/ub.jepa.2021.005.03.13.

[5] R. Sunoko, A. Saefuddin, R. Syarief, and N. Zulbainarni, “Proteksionisme dan Standardisasi Garam Konsumsi Beryodium,” J. Kebijak. Sos. Ekon. Kelaut. dan Perikan., vol. 12, no. 2, p. 101, Dec. 2022, doi: 10.15578/jksekp.v12i2.11077.

[6] P. : Jurnal et al., “Hasnawati Amqam 147 | P a g e Kelimpahan dan Karakteristik Mikroplastik pada Produk Garam Tradisional di Kabupaten Jeneponto Abundance and Characteristic of Microplastics in Traditional Salts in Jeneponto”.

[7] E. Febriantoro, E. Setyati, and J. Santoso, “Pemodelan Prediksi Kuantitas Penjualan Mainan Menggunakan LightGBM,” SMARTICS J., vol. 9, no. 1, pp. 7–13, Apr. 2023, doi: 10.21067/smartics.v9i1.8279.

[8] P. Septiana Rizky, R. Haiban Hirzi, U. Hidayaturrohman, U. Hamzanwadi Selong Jl TGKH Muhammad Zainuddin Abdul Madjid Pancor, and L. Timur, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” 2022. [Online]. Available: www.unipasby.ac.id

[9] A. Darmawan et al., “Implementasi Catboost Menggunakan Hyper-Parameter Tuning Bayesian Search Untuk Memprediksi Penyakit Diabetes.”

[10] Andrian Febriansyah Istianto, Fajri Rakhmat Umbara, and Asep Id Hadiana, “Prediksi Curah Hujan Menggunakan Metode Categorical Boosting (Catboost),” Jul. 2023. doi: https://doi.org/10.36040/jati.v7i4.7304.

[11] O. Pahlevi, D. Ayu, N. Wulandari, L. K. Rahayu, H. Leidiyana, and Y. Handrianto, “Bulletin Of Computer Science Research Model Klasifikasi Risiko Stunting Pada Balita Menggunakan Algoritma CatBoost Classifier,” Media Online), vol. 6, no. 4, pp. 414–421, 2024, doi: 10.47065/bulletincsr.v4i6.373.

[12] E. Mumpuni, “Implementasi Shap Pada Catboost Untuk Meningkatkan Akurasi Prediksi Temperatur Udara Di Kota Pekanbaru,” 2024.

[13] Ali Armadi et al., “Pengabdian Budidaya Garam Dan Dampak Dari Peluasan Wilayah Tambak Garam Beserta Penanaman Pohon Di Desa Galis Kec. Gili Genting,” J. Pengabdi. Masy. Nusant., vol. 5, no. 3, pp. 147–152, Sep. 2023, doi: 10.57214/pengabmas.v5i3.359.

[14] M. D. Firmansyah, I. Rizqa, F. A. Rafrastara, and W. Ghozi, “Balancing CICIoV2024 Dataset with RUS for Improved IoV Attack Detection,” vol. 9, no. 2, pp. 250–257, 2025.

[15] M. T. Syamkalla, S. Khomsah, and Y. S. R. Nur, “Implementasi Algoritma Catboost Dan Shapley Additive Explanations (SHAP) Dalam Memprediksi Popularitas Game Indie Pada Platform Steam,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 4, pp. 777–786, Aug. 2024, doi: 10.25126/jtiik.1148503.

[16] S. Diantika, “Penerapan Teknik Random Oversampling Untuk Mengatasi Imbalance Class Dalam Klasifikasi Website Phishing Menggunakan Algoritma LightGBM,” 2023.

[17] L. Pappalardo, F. Simini, G. Barlacchi, and R. Pellungrini, “scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data,” J. Stat. Softw., vol. 103, no. 4, 2022, doi: 10.18637/jss.v103.i04.

[18] A. Kotelnikov, D. Baranchuk, I. Rubachev, and A. Babenko, “TabDDPM: Modelling Tabular Data with Diffusion Models,” Proc. Mach. Learn. Res., vol. 202, pp. 17564–17579, 2023.

[19] I. Arora, “Improving Performance of Data Science Applications in Python,” Indian J. Sci. Technol., vol. 17, no. 24, pp. 2499–2507, 2024, doi: 10.17485/ijst/v17i24.914.

[20] S. Jäger, A. Allhorn, and F. Bießmann, “A Benchmark for Data Imputation Methods,” Front. Big Data, vol. 4, no. July, pp. 1–16, 2021, doi: 10.3389/fdata.2021.693674.

[21] S. Khedkar, S. Lambor, Y. Narule, and P. Berad, “Categorical Embeddings for Tabular Data using PyTorch,” ITM Web Conf., vol. 56, p. 02002, 2023, doi: 10.1051/itmconf/20235602002.

[22] S. Masuda, T. Tateishi, and T. Takahashi, “Datetime Feature Recommendation Using Textual Information,” Procedia Comput. Sci., vol. 225, pp. 617–625, 2023, doi: 10.1016/j.procs.2023.10.047.

Downloads

Published

2025-06-16

How to Cite

[1]
Tangkas Surya Wibawa, N. K. Ningrum, and Ahmad Syahreza, “Comparison of CatBoost and LightGBM Models for Air Humidity Prediction”, JAIC, vol. 9, no. 3, pp. 803–809, Jun. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.