Comparison of CatBoost and LightGBM Models for Air Humidity Prediction
DOI:
https://doi.org/10.30871/jaic.v9i3.9570Keywords:
Air Humidty, CatBoost, LightGBM, Machine Learning, Weather PredictionAbstract
This study uses historical weather data from the Badan Meteorologi, Klimatologi, dan Geofisika (BMKG) to evaluate the performance of two combination machine learning models, LightGBM and CatBoost, in predicting air humidity. Daily weather data including temperature, humidity, rainfall, daylight duration, and wind characteristics are included in the dataset. Among the preprocessing procedures were label encoding, normalization with MinMaxScaler, and managing missing values. Date fields' temporal information was extracted using feature engineering. Both models were optimized using GridSearchCV with three-fold cross-validation after being trained with an 80/20 split. Using R², MAE, and RMSE, the model's performance has been evaluated. CatBoost outperformed LightGBM, which received an R² score of 0.7981, with a better R² score (0.8191) and smaller prediction errors (MAE = 0.0570, RMSE = 0.0744). While feature importance analysis indicated that temperature and seasonal features were important predictors, residual plots validated the models low bias and good generalization. Both models can help with strategic decision-making in climate-sensitive businesses and salt production, according to the results, and are suitable for humidity forecasting.
Downloads
References
[1] R. Hartati, W. Widianingsih, B. W. RTD, M. B. Puspa, and E. Supriyo, “Analisa Air Tambak Desa Kaliwlingi sebagai Bahan Baku Produksi Garam Konsumsi,” J. Mar. Res., vol. 11, no. 4, pp. 657–666, 2022, doi: 10.14710/jmr.v11i4.35353.
[2] S. Redjeki, “Produksi Garam Industri Dari Garam Rakyat Industrial Salt Production From People’s Salt.”
[3] I. Sulistiyawati, N. L. Rahayu, M. Falah, and W. M. Endris, “Konsumsi Garam Beryodium Sebagai Upaya Preventif Penyakit Gaky Di Masyarakat.”
[4] O. Putri and T. Sugiarti, “Perkembangan dan Faktor yang Mempengaruhi Permintaan Volume Impor Garam Industri di Indonesia,” J. Ekon. Pertan. dan Agribisnis, vol. 5, no. 3, pp. 748–761, Jul. 2021, doi: 10.21776/ub.jepa.2021.005.03.13.
[5] R. Sunoko, A. Saefuddin, R. Syarief, and N. Zulbainarni, “Proteksionisme dan Standardisasi Garam Konsumsi Beryodium,” J. Kebijak. Sos. Ekon. Kelaut. dan Perikan., vol. 12, no. 2, p. 101, Dec. 2022, doi: 10.15578/jksekp.v12i2.11077.
[6] P. : Jurnal et al., “Hasnawati Amqam 147 | P a g e Kelimpahan dan Karakteristik Mikroplastik pada Produk Garam Tradisional di Kabupaten Jeneponto Abundance and Characteristic of Microplastics in Traditional Salts in Jeneponto”.
[7] E. Febriantoro, E. Setyati, and J. Santoso, “Pemodelan Prediksi Kuantitas Penjualan Mainan Menggunakan LightGBM,” SMARTICS J., vol. 9, no. 1, pp. 7–13, Apr. 2023, doi: 10.21067/smartics.v9i1.8279.
[8] P. Septiana Rizky, R. Haiban Hirzi, U. Hidayaturrohman, U. Hamzanwadi Selong Jl TGKH Muhammad Zainuddin Abdul Madjid Pancor, and L. Timur, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” 2022. [Online]. Available: www.unipasby.ac.id
[9] A. Darmawan et al., “Implementasi Catboost Menggunakan Hyper-Parameter Tuning Bayesian Search Untuk Memprediksi Penyakit Diabetes.”
[10] Andrian Febriansyah Istianto, Fajri Rakhmat Umbara, and Asep Id Hadiana, “Prediksi Curah Hujan Menggunakan Metode Categorical Boosting (Catboost),” Jul. 2023. doi: https://doi.org/10.36040/jati.v7i4.7304.
[11] O. Pahlevi, D. Ayu, N. Wulandari, L. K. Rahayu, H. Leidiyana, and Y. Handrianto, “Bulletin Of Computer Science Research Model Klasifikasi Risiko Stunting Pada Balita Menggunakan Algoritma CatBoost Classifier,” Media Online), vol. 6, no. 4, pp. 414–421, 2024, doi: 10.47065/bulletincsr.v4i6.373.
[12] E. Mumpuni, “Implementasi Shap Pada Catboost Untuk Meningkatkan Akurasi Prediksi Temperatur Udara Di Kota Pekanbaru,” 2024.
[13] Ali Armadi et al., “Pengabdian Budidaya Garam Dan Dampak Dari Peluasan Wilayah Tambak Garam Beserta Penanaman Pohon Di Desa Galis Kec. Gili Genting,” J. Pengabdi. Masy. Nusant., vol. 5, no. 3, pp. 147–152, Sep. 2023, doi: 10.57214/pengabmas.v5i3.359.
[14] M. D. Firmansyah, I. Rizqa, F. A. Rafrastara, and W. Ghozi, “Balancing CICIoV2024 Dataset with RUS for Improved IoV Attack Detection,” vol. 9, no. 2, pp. 250–257, 2025.
[15] M. T. Syamkalla, S. Khomsah, and Y. S. R. Nur, “Implementasi Algoritma Catboost Dan Shapley Additive Explanations (SHAP) Dalam Memprediksi Popularitas Game Indie Pada Platform Steam,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 4, pp. 777–786, Aug. 2024, doi: 10.25126/jtiik.1148503.
[16] S. Diantika, “Penerapan Teknik Random Oversampling Untuk Mengatasi Imbalance Class Dalam Klasifikasi Website Phishing Menggunakan Algoritma LightGBM,” 2023.
[17] L. Pappalardo, F. Simini, G. Barlacchi, and R. Pellungrini, “scikit-mobility: A Python Library for the Analysis, Generation, and Risk Assessment of Mobility Data,” J. Stat. Softw., vol. 103, no. 4, 2022, doi: 10.18637/jss.v103.i04.
[18] A. Kotelnikov, D. Baranchuk, I. Rubachev, and A. Babenko, “TabDDPM: Modelling Tabular Data with Diffusion Models,” Proc. Mach. Learn. Res., vol. 202, pp. 17564–17579, 2023.
[19] I. Arora, “Improving Performance of Data Science Applications in Python,” Indian J. Sci. Technol., vol. 17, no. 24, pp. 2499–2507, 2024, doi: 10.17485/ijst/v17i24.914.
[20] S. Jäger, A. Allhorn, and F. Bießmann, “A Benchmark for Data Imputation Methods,” Front. Big Data, vol. 4, no. July, pp. 1–16, 2021, doi: 10.3389/fdata.2021.693674.
[21] S. Khedkar, S. Lambor, Y. Narule, and P. Berad, “Categorical Embeddings for Tabular Data using PyTorch,” ITM Web Conf., vol. 56, p. 02002, 2023, doi: 10.1051/itmconf/20235602002.
[22] S. Masuda, T. Tateishi, and T. Takahashi, “Datetime Feature Recommendation Using Textual Information,” Procedia Comput. Sci., vol. 225, pp. 617–625, 2023, doi: 10.1016/j.procs.2023.10.047.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Tangkas Surya Wibawa, Novita Kurnia Ningrum, Ahmad Syahreza

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








