Comparison of Multiple Linear Regression and Random Forest Methods for Predicting National Rice Production in Indonesia
DOI:
https://doi.org/10.30871/jaic.v9i6.11398Keywords:
Race, Prediction, Linier Regretion, BPS, CommodityAbstract
Rice is a strategic commodity that plays an important role in maintaining national food security. However, rice production in Indonesia still fluctuates due to variations in harvest area, productivity, climate conditions, and differences in regional characteristics. This condition demands a predictive model capable of providing more accurate production estimates to support food policy planning. This research aims to predict national rice production by comparing two methods: Multiple Linear Regression and Random Forest Regression, using data from the Central Bureau of Statistics (BPS) and Nasa Power for the period 2018–2024. The analysis stages include data preprocessing, data exploration, categorical variable transformation, splitting data into training and testing sets, model training, and evaluation using Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the coefficient of determination (R²). The research results show that harvested area is the most dominant factor influencing rice production, followed by productivity, year, and province. Based on the evaluation results, Random Forest provided the best performance with an MAE value of 40,599.94, an RMSE of 77,153.07, and an R² of 0.9991. The low error value and the proximity of the prediction to the actual data indicate that this model is better at capturing non-linear patterns and inter-regional variations compared to Multiple Linear Regression. Overall, Random Forest can be an effective method for predicting national rice production and can be further developed in subsequent research by incorporating climate variables or other external factors.
Downloads
References
[1] J. P. Matematika, D. Matematika, T. N. Padilah, and R. I. Adam, “Analisis Regresi Linier Berganda Dalam Estimasi Produktivitas Tanaman Padi Di Kabupaten Karawang”.
[2] R. M. Ikhsanuddin and D. Rusvinasari, “Analisis Pengaruh Luas Area Pertanian Terhadap Prediksi Hasil Pertanian di Kebumen Menggunakan Metode Regresi Linier,” Jurnal Informatika: Jurnal Pengembangan IT, vol. 10, no. 2, pp. 410–418, Apr. 2025, doi: 10.30591/jpit.v10i2.8471.
[3] A. Rahman, M. Jafar Alamsyah, A. Amiruddin, K. Harun Rasyid, and S. Suhada, Penerapan Metode Regresi Linear Berganda Untuk Memprediksi Hasil Panen Rumput Laut, vol. 4, no. 1. 2024.
[4] M. Y. T. Sulistyono, E. S. Pane, E. M. Yuniarno, and M. H. Purnomo, “Correlation Analysis Approach Between Features and Motor Movement Stimulus for Stroke Severity Classification of EEG Signal Based on Time Domain, Frequency Domain, and Signal Decomposition Domain,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 13, no. 3, Dec. 2024, doi: 10.23887/janapati.v13i3.85550.
[5] J. M. Loban, “Analisis Regresi Faktor-Faktor Yang Mempengaruhi Hasil Produksi Padi Di Indonesia Bagian Barat,” Jul. 2023.
[6] D. Noviwiyanah and M. H. Yudhistira, “Pangan Indonesia The Effect Of Paddy Field Area On Indonesian Food Production And Consumption,” 2024.
[7] A. Bahtiar, “Prediksi Hasil Panen Padi Tahun 2023 Menggunakan Metode Regresi Linier Di Kabupaten Indramayu,” Jurnal Informatika Terpadu, vol. 9, no. 1, pp. 18–23, 2023, [Online]. Available: https://journal.nurulfikri.ac.id/index.php/JIT
[8] A. N. A. M. , M. F. F. , F. S. A. L. M. Deris Desmawan, “Dampak Pengalihan Fungsi Lahan Pertanian Menjadi Lahan Permukiman dan Industri Di Kawasan Kabupaten Bekasi,” Bursa: Jurnal Ekonomi dan Bisnis, vol. 3, no. 2, pp. 116–121, Dec. 2024.
[9] J. Hutahaean and D. Yusup, “Perbandingan Metode Linear Regression, Random Forest & K-Nearest Neighbor Untuk Prediksi Produksi Hasil Panen Padi Di Provinsi Jawa Barat,” 2024.
[10] E. Triyanto, H. Sismoro, and A. D. Laksito, “Implementasi Algoritma Regresi Linear Berganda Untuk Memprediksi Produksi Padi Di Kabupaten Bantul,” Rabit : Jurnal Teknologi dan Sistem Informasi Univrab, vol. 4, no. 2, pp. 66–75, Jul. 2019, doi: 10.36341/rabit.v4i2.666.
[11] D. Nuraini, D. Violina, D. R. Anamisa, B. K. Khotimah, A. Jauhari, and F. A. Mufarroha, “Prediksi Hasil Panen Padi dengan Metode Multiple Linear Regression dan Particle Swarm Optimization untuk Meningkatkan Produksi Padi di Madura,” JUSIFOR : Jurnal Sistem Informasi dan Informatika, vol. 4, no. 1, pp. 1–8, Jun. 2025, doi: 10.70609/jusifor.v4i1.5857.
[12] M. K. B. Seran, F. Tedy, A. N. Samane, P. Batarius, P. A. Nani, and A. A. J. Sinlae, “Analisis Data Pertanian Tanaman Pangan untuk Memprediksi Hasil Panen di Kabupaten Malaka Menggunakan Metode Multiple Linear Regression,” 2024.
[13] E. Fitri and S. N. Nugraha, “Optimasi Kinerja Linear Regression, Random Forest Regression Dan Multilayer Perceptron Pada Prediksi Hasil Panen,” INTI Nusa Mandiri, vol. 18, no. 2, pp. 210–217, Feb. 2024, doi: 10.33480/inti.v18i2.5269.
[14] E. Fitri, “Analisis Perbandingan Metode Regresi Linier, Random Forest Regression dan Gradient Boosted Trees Regression Method untuk Prediksi Harga Rumah,” Journal Of Applied Computer Science And Technology (JACOST), vol. 4, no. 1, pp. 2723–1453, 2023, doi: 10.52158/jacost.491.
[15] A. Novebrian Maharadja, I. Maulana, and B. Arif Dermawan, “Penerapan Metode Regresi Linear Berganda untuk Prediksi Kerugian Negara Berdasarkan Kasus Tindak Pidana Korupsi,” 2021. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC
[16] P. Sari Ramadhan and N. Safitri STMIK Triguna Dharma, “Penerapan Data Mining Untuk Mengestimasi Laju Pertumbuhan Penduduk Menggunakan Metode Regresi Linier Berganda Pada BPS Deli Serdang,” vol. 18, no. SAINTIKOM, pp. 55–61, 2019, [Online]. Available: https://sirusa.bps.go.id/index.php
[17] K. Mahmud Sujon, R. Binti Hassan, Z. Tusnia Towshi, M. A. Othman, M. Abdus Samad, and K. Choi, “When to Use Standardization and Normalization: Empirical Evidence from Machine Learning Models and XAI,” IEEE Access, vol. 12, pp. 135300–135314, 2024, doi: 10.1109/ACCESS.2024.3462434.
[18] T. O. Hodson, “Root mean square error (RMSE) or mean absolute error (MAE): when to use them or not,” Mar. 11, 2022. doi: 10.5194/gmd-2022-64.
[19] M. Fukushige, “Variable Selection and Variable Integration for Categorical Dummy Variables in Regression Analysis,” Annals of Data Science, 2025, doi: 10.1007/s40745-025-00607-x.
[20] L. Breiman, “Random Forests,” 2001.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sefrico Aji Nur Cahyo, MY Teguh Sulistyono

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








