Improving the Accuracy of Obesity Classification Using a Stacking Classifier on Imbalanced Data with SMOTE
DOI:
https://doi.org/10.30871/jaic.v10i1.11928Keywords:
Obesity Classification, Machine Learning, Stacking Classifier, SMOTE, Tuning HyperparameterAbstract
Overweight continues to be a prevalent public health problem related to lifestyle behavior, eating behaviour and physical activity. The aim of this work is to develop a generalized and robust machine learning model having a high accuracy for categorizing obesity-level. The study applies to the Obesity Dataset with 1610 members and some preprocessing methods such selected data cleaning, categorical attributes transformation, train/test data set split and class imbalance under utilization of SMOTE approach. The modeling process is based on two base learners namely an optimized Random Forest and Gaussian Naïve Bayes that are fused by Stacking Classifier while using Logistic Regression as the meta-model. Experimental results show that the performance of stacking is the best where it obtains an accuracy rate of 86.34%, outperforming each single model. The analysis also reveals enhancements of various classification measures: stacking can indeed model complex non-linear dependencies between instances as well as simple linear ones. In general, the results serve to demonstrate that stacking-based ensemble learning is a strong solution for predicting obesity level and holds promise against early risk detection in preventive health care systems.
Downloads
References
[1] Prakoso, R. N., Rochim, S. I., Subarnas, A., & Kurniawan, M. E. (2025). Perbandingan Algoritma Naïve Bayes Dan Random Forest Dalam Klasifikasi Obesitas Berdasarkan Faktor Gaya Hidup. Journal of Information Engineering and Educational Technology, 9(1), 11–18. https://doi.org/10.26740/jieet.v9n1.p11-18
[2] Dwi, E., Aini, N., Khasanah, R. A., Ristyawan, A., Diniati, E., Nusantara, U., & Kediri, P. (2024). Penggunaan Data Mining untuk Prediksi tingkat Obesitas di Meksiko Menggunakan Metode Random Forest. In Agustus (Vol. 8). Online.
[3] Maryani, I., & Irmayansyah, I. (2023). Penerapan Algoritma Naïve Bayes Untuk Penentuan Diagnosa Obesitas Pada Peserta Sosialisasi Deteksi Dini Penyakit Tidak Menular (PTM). TeknoIS : Jurnal Ilmiah
[4] Saraswati, S. K., Rahmaningrum, F. D., Pahsya, M. N. Z., Paramitha, N., Wulansari, A., Ristantya, A. R., Sinabutar, B. M., Pakpahan, V. E., & Nandini, N. (2021). Literature Review : Faktor Risiko Penyebab Obesitas. MEDIA KESEHATAN MASYARAKAT INDONESIA, 20(1), 70–74. https://doi.org/10.14710/mkmi.20.1.70-74
[5] Emilia Sukmawati, C., Fitri Nur Masruriyah, A., Ratna Juwita, A., Damaiarta Tejayanda, R., Nurmayanti, T., Korespondensi, P., & Buana Perjuangan Karawang Jl Ronggowaluyo, U. H. (2024). Efektivitas algoritma AdaBoost dan XGBoost pada dataset obesitas populasi dewasa. Jambura Journal of Informatics, 6(2), 101–111. https://doi.org/10.37905/jji
[6] Novianti, N., Alkadri, S. P. A., & Fakhruzi, I. (2024). Klasifikasi Penyakit Hipertensi Menggunakan Metode Random Forest. Progresif: Jurnal Ilmiah Komputer, 20(1), 380. https://doi.org/10.35889/progresif.v20i1.1663
[7] Aulia, Y., Andriyansyah, A., Suharjito, S., & Nensi, S. W. (2024). Analisis Prediksi Stroke dengan Membandingkan Tiga Metode Klasifikasi Decision Tree, Naïve Bayes, dan Random Forest. Jurnal Ilmu Komputer Dan Informatika, 3(2), 89–98. https://doi.org/10.54082/jiki.90
[8] K. Dhibi, M. Mansouri, K. Bouzrara, H. Nounou and M. Nounou, "An Enhanced Ensemble Learning-Based Fault Detection and Diagnosis for Grid-Connected PV Systems," in IEEE Access, vol. 9, pp. 155622-155633, 2021, doi: 10.1109/ACCESS.2021.3128749.
[9] Mustaqim, A. Z., Fadil, N. A., & Tyas, D. A. (2023). Artificial Neural Network for Classification Task in Tabular Datasets and Image Processing: A Systematic Literature Review. Jurnal Online Informatika, 8(2), 158–168. https://doi.org/10.15575/join.v8i2.1002
[10] Sofiyah, W., Negara, B. S., Irsyad, M., Iskandar, I., & Yanto, F. (2025). Lung Disease Detection Using Gradient-Weighted Class Activation Mapping (Grad-CAM). Journal of Artificial Intelligence and Software Engineering, 5(2), 720–730. https://doi.org/10.30811/jaise.v5i2.7041
[11] Lutfi, M., Arsanto, A. T., Amrulloh, M. F., & Kulsum, U. (2023). Penanganan Data Tidak Seimbang Menggunakan Hybrid Method Resampling Pada Algoritma Naive Bayes Untuk Software Defect Prediction. INFORMAL: Informatics Journal, 8(2), 119. https://doi.org/10.19184/isj.v8i2.41090
[12] Maharana, K., Mondal, S., & Nemade, B. (2022). A review: Data pre-processing and data augmentation techniques. Global Transitions Proceedings, 3(1), 91–99. https://doi.org/10.1016/j.gltp.2022.04.020
[13] Joseph, V. R. (2022). Optimal Ratio for Data Splitting. https://doi.org/10.1002/sam.11583
[14] Diukarev, V., & Starukhin, Y. (2024). Proposed Methods for Preventing Overfitting in Machine Learning and Deep Learning. Asian Journal of Research in Computer Science, 17(10), 85–94. https://doi.org/10.9734/ajrcos/2024/v17i10511
[15] Nofianti, A., Yawan, M. Y., & Nazar, M. A. (2023). Implementasi Data Mining dalam Pengolahan Data Transaksi Toko Sembako Menggunakan Algoritma Apriori (Studi Kasus : Toko Devan Mart). G-Tech: Jurnal Teknologi Terapan, 7(1), 165–173. https://doi.org/10.33379/gtech.v7i1.1962
[16] Algoritma, A., Pada, K., Rapidminer, S., & Ainurrohmah, W. (2021). Akurasi Algoritma Klasifikasi pada Software Rapidminer dan Weka. Prosiding Seminar Nasional Matematika, 4, 493–499. https://journal.unnes.ac.id/sju/index.php/prisma/
[17] Ekin Adhi Guna, M. Davin Diza Ghifary, Esra Fransiska Sihombing, & Age Pius Datubara. (2023). Implementasi Algoritma Decision Tree untuk Klasifikasi Data Evaluation Car Menggunakan Python. Jurnal Sistem Informasi Dan Ilmu Komputer, 1(4), 167–177. https://doi.org/10.59581/jusiik-widyakarya.v1i4.1830
[18] Mahmuda, S. (2024). Implementasi Metode Random Forest pada Kategori Konten Kanal Youtube. JURNAL JENDELA MATEMATIKA, 2(01), 21–31. https://doi.org/10.57008/jjm.v2i01.633
[19] Alvina Felicia Watratan, Arwini Puspita. B, & Dikwan Moeis. (2020). Implementasi Algoritma Naive Bayes Untuk Memprediksi Tingkat Penyebaran Covid-19 Di Indonesia. Journal of Applied Computer Science and Technology, 1(1), 7–14. https://doi.org/10.52158/jacost.v1i1.9
[20] Saputro, M. B., & Alamsyah, A. (2024). Comparison of Naive Bayes Classifier and K-Nearest Neighbor Algorithms with Information Gain and Adaptive Boosting for Sentiment Analysis of Spotify App Reviews. Recursive Journal of Informatics, 2(1), 37–44. https://doi.org/10.15294/rji.v2i1.68551
[21] Sari, P. W. S., Firmansyah, F., & Kadafi, A. R. K. (2025). Perbandingan Algoritma Random Forest Dan Naïve Bayes Dalam Menganalisis Sentimen Ulasan Pada Produk Skincare Lokal Di Media Sosial Tiktok. Jurnal Informatika Dan Teknik Elektro Terapan, 13(3S1). https://doi.org/10.23960/jitet.v13i3S1.8150
[22] Author, D. F. A. R., & Author, U. C. (2025). Perancangan Sistem Monitoring Dan Manajemen Proyek Pegawai Berbasis Website Dengan Framework Laravel. Jurnal Informatika Dan Teknik Elektro Terapan, 13(3S1). https://doi.org/10.23960/jitet.v13i3S1.7770
[23] Sofiyah, W., Negara, B. S., Irsyad, M., Iskandar, I., & Yanto, F. (2025). Lung Disease Detection Using Gradient-Weighted Class Activation Mapping (Grad-CAM). Journal of Artificial Intelligence and Software Engineering, 5(2), 720–730. https://doi.org/10.30811/jaise.v5i2.7041
[24] Köklü, N., & Sulak, S.A. (2024). Obesity Dataset. Kaggle. Available at: kaggle.com/datasets/suleymansulak/obesity-dataset
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Sifa Sari, M.Arief Soeleman, Mamay Maida, Hestiana Putri Novitasari

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








