Comparative Analysis of Logistic Regression, Random Forest, and SVM for Asthma Risk Prediction Using Demographic, Clinical, and Environmental Features

Barnabas Belieffain Fertility  Daeli; Ucta Pradema Sanjaya

doi:10.30871/jaic.v9i5.10824

Authors

Barnabas Belieffain Fertility Daeli Universitas Ngudi Waluyo
Ucta Pradema Sanjaya Universitas Ngudi Waluyo

DOI:

https://doi.org/10.30871/jaic.v9i5.10824

Keywords:

Asthma Risk Stratification, Clinical Decision Support Systems, Non-Atopic Phenotype, Random Forest Classification, Recall-Precision Tradeoff

Abstract

Asthma prediction demands architectures capable of capturing multifactorial interactions among demographic, clinical, and environmental determinants. This study establishes Random Forest (RF) as the optimal solution through rigorous comparison with Logistic Regression (LR) and Support Vector Machines (SVM) on a 10,000-patient cohort. RF achieved performance: 99.55% accuracy, 100% precision, 98.19% recall, and exceptional stability (σ=0.0019 CV) surpassing SVM by 6.86% recall, preventing 167 missed diagnoses per 10,000 cases. Hereditary factors dominated feature importance (Gini=0.20), generating 18.7% greater node purity reduction than BMI, while the paradoxical "No Allergies" signal (3.726) revealed non-atopic phenotypes. Critically, sparse linear correlations (94% |r|<0.02) contrasted with RF’s capture of nonlinear thresholds like sedentarism (2.243) > smoking impact. Clinical implementation requires: (1) threshold calibration (θ=0.3) achieving >99% recall, (2) monthly false-negative audits mitigating 24.33% prevalence skew, and (3) dimensionality reduction eliminating 3.256 features. RF’s capacity to resolve hereditary-environmental interactions establishes a new paradigm for asthma risk stratification.

Downloads

Download data is not yet available.

References

[1] A. Tahir, H. Malik, dan M. U. Chaudhry, “Multi-Classifi Deep Learning Models for Detecting Multiple Chest Infection Using Cough and Breath Sounds,” Deep Learn. Multimed. Process. Appl. Vol. One Image Secur. Intell. Syst. Multimed. Process., hal. 216–249, 2024, doi: 10.1201/9781003427674-12.

[2] S. Mujiyono, U. P. Sanjaya, I. S. Wibisono, dan H. Setyowati, “Prediksi Fluktuasi Berat Badan Berdasarkan Pola Hidup Menggunakan Model XGBoost dan Deep Learning,” J. Algoritm., vol. 22, no. 1, hal. 221–233, 2025, doi: 10.33364/algoritma/v.22-1.2253.

[3] B. Nemade, V. Bharadi, S. S. Alegavi, dan B. Marakarkandy, “A Comprehensive Review: SMOTE-Based Oversampling Methods for Imbalanced Classification Techniques, Evaluation, and Result Comparisons,” Int. J. Intell. Syst. Appl. Eng., vol. 11, no. 9s, hal. 790–803, 2023, [Daring]. Tersedia pada: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b%5C&scp=85171339904%5C&origin=inward

[4] B. N. Hiremath dan M. M. Patil, “Enhancing Optimized Personalized Therapy in Clinical Decision Support System using Natural Language Processing,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 6, hal. 2840–2848, 2022, doi: 10.1016/j.jksuci.2020.03.006.

[5] S. H. N. Pulung Nurtantio Andono, “Texture Feature Extraction in Grape Image Classification Using K-Nearest Neighbor,” Resti, vol. 6, no. 5, hal. 768–775, 2022.

[6] N. Bussmann, P. Giudici, D. Marinelli, dan J. Papenbrock, “Explainable AI in Fintech Risk Management,” Front. Artif. Intell., vol. 3, 2020, doi: 10.3389/frai.2020.00026.

[7] N. Mduma, “Data Balancing Techniques for Predicting Student Dropout Using Machine Learning,” Data, vol. 8, no. 3, 2023, doi: 10.3390/data8030049.

[8] J. Kim, S. Mun, S. Lee, K. Jeong, dan Y. Baek, “Prediction of metabolic and pre-metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle-aged population in Korea,” BMC Public Health, vol. 22, no. 1, 2022, doi: 10.1186/s12889-022-13131-x.

[9] V. V. Khanna, K. Chadaga, N. Sampathila, S. Prabhu, dan P. Rajagopala Chadaga, “A machine learning and explainable artificial intelligence triage-prediction system for COVID-19,” Decis. Anal. J., vol. 7, 2023, doi: 10.1016/j.dajour.2023.100246.

[10] S. Styawati, N. Hendrastuty, dan A. R. Isnain, “Analisis Sentimen Masyarakat Terhadap Program Kartu Prakerja Pada Twitter Dengan Metode Support Vector Machine,” J. Inform. J. Pengemb. IT, vol. 6, no. 3, hal. 150–155, 2021, doi: 10.30591/jpit.v6i3.2870.

[11] F. D. Ananda dan Y. Pristyanto, “Analisis Sentimen Pengguna Twitter Terhadap Layanan Internet Provider Menggunakan Algoritma Support Vector Machine,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 20, no. 2, hal. 407–416, 2021, doi: 10.30812/matrik.v20i2.1130.

[12] I. Khan dan B. K. Khare, “Exploring the potential of machine learning in gynecological care: a review,” Arch. Gynecol. Obstet., vol. 309, no. 6, hal. 2347–2365, 2024, doi: 10.1007/s00404-024-07479-1.

[13] J. M. Górriz et al., “Computational approaches to Explainable Artificial Intelligence: Advances in theory, applications and trends,” Inf. Fusion, vol. 100, 2023, doi: 10.1016/j.inffus.2023.101945.

[14] M. R. Islam, M. Qaraqe, K. Qaraqe, dan E. Serpedin, “CAT-Net: Convolution, attention, and transformer based network for single-lead ECG arrhythmia classification,” Biomed. Signal Process. Control, vol. 93, 2024, doi: 10.1016/j.bspc.2024.106211.

[15] U. P. Ais, Salma Rihadatul Sanjaya, “Perbandingan Algoritma Random Forest, XGBoost, dan Logistic Regression untuk Prediksi Risiko Kekambuhan Kanker Tiroid,” Edumatic J. Pendidik. Inform., vol. 9, no. 1, hal. 236–245, Apr 2025, doi: 10.29408/edumatic.v9i1.29644.

[16] C. Anil Kumar et al., “Lung Cancer Prediction from Text Datasets Using Machine Learning,” Biomed Res. Int., vol. 2022, 2022, doi: 10.1155/2022/6254177.

[17] H. A. Damayanti dan U. P. Sanjaya, “Perbandingan Model Pembelajaran Mesin Berbasis Smote Meningkatkan Identifikasi Siswa Berisiko di Sekolah Menengah Pertama,” JSiI (Jurnal Sist. Informasi), vol. 12, no. 1, hal. 119–127, 2024, doi: 10.30656/jsii.v11i2.9065.

[18] S. Solayman, S. A. Aumi, C. S. Mery, M. Mubassir, dan R. Khan, “Automatic COVID-19 prediction using explainable machine learning techniques,” Int. J. Cogn. Comput. Eng., vol. 4, hal. 36–46, 2023, doi: 10.1016/j.ijcce.2023.01.003.

[19] Y. Sun et al., “Borderline SMOTE Algorithm and Feature Selection‐Based Network Anomalies Detection Strategy,” Energies, vol. 15, no. 13, 2022, doi: 10.3390/en15134751.

[20] N. Koklu dan S. A. Sulak, “Using Artificial Intelligence Techniques for the Analysis of Obesity Status According to the Individuals’ Social and Physical Activities,” Sinop Üniversitesi Fen Bilim. Derg., vol. 9, no. 1, hal. 217–239, 2024, doi: 10.33484/sinopfbd.1445215.

[21] R. M. A. A. Bhirawa, U. P. Sanjaya, I. Engineering, S. Programme, N. Waluyo, dan C. Java, “From Data Imbalance To Precision : Smote-Driven Machine Learning For Early Detection Of Kidney Disease Optimasi Klasifikasi Data Tidak Seimbang Pada,” J. Inovtek Polbeng, vol. 10, no. 1, hal. 514–525, 2025.

[22] C. Bentéjac, A. Csörgő, dan G. Martínez-Muñoz, A comparative analysis of gradient boosting algorithms, vol. 54, no. 3. Springer Netherlands, 2021. doi: 10.1007/s10462-020-09896-5.

[23] R. Lamba, T. Gulati, H. F. Alharbi, dan A. Jain, “A hybrid system for Parkinson’s disease diagnosis using machine learning techniques,” Int. J. Speech Technol., vol. 25, no. 3, hal. 583–593, 2022, doi: 10.1007/s10772-021-09837-9.

[24] B. Zhang, J. Zhu, dan H. Su, “Toward the third generation artificial intelligence,” Science China Information Sciences, vol. 66, no. 2. 2023. doi: 10.1007/s11432-021-3449-x.

[25] F. Salo, A. B. Nassif, dan A. Essex, “Dimensionality reduction with IG-PCA and ensemble classifier for network intrusion detection,” Comput. Networks, vol. 148, hal. 164–175, 2019, doi: 10.1016/j.comnet.2018.11.010.

[26] A. Raza, K. P. Tran, L. Koehl, dan S. Li, “Designing ECG monitoring healthcare system with federated transfer learning and explainable AI,” Knowledge-Based Syst., vol. 236, 2022, doi: 10.1016/j.knosys.2021.107763.

[27] N. Khatun, N. Halder, S. Rashid, A. Islam, M. Z. Alam, dan T. Ahmed, “Performance Evaluation of Machine Learning and Deep Learning Models for Predicting Type-2 Diabetes on Balanced and Imbalanced Data,” Adv. Sci. Eng. Technol. Int. Conf. ASET, 2024, doi: 10.1109/ASET60340.2024.10708720.

[28] H. G. Gebremeskel, F. Chong, dan H. Heyan, “Unlock Tigrigna NLP - Design and Development of Morphological Analyzer for Tigrigna Verbs Using Hybrid Approach.” Research Square Platform LLC, 2023. doi: 10.21203/rs.3.rs-3682405/v1.

[29] E. Dritsas dan M. Trigka, “Efficient Data-Driven Machine Learning Models for Water Quality Prediction,” Computation, vol. 11, no. 2, 2023, doi: 10.3390/computation11020016.

[30] E. Chamseddine, N. Mansouri, M. Soui, dan M. Abed, “Handling class imbalance in COVID-19 chest X-ray images classification: Using SMOTE and weighted loss,” Appl. Soft Comput., vol. 129, 2022, doi: 10.1016/j.asoc.2022.109588.

[31] H. Zhao, D. Liu, H. Chen, dan W. Deng, “A fault diagnosis method based on hybrid sampling algorithm with energy entropy under unbalanced conditions,” Meas. Sci. Technol., vol. 34, no. 12, 2023, doi: 10.1088/1361-6501/ace98c.

Comparative Analysis of Logistic Regression, Random Forest, and SVM for Asthma Risk Prediction Using Demographic, Clinical, and Environmental Features

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

submit

tools

issn