Exploration of Machine Learning Algorithms and Class Imbalance Handling on Plant Disease Detection
DOI:
https://doi.org/10.30871/jaic.v9i5.10338Keywords:
Plant Disease Detection, Machine Learning, Under-Sampling, Pre-Trained ResNet50, CNNAbstract
Plant leaf diseases pose a significant threat to agricultural productivity, necessitating accurate and efficient identification systems for timely intervention. This study proposes an approach that leverages deep feature extraction using a pretrained ResNet50 model combined with traditional machine learning algorithms to recognize 38 types of plant leaf diseases. Each image was transformed into a 2048-dimensional feature vector, followed by normalization and dimensionality reduction using Principal Component Analysis (PCA). To mitigate the issue of class imbalance in the dataset, random under-sampling was applied at the feature level to ensure equal representation across all classes. Eleven machine learning models were trained and evaluated using 5-fold cross-validation, with performance assessed through accuracy, precision, recall, F1-score, and ROC AUC score. Among the evaluated models, the Support Vector Machine (SVM) achieved the highest accuracy of 99.63%, followed by Logistic Regression at 97.33%, and LightGBM at 96.25%. These models demonstrated strong generalization capabilities in multiclass settings, while simpler classifiers like AdaBoost and Decision Tree yielded lower performance. A comparative analysis of training and test accuracy further highlighted model robustness and overfitting tendencies. The findings emphasize the potential of combining pretrained convolutional neural networks for feature extraction with conventional classifiers to address complex agricultural classification tasks. Future work may explore the inclusion of healthy leaf samples, alternative CNN architectures, and deployment in real-time diagnostic tools to support precision farming and improve crop health monitoring.
Downloads
References
[1] H. H. E. van Zanten et al., “Circularity in Europe strengthens the sustainability of the global food system,” Nat. Food, 2023, doi: 10.1038/s43016-023-00734-9.
[2] S. Savary, L. Willocquet, S. J. Pethybridge, P. Esker, N. McRoberts, and A. Nelson, “The global burden of pathogens and pests on major food crops,” Nat. Ecol. Evol., 2019, doi: 10.1038/s41559-018-0793-y.
[3] A. Ahmad, D. Saraswat, and A. El Gamal, “A survey on using deep learning techniques for plant disease diagnosis and recommendations for development of appropriate tools,” 2023. doi: 10.1016/j.atech.2022.100083.
[4] S. U. Khan, A. Alsuhaibani, A. Alabduljabbar, F. Almarshad, Y. N. Altherwy, and T. Akram, A review on automated plant disease detection: motivation, limitations, challenges, and recent advancements for future research, vol. 37, no. 3. Springer International Publishing, 2025. doi: 10.1007/s44443-025-00040-3.
[5] W. Ding, M. Abdel-Basset, I. Alrashdi, and H. Hawash, “Next generation of computer vision for plant disease monitoring in precision agriculture: A contemporary survey, taxonomy, experiments, and future direction,” Inf. Sci. (Ny)., 2024, doi: 10.1016/j.ins.2024.120338.
[6] H. N. Ngugi, A. E. Ezugwu, A. A. Akinyelu, and L. Abualigah, “Revolutionizing crop disease detection with computational deep learning: a comprehensive review,” 2024. doi: 10.1007/s10661-024-12454-z.
[7] P. H. Kyaw, J. Gutierrez, and A. Ghobakhlou, “A Systematic Review of Deep Learning Techniques for Phishing Email Detection,” Electron., vol. 13, no. 19, 2024, doi: 10.3390/electronics13193823.
[8] T. Miftahushudur, H. M. Sahin, B. Grieve, and H. Yin, “A Survey of Methods for Addressing Imbalance Data Problems in Agriculture Applications,” Remote Sens., vol. 17, no. 3, pp. 1–31, 2025, doi: 10.3390/rs17030454.
[9] S. Sladojevic, M. Arsenovic, A. Anderla, D. Culibrk, and D. Stefanovic, “Deep Neural Networks Based Recognition of Plant Diseases by Leaf Image Classification,” Comput. Intell. Neurosci., 2016, doi: 10.1155/2016/3289801.
[10] S. P. Mohanty, D. P. Hughes, and M. Salathé, “Using deep learning for image-based plant disease detection,” Front. Plant Sci., 2016, doi: 10.3389/fpls.2016.01419.
[11] D. S. Joseph, P. M. Pawar, and K. Chakradeo, “Real-Time Plant Disease Dataset Development and Detection of Plant Disease Using Deep Learning,” IEEE Access, vol. 12, no. January, pp. 16310–16333, 2024, doi: 10.1109/ACCESS.2024.3358333.
[12] S. S. Harakannanavar, J. M. Rudagi, V. I. Puranikmath, A. Siddiqua, and R. Pramodhini, “Plant leaf disease detection using computer vision and machine learning algorithms,” Glob. Transitions Proc., 2022, doi: 10.1016/j.gltp.2022.03.016.
[13] A. Kamilaris and F. X. Prenafeta-Boldú, “Deep learning in agriculture: A survey,” 2018. doi: 10.1016/j.compag.2018.02.016.
[14] L. Li, S. Zhang, and B. Wang, “Plant Disease Detection and Classification by Deep Learning - A Review,” 2021. doi: 10.1109/ACCESS.2021.3069646.
[15] A. Bhatia, A. Chug, and A. Prakash Singh, “Application of extreme learning machine in plant disease prediction for highly imbalanced dataset,” J. Stat. Manag. Syst., 2020, doi: 10.1080/09720510.2020.1799504.
[16] H. Ghazouani, W. Barhoumi, E. Chakroun, and A. Chehri, “Dealing with Unbalanced Data in Leaf Disease Detection: A Comparative Study of Hierarchical Classification, Clustering-based Undersampling and Reweighting-based Approaches,” in Procedia Computer Science, 2023. doi: 10.1016/j.procs.2023.10.489.
[17] T. Wongvorachan, S. He, and O. Bulut, “A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining,” Inf., 2023, doi: 10.3390/info14010054.
[18] S. Ali, M. Hassan, J. Y. Kim, M. I. Farid, M. Sanaullah, and H. Mufti, “FF-PCA-LDA: Intelligent Feature Fusion Based PCA-LDA Classification System for Plant Leaf Diseases,” Appl. Sci., vol. 12, no. 7, 2022, doi: 10.3390/app12073514.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ervin Aditya, Ajie Kusuma Wardhana

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








