A Hybrid Framework Based on YOLOv8 and Vision Transformer for Multi-Class Detection and Classification of Coffee Fruit Maturity Levels

Authors

  • Ahmad Subki Universitas Teknologi Mataram
  • M. Zulpahmi Universitas Teknologi Mataram
  • Bahtiar Imran Universitas Teknologi Mataram

DOI:

https://doi.org/10.30871/jaic.v9i5.10590

Keywords:

YOLOv8, Computer Vision, Object detection, multi-class classification of coffee fruits

Abstract

Detection and classification of coffee cherries based on maturity levels present a significant challenge in agricultural product processing systems, primarily due to the high visual similarity among classes within a single bunch. This study aims to develop a multi-class detection and classification system for coffee cherries by integrating YOLOv8 and Vision Transformer (ViT) as a classification enhancer. The initial detection process is conducted using YOLOv8 to identify and automatically crop coffee cherry objects from bunch images. These cropped images are then re-classified using the Vision Transformer to improve prediction accuracy. The training process was carried out with a learning rate of 0.0001, a batch size of 16, and epoch variations of 50, 100, and 150. Evaluation results demonstrate that the integration of YOLOv8 and ViT significantly improves classification accuracy compared to using YOLOv8 alone. At 100 epochs, the YOLOv8+ViT model achieved an accuracy of 89.52%, a precision of 90.43%, and a recall of 89.52%, outperforming the standalone YOLOv8 model, which only reached an accuracy of 75.44%. These results indicate that the Vision Transformer effectively enhances classification performance, particularly for visually similar coffee cherry classes. The integration of these two methods offers a promising alternative solution for improving image-based multi-class classification in agriculture and other domains involving complex visual objects.

Downloads

Download data is not yet available.

References

[1] A. Subki and B. Imran, “Implementasi Deep Learning Menggunakan CNN dengan Arsitektur Alexnet Untuk Klasifikasi dan Identifikasi Jenis Kopi Khas Lombok Ahmad,” Explore, vol. 14, no. 2, pp. 135–140, 2024.

[2] N. Pradita, Hayati, Suwardji, Muktasam, and Mulyati, “Analisis Keberlanjutan Dimensi Ekologi Kopi Arabika di Lahan Kering Desa Sajang Kecamatan Sembalun Kabupaten Lombok Timur,” Agroteksos, vol. 34, no. 2, pp. 383–391, 2024.

[3] L. Y. K. Chandra, B. I. Linggarweni, and S. Novida, “Analisis Pendapatan Usaha Kopi Bubuk Arabika di Desa Sajang Kecamatan Sembalun Kabupaten Lombok Timur,” J. Ekon. dan Bisnis, vol. 3, no. 2, pp. 148–155, 2023, doi: 10.56145/jurnalekonomidanbisnis.v3i2.71.

[4] T. C. Pham, V. D. Nguyen, C. H. Le, M. Packianather, and V. D. Hoang, “Artificial intelligence-based solutions for coffee leaf disease classification,” IOP Conf. Ser. Earth Environ. Sci., vol. 1278, no. 1, 2023, doi: 10.1088/1755-1315/1278/1/012004.

[5] E. Elbasi et al., “Artificial Intelligence Technology in the Agricultural Sector: A Systematic Literature Review,” IEEE Access, vol. 11, pp. 171–202, 2022.

[6] B. Ye, R. Xue, and H. Xu, “ASD-YOLO: a lightweight network for coffee fruit ripening detection in complex scenarios,” Front. Plant Sci., vol. 16, no. February, pp. 1–13, 2025, doi: 10.3389/fpls.2025.1484784.

[7] H. C. Bazame, J. P. Molin, D. Althoff, and M. Martello, “Detection of coffee fruits on tree branches using computer vision,” Sci. Agric., vol. 80, no. October, 2022, doi: 10.1590/1678-992X-2022-0064.

[8] S. Velásquez, A. P. Franco, N. Peña, J. C. Bohórquez, and N. Gutiérrez, “Classification of the maturity stage of coffee cherries using comparative feature and machine learning,” Coffee Sci., vol. 16, no. March, p. 1, 2021, doi: 10.25186/.v16i.1710.

[9] M. N. Izza and G. P. Kusuma, “Image Classification of Green Arabica Coffee Using Transformer-Based Architecture,” Int. J. Eng. Trends Technol., vol. 72, no. 6, pp. 304–314, 2024, doi: 10.14445/22315381/IJETT-V72I6P128.

[10] M. García, J. E. Candelo-Becerra, and F. E. Hoyos, “Quality and defect inspection of green coffee beans using a computer vision system,” Appl. Sci., vol. 9, no. 19, 2019, doi: 10.3390/app9194195.

[11] H. L. Gope, H. Fukai, F. M. Ruhad, and S. Barman, “Comparative analysis of YOLO models for green coffee bean detection and defect classification,” Sci. Rep., vol. 14, no. 1, pp. 1–16, 2024, doi: 10.1038/s41598-024-78598-7.

[12] A. Rincon-Jimenez et al., “Ripeness stage characterization of coffee fruits (coffea arabica L. var. Castillo) applying chromaticity maps obtained from digital images,” in Materials Today: Proceedings, Elsevier Ltd., 2021, pp. 1271–1278. doi: 10.1016/j.matpr.2020.11.264.

[13] A. Michael and M. Garonga, “Classification model of ‘Toraja’ arabica coffee fruit ripeness levels using convolution neural network approach,” Ilk. J. Ilm., vol. 13, no. 3, pp. 226–234, 2021, doi: 10.33096/ilkom.v13i3.861.226-234.

[14] A. G. Costa, D. A. G. De Sousa, J. L. Paes, J. P. B. Cunha, and M. V. M. De Oliveira, “Classification of robusta coffee fruits at different maturation stages using colorimetric characteristics” Eng. Agrícola, vol. 4430, no. 4, pp. 518–525, 2020, [Online]. Available: https://doi.org/10.1590/1809-4430-Eng.Agric.v40n4p518-525/2020

[15] B. Xiao, M. Nguyen, and W. Q. Yan, “Fruit ripeness identification using YOLOv8 model,” Multimed. Tools Appl., vol. 83, no. 9, pp. 28039–28056, 2024, doi: 10.1007/s11042-023-16570-9.

Downloads

Published

2025-10-04

How to Cite

[1]
A. Subki, M. Zulpahmi, and B. Imran, “A Hybrid Framework Based on YOLOv8 and Vision Transformer for Multi-Class Detection and Classification of Coffee Fruit Maturity Levels”, JAIC, vol. 9, no. 5, pp. 2019–2028, Oct. 2025.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.