Benchmarking YOLOv12 Variants for Indonesian Traditional Cuisine Detection
DOI:
https://doi.org/10.30871/jaic.v10i3.12625Keywords:
YOLOv12, Object Detection, Indonesian Cuisine, Food Detection, Benchmark AnalysisAbstract
YOLOv12 is one of the latest YOLO versions currently. Several studies have proven that YOLOv12 has better performance compared to previous versions. YOLOv12 itself has five model variants based on its architectural complexity, namely nano, small, medium, large and extra larges. This study tests the performance of YOLOv12 model variants (n, s, m, l, x) for traditional Indonesian culinary detection using a domain-specific object detection dataset. The dataset contains 718 images with 720 bounding-box instances annotated across 20 culinary classes, divided into 418/150/150 images for training/validation/testing. Data processing was performed in Roboflow with automatic orientation and stretching resizing to 640×640, while the training split was enriched using augmentation (horizontal and vertical flips) to increase sample diversity. All YOLOv12 variants were trained with the same configuration and environment, for 50 epochs using the Ultralytics framework with default hyperparameters on an NVIDIA A100-SXM4 80GB GPU. On the validation set, all variants achieved high detection accuracy ([email protected] = 0.985–0.991), while differences emerged under a more stringent localization criterion ([email protected]:0.95). The best overall localization performance was achieved by YOLOv12-L ([email protected]:0.95 = 0.874), while YOLOv12-N provided the fastest inference (0.8 ms/image) with competitive accuracy ([email protected]:0.95 = 0.822). These findings provide preliminary guidance for selecting YOLOv12 variants based on the trade-off between accuracy and speed.
Downloads
References
[1] A. Tuomi and M. P. Ascenção, “Intelligent automation in hospitality: exploring the relative automatability of frontline food service tasks,” J. Hosp. Tour. Insights, vol. 6, no. 1, pp. 151–173, Nov. 2021, doi: 10.1108/JHTI-07-2021-0175.
[2] M. Gerasimchuk and A. Uzhinskiy, “Food Recognition for Smart Restaurants and Self-Service Cafes,” Phys. Part. Nucl. Lett., vol. 21, no. 1, pp. 79–83, Feb. 2024, doi: 10.1134/S1547477124010059.
[3] G. I. Alkady, “A Deep Learning-Powered Web Service for Optimal Restaurant Recommendations Based on Customers Food Preferences,” in 2024 16th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Jun. 2024, pp. 1–4. doi: 10.1109/ECAI61503.2024.10607587.
[4] M. Han, J. Chen, and Z. Zhou, “NutrifyAI: An AI-Powered System for Real-Time Food Detection, Nutritional Analysis, and Personalized Meal Recommendations,” Oct. 21, 2024, arXiv: arXiv:2408.10532. doi: 10.48550/arXiv.2408.10532.
[5] G. C. Utami, C. R. Widiawati, and P. Subarkah, “Detection of Indonesian Food to Estimate Nutritional Information Using YOLOv5,” Teknika, vol. 12, no. 2, pp. 158–165, Jun. 2023, doi: 10.34148/teknika.v12i2.636.
[6] F. Romadhon et al., “Food Image Detection System and Calorie Content Estimation Using Yolo to Control Calorie Intake in the Body,” E3S Web Conf., vol. 465, p. 02057, 2023, doi: 10.1051/e3sconf/202346502057.
[7] T. Selvaraju, V. Dakshinamurthi, G. Badurudeen, N. Prabhakaran, P. Gopi, and M. A. N. Hussain, “Food detection and estimation of calories and other macro-nutrients and features to support healthy lifestyle,” AIP Conf. Proc., vol. 3279, no. 1, p. 020129, Apr. 2025, doi: 10.1063/5.0263063.
[8] A. Dhelia, S. Chordia, and K. B, “YOLO-based Food Damage Detection: An Automated Approach for Quality Control in Food Industry,” in 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Oct. 2024, pp. 1444–1449. doi: 10.1109/I-SMAC61858.2024.10714664.
[9] N. Rane, “YOLO and Faster R-CNN object detection for smart Industry 4.0 and Industry 5.0: applications, challenges, and opportunities,” Oct. 25, 2023, Social Science Research Network, Rochester, NY: 4624206. doi: 10.2139/ssrn.4624206.
[10] M. Y. Wu, J. H. Lee, and C. Y. Hsueh, “A Framework of Visual Checkout System Using Convolutional Neural Networks for Bento Buffet,” Sensors, vol. 21, no. 8, p. 2627, Jan. 2021, doi: 10.3390/s21082627.
[11] J. W. Park, Y. H. Cho, M. K. Park, and Y. D. Kim, “Consumer Usability Test of Mobile Food Safety Inquiry Platform Based on Image Recognition,” Sustainability, vol. 16, no. 21, p. 9538, Jan. 2024, doi: 10.3390/su16219538.
[12] X. Huang et al., “Application of Image Computing in Non-Destructive Detection of Chinese Cuisine,” Foods, vol. 14, no. 14, p. 2488, Jan. 2025, doi: 10.3390/foods14142488.
[13] A. Sanatbyek, A. Karabay, H. A. Varol, and M. Y. Chan, “Deep Object Recognition-Based Analysis of Diverse Culinary Landscapes,” in 2025 IEEE International Conference on Image Processing (ICIP), Sep. 2025, pp. 1127–1132. doi: 10.1109/ICIP55913.2025.11084477.
[14] N. Zheng, X. Song, W. T. Tang, S.-K. Ng, L. Nie, and R. Zimmermann, “Unsupervised Few-Shot Food Recognition With Intra-Class Variation and Inter-Class Similarity Modeling,” IEEE Trans. Circuits Syst. Video Technol., vol. 35, no. 12, pp. 12138–12151, Dec. 2025, doi: 10.1109/TCSVT.2025.3585925.
[15] D. Pandey et al., “Object Detection in Indian Food Platters using Transfer Learning with YOLOv4,” in 2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW), May 2022, pp. 101–106. doi: 10.1109/ICDEW55742.2022.00021.
[16] A. H. Rangkuti, J. M. Kerta, R. Y. Mogot, and V. H. Athala, “Identification of Indonesian Traditional Foods Using Machine Learning and Supported by Segmentation Methods,” JOIV Int. J. Inform. Vis., vol. 8, no. 4, pp. 2324–2335, Dec. 2024, doi: 10.62527/joiv.8.4.2545.
[17] L. Bossard, M. Guillaumin, and L. Van Gool, “Food-101 – Mining Discriminative Components with Random Forests,” in Computer Vision – ECCV 2014, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds., Cham: Springer International Publishing, 2014, pp. 446–461. doi: 10.1007/978-3-319-10599-4_29.
[18] J. Dai, X. Hu, M. Li, Y. Li, and S. Du, “The multi-learning for food analyses in computer vision: a survey,” Multimed. Tools Appl., vol. 82, no. 17, pp. 25615–25650, Jul. 2023, doi: 10.1007/s11042-023-14373-6.
[19] N. Aditama and R. Munir, “Indonesian Street Food Calorie Estimation Using Mask R-CNN and Multiple Linear Regression,” in 2022 Second International Conference on Power, Control and Computing Technologies (ICPC2T), Mar. 2022, pp. 1–6. doi: 10.1109/ICPC2T53885.2022.9776804.
[20] M. Nadeem, H. Shen, L. Choy, and J. M. H. Barakat, “Smart Diet Diary: Real-Time Mobile Application for Food Recognition,” Appl. Syst. Innov., vol. 6, no. 2, p. 53, Apr. 2023, doi: 10.3390/asi6020053.
[21] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” May 09, 2016, arXiv: arXiv:1506.02640. doi: 10.48550/arXiv.1506.02640.
[22] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7263–7271. Accessed: Jan. 16, 2026. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2017/html/Redmon_YOLO9000_Better_Faster_CVPR_2017_paper.html
[23] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” Apr. 08, 2018, arXiv: arXiv:1804.02767. doi: 10.48550/arXiv.1804.02767.
[24] I. Kurmashev, V. Semenyuk, A. Lupidi, D. Alyoshin, L. Kurmasheva, and A. Cantelli-Forti, “Study of the Optimal YOLO Visual Detector Model for Enhancing UAV Detection and Classification in Optoelectronic Channels of Sensor Fusion Systems,” Drones, vol. 9, no. 11, p. 732, Nov. 2025, doi: 10.3390/drones9110732.
[25] C. Y. Wang, A. Bochkovskiy, and H. Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” Jul. 06, 2022, arXiv: arXiv:2207.02696. doi: 10.48550/arXiv.2207.02696.
[26] N. Jegham, C. Y. Koh, M. Abdelatti, and A. Hendawi, “YOLO Evolution: A Comprehensive Benchmark and Architectural Review of YOLOv12, YOLO11, and Their Previous Versions,” Mar. 17, 2025, arXiv: arXiv:2411.00201. doi: 10.48550/arXiv.2411.00201.
[27] Y. Tian, Q. Ye, and D. Doermann, “YOLOv12: Attention-Centric Real-Time Object Detectors,” Feb. 18, 2025, arXiv: arXiv:2502.12524. doi: 10.48550/arXiv.2502.12524.
[28] President University, “Indonesian-Traditional-Cuisine Computer Vision Model.” 2023. [Online]. Available: https://universe.roboflow.com/president-university-y2m5p/indonesian-traditional-cuisine
[29] R. Padilla, W. L. Passos, T. L. B. Dias, S. L. Netto, and E. A. B. da Silva, “A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit,” Electronics, vol. 10, no. 3, p. 279, Jan. 2021, doi: 10.3390/electronics10030279.
[30] A. Badithela, T. Wongpiromsarn, and R. M. Murray, “Evaluation Metrics for Object Detection for Autonomous Systems,” Oct. 19, 2022, arXiv: arXiv:2210.10298. doi: 10.48550/arXiv.2210.10298.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Fauzan Firdaus, Lidya Ningsih, Aminah Indahsari Marsuki, Angel Metanosa Afinda

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








