Comparative Analysis of MobileNetV3 and EfficientNetv2B0 in BISINDO Hand Sign Recognition Using MediaPipe Landmarks

Authors

  • Alief Khairul Fadzli Universitas Amikom Yogyakarta
  • Majid Rahardi Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.30871/jaic.v10i1.11878

Keywords:

Sign Language Recognition, EfficientNetV2B0, MobileNetV3, MediaPipe, Deep Learning

Abstract

Sign language is a vital communication medium for individuals with hearing and speech impairments. In Indonesia, more than 2.6 million people experience hearing disabilities, most of whom rely on Bahasa Isyarat Indonesia BISINDO for daily interaction. However, limited public understanding and the scarcity of professional interpreters continue to hinder inclusive communication. Recent advancements in computer vision and deep learning have enabled camera-based sign language recognition systems that are more affordable and practical compared to sensor-glove solutions. this study presents a comparative analysis between EfficientNetV2-B0 and MobileNetV3-Large in recognizing BISINDO hand sign alphabets using MediaPipe landmarks. The dataset was derived from BISINDO video recordings, from which hand landmarks were extracted using MediaPipe Hands and subsequently converted into two-dimensional skeletal images. In total, 10,309 skeletal images representing BISINDO alphabets A–Z were generated and used for model training and evaluation. Both models were trained under identical configurations using TensorFlow. The results show that MobileNetV3-Large achieved 89.67% test accuracy and an F1-score of 89.76%, while EfficientNetV2-B0 obtains 95.98% test accuracy and an F1-score of 95.93%. These findings highlight the trade-off between the higher classification accuracy of EfficientNetV2-B0 and the superior computational efficiency of MobileNetV3-Large. This research contributes to the development of lightweight, high-performance BISINDO recognition systems for assistive communication applications.

Downloads

Download data is not yet available.

References

[1] R. Fahlevi and C. Rozikin, “Identifikasi isyarat tangan bisindo dengan algoritma cnn dan transfer learning menggunakan mobilenetv2,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 4, pp. 6592–6597, May 2025, doi: 10.36040/JATI.V9I4.14095.

[2] A. Saleh, “A Comparative Analysis of CNN and SVM for Static Sign Language Recognition Using MediaPipe Landmarks,” Journal of Intelligent System and Telecommunication, vol. 1, no. 2, pp. 225–238, Jun. 2025, doi: 10.26740/JISTEL.V1N2.P225-238.

[3] R. Sutjiadi, “Android-Based Application for Real-Time Indonesian Sign Language Recognition Using Convolutional Neural Network,” TEM Journal, vol. 12, no. 3, pp. 1541–1549, Aug. 2023, doi: 10.18421/TEM123-35.

[4] B. Sundar and T. Bagyammal, “American Sign Language Recognition for Alphabets Using MediaPipe and LSTM,” Procedia Comput Sci, vol. 215, pp. 642–651, Jan. 2022, doi: 10.1016/J.PROCS.2022.12.066.

[5] J. Bora, S. Dehingia, A. Boruah, A. A. Chetia, and D. Gogoi, “Real-time Assamese Sign Language Recognition using MediaPipe and Deep Learning,” Procedia Comput Sci, vol. 218, pp. 1384–1393, Jan. 2023, doi: 10.1016/J.PROCS.2023.01.117.

[6] O. Yusuf, M. Habib, and M. Moustafa, “Real-Time Hand Gesture Recognition: Integrating Skeleton-Based Data Fusion and Multi-Stream CNN,” Oct. 2024, Accessed: Oct. 26, 2025. [Online]. Available: https://arxiv.org/pdf/2406.15003v2

[7] D. Joan, V. Vincent, K. J. Daniel, S. Achmad, and R. Sutoyo, “BISINDO Hand-Sign Detection Using Transfer Learning,” 8th International Conference on Recent Advances and Innovations in Engineering: Empowering Computing, Analytics, and Engineering Through Digital Innovation, ICRAIE 2023, pp. 1–7, Dec. 2023, doi: 10.1109/ICRAIE59459.2023.10468194.

[8] T. Shahriar, “Comparative Analysis of Lightweight Deep Learning Models for Memory-Constrained Devices,” May 2025, Accessed: Oct. 26, 2025. [Online]. Available: https://arxiv.org/pdf/2505.03303

[9] M. Al-Hammadi et al., “Deep Learning-Based Approach for Sign Language Gesture Recognition With Efficient Hand Gesture Representation,” IEEE Access, vol. 8, pp. 192527–192542, Oct. 2020, doi: 10.1109/ACCESS.2020.3032140.

[10] M. Tan and Q. V. Le, “EfficientNetV2: Smaller Models and Faster Training,” Proc Mach Learn Res, vol. 139, pp. 10096–10106, Apr. 2021, Accessed: Oct. 26, 2025. [Online]. Available: https://arxiv.org/pdf/2104.00298

[11] J. Shin, A. S. M. Miah, M. H. Kabir, M. A. Rahim, and A. Al Shiam, “A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data Modalities,” IEEE Access, vol. 12, pp. 142606–142639, 2024, doi: 10.1109/ACCESS.2024.3456436.

[12] S. Sharma and S. Singh, “ISL recognition system using integrated mobile-net and transfer learning method,” Expert Syst Appl, vol. 221, pp. 119772–119772, Mar. 2023, doi: 10.1016/J.ESWA.2023.119772.

[13] I. Rizka Fadhillah, M. Muharrom Al Haromainy, and H. Maulana, “Implementasi model transfer learning efficientnet untuk pendeteksian bahasa isyarat indonesia (bisindo) pada perangkat android,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 8, no. 4, pp. 7816–7822, Aug. 2024, doi: 10.36040/JATI.V8I4.10463.

[14] R. A. Lashaki, Z. Raeisi, N. Razavi, M. Goodarzi, and H. Najafzadeh, “Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data,” BMC Oral Health, vol. 25, no. 1, pp. 1–22, Dec. 2025, doi: 10.1186/S12903-025-05704-0/TABLES/3.

[15] M. K. Habib, O. Yusuf, and M. Moustafa, “Skeleton-Based Real-Time Hand Gesture Recognition Using Data Fusion and Ensemble Multi-Stream CNN Architecture,” Technologies 2025, Vol. 13, Page 484, vol. 13, no. 11, p. 484, Oct. 2025, doi: 10.3390/TECHNOLOGIES13110484.

[16] M. Pu, C. Y. Chong, and M. K. Lim, “Robustness Evaluation in Hand Pose Estimation Models using Metamorphic Testing,” Mar. 2023, Accessed: Jan. 09, 2026. [Online]. Available: http://arxiv.org/abs/2303.04566

Downloads

Published

2026-02-07

How to Cite

[1]
A. K. Fadzli and M. Rahardi, “Comparative Analysis of MobileNetV3 and EfficientNetv2B0 in BISINDO Hand Sign Recognition Using MediaPipe Landmarks”, JAIC, vol. 10, no. 1, pp. 737–746, Feb. 2026.

Most read articles by the same author(s)

1 2 3 > >> 

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.