Evaluating Image Recognition Accuracy in Explicit Content Detection: A Comparative Study with Indonesian Perceptions

Authors

  • Rauhil Fahmi Universitas Negeri Jakarta
  • Deni Utama Universitas Negeri Jakarta
  • Muhammad Ridho Kurniawan Pratama Universitas Negeri Jakarta

DOI:

https://doi.org/10.30871/jaic.v10i1.11934

Keywords:

Image Recognition, Explicit Content, Google Vision SafeSearch, Comparative Analysis, Indonesian perceptions

Abstract

This study evaluates image recognition accuracy in explicit content detection by using the Indonesian social context as a comparative reference. Google Vision SafeSearch is employed as a representative automated image recognition system widely used in online content moderation. Although such systems provide efficiency in detecting adult, violent, or racy content, challenges arise when their detection outputs must align with more conservative cultural and religious norms, such as those in Indonesia. A quantitative descriptive-comparative method was applied by testing six representative images based on SafeSearch explicit content categories (adult, racy, violence, medical, and spoof) and comparing the automated detections with Indonesian respondents’ perceptions collected through a Likert-scale questionnaire. Statistical analysis shows a significant difference between the system’s explicit content classifications and human perceptions, with respondents consistently rating explicitness higher than Google Vision API. Despite this difference, a strong Spearman rank correlation indicates that Google Vision SafeSearch is consistent in ranking explicit content levels, although still limited in capturing emotional intensity and cultural sensitivity. These findings highlight how Indonesian social and cultural norms shape the perception of explicit imagery, emphasizing the need for image recognition systems that incorporate local contextual factors.

Downloads

Download data is not yet available.

References

[1] R. Gorwa, R. Binns, and C. Katzenbach, “Algorithmic content moderation: Technical and political challenges in the automation of platform governance,” Big Data Soc, vol. 7, no. 1, Jan. 2020, doi: 10.1177/2053951719897945.

[2] M. Ruckenstein and L. L. M. Turunen, “Re-humanizing the platform: Content moderators and the logic of care,” New Media Soc, vol. 22, no. 6, pp. 1026–1042, Jun. 2020, doi: 10.1177/1461444819875990.

[3] N. Sambasivan et al., “‘They don’t leave us alone anywhere we go’: Gender and digital abuse in South Asia,” in Conference on Human Factors in Computing Systems - Proceedings, Association for Computing Machinery, May 2019. doi: 10.1145/3290605.3300232.

[4] N. Sambasivan, E. Arnesen, B. Hutchinson, T. Doshi, and V. Prabhakaran, “Re-imagining algorithmic fairness in India and beyond,” in FAccT 2021 - Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Association for Computing Machinery, Inc, Mar. 2021, pp. 315–328. doi: 10.1145/3442188.3445896.

[5] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “ArcFace: Additive Angular Margin Loss for Deep Face Recognition.” [Online]. Available: https://github.com/

[6] M. D. Zakir Hossain, F. Sohel, M. F. Shiratuddin, and H. Laga, “A comprehensive survey of deep learning for image captioning,” Nov. 30, 2019, Association for Computing Machinery. doi: 10.1145/3295748.

[7] “Detect explicit content (SafeSearch),” Cloud Vision API Documentation.”

[8] “The 17 goals – Sustainable development.”

[9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Commun ACM, vol. 60, no. 6, pp. 84–90, Jun. 2017, doi: 10.1145/3065386.

[10] J. S. Lee, Y. M. Kuo, P. C. Chung, and E. L. Chen, “Naked image detection based on adaptive and extensible skin color model,” Pattern Recognit, vol. 40, no. 8, pp. 2261–2270, Aug. 2007, doi: 10.1016/j.patcog.2006.11.016.

[11] K. Yousaf and T. Nawaz, “A Deep Learning-Based Approach for Inappropriate Content Detection and Classification of YouTube Videos,” IEEE Access, vol. 10, pp. 16283–16298, 2022, doi: 10.1109/ACCESS.2022.3147519.

[12] F. Shahid, M. Elswah, and A. Vashistha, “Think Outside the Data: Colonial Biases and Systemic Issues in Automated Moderation Pipelines for Low-Resource Languages,” Aug. 2025, [Online]. Available: http://arxiv.org/abs/2501.13836

[13] E. Mohammadi, Y. Cai, A. Novin, V. Vera, and E. Soltanmohammadi, “Who is a scientist? Gender and racial biases in google vision AI,” AI and Ethics, vol. 5, no. 5, pp. 4993–5010, Oct. 2025, doi: 10.1007/s43681-025-00742-4.

[14] “International Journal of Artificial Intelligence and Machine Learning in Engineering 763|p AI in Automated Content Moderation on Social Media.”

[15] A. Alhakim, “Criminal Control for the Distribution of Pornographic Content on the Internet: An Indonesian Experience,” 2021, [Online]. Available: https://ejournal.undiksha.ac.id/index.php/jkh

[16] N. Meilani, S. S. Hariadi, and F. T. Haryadi, “Social media and pornography access behavior among adolescents,” Int J Publ Health Sci, vol. 12, no. 2, pp. 536–544, Jun. 2023, doi: 10.11591/ijphs.v12i2.22513.

[17] M. P. F. Purwaningtyas and C. K. A. Wibowo, “Negotiating Sexuality: Indonesian Female Audience towards Pornographic Media Content,” IKAT: The Indonesian Journal of Southeast Asian Studies, vol. 5, no. 2, Apr. 2022, doi: 10.22146/ikat.v5i2.70077.

[18] H. Bao et al., “VModA: An Effective Framework for Adaptive NSFW Image Moderation,” May 2025, [Online]. Available: http://arxiv.org/abs/2505.23386

[19] J. W. Creswell, Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 4th ed. Thousand Oaks, CA, USA: SAGE, 2014.

[20] U. Sekaran, Research Methods for Business: A Skill-Building Approach, 7th ed. Hoboken, NJ, USA: Wiley, 2016.

[21] A. Field, Discovering Statistics Using IBM SPSS Statistics, 5th ed. London, UK: SAGE, 2018.

[22] F. Wilcoxon, “Individual Comparisons by Ranking Methods,” 1945. [Online]. Available: https://www.jstor.org/stable/3001968

[23] A. Joshi, S. Kale, S. Chandel, and D. Pal, “Likert Scale: Explored and Explained,” Br J Appl Sci Technol, vol. 7, no. 4, pp. 396–403, Jan. 2015, doi: 10.9734/bjast/2015/14975.

[24] C. Spearman, “The Proof and Measurement of Association between Two Things,” Autumn-Winter, 1987.

[25] W. J. Potter, “The state of media literacy,” J Broadcast Electron Media, vol. 54, no. 4, pp. 675–696, Oct. 2010, doi: 10.1080/08838151.2011.521462.

[26] H. Hosseini, B. Xiao, M. Jaiswal, and R. Poovendran, “On the Limitation of Convolutional Neural Networks in Recognizing Negative Images,” Aug. 2017, [Online]. Available: http://arxiv.org/abs/1703.06857

[27] A. Apte et al., “Countering Inconsistent Labelling by Google’s Vision API for Rotated Images.”

[28] H. Hosseini, B. Xiao, and R. Poovendran, “Google’s Cloud Vision API Is Not Robust To Noise,” Jul. 2017, [Online]. Available: http://arxiv.org/abs/1704.05051

[29] P. Ricaurte, “Data Epistemologies, The Coloniality of Power, and Resistance,” Television and New Media, vol. 20, no. 4, pp. 350–365, 2019, doi: 10.1177/1527476419831640.

[30] A. Adadi and M. Berrada, “Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI),” IEEE Access, vol. 6, pp. 52138–52160, Sep. 2018, doi: 10.1109/ACCESS.2018.2870052.

[31] B. Krawczyk, “Learning from imbalanced data: open challenges and future directions,” Nov. 01, 2016, Springer Verlag. doi: 10.1007/s13748-016-0094-0.

[32] K. Holstein, J. W. Vaughan, H. Daumé, M. Dudík, and H. Wallach, “Improving fairness in machine learning systems: What do industry practitioners need?,” in Conference on Human Factors in Computing Systems - Proceedings, Association for Computing Machinery, May 2019. doi: 10.1145/3290605.3300830.

Downloads

Published

2026-02-04

How to Cite

[1]
R. Fahmi, D. Utama, and M. R. K. Pratama, “Evaluating Image Recognition Accuracy in Explicit Content Detection: A Comparative Study with Indonesian Perceptions”, JAIC, vol. 10, no. 1, pp. 336–347, Feb. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.