SemetonBug: Next-Generation Machine Learning-Powered Code Analyzer for Precision Bug Detection and Dynamic Error Localization

Authors

  • Surni Erniwati Universitas Teknologi Mataram
  • Bahtiar Imran Universitas Teknologi Mataram
  • Zumratul Muahidin Universitas Teknologi Mataram
  • Zaeniah Zaeniah Universitas Teknologi Mataram
  • Juhartini Juhartini Universitas Teknologi Mataram

DOI:

https://doi.org/10.30871/jaic.v10i1.11837

Keywords:

Bug Detection, Machine Learning, Python, Random Forest, Abstract Syntax Tree

Abstract

Bug detection in Python programming is a crucial challenge in software development. This research proposes SemetonBug, a machine learning-based system for automatically detecting bugs in Python code. The system utilizes a Random Forest Classifier as the main model, with features extracted from the syntactic structure of the code using an Abstract Syntax Tree (AST). The dataset consists of 200 Python files, divided into 100 files with bugs and 100 files without bugs. The model is optimized using Grid Search Cross Validation, with the best combination of n_estimators = 300, max_depth = 20, min_samples_split = 5, and min_samples_leaf = 2. Evaluation results show that the model achieves 85% accuracy, 0.84 precision, 0.87 recall, and 0.86 F1-score. The detected bugs are stored in an Excel file for further analysis. By leveraging machine learning, SemetonBug enhances efficiency and accuracy in bug identification compared to traditional rule-based methods. These findings highlight the potential of machine learning models in improving software quality and reducing coding errors automatically.

Downloads

Download data is not yet available.

References

[1] M. Allamanis, H. Jackson-Flux, and M. Brockschmidt, “Self-Supervised Bug Detection and Repair,” in Journal of Mathematical Sciences, 2021. doi: 10.48550/arxiv.2105.12787.

[2] D. Cotroneo, L. De Simone, A. K. Iannillo, R. Natella, S. Rosiello, and N. Bidokhti, “Analyzing the Context of Bug-Fixing Changes in the OpenStack Cloud Computing Platform,” in 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), 2019. doi: 10.1109/issre.2019.00041.

[3] M. Ben Messaoud, A. Miladi, I. Jenhani, M. W. Mkaouer, and L. Ghadhab, “Duplicate Bug Report Detection Using an Attention-Based Neural Language Model,” Ieee Trans. Reliab., 2023, doi: 10.1109/tr.2022.3193645.

[4] S. N. Saharudin, T. W. Koh, and S. N. Kew, “Machine Learning Techniques for Software Bug Prediction: A Systematic Review,” J. Comput. Sci., 2020, doi: 10.3844/jcssp.2020.1558.1569.

[5] D. Ajiga, P. A. Okeleke, S. O. Folorunsho, and C. Ezeigweneme, “Enhancing Software Development Practices With AI Insights in High-Tech Companies,” Comput. Sci. & It Res. J., 2024, doi: 10.51594/csitrj.v5i8.1450.

[6] Z. Li, S. Wang, W. Wang, P. Liang, R. Mo, and B. Li, “Understanding Bugs in Multi-Language Deep Learning Frameworks,” Ieee Access, 2023, doi: 10.48550/arxiv.2303.02695.

[7] A. Vadlamani, R. Kalicheti, and S. Chimalakonda, “APIScanner -- Towards Automated Detection of Deprecated APIs in Python Libraries,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2021. doi: 10.48550/arxiv.2102.09251.

[8] N. A. Adam Khleel and K. Nehéz, “Comprehensive Study on Machine Learning Techniques for Software Bug Prediction,” Int. J. Adv. Comput. Sci. Appl., 2021, doi: 10.14569/ijacsa.2021.0120884.

[9] F. Khan, S. Kanwal, S. Alamri, and B. Mumtaz, “Hyper-Parameter Optimization of Classifiers, Using an Artificial Immune Network and Its Application to Software Bug Prediction,” Ieee Access, 2020, doi: 10.1109/access.2020.2968362.

[10] M. K. Wozniak and P. J. Giabbanelli, “Comparing Implementations of Cellular Automata as Images: A Novel Approach to Verification by Combining Image Processing and Machine Learning,” in SIGSIM-PADS ’21, 2021. doi: 10.1145/3437959.3459256.

[11] S. Kotsiantis, V. S. Verykios, and M. Tzagarakis, “AI-Assisted Programming Tasks Using Code Embeddings and Transformers,” Electronics, 2024, doi: 10.3390/electronics13040767.

[12] X. Huang, P. Kruisz, and M. Kuhlwilm, “Sstar: A Python Package for Detecting Archaic Introgression From Population Genetic Data With S*,” Mol. Biol. Evol., 2022, doi: 10.1101/2022.03.10.483765.

[13] A. Kukkar, R. Mohana, Y. Kumar, A. Nayyar, M. Bilal, and K. S. Kwak, “Duplicate Bug Report Detection and Classification System Based on Deep Learning Technique,” IEEE Access, vol. 8, pp. 200749–200763, 2020, doi: 10.1109/ACCESS.2020.3033045.

[14] M. Kumari, U. K. Singh, and M. Sharma, “Entropy Based Machine Learning Models for Software Bug Severity Assessment in Cross Project Context,” Comput. Sci. Its Appl., 2020, doi: 10.1007/978-3-030-58817-5_66.

[15] P. Hegedűs and R. Ferenć, “Static Code Analysis Alarms Filtering Reloaded: A New Real-World Dataset and Its ML-Based Utilization,” Ieee Access, 2022, doi: 10.1109/access.2022.3176865.

[16] A. Rahman and E. Farhana, “An Empirical Study of Bugs in COVID-19 Software Projects,” J. Softw. Eng. Res. Dev., 2021, doi: 10.5753/jserd.2021.827.

[17] S. K. Pandey, R. B. Mishra, and A. K. Tripathi, “BPDET: An Effective Software Bug Prediction Model Using Deep Representation and Ensemble Learning Techniques,” Expert Syst. Appl., 2020, doi: 10.1016/j.eswa.2019.113085.

[18] U. Dikme, “Industrial User Interface Software Design for Visual Python AI Applications Using Embedded Linux Based Systems,” J. Appl. Phys. Sci., 2021, doi: 10.20474/japs-7.1.

[19] A. Ghaleb and K. Pattabiraman, “How Effective Are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools Using Bug Injection,” in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. doi: 10.1145/3395363.3397385.

[20] K. Bharath and P. Jagadeesh, “An Innovative Software Bug Prediction System using Random Forest Algorithm for Enhanced Accuracy in Comparison with Logistic Regression Algorithm,” in 2023 Intelligent Computing and Control for Engineering and Business Systems (ICCEBS), 2023.

[21] S. T. Cynthia, B. Roy, and D. Mondal, “Feature transformation for improved software bug detection models,” in ACM International Conference Proceeding Series, Association for Computing Machinery, 2022. doi: 10.1145/3511430.3511444.

[22] B. Imran, E. Wahyudi, S. Riadi, Z. Muahidin, S. Erniwati, and W. A. Wahyuni, “A Comparative Hybrid Approach for Python Bug Detection Using Syntactic Features, Random Forest, and Neural Network,” CommIT J., vol. 19, no. 2, pp. 141–150, 2025.

[23] B. Imran, S. Riadi, E. Suryadi, M. Zulpahmi, and E. Wahyudi, “SemetonBug : A Machine Learning Model for Automatic Bug Detection in Python Code Based on Syntactic Analysis,” J. Inform., vol. 11, no. 2, pp. 75–80, 2025.

[24] H. M. Tran, S. T. Le, S. Van Nguyen, and P. T. Ho, “An Analysis of Software Bug Reports Using Machine Learning Techniques,” SN Comput. Sci., vol. 1, no. 1, 2020, doi: 10.1007/s42979-019-0004-1.

[25] W. Albattah and M. Alzahrani, “Software Defect Prediction based on Machine Learning and Deep Learning,” AI, pp. 116–122, 2024, doi: 10.1109/ICICT54344.2022.9850643.

Downloads

Published

2026-02-04

How to Cite

[1]
S. Erniwati, B. Imran, Z. Muahidin, Z. Zaeniah, and J. Juhartini, “SemetonBug: Next-Generation Machine Learning-Powered Code Analyzer for Precision Bug Detection and Dynamic Error Localization”, JAIC, vol. 10, no. 1, pp. 224–231, Feb. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.