SemetonBug: Next-Generation Machine Learning-Powered Code Analyzer for Precision Bug Detection and Dynamic Error Localization
DOI:
https://doi.org/10.30871/jaic.v10i1.11837Keywords:
Bug Detection, Machine Learning, Python, Random Forest, Abstract Syntax TreeAbstract
Bug detection in Python programming is a crucial challenge in software development. This research proposes SemetonBug, a machine learning-based system for automatically detecting bugs in Python code. The system utilizes a Random Forest Classifier as the main model, with features extracted from the syntactic structure of the code using an Abstract Syntax Tree (AST). The dataset consists of 200 Python files, divided into 100 files with bugs and 100 files without bugs. The model is optimized using Grid Search Cross Validation, with the best combination of n_estimators = 300, max_depth = 20, min_samples_split = 5, and min_samples_leaf = 2. Evaluation results show that the model achieves 85% accuracy, 0.84 precision, 0.87 recall, and 0.86 F1-score. The detected bugs are stored in an Excel file for further analysis. By leveraging machine learning, SemetonBug enhances efficiency and accuracy in bug identification compared to traditional rule-based methods. These findings highlight the potential of machine learning models in improving software quality and reducing coding errors automatically.
Downloads
References
[1] M. Allamanis, H. Jackson-Flux, and M. Brockschmidt, “Self-Supervised Bug Detection and Repair,” in Journal of Mathematical Sciences, 2021. doi: 10.48550/arxiv.2105.12787.
[2] D. Cotroneo, L. De Simone, A. K. Iannillo, R. Natella, S. Rosiello, and N. Bidokhti, “Analyzing the Context of Bug-Fixing Changes in the OpenStack Cloud Computing Platform,” in 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), 2019. doi: 10.1109/issre.2019.00041.
[3] M. Ben Messaoud, A. Miladi, I. Jenhani, M. W. Mkaouer, and L. Ghadhab, “Duplicate Bug Report Detection Using an Attention-Based Neural Language Model,” Ieee Trans. Reliab., 2023, doi: 10.1109/tr.2022.3193645.
[4] S. N. Saharudin, T. W. Koh, and S. N. Kew, “Machine Learning Techniques for Software Bug Prediction: A Systematic Review,” J. Comput. Sci., 2020, doi: 10.3844/jcssp.2020.1558.1569.
[5] D. Ajiga, P. A. Okeleke, S. O. Folorunsho, and C. Ezeigweneme, “Enhancing Software Development Practices With AI Insights in High-Tech Companies,” Comput. Sci. & It Res. J., 2024, doi: 10.51594/csitrj.v5i8.1450.
[6] Z. Li, S. Wang, W. Wang, P. Liang, R. Mo, and B. Li, “Understanding Bugs in Multi-Language Deep Learning Frameworks,” Ieee Access, 2023, doi: 10.48550/arxiv.2303.02695.
[7] A. Vadlamani, R. Kalicheti, and S. Chimalakonda, “APIScanner -- Towards Automated Detection of Deprecated APIs in Python Libraries,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2021. doi: 10.48550/arxiv.2102.09251.
[8] N. A. Adam Khleel and K. Nehéz, “Comprehensive Study on Machine Learning Techniques for Software Bug Prediction,” Int. J. Adv. Comput. Sci. Appl., 2021, doi: 10.14569/ijacsa.2021.0120884.
[9] F. Khan, S. Kanwal, S. Alamri, and B. Mumtaz, “Hyper-Parameter Optimization of Classifiers, Using an Artificial Immune Network and Its Application to Software Bug Prediction,” Ieee Access, 2020, doi: 10.1109/access.2020.2968362.
[10] M. K. Wozniak and P. J. Giabbanelli, “Comparing Implementations of Cellular Automata as Images: A Novel Approach to Verification by Combining Image Processing and Machine Learning,” in SIGSIM-PADS ’21, 2021. doi: 10.1145/3437959.3459256.
[11] S. Kotsiantis, V. S. Verykios, and M. Tzagarakis, “AI-Assisted Programming Tasks Using Code Embeddings and Transformers,” Electronics, 2024, doi: 10.3390/electronics13040767.
[12] X. Huang, P. Kruisz, and M. Kuhlwilm, “Sstar: A Python Package for Detecting Archaic Introgression From Population Genetic Data With S*,” Mol. Biol. Evol., 2022, doi: 10.1101/2022.03.10.483765.
[13] A. Kukkar, R. Mohana, Y. Kumar, A. Nayyar, M. Bilal, and K. S. Kwak, “Duplicate Bug Report Detection and Classification System Based on Deep Learning Technique,” IEEE Access, vol. 8, pp. 200749–200763, 2020, doi: 10.1109/ACCESS.2020.3033045.
[14] M. Kumari, U. K. Singh, and M. Sharma, “Entropy Based Machine Learning Models for Software Bug Severity Assessment in Cross Project Context,” Comput. Sci. Its Appl., 2020, doi: 10.1007/978-3-030-58817-5_66.
[15] P. Hegedűs and R. Ferenć, “Static Code Analysis Alarms Filtering Reloaded: A New Real-World Dataset and Its ML-Based Utilization,” Ieee Access, 2022, doi: 10.1109/access.2022.3176865.
[16] A. Rahman and E. Farhana, “An Empirical Study of Bugs in COVID-19 Software Projects,” J. Softw. Eng. Res. Dev., 2021, doi: 10.5753/jserd.2021.827.
[17] S. K. Pandey, R. B. Mishra, and A. K. Tripathi, “BPDET: An Effective Software Bug Prediction Model Using Deep Representation and Ensemble Learning Techniques,” Expert Syst. Appl., 2020, doi: 10.1016/j.eswa.2019.113085.
[18] U. Dikme, “Industrial User Interface Software Design for Visual Python AI Applications Using Embedded Linux Based Systems,” J. Appl. Phys. Sci., 2021, doi: 10.20474/japs-7.1.
[19] A. Ghaleb and K. Pattabiraman, “How Effective Are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools Using Bug Injection,” in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020. doi: 10.1145/3395363.3397385.
[20] K. Bharath and P. Jagadeesh, “An Innovative Software Bug Prediction System using Random Forest Algorithm for Enhanced Accuracy in Comparison with Logistic Regression Algorithm,” in 2023 Intelligent Computing and Control for Engineering and Business Systems (ICCEBS), 2023.
[21] S. T. Cynthia, B. Roy, and D. Mondal, “Feature transformation for improved software bug detection models,” in ACM International Conference Proceeding Series, Association for Computing Machinery, 2022. doi: 10.1145/3511430.3511444.
[22] B. Imran, E. Wahyudi, S. Riadi, Z. Muahidin, S. Erniwati, and W. A. Wahyuni, “A Comparative Hybrid Approach for Python Bug Detection Using Syntactic Features, Random Forest, and Neural Network,” CommIT J., vol. 19, no. 2, pp. 141–150, 2025.
[23] B. Imran, S. Riadi, E. Suryadi, M. Zulpahmi, and E. Wahyudi, “SemetonBug : A Machine Learning Model for Automatic Bug Detection in Python Code Based on Syntactic Analysis,” J. Inform., vol. 11, no. 2, pp. 75–80, 2025.
[24] H. M. Tran, S. T. Le, S. Van Nguyen, and P. T. Ho, “An Analysis of Software Bug Reports Using Machine Learning Techniques,” SN Comput. Sci., vol. 1, no. 1, 2020, doi: 10.1007/s42979-019-0004-1.
[25] W. Albattah and M. Alzahrani, “Software Defect Prediction based on Machine Learning and Deep Learning,” AI, pp. 116–122, 2024, doi: 10.1109/ICICT54344.2022.9850643.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Surni Erniwati, Bahtiar Imran, Zumratul Muahidin, Zaeniah Zaeniah, Juhartini Juhartini

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








