Classification Analysis of Single Tuition Fees Using the Random Forest Method with K-Fold Cross Validation
DOI:
https://doi.org/10.30871/jaic.v10i1.11798Keywords:
Random Forest, Single Tuition Fee, Classification, K-Fold Cross ValidationAbstract
Classification is the process of grouping data into specific categories based on their characteristics or features, which plays a crucial role in the analysis, decision-making, and prediction of new data. In academic settings, classification is used to determine the Single Tuition Fee to place students according to their economic ability. Lhokseumawe State Polytechnic has implemented the UKT system since 2020 with eight categories, but some students are still placed in UKT groups that do not match the results of the manual process, which has limited accuracy. This study uses the Random Forest method as a technology-based solution to improve the accuracy and objectivity of UKT classification. The dataset used consists of 10,000 student data with 10 variables, covering economic and social information. The research process includes data preprocessing, Random Forest model training, performance evaluation using accuracy, precision, recall, and F1-score, and model stability testing through 10-fold K-Fold Cross Validation. The results show that Random Forest is able to classify most UKT classes well, especially classes 0–5 and 7. Class 6 has lower performance with a recall of 0.39 and an F1-score of 0.56 due to the limited number of samples. The overall accuracy of the model reaches 96%, while K-Fold Cross Validation produces an average accuracy of 95.50% with a standard deviation of 0.66%, indicating the model is stable and able to generalize to new data. This study proves that Random Forest is effective in UKT classification, producing an objective, fair, and efficient system. This implementation model supports data-driven decision-making in higher education and increases transparency in UKT determination.
Downloads
References
[1] I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Comput Sci, vol. 2, no. 3, 2021.
[2] U. I. Akpan, “Review of Classification Algorithms with Changing Inter-Class Distances,” J. King Saud, 2021.
[3] N. Nurdin, “Analisa Data Mining Dalam Memprediksi Masyarakat Kurang Mampu Menggunakan Metode K-Nearest Neighbor,” J. Inform. Dan Tek. Elektro Terap., vol. 12, no. 2, 2025.
[4] A. Khaidar, “Analisis Sentimen Di Instagram Terhadap Menteri Keuangan Purbaya Yudhi Sadewa Menggunakan Metode Logistic Regression”,” Jitet, Vol. 13, No. 3s1, 2025.
[5] K. Taha, “A Comprehensive Survey of Text Classification Techniques,” Expert Syst. Appl., vol. 202, pp. 117–134, 2024.
[6] T. Jiang, J. L. Gradus, and A. J. Rosellini, “Supervised Machine Learning: A Brief Primer. Behav Ther,” 2021. doi: 10.1016/j.beth.2020.05.002.
[7] S. Kurnia and A. Khaidar, “Perbandingan Metode Machine Learning Menggunakan Metode Support Vector Machine Dan Artificial Neural Network Dalam Memprediksi Serangan Jantung,” J. Inform. Kaputama (Jik, Vol. 9, No. 2, Pp. 87–94, 2025.
[8] N. Nurdin, M. Suhendri, Y. Afrilia, and R. Rizal, “Klasifikasi Karya Ilmiah (Tugas Akhir) Mahasiswa Menggunakan Metode Naive Bayes Classifier (NBC,” Sist. J. Sist. Inf., vol. 10, no. 2, pp. 268–279.
[9] A. Khaidar, M. Arhami, and M. Abdi, “Application of the Random Forest Method for UKT Classification at Politeknik Negeri Lhokseumawe,” J. Artif. Intell. Softw. Eng., vol. 4, no. 2, pp. 94–103, 2024.
[10] K. P. Kebudayaan Republik Indonesia, “Peraturan Menteri Pendidikan dan Kebudayaan Nomor 2 Tahun 2024 tentang Biaya Kuliah Tunggal dan Uang Kuliah Tunggal pada Perguruan Tinggi Negeri di Lingkungan Kementerian Pendidikan dan Kebudayaan,” 2024, Jakarta.
[11] K. P. Kebudayaan Republik Indonesia, “Peraturan Menteri Pendidikan dan Kebudayaan Nomor 25 Tahun 2020 tentang Standar Satuan Biaya Operasional Pendidikan Tinggi pada Perguruan Tinggi Negeri di Lingkungan Kementerian Pendidikan dan Kebudayaan,” 2020, Jakarta. [Online]. Available: https://peraturan.bpk.go.id/Details/163756/permendikbud-no-25-tahun-2020
[12] T.-T. Huynh-Cam, L.-S. Chen, and H. Le, “Using Decision Trees and Random Forest Algorithms to Predict and Determine Factors Contributing to First-Year University Students’ Learning Performance,” Algorithms, vol. 14, no. 11, p. 318, 2021, doi: 10.3390/a14110318.
[13] M. Chen and Z. Liu, “Predicting performance of students by optimizing tree components of random forest using genetic algorithm,” Heliyon, vol. 10, no. 12, p. e32570, 2024, doi: https://doi.org/10.1016/j.heliyon.2024.e32570.
[14] J. Gao and Y. Liu, “Prediction and the influencing factor study of colorectal cancer hospitalization costs in China based on machine learning-random forest and support vector regression: a retrospective study,” Front Public Heal., vol. 12, no. 1211220, 2024, doi: 10.3389/fpubh.2024.1211220.
[15] E. Widhiastuti, “Implementasi Data Mining Untuk Memprediksi Penyakit Hipertensi Dalam Kehamilan Menggunakan Algoritma C4. 5 (Study Kasus: Puskesmas RImba Melintang, Rokan Hilir,” 2021.
[16] M. Sitanggang, E. Simamora, and F. D. Mobo, “Increasing Accuracy of Classification in C4.5 Algorithm by Applying Principle Component Analysis for Diabetes Diagnosis,” Numer. J. Mat. Dan Pendidik. Mat., vol. 6, no. 2, pp. 175–186, 2022, doi: 10.25217/numerical.v6i2.2610.
[17] E. R. B. Sebayang, Y. H. Chrisnanto, and M. Melina, “Klasifikasi Data Kesehatan Mental di Industri Teknologi Menggunakan Algoritma Random Forest,” IJESPG (International J. Eng. Econ. Soc. Polit. Gov., vol. 1, no. 3, pp. 237–253, 2023.
[18] N. Nurdin, F. Fajriana, M. Maryana, and A. Zanati, “Information System for Predicting Fisheries Outcomes Using Regression Algorithm Multiple Linear,” J. Informatics Telecommun. Eng., vol. 5, no. 2, pp. 247–258.
[19] Y. Wang, P. Jia, L. Liu, C. Huang, and Z. Liu, “A systematic review of fuzzing based on machine learning techniques,” PLoS One, vol. 18;15(8):e, 2020, doi: 10.1371/journal.pone.0237749.
[20] T. P. Quinn et al., “Machine Learning in Psychiatry (MLPsych) Consortium. A primer on the use of machine learning to distil knowledge from data in biological psychiatry,” Mol Psychiatry, vol. Feb;29(2):, 2024, doi: 10.1038/s41380-023-02334-2.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Al Khaidar, Nurdin Nurdin, Fajriana Fajriana, Taufiq Taufiq, Defry Hamdhana

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








