Classification Analysis of Single Tuition Fees Using the Random Forest Method with K-Fold Cross Validation

Al Khaidar; Nurdin Nurdin; Fajriana Fajriana; Taufiq Taufiq; Defry Hamdhana

doi:10.30871/jaic.v10i1.11798

Authors

Al Khaidar Program Studi Magister Teknologi Informasi, Universitas Malikussaleh
Nurdin Nurdin Program Studi Magister Teknologi Informasi, Universitas Malikussaleh
Fajriana Fajriana Program Studi Magister Teknologi Informasi, Universitas Malikussaleh
Taufiq Taufiq Program Studi Magister Teknologi Informasi, Universitas Malikussaleh
Defry Hamdhana Program Studi Magister Teknologi Informasi, Universitas Malikussaleh

DOI:

https://doi.org/10.30871/jaic.v10i1.11798

Keywords:

Random Forest, Single Tuition Fee, Classification, K-Fold Cross Validation

Abstract

Classification is the process of grouping data into specific categories based on their characteristics or features, which plays a crucial role in the analysis, decision-making, and prediction of new data. In academic settings, classification is used to determine the Single Tuition Fee to place students according to their economic ability. Lhokseumawe State Polytechnic has implemented the UKT system since 2020 with eight categories, but some students are still placed in UKT groups that do not match the results of the manual process, which has limited accuracy. This study uses the Random Forest method as a technology-based solution to improve the accuracy and objectivity of UKT classification. The dataset used consists of 10,000 student data with 10 variables, covering economic and social information. The research process includes data preprocessing, Random Forest model training, performance evaluation using accuracy, precision, recall, and F1-score, and model stability testing through 10-fold K-Fold Cross Validation. The results show that Random Forest is able to classify most UKT classes well, especially classes 0–5 and 7. Class 6 has lower performance with a recall of 0.39 and an F1-score of 0.56 due to the limited number of samples. The overall accuracy of the model reaches 96%, while K-Fold Cross Validation produces an average accuracy of 95.50% with a standard deviation of 0.66%, indicating the model is stable and able to generalize to new data. This study proves that Random Forest is effective in UKT classification, producing an objective, fair, and efficient system. This implementation model supports data-driven decision-making in higher education and increases transparency in UKT determination.

Downloads

Download data is not yet available.

References

[1] I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Comput Sci, vol. 2, no. 3, 2021.

[2] U. I. Akpan, “Review of Classification Algorithms with Changing Inter-Class Distances,” J. King Saud, 2021.

[3] N. Nurdin, “Analisa Data Mining Dalam Memprediksi Masyarakat Kurang Mampu Menggunakan Metode K-Nearest Neighbor,” J. Inform. Dan Tek. Elektro Terap., vol. 12, no. 2, 2025.

[4] A. Khaidar, “Analisis Sentimen Di Instagram Terhadap Menteri Keuangan Purbaya Yudhi Sadewa Menggunakan Metode Logistic Regression”,” Jitet, Vol. 13, No. 3s1, 2025.

[5] K. Taha, “A Comprehensive Survey of Text Classification Techniques,” Expert Syst. Appl., vol. 202, pp. 117–134, 2024.

[6] T. Jiang, J. L. Gradus, and A. J. Rosellini, “Supervised Machine Learning: A Brief Primer. Behav Ther,” 2021. doi: 10.1016/j.beth.2020.05.002.

[7] S. Kurnia and A. Khaidar, “Perbandingan Metode Machine Learning Menggunakan Metode Support Vector Machine Dan Artificial Neural Network Dalam Memprediksi Serangan Jantung,” J. Inform. Kaputama (Jik, Vol. 9, No. 2, Pp. 87–94, 2025.

[8] N. Nurdin, M. Suhendri, Y. Afrilia, and R. Rizal, “Klasifikasi Karya Ilmiah (Tugas Akhir) Mahasiswa Menggunakan Metode Naive Bayes Classifier (NBC,” Sist. J. Sist. Inf., vol. 10, no. 2, pp. 268–279.

[9] A. Khaidar, M. Arhami, and M. Abdi, “Application of the Random Forest Method for UKT Classification at Politeknik Negeri Lhokseumawe,” J. Artif. Intell. Softw. Eng., vol. 4, no. 2, pp. 94–103, 2024.

[10] K. P. Kebudayaan Republik Indonesia, “Peraturan Menteri Pendidikan dan Kebudayaan Nomor 2 Tahun 2024 tentang Biaya Kuliah Tunggal dan Uang Kuliah Tunggal pada Perguruan Tinggi Negeri di Lingkungan Kementerian Pendidikan dan Kebudayaan,” 2024, Jakarta.

[11] K. P. Kebudayaan Republik Indonesia, “Peraturan Menteri Pendidikan dan Kebudayaan Nomor 25 Tahun 2020 tentang Standar Satuan Biaya Operasional Pendidikan Tinggi pada Perguruan Tinggi Negeri di Lingkungan Kementerian Pendidikan dan Kebudayaan,” 2020, Jakarta. [Online]. Available: https://peraturan.bpk.go.id/Details/163756/permendikbud-no-25-tahun-2020

[12] T.-T. Huynh-Cam, L.-S. Chen, and H. Le, “Using Decision Trees and Random Forest Algorithms to Predict and Determine Factors Contributing to First-Year University Students’ Learning Performance,” Algorithms, vol. 14, no. 11, p. 318, 2021, doi: 10.3390/a14110318.

[13] M. Chen and Z. Liu, “Predicting performance of students by optimizing tree components of random forest using genetic algorithm,” Heliyon, vol. 10, no. 12, p. e32570, 2024, doi: https://doi.org/10.1016/j.heliyon.2024.e32570.

[14] J. Gao and Y. Liu, “Prediction and the influencing factor study of colorectal cancer hospitalization costs in China based on machine learning-random forest and support vector regression: a retrospective study,” Front Public Heal., vol. 12, no. 1211220, 2024, doi: 10.3389/fpubh.2024.1211220.

[15] E. Widhiastuti, “Implementasi Data Mining Untuk Memprediksi Penyakit Hipertensi Dalam Kehamilan Menggunakan Algoritma C4. 5 (Study Kasus: Puskesmas RImba Melintang, Rokan Hilir,” 2021.

[16] M. Sitanggang, E. Simamora, and F. D. Mobo, “Increasing Accuracy of Classification in C4.5 Algorithm by Applying Principle Component Analysis for Diabetes Diagnosis,” Numer. J. Mat. Dan Pendidik. Mat., vol. 6, no. 2, pp. 175–186, 2022, doi: 10.25217/numerical.v6i2.2610.

[17] E. R. B. Sebayang, Y. H. Chrisnanto, and M. Melina, “Klasifikasi Data Kesehatan Mental di Industri Teknologi Menggunakan Algoritma Random Forest,” IJESPG (International J. Eng. Econ. Soc. Polit. Gov., vol. 1, no. 3, pp. 237–253, 2023.

[18] N. Nurdin, F. Fajriana, M. Maryana, and A. Zanati, “Information System for Predicting Fisheries Outcomes Using Regression Algorithm Multiple Linear,” J. Informatics Telecommun. Eng., vol. 5, no. 2, pp. 247–258.

[19] Y. Wang, P. Jia, L. Liu, C. Huang, and Z. Liu, “A systematic review of fuzzing based on machine learning techniques,” PLoS One, vol. 18;15(8):e, 2020, doi: 10.1371/journal.pone.0237749.

[20] T. P. Quinn et al., “Machine Learning in Psychiatry (MLPsych) Consortium. A primer on the use of machine learning to distil knowledge from data in biological psychiatry,” Mol Psychiatry, vol. Feb;29(2):, 2024, doi: 10.1038/s41380-023-02334-2.

Classification Analysis of Single Tuition Fees Using the Random Forest Method with K-Fold Cross Validation

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Similar Articles

submit

tools

issn