A Fuzzy C-Means–Based Clustering Model for Analyzing TOEFL Prediction Scores in Higher Education
DOI:
https://doi.org/10.30871/jaic.v10i1.11468Keywords:
Clustering, Language, Fuzzy C-Means, Learning Analytics, Xie Beni Index, Vocational EducationAbstract
In the era of digital transformation, the application of data mining in academic data management has become an important requirement for improving the quality of education. One crucial aspect is English proficiency. One of the tools for measuring English proficiency is the Test of English as a Foreign Language (TOEFL) Prediction test, which is administered at every university, including the State Polytechnic of Lhokseumawe. The management of TOEFL Prediction scores can utilize data mining as a basis for more in-depth learning analysis, as well as evaluation material. This study aims to design and develop a model for grouping the TOEFL scores of students at State Polytechnic of Lhokseumawe by applying the Fuzzy C-Means (FCM) algorithm. The research methods included observation and interviews, data collection and pre-processing, cluster model design, web-based system development, and system testing. Evaluation was conducted through Black Box and White Box testing for the system, as well as cluster quality validation using the Xie-Beni Index (XB) and Partition Coefficient. The results showed that the pre-test dataset of first-year students (651 data) produced three clusters with an XB value of 0.623, while the dataset of final-year students (826 data) produced six clusters with an XB value of 0.181. The developed model proved to be able to map students' English language abilities in a more structured manner and could be used as a basis for academic planning and skill improvement.
Downloads
References
[1]. Z. Waznah and M. I. P. Nasution, “Peran Teknologi Terbaru: Big Data dan Kecerdasan Buatan dalam Mengoptimalkan Sistem Informasi Manajemen Organisasi,” Neraca Manajemen, Ekonomi, vol. 10, no. 9, 2024. DOI: 10.8734/mnmae.v1i2.359.
[2]. Direktorat Pembelajaran dan Kemahasiswaan, Direktorat Jenderal Pendidikan Tinggi Riset dan Teknologi dan Kementerian Pendidikan Kebudayaan Riset dan Teknologi. 2024. Buku Panduan Merdeka Belajar-Kampus Merdeka 2024. Jakarta.
[3]. TOEFL ITP Test and Score Data Summary Report, January–December 2024. ETS, 2025. [Online]. Tersedia: https://www.ets.org/toefl/itp
[4]. M. Niu, “Design and Application of the DPC-K-Means Clustering Algorithm for Evaluation of English Teaching Proficiency,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 8, 2024, doi: 10.14569/ijacsa.2024.0150851.
[5]. [5] N. Ulinnuha dan D. C. R. Novitasari, “Penerapan Fuzzy C-Means Untuk Pengelompokkan Tingkat Kualitas Pendidikan Di Jawa Timur,” Simetris J. Tek. Mesin Elektro Dan Ilmu Komput., vol. 14, no. 2, hlm. 419–426, Nov 2023, doi: 10.24176/simet.v14i2.9442.
[6]. G. N. S. Putri, D. Ispriyanti, dan T. Widiharih, “Implementasi Algoritma Fuzzy C-Means Dan Fuzzy Possibilistics C-Means Untuk Klasterisasi Data Tweets Pada Akun Twitter Tokopedia,” J. Gaussian, vol. 11, no. 1, hlm. 86–98, Mei 2022, doi: 10.14710/j.gauss.v11i1.33996.
[7]. A. N. Anwar, “Implementasi Fuzzy C-Mean (FCM) untuk Menentukan Penerima Beasiswa” Jurnal Ilmu Komputer, vol. 6, no. 1, pp. 84–87, 2023.
[8]. A. Yasir and A. U. Firmansyah, “Implementasi metode Fuzzy C-Means dan metode AHP dalam pemilihan promosi jabatan karyawan berbasis web (Studi kasus: PT. Tunas Dwipa Matra Sekampung),” Journal of Science and Social Research, vol. 7, no. 4, pp. 1616–1619, 2024.
[9]. A. R. N. Nabella, Hani Zulfia Zahro’, dan Yosep Agus Pranoto, “Rancang Bangun Sistem TOEFL Untuk Analisis Kelemahan Peserta Dengan Penerapan Algoritma K-Means Clustering,” Infotek J. Inform. Dan Teknol., vol. 8, no. 1, hlm. 94–103, Jan 2025, doi: 10.29408/jit.v8i1.28260.
[10]. Y. P. Putra and R. Nuari, “Application of K-Means algorithm to cluster students' Reading patterns in the digital age,” INOVTEK Polbeng - Seri Informatika, vol. 10, no. 1, pp. 320–331, 2025.
[11]. M. A. Septianto, A. Faqih, dan A. R. Rinaldi, “Klasterisasi Data Produksi Pertanian Di Kabupaten Cirebon Dengan Algoritma K-Means,” J. Inform. Dan Tek. Elektro Terap., vol. 13, no. 2, Apr 2025, doi: 10.23960/jitet.v13i2.6174.
[12]. M. Cui et al., “Introduction to the k-means clustering algorithm based on the elbow method,” Accounting, Auditing and Finance, vol. 1, no. 1, pp. 5–8, 2020.
[13]. M. Ula, G. Perdinanta, R. Hidayat, dan I. Sahputra, “Analyze the Clustering and Predicting Results of Palm Oil Production in Aceh Utara,” IJCCS Indones. J. Comput. Cybern. Syst., vol. 17, no. 2, Apr 2023, doi: 10.22146/ijccs.83195.
[14]. D. Krasnov, D. Davis, K. Malott, Y. Chen, X. Shi, dan A. Wong, “Fuzzy C-Means Clustering: A Review of Applications in Breast Cancer Detection,” Entropy, vol. 25, no. 7, hlm. 1021, Jul 2023, doi: 10.3390/e25071021.
[15]. R. K. Verma, R. Tiwari, dan P. S. Thakur, “Partition Coefficient and Partition Entropy in Fuzzy C Means Clustering,” J. Sci. Res. Rep., vol. 29, no. 12, hlm. 1–6, Des 2023, doi: 10.9734/jsrr/2023/v29i121812.
[16]. D. Nurmin, M. N. Hayati, dan R. Goejantoro, “Penerapan Metode Fuzzy C-Means pada Pengelompokan Kabupaten/Kota di Pulau Kalimantan Berdasarkan Indikator Kesejahteraan Rakyat Tahun 2020,” Laboratorium Statistika Terapan dan Statistika Komputasi, FMIPA, Universitas Mulawarman, 2020
[17]. S. F. Octavia dan M. Mustakim, “Penerapan K-Means dan Fuzzy C-Means untuk Pengelompokan Data Kasus Covid-19 di Kabupaten Indragiri Hilir,” Build. Inform. Technol. Sci. BITS, vol. 3, no. 2, hlm. 88–94, Sep 2021, doi: 10.47065/bits.v3i2.1005.
[18]. M. A. Mallik, N. F. Zulkurnain, M. K. Nizamuddin, dan A. Kc, “An Efficient Fuzzy C-Least Median Clustering Algorithm,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1070, hlm. 012050, Feb 2021, doi: 10.1088/1757-899x/1070/1/012050.
[19]. H. Y. Wang, J. S. Wang, dan L. F. Zhu, “A new validity function of FCM clustering algorithm based on intra-class compactness and inter-class separation,” J. Intell. Fuzzy Syst., vol. 40, no. 6, hlm. 12411–12432, Jun 2021, doi: 10.3233/jifs-210555.
[20]. B. Rais dan L. Awwalia, “What is the student’s level? Analyzing Students’ English Proficiency Levels in High Education Level,” SALEE Study Appl. Linguist. Engl. Educ., vol. 6, no. 1, hlm. 305–316, Feb 2025, doi: 10.35961/salee.v6i1.1762.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Filipus Mei Tri Boy Gulo, Rahmad Hidayat, Hendrawaty Hendrawaty, Rahmat Isma Hidayat, Muhammad Heikal Fasya, Syifaurrahman Syifaurrahman, Dea Syafira Ananda

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








