A Fuzzy C-Means–Based Clustering Model for Analyzing TOEFL Prediction Scores in Higher Education

Authors

  • Filipus Mei Tri Boy Gulo Politeknik Negeri Lhokseumawe
  • Rahmad Hidayat Politeknik Negeri Lhokseumawe
  • Hendrawaty Hendrawaty Politeknik Negeri Lhokseumawe
  • Rahmat Isma Hidayat Politeknik Negeri Lhokseumawe
  • Muhammad Heikal Fasya Politeknik Negeri Lhokseumawe
  • Syifaurrahman Syifaurrahman Politeknik Negeri Lhokseumawe
  • Dea Syafira Ananda Politeknik Negeri Lhokseumawe

DOI:

https://doi.org/10.30871/jaic.v10i1.11468

Keywords:

Clustering, Language, Fuzzy C-Means, Learning Analytics, Xie Beni Index, Vocational Education

Abstract

In the era of digital transformation, the application of data mining in academic data management has become an important requirement for improving the quality of education. One crucial aspect is English proficiency. One of the tools for measuring English proficiency is the Test of English as a Foreign Language (TOEFL) Prediction test, which is administered at every university, including the State Polytechnic of Lhokseumawe. The management of TOEFL Prediction scores can utilize data mining as a basis for more in-depth learning analysis, as well as evaluation material. This study aims to design and develop a model for grouping the TOEFL scores of students at State Polytechnic of Lhokseumawe by applying the Fuzzy C-Means (FCM) algorithm. The research methods included observation and interviews, data collection and pre-processing, cluster model design, web-based system development, and system testing. Evaluation was conducted through Black Box and White Box testing for the system, as well as cluster quality validation using the Xie-Beni Index (XB) and Partition Coefficient. The results showed that the pre-test dataset of first-year students (651 data) produced three clusters with an XB value of 0.623, while the dataset of final-year students (826 data) produced six clusters with an XB value of 0.181. The developed model proved to be able to map students' English language abilities in a more structured manner and could be used as a basis for academic planning and skill improvement.

Downloads

Download data is not yet available.

References

[1]. Z. Waznah and M. I. P. Nasution, “Peran Teknologi Terbaru: Big Data dan Kecerdasan Buatan dalam Mengoptimalkan Sistem Informasi Manajemen Organisasi,” Neraca Manajemen, Ekonomi, vol. 10, no. 9, 2024. DOI: 10.8734/mnmae.v1i2.359.

[2]. Direktorat Pembelajaran dan Kemahasiswaan, Direktorat Jenderal Pendidikan Tinggi Riset dan Teknologi dan Kementerian Pendidikan Kebudayaan Riset dan Teknologi. 2024. Buku Panduan Merdeka Belajar-Kampus Merdeka 2024. Jakarta.

[3]. TOEFL ITP Test and Score Data Summary Report, January–December 2024. ETS, 2025. [Online]. Tersedia: https://www.ets.org/toefl/itp

[4]. M. Niu, “Design and Application of the DPC-K-Means Clustering Algorithm for Evaluation of English Teaching Proficiency,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 8, 2024, doi: 10.14569/ijacsa.2024.0150851.

[5]. [5] N. Ulinnuha dan D. C. R. Novitasari, “Penerapan Fuzzy C-Means Untuk Pengelompokkan Tingkat Kualitas Pendidikan Di Jawa Timur,” Simetris J. Tek. Mesin Elektro Dan Ilmu Komput., vol. 14, no. 2, hlm. 419–426, Nov 2023, doi: 10.24176/simet.v14i2.9442.

[6]. G. N. S. Putri, D. Ispriyanti, dan T. Widiharih, “Implementasi Algoritma Fuzzy C-Means Dan Fuzzy Possibilistics C-Means Untuk Klasterisasi Data Tweets Pada Akun Twitter Tokopedia,” J. Gaussian, vol. 11, no. 1, hlm. 86–98, Mei 2022, doi: 10.14710/j.gauss.v11i1.33996.

[7]. A. N. Anwar, “Implementasi Fuzzy C-Mean (FCM) untuk Menentukan Penerima Beasiswa” Jurnal Ilmu Komputer, vol. 6, no. 1, pp. 84–87, 2023.

[8]. A. Yasir and A. U. Firmansyah, “Implementasi metode Fuzzy C-Means dan metode AHP dalam pemilihan promosi jabatan karyawan berbasis web (Studi kasus: PT. Tunas Dwipa Matra Sekampung),” Journal of Science and Social Research, vol. 7, no. 4, pp. 1616–1619, 2024.

[9]. A. R. N. Nabella, Hani Zulfia Zahro’, dan Yosep Agus Pranoto, “Rancang Bangun Sistem TOEFL Untuk Analisis Kelemahan Peserta Dengan Penerapan Algoritma K-Means Clustering,” Infotek J. Inform. Dan Teknol., vol. 8, no. 1, hlm. 94–103, Jan 2025, doi: 10.29408/jit.v8i1.28260.

[10]. Y. P. Putra and R. Nuari, “Application of K-Means algorithm to cluster students' Reading patterns in the digital age,” INOVTEK Polbeng - Seri Informatika, vol. 10, no. 1, pp. 320–331, 2025.

[11]. M. A. Septianto, A. Faqih, dan A. R. Rinaldi, “Klasterisasi Data Produksi Pertanian Di Kabupaten Cirebon Dengan Algoritma K-Means,” J. Inform. Dan Tek. Elektro Terap., vol. 13, no. 2, Apr 2025, doi: 10.23960/jitet.v13i2.6174.

[12]. M. Cui et al., “Introduction to the k-means clustering algorithm based on the elbow method,” Accounting, Auditing and Finance, vol. 1, no. 1, pp. 5–8, 2020.

[13]. M. Ula, G. Perdinanta, R. Hidayat, dan I. Sahputra, “Analyze the Clustering and Predicting Results of Palm Oil Production in Aceh Utara,” IJCCS Indones. J. Comput. Cybern. Syst., vol. 17, no. 2, Apr 2023, doi: 10.22146/ijccs.83195.

[14]. D. Krasnov, D. Davis, K. Malott, Y. Chen, X. Shi, dan A. Wong, “Fuzzy C-Means Clustering: A Review of Applications in Breast Cancer Detection,” Entropy, vol. 25, no. 7, hlm. 1021, Jul 2023, doi: 10.3390/e25071021.

[15]. R. K. Verma, R. Tiwari, dan P. S. Thakur, “Partition Coefficient and Partition Entropy in Fuzzy C Means Clustering,” J. Sci. Res. Rep., vol. 29, no. 12, hlm. 1–6, Des 2023, doi: 10.9734/jsrr/2023/v29i121812.

[16]. D. Nurmin, M. N. Hayati, dan R. Goejantoro, “Penerapan Metode Fuzzy C-Means pada Pengelompokan Kabupaten/Kota di Pulau Kalimantan Berdasarkan Indikator Kesejahteraan Rakyat Tahun 2020,” Laboratorium Statistika Terapan dan Statistika Komputasi, FMIPA, Universitas Mulawarman, 2020

[17]. S. F. Octavia dan M. Mustakim, “Penerapan K-Means dan Fuzzy C-Means untuk Pengelompokan Data Kasus Covid-19 di Kabupaten Indragiri Hilir,” Build. Inform. Technol. Sci. BITS, vol. 3, no. 2, hlm. 88–94, Sep 2021, doi: 10.47065/bits.v3i2.1005.

[18]. M. A. Mallik, N. F. Zulkurnain, M. K. Nizamuddin, dan A. Kc, “An Efficient Fuzzy C-Least Median Clustering Algorithm,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1070, hlm. 012050, Feb 2021, doi: 10.1088/1757-899x/1070/1/012050.

[19]. H. Y. Wang, J. S. Wang, dan L. F. Zhu, “A new validity function of FCM clustering algorithm based on intra-class compactness and inter-class separation,” J. Intell. Fuzzy Syst., vol. 40, no. 6, hlm. 12411–12432, Jun 2021, doi: 10.3233/jifs-210555.

[20]. B. Rais dan L. Awwalia, “What is the student’s level? Analyzing Students’ English Proficiency Levels in High Education Level,” SALEE Study Appl. Linguist. Engl. Educ., vol. 6, no. 1, hlm. 305–316, Feb 2025, doi: 10.35961/salee.v6i1.1762.

Downloads

Published

2026-02-04

How to Cite

[1]
F. M. T. B. Gulo, “A Fuzzy C-Means–Based Clustering Model for Analyzing TOEFL Prediction Scores in Higher Education”, JAIC, vol. 10, no. 1, pp. 115–124, Feb. 2026.

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.