Improvement of Spelling Correction Accuracy in Indonesian Language through the Application of Hamming Distance Method

  • Mudawil Qulub Universitas Bumigora
  • Rifqi Hammad Universitas Bumigora
  • Pahrul Irfan Universitas Bumigora
  • Yuliana Yuliana Institut Shanti Bhuana
Keywords: Hamming Distance, Spelling Correction, Indonesian Language

Abstract

Spelling correction is a critical feature in software to reduce typing errors, commonly found in document processing software and smartphone keyboards. This research aims to evaluate the accuracy of the Hamming Distance method in correcting words in the Indonesian language, both standard and non-standard forms. The research data is derived from a previous study and comprises 60 standard and non-standard Indonesian words. Typos are generated by considering the layout of letters on the QWERTY keyboard. Typing error data is divided into two groups, namely words with 1 and 2 character differences. The first test is conducted on standard words, achieving an accuracy rate of 98.33% for 1 and 2 character differences. Subsequent testing on non-standard words shows an accuracy rate of 100% for 1 character difference and 96.67% for 2 character differences. The results of this research highlight the potential of the Hamming Distance method in improving the quality of spelling correction in the Indonesian language.

Downloads

Download data is not yet available.

References

A. Musyafa, Y. Gao, A. Solyman, C. Wu, and S. Khan, “Automatic Correction of Indonesian Grammatical Errors Based on Transformer,” Appl. Sci., vol. 12, no. 20, p. 10380, Oct. 2022, doi: 10.3390/app122010380.

A. Muchti and Y. Ernawati, “Penguasaan Kosakata Baku Dan Tidak Baku: Sebuah Studi Kasus Mahasiswa UBD,” J. Ilm. Bina Edukasi, vol. 15, no. 1, pp. 61–70, Jun. 2022, doi: 10.33557/jedukasi.v15i1.1762.

D. Ginting, “Kemampuan Membedakan Bahasa Indonesia Baku dan Tidak Baku Oleh Siswa (Studi Kasus Siswa SMP Negeri 3 Mardingding),” no. 1.

A. I. Fahma, I. Cholissodin, and R. S. Perdana, “Identifikasi Kesalahan Penulisan Kata (Typographical Error) pada Dokumen Berbahasa Indonesia Menggunakan Metode N-gram dan Levenshtein Distance”.

Cahyo Prianto, D. Markuci, and S. F. Pane, “Implementasi Spelling Corrector Untuk Mengatasi Typographical Error Pada Fitur Pencarian Aplikasi Kamus Istilah Informatika,” J. Teknol. Inf. J. Keilmuan Dan Apl. Bid. Tek. Inform., vol. 17, no. 1, pp. 20–30, Jan. 2023, doi: 10.47111/jti.v17i1.5520.

S. N. Agustin, I. Mufarrihah, and D. R. Prehanto, “Aplikasi Pengoreksian Kesalahan Bahasa Indonesia Menggunakan Metode Jaro Winkler Distance Dan Levenshtein Distance Berbasis Web”.

W. L. Wong, M. M. Muhammad, K. P. Chuah, N. Saimi, A. H. Ma’arop, and R. Elias, “Did you Run the Telegram? Use of Mobile Spelling Checker on Academic Writing,” Multiling. Acad. J. Educ. Soc. Sci., vol. 10, no. 1, p. Pages 1-19, Jan. 2022, doi: 10.46886/MAJESS/v10-i1/7379.

K. Goslin and M. Hofmann, “English Language Spelling Correction as an Information Retrieval Task Using Wikipedia Search Statistics”.

D. A. Anggoro and I. Nurfadilah, “Active Verb Spell Checking Mem- + P in Indonesian Language Using the Jaro-Winkler Distance Algorithm,” Iraqi J. Sci., pp. 1811–1822, Apr. 2022, doi: 10.24996/ijs.2022.63.4.38.

M. H. Ferdiansyah and I. K. D. Nuryana, “Analisis Perbandingan Metode Burkhard Keller Tree dan SymSpell dalam Spell Correction Bahasa Indonesia,” vol. 04, 2023.

Y. N. Gulo, “Penerapan Algoritma Hamming Distance Untuk Pencarian Teks Pada Aplikasi Ensiklopedia Indonesia,” vol. 1, no. 2, 2022.

R. B. S. Putra and E. Utami, “Non-formal affixed word stemming in Indonesian language,” in 2018 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta: IEEE, Mar. 2018, pp. 531–536. doi: 10.1109/ICOIACT.2018.8350735.

A. Kirk, “Improving the Accuracy of Mobile Touchscreen QWERTY Keyboards”.

T. K. Wulandari, E. D. Oktaviani, and A. Lestari, “Penerapan Metode Binary Search dan Hamming Distance pada E-library SMAN 2 Katingan Hilir,” vol. 2, 2022.

D. S. Suparno, “Pengenalan Pola Untuk Mengetahui Jumlah Target Pengunjung Mall Berdasarkan Usia, Gender, Pendapatan Pertahun, Pengeluaran, Tujuannya Untuk Mempermudah Mengetahui Target Pasar Menggunakan Metode EDA, K-Means, Hierarchial Clustering, Confusion Matrix,” vol. 3, no. 2, 2021.

Published
2023-12-06
How to Cite
[1]
M. Qulub, R. Hammad, P. Irfan, and Y. Yuliana, “Improvement of Spelling Correction Accuracy in Indonesian Language through the Application of Hamming Distance Method”, JAIC, vol. 7, no. 2, pp. 271-277, Dec. 2023.
Section
Articles