Enhancing Clustering Accuracy Using K-Means with Seeds Optimization

Authors

  • Adiyah Mahiruna Institut Teknologi Statistika dan Bisnis Muhammadiyah Semarang
  • Ngatimin Ngatimin Institut Teknologi Statistika dan Bisnis Muhammadiyah
  • Rachmat Destriana Universitas Muhammadiyah Tangerang
  • Eko Hari Rachmawanto Universitas Dian Nuswantoro
  • Herman Yuliansyah Institut Teknologi Statistika dan Bisnis Muhammadiyah
  • Muhammad Taufiq Hidayat Universitas Ahmad Dahlan

DOI:

https://doi.org/10.30871/jaic.v9i5.10458

Keywords:

Clustering, Data Mining, Machine Learning, Health, Heredity

Abstract

In this study, the development of the Mean-based method proposed by Goyal and Kumar will be carried out by changing the initial cluster center determination step, which was originally based on the origin point O (0,0), to be replaced with the arithmetic mean. To assess the performance of the proposed method, it will be compared with the Global K-means method and the Mean-based K-means method. In this study, the performance of these methods will be measured using the Davies-Bouldin Index, and the significance of the proposed method will be measured using the Friedman Test. This study proposes a method of Improving K-Means Performance through Initial Center Optimization based on Second Global Average for Clustering Osteoporosis Diagnosis of lifestyle factors. Evaluation of K-Means performance through Initial Center Optimization based on Second Global Average with DBI measurements. The targeted experimental results of this study include improving the performance of K-means optimized through the initial center based on Second Global Average. From the results of nine experiments with the number of clusters [2,3,4,5,6], it can be seen that the method proposed in this study has the same superior performance compared to the Mean Based method and compared to the Global K-means method.

Downloads

Download data is not yet available.

Author Biography

Eko Hari Rachmawanto, Universitas Dian Nuswantoro

References

[1] F. Bagaswara, M. A. Muthalib, and R. Meiyanti, “Clustering of Futsal Interest Level Among Students K-Means Method,” Int. J. Eng. Sci. Inf. Technol., vol. 5, no. 3, pp. 41–50, 2025, doi: 10.52088/ijesty.v5i3.879.

[2] P.-N. Tan et al., Pang.N I Ng Tan. 2006.

[3] B. Zheng, S. W. Yoon, and S. S. Lam, “Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms,” Expert Syst. Appl., vol. 41, no. 4 PART 1, pp. 1476–1482, 2014, doi: 10.1016/j.eswa.2013.08.044.

[4] K. M. Kumar and A. R. M. Reddy, “An efficient k-means clustering filtering algorithm using density based initial cluster centers,” Inf. Sci. (Ny)., vol. 418–419, pp. 286–301, 2017, doi: 10.1016/j.ins.2017.07.036.

[5] S. Wang and W. Shi, Data mining and knowledge discovery. 2012. doi: 10.1007/978-3-540-72680-7_5.

[6] C. Zhang, D. Ouyang, and J. Ning, “An artificial bee colony approach for clustering,” Expert Syst. Appl., vol. 37, no. 7, pp. 4761–4767, 2010, doi: 10.1016/j.eswa.2009.11.003.

[7] H. Xie et al., “Improving K-means clustering with enhanced Firefly Algorithms,” Appl. Soft Comput. J., vol. 84, p. 105763, 2019, doi: 10.1016/j.asoc.2019.105763.

[8] Y. Li, K. Zhao, X. Chu, and J. Liu, “Speeding up k-Means algorithm by GPUs,” J. Comput. Syst. Sci., vol. 79, no. 2, pp. 216–229, 2013, doi: 10.1016/j.jcss.2012.05.004.

[9] H. Xue, Q. Yang, and S. Chen, Nugroho, vol. 6, no. SVM. 2009. doi: 10.1007/s10115-007-0114-2.

[10] R. T. Aldahdooh and W. Ashour, “DIMK-means ‘Distance-based Initialization Method for K-means Clustering Algorithm,’” Int. J. Intell. Syst. Appl., vol. 5, no. 2, pp. 41–51, 2013, doi: 10.5815/ijisa.2013.02.05.

[11] J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” Proc. Fifth Berkeley Symp. Math. Stat. Probab., vol. 1, pp. 281–297, 1967, doi: 10.1007/s11665-016-2173-6.

[12] A. Likas, N. Vlassis, and J. J. Verbeek, “The global k-means clustering algorithm,” Pattern Recognit., vol. 36, no. 2, pp. 451–461, 2003, doi: 10.1016/S0031-3203(02)00060-2.

[13] X. Wang and Y. Bai, “The global Minmax k-means algorithm,” Springerplus, vol. 5, no. 1, 2016, doi: 10.1186/s40064-016-3329-4.

[14] M. Goyal and S. Kumar, “Improving the Initial Centroids of k-means Clustering Algorithm to Generalize its Applicability,” J. Inst. Eng. Ser. B, vol. 95, no. 4, pp. 345–350, 2014, doi: 10.1007/s40031-014-0106-z.

[15] J. Chen, X. Qi, L. Chen, F. Chen, and G. Cheng, “Quantum-inspired ant lion optimized hybrid k-means for cluster analysis and intrusion detection,” Knowledge-Based Syst., vol. 203, p. 106167, 2020, doi: 10.1016/j.knosys.2020.106167.

[16] J. Liu, Y. Guo, D. Li, Z. Wang, and Y. Xu, “Kernel-based MinMax clustering methods with kernelization of the metric and auto-tuning hyper-parameters,” Neurocomputing, vol. 359, pp. 173–184, 2019, doi: 10.1016/j.neucom.2019.05.056.

[17] Y. Li, J. Cai, H. Yang, J. Zhang, and X. Zhao, “A Novel Algorithm for Initial Cluster Center Selection,” IEEE Access, vol. 7, pp. 74683–74693, 2019, doi: 10.1109/ACCESS.2019.2921320.

[18] N. Han, S. Qiao, G. Yuan, P. Huang, D. Liu, and K. Yue, “A novel Chinese herbal medicine clustering algorithm via artificial bee colony optimization,” Artif. Intell. Med., vol. 101, p. 101760, 2019, doi: 10.1016/j.artmed.2019.101760.

[19] A. Ilham, D. Ibrahim, L. Assaffat, and A. Solichan, “Tackling Initial Centroid of K-Means with Distance Part (DP-KMeans),” Proceeding - 2018 Int. Symp. Adv. Intell. Informatics Revolutionize Intell. Informatics Spectr. Humanit. SAIN 2018, pp. 185–189, 2019, doi: 10.1109/SAIN.2018.8673364.

[20] Srividya, S. Mohanavalli, N. Sripriya, and S. Poornima, “Outlier Detection using Clustering Techniques,” Int. J. Eng. Technol., vol. 7, no. 3.12, p. 813, 2018, doi: 10.14419/ijet.v7i3.12.16508.

[21] X. Huang, X. Yang, J. Zhao, L. Xiong, and Y. Ye, “A new weighting k-means type clustering framework with an l2-norm regularization,” Knowledge-Based Syst., vol. 151, pp. 165–179, 2018, doi: 10.1016/j.knosys.2018.03.028.

[22] G. Wang, Y. Wei, and P. Tse, “Clustering by defining and merging candidates of cluster centers via independence and affinity,” Neurocomputing, vol. 315, pp. 486–495, 2018, doi: 10.1016/j.neucom.2018.07.043.

[23] S. K. Majhi and S. Biswal, “Optimal cluster analysis using hybrid K-Means and Ant Lion Optimizer,” Karbala Int. J. Mod. Sci., vol. 4, no. 4, pp. 347–360, 2018, doi: 10.1016/j.kijoms.2018.09.001.

[24] S. A. Sajidha, S. P. Chodnekar, and K. Desikan, “Initial seed selection for K-modes clustering – A distance and density based approach,” J. King Saud Univ. - Comput. Inf. Sci., 2018, doi: 10.1016/j.jksuci.2018.04.013.

[25] D. Yu, G. Liu, M. Guo, and X. Liu, “An improved K-medoids algorithm based on step increasing and optimizing medoids,” Expert Syst. Appl., vol. 92, pp. 464–473, 2018, doi: 10.1016/j.eswa.2017.09.052.

[26] M. A. Masud, J. Z. Huang, C. Wei, J. Wang, I. Khan, and M. Zhong, “I-nice: A new approach for identifying the number of clusters and initial cluster centres,” Inf. Sci. (Ny)., vol. 466, pp. 129–151, 2018, doi: 10.1016/j.ins.2018.07.034.

[27] E. Zhu and R. Ma, “An effective partitional clustering algorithm based on new clustering validity index,” Appl. Soft Comput. J., vol. 71, pp. 608–621, 2018, doi: 10.1016/j.asoc.2018.07.026.

[28] H. Yan, L. Wang, and Y. Lu, “Identifying cluster centroids from decision graph automatically using a statistical outlier detection method,” Neurocomputing, no. xxxx, 2018, doi: 10.1016/j.neucom.2018.10.067.

[29] W. Cai, “A dimension reduction algorithm preserving both global and local clustering structure,” Knowledge-Based Syst., vol. 118, pp. 191–203, 2017, doi: 10.1016/j.knosys.2016.11.020.

[30] R. S. Xue W, Yang RL, Hong XY, Zhao N, “A novel k-Means based on spatial density similarity measurement”.

[31] A. Mahiruna, E. H. Rachmawanto, and D. Istiawan, “Analysis of Time Optimization for Watermark Image Quality Using Run Length Encoding Compression,” J. Intell. Comput. Heal. Informatics, vol. 4, no. 2, p. 35, 2023, doi: 10.26714/jichi.v4i2.12058.

Downloads

Published

2025-10-08

How to Cite

[1]
A. Mahiruna, N. Ngatimin, R. Destriana, E. H. Rachmawanto, H. Yuliansyah, and M. T. Hidayat, “Enhancing Clustering Accuracy Using K-Means with Seeds Optimization”, JAIC, vol. 9, no. 5, pp. 2426–2433, Oct. 2025.

Most read articles by the same author(s)

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.