Clustering of the Air Pollution Standard Index (ISPU) in the Province of DKI Jakarta Using the CLARANS Algorithm
DOI:
https://doi.org/10.30871/jaic.v9i4.9783Keywords:
Air Pollution Index, Clustering, CLARANS, Jakarta, Silhouette ScoreAbstract
Air pollution has become a serious global issue. According to IQAir's 2024 report, DKI Jakarta ranked 10th among cities with the worst air quality worldwide, indicating that air pollution in DKI Jakarta has reached a concerning level. This research uses the CLARANS algorithm to cluster daily air quality in DKI Jakarta based on pollution parameters. CLARANS is chosen due to its advantages in terms of big data processing efficiency, outlier resistance, and medoid search capability. The novelty of this research lies in the application of CLARANS to overcome the limitations of clustering algorithms in previous research. This research comprises several stages, including data understanding, data preprocessing, building the CLARANS model, and evaluation using the silhouette score. The CLARANS clustering result using the most optimal parameter combination and k = 3 demonstrates well-separated cluster boundaries, with an overall average silhouette score across all regions and years of 0.6398. The analysis results indicate that air pollution in DKI Jakarta tends to worsen in 2024. Jakarta Barat and Jakarta Pusat are predominantly affected by PM10, CO, and O₃ pollution, whereas Jakarta Selatan and Jakarta Utara are more influenced by SO₂ and NO₂ pollution. On the other hand, air pollution in East Jakarta shows a balanced dominance from both pollutant categories.
Downloads
References
[1] S. Annas, U. Uca, I. Irwan, R. H. Safei, and Z. Rais, “Using k-Means and Self Organizing Maps in Clustering Air Pollution Distribution in Makassar City, Indonesia,” Jambura Journal of Mathematics, vol. 4, no. 1, pp. 167–176, Jan. 2022, doi: 10.34312/jjom.v4i1.11883.
[2] P. Alusvigayana, A. S. Yuwono, M. Yani, and S. Syarwan, “Evaluation of the Air Pollutant Standard Index (ISPU) parameter concentration limits in industrial estates on Java Island,” Jurnal Pengelolaan Sumberdaya Alam dan Lingkungan (Journal of Natural Resources and Environmental Management), vol. 13, no. 4, pp. 537–548, Dec. 2023, doi: 10.29244/jpsl.13.4.537-548.
[3] V. Deandra, F. Hamami, and I. Darmawan, “Analisis Klasifikasi Kualitas Udara Menggunakan Metode Algoritma K-Nearest Neighbor Pada Provinsi Dki Jakarta,” e-Proceeding of Engineering, vol. 11, no. 4, pp. 3692–3698, 2024.
[4] “2024 World Air Quality Report,” 2024. Accessed: May 15, 2025. [Online]. Available: https://www.iqair.com/us/newsroom/waqr-2024-pr.
[5] M. H. S. Situmorang, B. I. Nasution, M. E. Aminanto, Y. Nugraha, and J. I. Kanggrawan, “Air Pollution Index (API) Analysis at Jakarta in 2019-2020 using Fuzzy C-Means and Gaussian Mixture Model,” in Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications, New York, NY, USA: ACM, Nov. 2022, pp. 174–178. doi: 10.1145/3575882.3575916.
[6] I. Mahendrasyah, A. Diana, Rusdah, and D. Mahdiana, “PENERAPAN ALGORITMA K-MEANS UNTUK KLASTERISASI INDEKS STANDAR PENCEMARAN UDARA,” Teknologi, vol. 14, no. 2, pp. 146–156, Dec. 2024, doi: 10.26594/teknologi.v14i2.4088.
[7] H. al AZIES, “Air Pollution in Jakarta, Indonesia Under Spotlight: An AI-Assisted Semi-Supervised Learning Approach,” Proceedings of The International Conference on Data Science and Official Statistics, vol. 2023, no. 1, pp. 150–161, Dec. 2023, doi: 10.34123/icdsos.v2023i1.348.
[8] S. Wisa Fitri, Z. Martha, Y. Kurniawati, and Zilrahmi, “Pengelompokan Potensi Kebakaran Hutan/Lahan di Indonesia Berdasarkan Sebaran Titik Panas Menggunakan Metode CLARANS,” UNP Journal of Statistics and Data Science, vol. 2, no. 3, pp. 273–278, Aug. 2024, doi: 10.24036/ujsds/vol2-iss3/182.
[9] R. T. Ng and Jiawei Han, “CLARANS: a method for clustering objects for spatial data mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 5, pp. 1003–1016, Sep. 2002, doi: 10.1109/TKDE.2002.1033770.
[10] A. Vatresia, F. P. Utama, I. P. Hati, and L. Z. Mase, “Discovering Bengkulu Province Earthquake Clusters with CLARANS Methods,” Journal of Soft Computing in Civil Engineering, vol. 8, no. 3, pp. 71–86, 2024.
[11] “Data Indeks Standar Pencemar Udara (ISPU) di Provinsi DKI Jakarta,” Satu Data Jakarta. Accessed: May 15, 2025. [Online]. Available: https://satudata.jakarta.go.id
[12] C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Frontiers in Energy Research, vol. 9, Mar. 2021, doi: 10.3389/fenrg.2021.652801.
[13] P. Bansal, P. Deshpande, and S. Sarawagi, “Missing Value Imputation on Multidimensional Time Series,” Proceedings of the VLDB Endowment, 2021.
[14] C. Oh, S. Han, and J. Jeong, “Time-Series Data Augmentation based on Interpolation,” Procedia Computer Science, vol. 175, pp. 64–71, 2020, doi: 10.1016/j.procs.2020.07.012.
[15] V. Sharma, “A Study on Data Scaling Methods for Machine Learning,” International Journal for Global Academic & Scientific Research, vol. 1, no. 1, Feb. 2022, doi: 10.55938/ijgasr.v1i1.4.
[16] S. Sinsomboonthong, “Performance Comparison of New Adjusted Min-Max with Decimal Scaling and Statistical Column Normalization Methods for Artificial Neural Network Classification,” International Journal of Mathematics and Mathematical Sciences, vol. 2022, pp. 1–9, Apr. 2022, doi: 10.1155/2022/3584406.
[17] I. Stolarek, A. Samelak-Czajka, M. Figlerowicz, and P. Jackowiak, “Dimensionality reduction by UMAP for visualizing and aiding in classification of imaging flow cytometry data,” iScience, vol. 25, no. 10, p. 105142, Oct. 2022, doi: 10.1016/j.isci.2022.105142.
[18] Y. FAKIR, R. ELAYACHI, and B. MAHI, “Clustering objects for spatial data mining: a comparative study,” Journal of Big Data Research, vol. 1, no. 3, pp. 1–11, Mar. 2023, doi: 10.14302/issn.2768-0207.jbr-23-4478.
[19] J. Zhang and H. Wang, “Analysis of CLARANS Algorithm for Weather Data Based on Spark,” Computers, Materials & Continua, vol. 76, no. 2, pp. 2427–2441, 2023, doi: 10.32604/cmc.2023.038462.
[20] I. K. Khan et al., “Determining the optimal number of clusters by Enhanced Gap Statistic in K-mean algorithm,” Egyptian Informatics Journal, vol. 27, p. 100504, Sep. 2024, doi: 10.1016/j.eij.2024.100504.
[21] S. Renaldi. S, D. A. Prasetya, and A. Muhaimin, “Analisis Klaster Partitioning Around Medoids dengan Gower Distance untuk Rekomendasi Indekos (Studi Kasus: Indekos di Sekitar Kampus UPNVJT),” G-Tech: Jurnal Teknologi Terapan, vol. 8, no. 3, pp. 2060–2069, Jul. 2024, doi: 10.33379/gtech.v8i3.4898.
[22] M. Shutaywi and N. N. Kachouie, “Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering,” Entropy, vol. 23, no. 6, p. 759, Jun. 2021, doi: 10.3390/e23060759.
[23] A. M. Ikotun and A. E. Ezugwu, “Boosting k-means clustering with symbiotic organisms search for automatic clustering problems,” PLoS ONE, vol. 17, no. 8, Aug. 2022, doi: 10.1371/journal.pone.0272861.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Adelia Ramadhina Azzahra, Nasywa Azzah Nabila, Mohammad Idhom, Trimono Trimono

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








