Evaluation of Telecommunication Customer Churn Classification with SMOTE Using Random Forest and XGBoost Algorithms
Abstract
Competition in the telecommunications industry, particularly among Internet Service Providers (ISPs), significantly influences customer churn, which negatively impacts revenue, profitability, and business sustainability. An effective approach to mitigate churn involves identifying potential churners early, enabling companies to implement strategic retention measures. However, predicting churn can be challenging due to the limited data available on churned customers. This study aims to predict customers likely to terminate or discontinue their subscriptions, focusing on addressing data imbalance using the Synthetic Minority Over-Sampling Technique (SMOTE). The dataset, sourced from Kaggle, comprises 21 attributes and 7,034 entries. The pre-processing phase includes data cleaning, feature encoding, and the implementation of Random Forest and XGBoost algorithms after data balancing with SMOTE. The findings reveal that the XGBoost algorithm achieves a prediction accuracy of 82%, outperforming Random Forest with 81%. Key factors influencing churn include Contract, TotalCharges, and tenure. The study concludes by emphasizing the significance of contract flexibility and the need to prioritize customers with high total costs or extended subscription periods to reduce churn rates. Future research is encouraged to investigate alternative methods for handling data imbalance and to explore advanced machine learning algorithms to further enhance prediction accuracy and the effectiveness of customer retention strategies.
Downloads
References
survei.apjii.or.id, “Survei Internet APJII 2024,”
survei.apjii.or.id. Accessed: Oct. 29, 2024. [Online].
Available: https://survei.apjii.or.id/
A. Wicaksono, A. Anita, and T. N. Padilah, “Uji Performa Teknik Klasifikasi untuk Memprediksi Customer Churn,” Bianglala Inform., vol. 9, no. 1, pp. 37–45, 2021, doi: 10.31294/bi.v9i1.9992.
Siti Alvi Sholikhatin Khairunnisak Nur Isnaini, “Faculty of Sains and Technology, Ibrahimy University,” Ilm. Inform., vol. 6, no. 1, pp. 43–49, 2021, [Online]. Available: Siti Alvi Sholikhatin1), Khairunnisak Nur Isnaini2)
V. Kavitha, G. Hemanth Kumar, S. V Mohan Kumar, and M. Harish, “Churn Prediction of Customer in Telecom Industry using Machine Learning Algorithms,” Int. J. Eng. Res., vol. V9, no. 05, pp. 181– 184, 2020, doi: 10.17577/ijertv9is050022.
J. -, S. Usman, and F. Aziz, “Analisis Perilaku Pelanggan menggunakan Metode Ensemble Logistic Regression,” J. Teknol. Dan Ilmu Komput. Prima, vol. 6, no. 2, pp. 90–97, 2023, doi:
34012/jutikomp.v6i2.4258.
I. M. Latief, A. Subekti, and W. Gata, “Prediksi Tingkat Pelanggan Churn Pada Perusahaan Telekomunikasi Dengan Algoritma Adaboost,” J. Inform., vol. 21, no. 1, pp. 34–43, 2021, doi: 10.30873/ji.v21i1.2867.
J. Penelitian Ilmu Komputer, M. Dahlan Kurnia, D. Universitas Pamulang, and P. Brin, “Klasifikasi Customer Relationship Management Perusahaan Telekomunikasi Seluler Dengan Metode Machine Learning,” vol. 1, no. 4, pp. 63–76, 2023, [Online].
Available: https://mypublikasi.com/
S. D. Damanik and M. I. Jambak, “Klasifikasi Customer Churn pada Telekomunikasi Industri Untuk Retensi Pelanggan Menggunakan Algoritma C4.5,” KLIK Kaji. Ilm. Inform. dan Komput., vol. 3, no. 6, pp. 1303–1309, 2023, doi: 10.30865/klik.v3i6.829.
M. M. S. Nurhidayat and Dyah Anggraini, “Analysis and Classification of Customer Churn Using Machine Learning Models,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 7, no. 6, pp. 1253–1259, 2023, doi: 10.29207/resti.v7i6.4933.
A. Nugroho and E. Rilvani, “Penerapan Metode Oversampling SMOTE Pada Algoritma Random Forest Untuk Prediksi Kebangkrutan Perusahaan Application of the SMOTE Oversampling Method to the Random Forest Algorithm for Predicting Company Bankruptcy,” Februari, vol. 22, no. 1, pp. 207–214,
Cosmas Haryawan and Yosef Muria Kusuma Ardhana, “Analisa Perbandingan Teknik Oversampling Smote Pada Imbalanced Data,” J. Inform. dan Rekayasa Elektron., vol. 6, no. 1, pp. 73–78, 2023, doi: 10.36595/jire.v6i1.834.
A. N. Kasanah, M. Muladi, and U. Pujianto, “Penerapan Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Objektivitas Berita Online Menggunakan Algoritma KNN,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 2,
pp. 196–201, 2019, doi: 10.29207/resti.v3i2.945.
B. Ramadhan, D. Firdaus, and A. R. Rafi, “MIND (Multimedia Artificial Intelligent Networking Database Teknik SMOTE Sebagai Solusi Imbalance Class dalam Model Deteksi Intrusi DDoS dengan Metode PCA- Random Forest,” J. MIND J. | ISSN, vol. 8, no. 1, pp. 52–64, 2023, [Online]. Available: https://doi.org/10.26760/mindjournal.v8i1.52-64
A. Y. W. Chong, K. W. Khaw, W. C. Yeong, and W. X. Chuah, “Customer Churn Prediction of Telecom Company Using Machine Learning Algorithms,” J. Soft Comput. Data Min., vol. 4, no. 2, pp. 1–22, 2023, doi: 10.30880/jscdm.2023.04.02.001.
N. Suryana, P. Pratiwi, and R. T. Prasetio, “Penanganan Ketidakseimbangan Data pada Prediksi Customer Churn Menggunakan Kombinasi SMOTE dan Boosting,” IJCIT (Indonesian J. Comput. Inf. Technol., vol. 6, no. 1, May 2021, doi: 10.31294/ijcit.v6i1.9545.
Anis Fitri Nur Masruriyah, Hilda Yulia Novita, Cici Emilia Sukmawati, Siti Novianti Nuraini Arif, Angga Ramda Ramadhan, and P. Studi Informatika, “Evaluasi Algoritma Pembelajaran Terbimbing terhadap Dataset Penyakit Jantung yang telah Dilakukan Oversampling,” J. MIND J. | ISSN, vol. 8, no. 2, pp. 242–253, 2023, [Online]. Available: https://doi.org/10.26760/mindjournal.v8i2.242-253
F. E. P. Nadya, M. F. I. Ferdiansyah, V. R. S. Nastiti, and C. S. K. Aditya, “Implementation of Feature Selection Strategies to Enhance Classification Using XGBoost and Decision Tree,” Sci. J. Informatics, vol. 11, no. 1, pp. 18–194, 2024, doi: 10.15294/sji.v11i1.48145.
Yoga Religia, Agung Nugroho, and Wahyu Hadikristanto, “Klasifikasi Analisis Perbandingan Algoritma Optimasi pada Random Forest untuk Klasifikasi Data Bank Marketing,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 187–192, 2021, doi: 10.29207/resti.v5i1.2813.
B. Alnur, Mulyono, Fitri Amillia, and S. Sutoyo, “JITE (Journal of Informatics andTelecommunication Engineering),” J. Informatics Telecommun. Eng., vol. 7, no. 1, pp. 102–111, 2023, [Online]. Available: https://www.researchgate.net/publication/335117624_M alang_City_Polytechnic_Web_Based_Student_Attendan ce_Information_System_Telecommunications_Engineer ing_Study_Program_Using_Fingerprint/fulltext/5d515fe 34585153e594ef214/Malang-City-Polytechnic-Web-Based-S
M. M. S. Nurhidayat and Dyah Anggraini, “Analysis and Classification of Customer Churn Using Machine Learning Models,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 7, no. 6, pp. 1253–1259, Nov. 2023, doi: 10.29207/resti.v7i6.4933.
Copyright (c) 2025 Lisa Nusrotul Wakhidah, Akhmad Khanif Zyen, Buang Budi Wahono
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).