Knowledge Discovery on E-Commerce Customer Churn Using Interpretable Machine Learning: A Comparative Study of SHAP-Based Classifiers
DOI:
https://doi.org/10.30871/jaic.v9i5.10811Keywords:
Customer Churn, E-Commerce, Machine Learning, SHapley Additive exPlanationsAbstract
Customer churn remains one of the most pressing issues in the e-commerce sector, as it directly erodes revenue and reduces customer lifetime value. This study proposes an interpretable machine learning approach designed not only to predict churn but also to uncover practical insights that can inform retention strategies. The analysis draws on a publicly available dataset containing customer behavior and transaction records. Data preparation involved handling missing values, applying label encoding, and addressing class imbalance with SMOTE. Five classification models—Logistic Regression, Random Forest, XGBoost, Support Vector Machine, and Gradient Boosting—were trained on an 80:20 stratified split, with performance assessed through accuracy, precision, recall, F1-score, and AUC. Among these, XGBoost delivered the most consistent results, achieving 96% accuracy, 95% precision, 92% recall, and a near-perfect AUC of 0.999, followed closely by Random Forest. Logistic Regression produced the lowest AUC at 0.886. To ensure transparency in decision-making, SHAP (SHapley Additive exPlanations) was applied, revealing Tenure, Complain, and CashbackAmount as the most influential predictors. Longer customer relationships were linked to reduced churn risk, while frequent complaints and higher cashback usage indicated a greater likelihood of leaving. These findings contribute knowledge by blending robust predictive performance with interpretability, enabling e-commerce businesses to design more targeted and proactive customer retention measures.
Downloads
References
[1] B. Zhu, C. Qian, S. vanden Broucke, J. Xiao, and Y. Li, “A bagging-based selective ensemble model for churn prediction on imbalanced data,” Expert Syst Appl, vol. 227, Oct. 2023, doi: 10.1016/j.eswa.2023.120223.
[2] D. Asif, M. S. Arif, and A. Mukheimer, “A data-driven approach with explainable artificial intelligence for customer churn prediction in the telecommunications industry,” Results in Engineering, vol. 26, Jun. 2025, doi: 10.1016/j.rineng.2025.104629.
[3] A. Amin, A. Adnan, and S. Anwar, “An adaptive learning approach for customer churn prediction in the telecommunication industry using evolutionary computation and Naïve Bayes,” Appl Soft Comput, vol. 137, Apr. 2023, doi: 10.1016/j.asoc.2023.110103.
[4] H. Zhang and W. Zhang, “Application of GWO-attention-ConvLSTM model in customer churn prediction and satisfaction analysis in customer relationship management,” Heliyon, vol. 10, no. 17, Sep. 2024, doi: 10.1016/j.heliyon.2024.e37229.
[5] R. Krishna, D. Jayanthi, D. S. Shylu Sam, K. Kavitha, N. K. Maurya, and T. Benil, “Application of machine learning techniques for churn prediction in the telecom business,” Results in Engineering, vol. 24, Dec. 2024, doi: 10.1016/j.rineng.2024.103165.
[6] K. Ljubičić, A. Merćep, and Z. Kostanjčar, “Churn prediction methods based on mutual customer interdependence,” J Comput Sci, vol. 67, Mar. 2023, doi: 10.1016/j.jocs.2022.101940.
[7] K. A. Pflughoeft, N. T. Butz, and A. Corbley, “Customer churn prediction for fixed wireless access: The case of a regional internet service provider,” Telecomm Policy, vol. 49, no. 4, May 2025, doi: 10.1016/j.telpol.2025.102929.
[8] S. K. Wagh, A. A. Andhale, K. S. Wagh, J. R. Pansare, S. P. Ambadekar, and S. H. Gawande, “Customer churn prediction in telecom sector using machine learning techniques,” Results in Control and Optimization, vol. 14, Mar. 2024, doi: 10.1016/j.rico.2023.100342.
[9] S. Arockia Panimalar and A. Krishnakumar, “Customer churn prediction model in cloud environment using DFE-WUNB: ANN deep feature extraction with Weight Updated Tuned Naïve Bayes classification with Block-Jacobi SVD dimensionality reduction,” Eng Appl Artif Intell, vol. 126, Nov. 2023, doi: 10.1016/j.engappai.2023.107015.
[10] H. D. Hoang and N. T. Cam, “Do they like your game? Early-stage churn prediction using a two-phase neural network system,” Eng Appl Artif Intell, vol. 144, Mar. 2025, doi: 10.1016/j.engappai.2025.110102.
[11] F. E. Usman-Hamza et al., “Empirical analysis of tree-based classification models for customer churn prediction,” Sci Afr, vol. 23, Mar. 2024, doi: 10.1016/j.sciaf.2023.e02054.
[12] P. Boozary, S. Sheykhan, H. GhorbanTanhaei, and C. Magazzino, “Enhancing customer retention with machine learning: A comparative analysis of ensemble models for accurate churn prediction,” International Journal of Information Management Data Insights, vol. 5, no. 1, Jun. 2025, doi: 10.1016/j.jjimei.2025.100331.
[13] H. Habiba Shabbir, M. Hamza Farooq, A. Zafar, B. Ayesha Akram, T. Waheed, and M. Aslam, “Enhancing employee churn prediction with weibull time-to-event modeling,” Journal of Engineering Research (Kuwait), 2025, doi: 10.1016/j.jer.2025.03.009.
[14] S. S. Poudel, S. Pokharel, and M. Timilsina, “Explaining customer churn prediction in telecom industry using tabular machine learning models,” Machine Learning with Applications, vol. 17, p. 100567, Sep. 2024, doi: 10.1016/j.mlwa.2024.100567.
[15] Z. Liu, P. Jiang, K. W. De Bock, J. Wang, L. Zhang, and X. Niu, “Extreme gradient boosting trees with efficient Bayesian optimization for profit-driven customer churn prediction,” Technol Forecast Soc Change, vol. 198, Jan. 2024, doi: 10.1016/j.techfore.2023.122945.
[16] A. De Caigny, K. W. De Bock, and S. Verboven, “Hybrid black-box classification for customer churn prediction with segmented interpretability analysis,” Decis Support Syst, vol. 181, Jun. 2024, doi: 10.1016/j.dss.2024.114217.
[17] P. Jiang, Z. Liu, L. Zhang, and J. Wang, “Hybrid model for profit-driven churn prediction based on cost minimization and return maximization,” Expert Syst Appl, vol. 228, Oct. 2023, doi: 10.1016/j.eswa.2023.120354.
[18] A. L. D. Loureiro, V. L. Miguéis, Á. Costa, and M. Ferreira, “Improving customer retention in taxi industry using travel data analytics: A churn prediction study,” Journal of Retailing and Consumer Services, vol. 85, Jul. 2025, doi: 10.1016/j.jretconser.2025.104288.
[19] J. Sanchez Ramirez, K. Coussement, A. De Caigny, D. F. Benoit, and E. Guliyev, “Incorporating usage data for B2B churn prediction modeling,” Industrial Marketing Management, vol. 120, pp. 191–205, Jul. 2024, doi: 10.1016/j.indmarman.2024.05.008.
[20] N. A. Sofiah, K. D. Tania, A. Meiriza and A. Wedhasmara, "A Comparative Assessment SARIMA and LSTM Models for the Gurugram Air Quality Index's Knowledge Discovery," 2024 International Conference on Electrical Engineering and Computer Science (ICECOS), Indonesia, 2024, pp. 26-31, doi: 10.1109/ICECOS63900.2024.10791243.
[21] J. Shobana and C. G. Gangadhar, “E-commerce customer churn prevention using machine learning-based business intelligence strategy,” Measurement, vol. 270, Jan. 2023, Art. no. 110998. doi: 10.1016/j.measurement.2023.110998.
[22] I. Boukrouh and A. Azmani, “Explainable machine learning models applied to predicting customer churn for e-commerce,” International Journal of Artificial Intelligence (IJAI), vol. 14, no. 1, pp. 286–297, Feb. 2025. doi: 10.11591/ijai.v14.i1.pp286-297.
[23] S. Kumar, S. Deep, and P. Kalra, “A comprehensive analysis of machine learning techniques for churn prediction in e-commerce: A comparative study,” International Journal of Computer Trends and Technology (IJCTT), vol. 72, no. 5, pp. 163–170, May 2024. doi: 10.14445/22312803/IJCTT-V72I5P119.
[24] J. Maan and H. Maan, “Customer churn prediction model using explainable machine learning,” arXiv preprint arXiv:2303.00960, Mar. 2023. [Online]. Available: https://arxiv.org/abs/2303.00960
[25] J. Li, “Customer churn prediction using machine learning: A case study of e-commerce data,” International Journal of Computer Applications, vol. 186, no. 48, pp. 1–6, Nov. 2024. doi: 10.5120/ijca2024924140.
[26] O. S. Owolabi, A. T. Adepoju, and A. A. Ajayi, “Comparative analysis of machine learning models for customer churn prediction in the U.S. banking and financial services: Economic impact and industry-specific insights,” Journal of Data Analysis and Information Processing, vol. 12, pp. 388–418, 2024. doi: 10.4236/jdaip.2024.123021.
[27] A. Almahadeen, “Evaluating machine learning techniques for predicting customer churn in e-commerce sector,” Journal of Logistics, Informatics and Service Science, vol. 11, no. 6, pp. 439–450, 2024. [Online]. Available: https://www.aasmr.org/liss/onlinefirst/Vol11/No.6/Vol.11.No.6.27.pdf
[28] S. Baghla and G. Gupta, “Performance evaluation of various classification techniques for customer churn prediction in e-commerce,” Microprocessors and Microsystems, vol. 101, Art. no. 104689, Apr. 2023. doi: 10.1016/j.micpro.2023.104689.
[29] D. Y. C. Wang, L. A. Jordanger, and J. C.-W. Lin, “Explainability of highly associated fuzzy churn patterns in binary classification,” arXiv preprint arXiv:2410.15827, Oct. 2024. [Online]. Available: https://arxiv.org/abs/2410.15827
[30] H. Ren, “Machine learning-based prediction of customer churn risk in e-commerce,” in Proc. Int. Conf. on Business Intelligence and Big Data (BIBD), Chengdu, China, Oct. 2024, pp. 55–60. doi: 10.1109/BIBD.2024.9932147.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Dhita Amanda Ardhani, Ken Ditha Tania

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).








