Optimization of the Decision Tree Method using Pruning on Liver Disease Classification

  • Anindya Khrisna Wardhani Politeknik Rukun Abdi Luhur
  • Ega Nugraha Politeknik Rukun Abdi Luhur
  • Qonita Ulfiana Politeknik Rukun Abdi Luhur
Keywords: Data Mining, Decision Tree, Liver, Pruning

Abstract

The amount of data about liver disease can be used to become information that can be extracted using the decision tree data mining method. However, there is a weakness in the decision tree method, namely over-fitting the resulting tree can produce a good model in training data but normally cannot produce a good tree model when applied to unseen data. Based on experiments conducted using datasets taken from The UCI Machine Learning Repository database is the ILPD dataset which contains 583 clinical data with 10 attributes with a target output of 416 positive liver and 167 negative liver. The results show that the decision tree algorithm using pruning and without pruning has been tested showing an increase in accuracy. The results of the decision tree performance without pruning generated in the confusion matrix for the accuracy measure, which is 73.58 %. While the results of the system performance using the pruning method have an accuracy of 73.76%. Although the accuracy value is slightly adrift, it can prove that the decision tree method using the pruning method has much better accuracy. In addition, the models and rules generated by the decision tree can be used as the basis for developing a prototype application for liver disease classification.

Downloads

Download data is not yet available.

References

T. Assegie, “Support Vector Machine And K-Nearest Neighbor Based Liver Disease Classification Model”, Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 3, no. 1, pp. 9-14, Feb. 2021.

A. Wardhani, “Implementasi Algoritma K-Means Untuk Pengelompokkan Penyakit Pasien Pada Puskesmas Kajen Pekalongan,” Jurnal Transformatika, vol. Volume 14 No 1, pp. 30–37, 2016.

T. Bimantoro and A. K. Wardhani, “Implementasi Algoritma Partitioning Around Medoids Dalam Pengelompokan Restoran,” Indonesian Journal of Technology, Informatics and Science (IJTIS), vol. 2, no. 1, pp. 33–36, 2020, doi: 10.24176/ijtisv2i1.5651.

A. K. Wardhani, “Penerapan Algoritma Partitioning Around Medoids Untuk Menentukan Kelompok Penyakit Pasien (Studi Kasus: Puskesmas Kajen Pekalongan),” 2017.

A. K. Wardhani, C. E. Widodo, and J. E. Suseno, “Information System for Culinary Product Selection Using Clustering K-Means and Weighted Product Method,” vol. 165, no. ICCSR, pp. 18–22, 2018, doi: 10.2991/iccsr-18.2018.5.

T. Lan, H. Hu, C. Jiang, G. Yang, and Z. Zhao, “A comparative study of decision tree, random forest, and convolutional neural network for spread-F identification,” Advances in Space Research, vol. 65, no. 8, pp. 2052–2061, 2020, doi: 10.1016/j.asr.2020.01.036.

R. Pal and S. Pal, “Application of Data Mining Techniques in Health Fraud Detection,” International Journal of Engineering Research and General Science, vol. 3, no. 5, pp. 129–137, 2015.

Rismayanti, “Decision Tree Penentuan Masa Studi Mahasiswa Prodi Teknik Informatika (Studi Kasus: Fakultas Teknik dan Komputer Universitas Harapan Medan),” Jurnal Sistem Informasi, vol. 02, no. 01, pp. 16–24, 2018.

C. A. Sugianto, “Penerapan Teknik Data Mining Untuk Menentukan Hasil Seleksi Masuk Sman 1 Gibeber Untuk Siswa Baru Menggunakan Decision Tree,” pp. 39–43, 2017, doi: 10.31227/osf.io/vedu7.

E. P. Cynthia and E. Ismanto, “Metode Decision Tree Algoritma C.45 Dalam Mengklasifikasi Data Penjualan Bisnis Gerai Makanan Cepat Saji,” Jurasik (Jurnal Riset Sistem Informasi dan Teknik Informatika), vol. 3, no. July, p. 1, 2018, doi: 10.30645/jurasikv3i0.60.

P. Bimo, N. Setio, D. Retno, S. Saputro, and B. Winarno, “Klasifikasi dengan Pohon Keputusan Berbasis Algoritme C4.5,” PRISMA, Prosiding Seminar Nasional Matematika, vol. 3, pp. 64–71, 2020.

D. Rosdiana and A. H. Rismayana, “Prediksi waktu tanam cabai menggunakan algoritma c4.5,” pp. 436–442, 2018.

C. C. Chern, Y. J. Chen, and B. Hsiao, “Decision tree-based classifier in providing telehealth service,” BMC Med Inform Decis Mak, vol. 19, no. 1, pp. 1–15, 2019, doi: 10.1186/s12911-019-0825-9.

A. Brunello, E. Marzano, A. Montanari, and G. Sciavicco, “Decision tree pruning via multi-objective evolutionary computation,” Int J Mach Learn Comput, vol. 7, no. 6, pp. 167–175, Dec. 2017, doi: 10.18178/ijmlc.2017.7.6.641.

Y. Rokhayati, N. Z. Jannah, S. Irawan, and D. E. Kurniawan, “Decision Determination of Hinterland Selection Using Analytical Network Process,” in 2019 2nd International Conference on Applied Engineering (ICAE), Oct. 2019, pp. 1–5. doi: 10.1109/ICAE47758.2019.9221825.

Y. Xie et al., “Predicting Days in Hospital Using Health Insurance Claims,” IEEE J Biomed Health Inform, vol. 19, no. 4, pp. 1224–1233, 2015, doi: 10.1109/JBHI.2015.2402692.

Y. Xie et al., “Analyzing health insurance claims on different timescales to predict days in hospital,” J Biomed Inform, vol. 60, pp. 187–196, 2016, doi: 10.1016/j.jbi.2016.01.002.

N. K. Frempong, N. Nicholas, and M. A. Boateng, “Decision Tree as a Predictive Modeling Tool for Auto Insurance Claims,” Int J Stat Appl, vol. 7, no. 2, pp. 117–120, 2017, doi: 10.5923/j.statistics.20170702.07.

I. Setiawati, A. P. Wibowo, A. Hermawan, M. Teknologi, I. Universitas, and T. Yogyakarta, “Implementasi Decision Tree Untuk Mendiagnosis Penyakit Liver,” 2019.

A. P. Ayudhitama and U. Pujianto, “Analisa 4 Algoritma Dalam Klasifikasi Penyakit Liver Menggunakan Rapidminer,” JIP (Jurnal Informatika Polinema), 2020.

Published
2022-11-21
How to Cite
[1]
A. Wardhani, E. Nugraha, and Q. Ulfiana, “Optimization of the Decision Tree Method using Pruning on Liver Disease Classification”, JAIC, vol. 6, no. 2, pp. 136-140, Nov. 2022.
Section
Articles