Optimization of the Decision Tree Method using Pruning on Liver Disease Classification
Abstract
The amount of data about liver disease can be used to become information that can be extracted using the decision tree data mining method. However, there is a weakness in the decision tree method, namely over-fitting the resulting tree can produce a good model in training data but normally cannot produce a good tree model when applied to unseen data. Based on experiments conducted using datasets taken from The UCI Machine Learning Repository database is the ILPD dataset which contains 583 clinical data with 10 attributes with a target output of 416 positive liver and 167 negative liver. The results show that the decision tree algorithm using pruning and without pruning has been tested showing an increase in accuracy. The results of the decision tree performance without pruning generated in the confusion matrix for the accuracy measure, which is 73.58 %. While the results of the system performance using the pruning method have an accuracy of 73.76%. Although the accuracy value is slightly adrift, it can prove that the decision tree method using the pruning method has much better accuracy. In addition, the models and rules generated by the decision tree can be used as the basis for developing a prototype application for liver disease classification.
Downloads
References
T. Assegie, “Support Vector Machine And K-Nearest Neighbor Based Liver Disease Classification Model”, Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 3, no. 1, pp. 9-14, Feb. 2021.
A. Wardhani, “Implementasi Algoritma K-Means Untuk Pengelompokkan Penyakit Pasien Pada Puskesmas Kajen Pekalongan,” Jurnal Transformatika, vol. Volume 14 No 1, pp. 30–37, 2016.
T. Bimantoro and A. K. Wardhani, “Implementasi Algoritma Partitioning Around Medoids Dalam Pengelompokan Restoran,” Indonesian Journal of Technology, Informatics and Science (IJTIS), vol. 2, no. 1, pp. 33–36, 2020, doi: 10.24176/ijtisv2i1.5651.
A. K. Wardhani, “Penerapan Algoritma Partitioning Around Medoids Untuk Menentukan Kelompok Penyakit Pasien (Studi Kasus: Puskesmas Kajen Pekalongan),” 2017.
A. K. Wardhani, C. E. Widodo, and J. E. Suseno, “Information System for Culinary Product Selection Using Clustering K-Means and Weighted Product Method,” vol. 165, no. ICCSR, pp. 18–22, 2018, doi: 10.2991/iccsr-18.2018.5.
T. Lan, H. Hu, C. Jiang, G. Yang, and Z. Zhao, “A comparative study of decision tree, random forest, and convolutional neural network for spread-F identification,” Advances in Space Research, vol. 65, no. 8, pp. 2052–2061, 2020, doi: 10.1016/j.asr.2020.01.036.
R. Pal and S. Pal, “Application of Data Mining Techniques in Health Fraud Detection,” International Journal of Engineering Research and General Science, vol. 3, no. 5, pp. 129–137, 2015.
Rismayanti, “Decision Tree Penentuan Masa Studi Mahasiswa Prodi Teknik Informatika (Studi Kasus: Fakultas Teknik dan Komputer Universitas Harapan Medan),” Jurnal Sistem Informasi, vol. 02, no. 01, pp. 16–24, 2018.
C. A. Sugianto, “Penerapan Teknik Data Mining Untuk Menentukan Hasil Seleksi Masuk Sman 1 Gibeber Untuk Siswa Baru Menggunakan Decision Tree,” pp. 39–43, 2017, doi: 10.31227/osf.io/vedu7.
E. P. Cynthia and E. Ismanto, “Metode Decision Tree Algoritma C.45 Dalam Mengklasifikasi Data Penjualan Bisnis Gerai Makanan Cepat Saji,” Jurasik (Jurnal Riset Sistem Informasi dan Teknik Informatika), vol. 3, no. July, p. 1, 2018, doi: 10.30645/jurasikv3i0.60.
P. Bimo, N. Setio, D. Retno, S. Saputro, and B. Winarno, “Klasifikasi dengan Pohon Keputusan Berbasis Algoritme C4.5,” PRISMA, Prosiding Seminar Nasional Matematika, vol. 3, pp. 64–71, 2020.
D. Rosdiana and A. H. Rismayana, “Prediksi waktu tanam cabai menggunakan algoritma c4.5,” pp. 436–442, 2018.
C. C. Chern, Y. J. Chen, and B. Hsiao, “Decision tree-based classifier in providing telehealth service,” BMC Med Inform Decis Mak, vol. 19, no. 1, pp. 1–15, 2019, doi: 10.1186/s12911-019-0825-9.
A. Brunello, E. Marzano, A. Montanari, and G. Sciavicco, “Decision tree pruning via multi-objective evolutionary computation,” Int J Mach Learn Comput, vol. 7, no. 6, pp. 167–175, Dec. 2017, doi: 10.18178/ijmlc.2017.7.6.641.
Y. Rokhayati, N. Z. Jannah, S. Irawan, and D. E. Kurniawan, “Decision Determination of Hinterland Selection Using Analytical Network Process,” in 2019 2nd International Conference on Applied Engineering (ICAE), Oct. 2019, pp. 1–5. doi: 10.1109/ICAE47758.2019.9221825.
Y. Xie et al., “Predicting Days in Hospital Using Health Insurance Claims,” IEEE J Biomed Health Inform, vol. 19, no. 4, pp. 1224–1233, 2015, doi: 10.1109/JBHI.2015.2402692.
Y. Xie et al., “Analyzing health insurance claims on different timescales to predict days in hospital,” J Biomed Inform, vol. 60, pp. 187–196, 2016, doi: 10.1016/j.jbi.2016.01.002.
N. K. Frempong, N. Nicholas, and M. A. Boateng, “Decision Tree as a Predictive Modeling Tool for Auto Insurance Claims,” Int J Stat Appl, vol. 7, no. 2, pp. 117–120, 2017, doi: 10.5923/j.statistics.20170702.07.
I. Setiawati, A. P. Wibowo, A. Hermawan, M. Teknologi, I. Universitas, and T. Yogyakarta, “Implementasi Decision Tree Untuk Mendiagnosis Penyakit Liver,” 2019.
A. P. Ayudhitama and U. Pujianto, “Analisa 4 Algoritma Dalam Klasifikasi Penyakit Liver Menggunakan Rapidminer,” JIP (Jurnal Informatika Polinema), 2020.
Copyright (c) 2022 Anindya Khrisna Wardhani, Ega Nugraha, Qonita Ulfiana
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) ) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).