MUHAMMAD BAGUS FADLI, NPM 2108100060 (2026) KOMPARASI PERBANDINGAN ALGORITMA C4.5, NAIVE BAYES, K-NEAREST NEIGHBOR, RANDOM FOREST UNTUK PREDIKSI FAKTOR PENYEBAB PENYAKIT DIABETES. Tugas_Akhir(Artikel) Building of Informatics, Technology and Science (BITS), 7 (3). pp. 2118-2126. ISSN 2685-3310(e-ISSN) 2685-3310(p-ISSN)
|
Text
COVER.pdf Download (1MB) |
|
|
Text
ARTIKEL.pdf Restricted to Registered users only Download (738kB) |
Abstract
Diabetes merupakan penyakit metabolik kronis yang ditandai dengan peningkatan kadar glukosa darah dan dapat menyebabkan berbagai komplikasi serius serta berkontribusi terhadap tingginya angka kematian di dunia. Permasalahan utama dalam penanganan penyakit diabetes adalah perlunya klasifikasi status pasien yang akurat berdasarkan data pemeriksaan laboratorium agar dapat dilakukan penanganan yang tepat. Penelitian ini bertujuan untuk mengkomparasikan kinerja algoritma C4.5, Naive Bayes, K-Nearest Neighbor (KNN), dan Random Forest dalam mengklasifikasikan data pasien diabetes. Dataset yang digunakan bersumber dari Electronic Health Records (EHRs) dengan subjek penelitian dari Rumah Sakit Umum Daerah Rantauprapat, berjumlah 10.000 data yang terdiri dari delapan atribut dan satu atribut kelas, dengan 859 data pasien diabetes dan 9.141 data pasien non-diabetes. Metode penelitian dilakukan dengan membagi data menjadi data training dan data testing menggunakan rasio 90:10, 80:20, dan 70:30. Evaluasi kinerja model menggunakan parameter accuracy dan Receiver Operating Characteristic (ROC) dengan nilai Area Under Curve (AUC). Hasil penelitian menunjukkan bahwa algoritma C4.5 dan Random Forest menghasilkan nilai accuracy yang lebih tinggi dibandingkan Naive Bayes dan KNN, terutama pada rasio data training 90%:10% dan 70%:30%. Berdasarkan evaluasi ROC, algoritma Random Forest memperoleh nilai AUC tertinggi pada rasio 70%:30% sebesar 0,972 dan 80%:20% sebesar 0,970. Berdasarkan hasil pengujian tersebut, dapat disimpulkan bahwa algoritma C4.5 dan Random Forest memiliki kinerja yang relatif lebih baik dan hampir setara dalam klasifikasi penyakit diabetes berdasarkan nilai accuracy dan AUC. Kata kunci : Diabetes; Decision Tree; C4.5; Naive Bayes; K-Nearest Neighbor; Random Forest ================================================================================================== Diabetes is a chronic metabolic disease characterized by elevated blood glucose levels and can cause various serious complications and contribute to high mortality rates worldwide. The main problem in managing diabetes is the need for accurate patient status classification based on laboratory test data so that appropriate treatment can be carried out. This study aims to compare the performance of the C4.5 algorithm, Naive Bayes, K-Nearest Neighbor (KNN), and Random Forest in classifying diabetes patient data. The dataset used was sourced from Electronic Health Records (EHRs) with research subjects from Rantauprapat Regional General Hospital, totaling 10,000 data consisting of eight attributes and one class attribute, with 859 diabetes patient data and 9,141 non-diabetes patient data. The research method was carried out by dividing the data into training data and testing data using a ratio of 90:10, 80:20, and 70:30. Evaluation of model performance used accuracy parameters and Receiver Operating Characteristic (ROC) with Area Under Curve (AUC) values. The results showed that the C4.5 and Random Forest algorithms produced higher accuracy values than Naive Bayes and KNN, especially at training data ratios of 90%:10% and 70%:30%. Based on the ROC evaluation, the Random Forest algorithm obtained the highest AUC values at the 70%:30% ratio of 0.972 and 80%:20% of 0.970. Based on these test results, it can be concluded that the C4.5 and Random Forest algorithms have relatively better performance and are almost equivalent in classifying diabetes based on accuracy and AUC values. Keywords: Diabetes; Decision Tree; C4.5; Naive Bayes; K-Nearest Neighbor; Random Forest
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | Diabetes; Decision Tree; C4.5; Naive Bayes; K-Nearest Neighbor; Random Forest===============Diabetes; Decision Tree; C4.5; Naive Bayes; K-Nearest Neighbor; Random Forest |
| Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General) Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4450 Databases |
| Divisions: | Fakultas Sains Dan Teknologi > Teknologi Informasi |
| Depositing User: | Unnamed user with email repository@ulb.ac.id |
| Date Deposited: | 19 May 2026 04:10 |
| Last Modified: | 19 May 2026 04:10 |
| URI: | http://repository.ulb.ac.id/id/eprint/2332 |
Actions (login required)
![]() |
View Item |
