Perbandingan Algoritma Naïve Bayes dan K-Nearest Neighbor Pada Imbalace Class Dataset Penyakit Diabetes
DOI:
https://doi.org/10.30865/klik.v4i3.1486Keywords:
Diabetes; K-NN; Naïve Bayes; Diesease; ComparisonAbstract
Diabetes is a global health concern due to its significant impact on public health. Managing this disease is crucial to prevent serious complications. Technological advancements, particularly in machine learning models, have opened new avenues in diabetes identification. This study compares the performance of the Naive Bayes and K-Nearest Neighbor (KNN) classification algorithms on an imbalanced diabetes dataset. The primary aim is to evaluate these algorithms' performance in predicting diabetes while considering class imbalance. Classification methods were applied to previously collected datasets. The research findings demonstrate that Naive Bayes with the SMOTE technique exhibited the best performance with an accuracy of 71.66%, followed by Naive Bayes without SMOTE (76.03%), and KNN with SMOTE (80.47%). Although KNN without SMOTE showed the highest accuracy (83.02%), Naive Bayes with SMOTE showcased a better balance between accuracy, precision, and recall. The utilization of the SMOTE technique improved Naive Bayes' performance by enhancing precision and recall, indicating its capability to address class imbalance in the diabetes dataset. This study offers insights into selecting the best algorithms and effective techniques for handling class imbalance to predict diabetes on imbalanced datasets.
Downloads
References
International Diabetes Federation, “About Diabetes.” Diakses: 26 November 2023. [Daring]. Tersedia pada: https://idf.org/about-diabetes/what-is-diabetes/
Kementerian Kesehatan RI, Tetap Produktif, Cegah, dan Atasi Diabetes Melitus. Jakarta: Pusat Data dan Informasi Kementerian Kesehatan RI, 2020.
M. D. Purbolaksono, M. Irvan Tantowi, A. Imam Hidayat, dan A. Adiwijaya, “Perbandingan Support Vector Machine dan Modified Balanced Random Forest dalam Deteksi Pasien Penyakit Diabetes,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 2, hlm. 393–399, Apr 2021, doi: 10.29207/resti.v5i2.3008.
A. Ridwan, “Penerapan Algoritma Naïve Bayes Untuk Klasifikasi Penyakit Diabetes Mellitus,” Jurnal Sistem Komputer & Kecerdasan Buatan, vol. 4, no. 1, hlm. 15–21, 2020, doi: https://doi.org/10.47970/siskom-kb.v4i1.169.
M. M. F. Islam, R. Ferdousi, S. Rahman, dan H. Bushra, “Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques,” dalam Advances in Intelligent Systems and Computing, vol. 992, 2020, hlm. 113–125. doi: 10.1007/978-981-13-8798-2_12.
V. Lopatka, I. Meniailov, dan K. Bazilevych, “Classification and Prediction of Diabetes Disease Using Modified k-neighbors Method,” dalam 2021 IEEE 12th International Conference on Electronics and Information Technologies (ELIT), 2021, hlm. 46–50. doi: 10.1109/ELIT53502.2021.9501151.
K. Alpan dan G. S. ?lgi, “Classification of Diabetes Dataset with Data Mining Techniques by Using WEKA Approach,” dalam 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), 2020, hlm. 1–7. doi: 10.1109/ISMSIT50672.2020.9254720.
R. S. Raj, D. S. Sanjay, M. Kusuma, dan S. Sampath, “Comparison of Support Vector Machine and Naïve Bayes Classifiers for Predicting Diabetes,” dalam 2019 1st International Conference on Advanced Technologies in Intelligent Control, Environment, Computing & Communication Engineering (ICATIECE), 2019, hlm. 41–45. doi: 10.1109/ICATIECE45860.2019.9063792.
A. M. Argina, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,” Indonesian Journal of Data and Science, vol. 1, no. 2, hlm. 29–33, Jul 2020, doi: 10.33096/ijodas.v1i2.11.
N. Marito Putry dan B. Nurina Sari, “KOMPARASI ALGORITMA KNN DAN NAÏVE BAYES UNTUK KLASIFIKASI DIAGNOSIS PENYAKIT DIABETES MELITUS,” Jurnal Sains dan Manajemen, vol. 10, no. 1, 2022, doi: https://doi.org/10.31294/evolusi.v10i1.12514.
D. Vigneswari, N. K. Kumar, V. G. Raj, A. Gugan, dan S. R. Vikash, “Machine Learning Tree Classifiers in Predicting Diabetes Mellitus,” dalam 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), 2019, hlm. 84–87. doi: 10.1109/ICACCS.2019.8728388.
D. Elreedy dan A. F. Atiya, “A Comprehensive Analysis of Synthetic Minority Oversampling Technique (SMOTE) for handling class imbalance,” Inf Sci (N Y), vol. 505, hlm. 32–64, Des 2019, doi: 10.1016/J.INS.2019.07.070.
S.-A. N. Alexandropoulos, S. B. Kotsiantis, dan M. N. Vrahatis, “Data preprocessing in predictive data mining,” Knowl Eng Rev, vol. 34, hlm. e1, 2019, doi: DOI: 10.1017/S026988891800036X.
J. Peng dkk., “DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python,” dalam Proceedings of the 2021 International Conference on Management of Data, dalam SIGMOD ’21. New York, NY, USA: Association for Computing Machinery, 2021, hlm. 2271–2280. doi: 10.1145/3448016.3457330.
D. Singh dan B. Singh, “Investigating the impact of data normalization on classification performance,” Appl Soft Comput, vol. 97, hlm. 105524, Des 2020, doi: 10.1016/J.ASOC.2019.105524.
A. Nikmatul Kasanah, U. Pujianto, T. Elektro, F. Teknik, dan U. Negeri Malang, “Penerapan Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Objektivitas Berita Online Menggunakan Algoritma KNN,” JURNAL RESTI (Rekayasa Sist.Teknol. Inf.), vol. 1, no. 3, hlm. 196–201, 2019, doi: https://doi.org/10.29207/resti.v3i2.945.
N. G. Ramadhan dan A. Khoirunnisa, “Klasifikasi Data Malaria Menggunakan Metode Support Vector Machine,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 5, no. 4, hlm. 1580, Okt 2021, doi: 10.30865/mib.v5i4.3347.
M. A. Maricar dan Dian Pramana, “Perbandingan Akurasi Naïve Bayes dan K-Nearest Neighbor pada Klasifikasi untuk Meramalkan Status Pekerjaan Alumni ITB STIKOM Bali,” Jurnal Sistem dan Informatika (JSI), vol. 14, no. 1, hlm. 16–22, Nov 2019, doi: 10.30864/jsi.v14i1.233.
J. Li, Q. Zhu, Q. Wu, dan Z. Fan, “A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors,” Inf Sci (N Y), vol. 565, hlm. 438–455, Jul 2021, doi: 10.1016/J.INS.2021.03.041.
E. Sutoyo dan M. Asri Fadlurrahman, “Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Television Advertisement Performance Rating Menggunakan Artificial Neural Network,” JEPIN (Jurnal Edukasi dan Penelitian Informatika) , vol. 6, no. 3, hlm. 379–385, 2020, doi: https://dx.doi.org/10.26418/jp.v6i3.42896.
J. Xu, Y. Zhang, dan D. Miao, “Three-way confusion matrix for classification: A measure driven view,” Inf Sci (N Y), vol. 507, hlm. 772–794, Jan 2020, doi: 10.1016/J.INS.2019.06.064.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Perbandingan Algoritma Naïve Bayes dan K-Nearest Neighbor Pada Imbalace Class Dataset Penyakit Diabetes
ARTICLE HISTORY
Issue
Section
Copyright (c) 2023 Muhammad Rousydi Hunafa, Arief Hermawan

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).