Preparing Dual Data Normalization for KNN Classfication in Prediction of Heart Failure


Authors

  • Alya Masitha Universitas Ahmad Dahlan, Yogyakarta, Indonesia
  • Muhammad Kunta Biddinika Universitas Ahmad Dahlan, Yogyakarta, Indonesia
  • Herman Universitas Ahmad Dahlan, Yogyakarta, Indonesia

DOI:

https://doi.org/10.30865/klik.v4i3.1382

Keywords:

Heart Failure; Min-Max; Simple Feature Scale; K-NN; Classification; Normalization; Preprocessing

Abstract

Heart failure disease is a serious condition that is significant in affecting both a person's quality of life and health. Therefore, it is important to develop classification methods that can help detect this disease. In this research, a data preprocessing stage is performed before being used to classify heart failure diseases using machine learning models, such as K-NN. Data preprocessing is an effort to simplify data analysis and ensure accurate results, and it is a very essential step in analyzing data to improve the quality of the data used. The dataset used in this research is raw data that has not gone through the preprocessing stage. The dataset consists of 918 data with target attributes of 0 and 1, where a value of 0 indicates a normal condition and a value of 1 indicates a potential heart failure condition. Data preprocessing includes data cleaning, data transformation, and data normalization. The main objective of this research is to carry out the preprocessing stage on data derived from heart failure disease datasets. Based on the comparison between two normalization methods, namely Min-Max and Simple Feature Scale, it is found that the Simple Feature Scale normalization method has the best performance, with an accuracy rate of 85%, while the Min-Max normalization method only reaches 84%.

Downloads

Download data is not yet available.

References

B. Rahman, H. L. H. S. Warnars, B. S. Sabarguna, and W. Budiharto, “Heart Disease Classification Model Using K-Nearest Neighbor Algorithm,” 2021 6th Int. Conf. Informatics Comput. ICIC 2021, pp. 1–4, 2021, doi: 10.1109/ICIC54025.2021.9632918.

S. Uddin, A. Khan, M. E. Hossain, and M. A. Moni, “Comparing different supervised machine learning algorithms for disease prediction,” BMC Med. Inform. Decis. Mak., vol. 19, no. 1, 2019, doi: 10.1186/s12911-019-1004-8.

H. Agrawal, J. Chandiwala, S. Agrawal, and Y. Goyal, “Heart Failure Prediction using Machine Learning with Exploratory Data Analysis,” 2021 Int. Conf. Intell. Technol. CONIT 2021, 2021, doi: 10.1109/CONIT51480.2021.9498561.

D. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone,” BMC Med. Inform. Decis. Mak., vol. 20, no. 1, 2020, doi: 10.1186/s12911-020-1023-5.

C. B. C. Latha and S. C. Jeeva, “Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques,” Informatics Med. Unlocked, vol. 16, p. 100203, 2019, doi: 10.1016/j.imu.2019.100203.

A. Upadhyay, S. Nadar, and R. Jadhav, “Comparative study of SVM & KNN for signature verification,” J. Stat. Manag. Syst., vol. 23, no. 2, pp. 191–198, 2020, doi: 10.1080/09720510.2020.1724619.

R. Yunus, U. Ulfa, and M. D. Safitri, “Application of the K-Nearest Neighbors (K-NN) Algorithm for Classification of Heart Failure,” J. Appl. Intell. Syst., vol. 6, no. 1, pp. 1–9, 2021.

S. Hafeez and N. Kathirisetty, “Effects and Comparison of different Data pre-processing techniques and ML and deep learning models for sentiment analysis: SVM, KNN, PCA with SVM and CNN,” 2022 1st Int. Conf. Artif. Intell. Trends Pattern Recognition, ICAITPR 2022, 2022, doi: 10.1109/ICAITPR51569.2022.9844192.

D. A. Nasution, H. H. Khotimah, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,” Comput. Eng. Sci. Syst. J., vol. 4, no. 1, p. 78, 2019, doi: 10.24114/cess.v4i1.11458.

S. Alam and N. Yao, “The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis,” Comput. Math. Organ. Theory, vol. 25, no. 3, pp. 319–335, 2019, doi: 10.1007/s10588-018-9266-8.

F. Adams, R. A. D. Anggoro, M. B. Satria, and A. W. Oktavia, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma Naïve Bayes, Decision Tree, dan Support Vector Machine,” Semin. Nas. Mhs. Ilmu Komput. dan Apl., no. September, pp. 260–268, 2021.

P. Mamatha Alex and S. P. Shaji, “Prediction and diagnosis of heart disease patients using data mining technique,” Proc. 2019 IEEE Int. Conf. Commun. Signal Process. ICCSP 2019, pp. 848–852, 2019, doi: 10.1109/ICCSP.2019.8697977.

C. S. Wu, M. Badshah, and V. Bhagwat, “Heart disease prediction using data mining techniques,” ACM Int. Conf. Proceeding Ser., pp. 7–11, 2019, doi: 10.1145/3352411.3352413.

H. Henderi, “Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer,” IJIIS Int. J. Informatics Inf. Syst., vol. 4, no. 1, pp. 13–20, 2021, doi: 10.47738/ijiis.v4i1.73.

D. Borkin, A. Némethová, G. Micha??onok, and K. Maiorov, “Impact of Data Normalization on Classification Model Accuracy,” Res. Pap. Fac. Mater. Sci. Technol. Slovak Univ. Technol., vol. 27, no. 45, pp. 79–84, 2019, doi: 10.2478/rput-2019-0029.

M. Sholeh, D. Andayati, and R. Y. Rachmawati, “Data Mining Model Klasifikasi Menggunakan Algoritma K-Nearest Neighbor Dengan Normalisasi Untuk Prediksi Penyakit Diabetes,” TeIKa, vol. 12, no. 02, pp. 77–87, 2022, doi: 10.36342/teika.v12i02.2911.

S. A. N. Alexandropoulos, S. B. Kotsiantis, and M. N. Vrahatis, Data preprocessing in predictive data mining, vol. 34. 2019.

C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Front. Energy Res., vol. 9, no. March, pp. 1–17, 2021, doi: 10.3389/fenrg.2021.652801.

G. S. R. Thummala and R. Baskar, “Prediction of Heart Disease using Decision Tree in Comparison with KNN to Improve Accuracy,” in Proceedings of the 2022 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems, ICSES 2022, 2022, pp. 1–5, doi: 10.1109/ICSES55317.2022.9914044.

T. A. Assegie, S. J. Sushma, B. G. Bhavya, and S. Padmashree, “Correlation Analysis for Determining Effective Data in Machine Learning: Detection of Heart Failure,” SN Comput. Sci., vol. 2, no. 3, 2021, doi: 10.1007/s42979-021-00617-5.

K. Burse, V. P. S. Kirar, A. Burse, and R. Burse, “Various Preprocessing Methods for Neural Network Based Heart Disease Prediction,” Adv. Intell. Syst. Comput., vol. 851, pp. 55–65, 2019, doi: 10.1007/978-981-13-2414-7_6.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Preparing Dual Data Normalization for KNN Classfication in Prediction of Heart Failure

Dimensions Badge

ARTICLE HISTORY


Published: 2023-12-06
Abstract View: 287 times
PDF Download: 227 times