Penerapan Fungsi Exponential Pada Pembobotan Fungsi Jarak Euclidean Algoritma K-Nearest Neighbor

Muhammad Jauhar Vikri; Roihatur Rohmah

doi:10.29407/gj.v6i2.18070

Authors

Muhammad Jauhar Vikri Universitas Nahdlatul Ulama Sunan Giri
Roihatur Rohmah Universitas Nahdlatul Ulama Sunan Giri

DOI:

https://doi.org/10.29407/gj.v6i2.18070

Keywords:

Classification, k-NN, Attribute Weighting, Exponential Function, Euclidean Distance

Abstract

– k-Nearest Neighbor (k-NN) is one of the popular classification algorithms and is widely used to solve classification cases. This is because the k-NN algorithm has advantages such as being simple, easy to explain, and easy to implement. However, the k-NN algorithm has a lack of classification results that are strongly influenced by the scale of input data and Euclidean which treats attribute data evenly, not according to the relevance of each data attribute. This causes a decrease in the classification results. One way to improve the classification accuracy performance of the k-NN algorithm is the method of weighting its features when measuring the Euclidean distance. The exponential function of the optimized Euclidean distance measurement is applied to the k-NN algorithm as a distance measurement method. Improving the performance of the k-NN method with the Exponential function for weighting features on k-NN will be carried out by experimentation using the Data Mining method. Then the results of the performance of the objective method will be compared with the original k-NN method and the previous k-NN weighting research method. As a result of the closest distance decision, taking the closest distance to k-NN will be determined with a value of k=5. After the experiment, the goal algorithm was compared with the k-NN, Wk-NN, and DWk-NN algorithms. Overall the comparison results obtained an average value of k-NN 85.87%, Wk-NN 86.98%, DWk-NN 88.19% and the k-NN algorithm given the weighting of the Exponential function obtained a value of 90.17%.

Abstract views: 663 , PDF downloads: 615

References

[1] H. Jiawei, M. Kamber, J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2012.
[2] M. Bicego, “Weighted K-Nearest Neighbor Revisited,” pp. 1643–1648, 2016.
[3] J. Gou, H. Ma, W. Ou, S. Zeng, Y. Rao, and H. Yang, “A generalized mean distance-based k -nearest neighbor classifier,” Expert Syst. Appl., vol. 115, pp. 356–372, 2019, doi: 10.1016/j.eswa.2018.08.021.
[4] J. Gou, “A New Distance-weighted k -nearest Neighbor Classifier,” no. November 2011, 2014.
[5] P. Cao et al., “Nonlinearity-aware based dimensionality reduction and over-sampling for AD/MCI classification from MRI measures,” Comput. Biol. Med., vol. 91, pp. 21–37, Dec. 2017, doi: https://doi.org/10.1016/j.compbiomed.2017.10.002.
[6] I. Witten, E. Frank, and M. Hall, Data Mining Practical Machine Learning Tools and Techniques Third Edition, vol. 277, no. Tentang Data Mining. 2011.
[7] X. Zheng, Z. Lin, H. Xu, C. Chen, and T. Ye, “Efficient learning ensemble SuperParent-one-dependence estimator by maximizing conditional log likelihood,” Expert Syst. Appl., vol. 42, no. 21, pp. 7732–7745, Nov. 2015, doi: https://doi.org/10.1016/j.eswa.2015.05.051.
[8] U. R. Yelipe, S. Porika, and M. Golla, “An efficient approach for imputation and classification of medical data values using class-based clustering of medical records,” Comput. Electr. Eng., vol. 0, pp. 1–18, 2017, doi: 10.1016/j.compeleceng.2017.11.030.
[9] P. Rani, “A Review of various KNN Techniques,” vol. 5, no. Viii, pp. 1174–1179, 2017.
[10] X. Wu et al., Top 10 algorithms in data mining. 2008.
[11] I. H. Witten, E. Frank, and M. A. Hall, Data Mining Third Edition. Elsevier, 2011.
[12] J. Xia et al., “Adjusted weight voting algorithm for random forests in handling missing values,” Pattern Recognit., vol. 69, pp. 52–60, 2017, doi: 10.1016/j.patcog.2017.04.005.