• Title/Summary/Keyword: k-NN Method

Search Result 309, Processing Time 0.022 seconds

Formation of Nearest Neighbors Set Based on Similarity Threshold (유사도 임계치에 근거한 최근접 이웃 집합의 구성)

  • Lee, Jae-Sik;Lee, Jin-Chun
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.1-14
    • /
    • 2007
  • Case-based reasoning (CBR) is one of the most widely applied data mining techniques and has proven its effectiveness in various domains. Since CBR is basically based on k-Nearest Neighbors (NN) method, the value of k affects the performance of CBR model directly. Once the value of k is set, it is fixed for the lifetime of the CBR model. However, if the value is set greater or smaller than the optimal value, the performance of CBR model will be deteriorated. In this research, we propose a new method of composing the NN set using similarity scores as themselves, which we shall call s-NN method, rather than using the fixed value of k. In the s-NN method, the different number of nearest neighbors can be selected for each new case. Performance evaluation using the data from UCI Machine Learning Repository shows that the CBR model adopting the s-NN method outperforms the CBR model adopting the traditional k-NN method.

  • PDF

Imputation of Missing Data Based on Hot Deck Method Using K-nn (K-nn을 이용한 Hot Deck 기반의 결측치 대체)

  • Kwon, Soonchang
    • Journal of Information Technology Services
    • /
    • v.13 no.4
    • /
    • pp.359-375
    • /
    • 2014
  • Researchers cannot avoid missing data in collecting data, because some respondents arbitrarily or non-arbitrarily do not answer questions in studies and experiments. Missing data not only increase and distort standard deviations, but also impair the convenience of estimating parameters and the reliability of research results. Despite widespread use of hot deck, researchers have not been interested in it, since it handles missing data in ambiguous ways. Hot deck can be complemented using K-nn, a method of machine learning, which can organize donor groups closest to properties of missing data. Interested in the role of k-nn, this study was conducted to impute missing data based on the hot deck method using k-nn. After setting up imputation of missing data based on hot deck using k-nn as a study objective, deletion of listwise, mean, mode, linear regression, and svm imputation were compared and verified regarding nominal and ratio data types and then, data closest to original values were obtained reasonably. Simulations using different neighboring numbers and the distance measuring method were carried out and better performance of k-nn was accomplished. In this study, imputation of hot deck was re-discovered which has failed to attract the attention of researchers. As a result, this study shall be able to help select non-parametric methods which are less likely to be affected by the structure of missing data and its causes.

Comparison with Finger Print Method and NN as PD Classification (PD 분류에 있어서 핑거프린트법과 신경망의 비교)

  • Park, Sung-Hee;Park, Jae-Yeol;Lee, Kang-Won;Kang, Seong-Hwa;Lim, Kee-Joe
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2003.07b
    • /
    • pp.1163-1167
    • /
    • 2003
  • As a PD classification method, statistical distribution parameters have been used during several ten years. And this parameters are recently finger print method, NN(Neural Network) and etc. So in this paper we studied finger print method and NN with BP(Back propagation) learning algorithm using the statistical distribution parameter, and compared with two method as classification method. As a result of comparison, classification of NN is more good result than Finger print method in respect to calculation speed, visible effect and simplicity. So, NN has more advantage as a tool for PD classification.

  • PDF

Neural network based model for seismic assessment of existing RC buildings

  • Caglar, Naci;Garip, Zehra Sule
    • Computers and Concrete
    • /
    • v.12 no.2
    • /
    • pp.229-241
    • /
    • 2013
  • The objective of this study is to reveal the sufficiency of neural networks (NN) as a securer, quicker, more robust and reliable method to be used in seismic assessment of existing reinforced concrete buildings. The NN based approach is applied as an alternative method to determine the seismic performance of each existing RC buildings, in terms of damage level. In the application of the NN, a multilayer perceptron (MLP) with a back-propagation (BP) algorithm is employed using a scaled conjugate gradient. NN based model wasd eveloped, trained and tested through a based MATLAB program. The database of this model was developed by using a statistical procedure called P25 method. The NN based model was also proved by verification set constituting of real existing RC buildings exposed to 2003 Bingol earthquake. It is demonstrated that the NN based approach is highly successful and can be used as an alternative method to determine the seismic performance of each existing RC buildings.

Regional Extension of the Neural Network Model for Storm Surge Prediction Using Cluster Analysis (군집분석을 이용한 국지해일모델 지역확장)

  • Lee, Da-Un;Seo, Jang-Won;Youn, Yong-Hoon
    • Atmosphere
    • /
    • v.16 no.4
    • /
    • pp.259-267
    • /
    • 2006
  • In the present study, the neural network (NN) model with cluster analysis method was developed to predict storm surge in the whole Korean coastal regions with special focuses on the regional extension. The model used in this study is NN model for each cluster (CL-NN) with the cluster analysis. In order to find the optimal clustering of the stations, agglomerative method among hierarchical clustering methods was used. Various stations were clustered each other according to the centroid-linkage criterion and the cluster analysis should stop when the distances between merged groups exceed any criterion. Finally the CL-NN can be constructed for predicting storm surge in the cluster regions. To validate model results, predicted sea level value from CL-NN model was compared with that of conventional harmonic analysis (HA) and of the NN model in each region. The forecast values from NN and CL-NN models show more accuracy with observed data than that of HA. Especially the statistics analysis such as RMSE and correlation coefficient shows little differences between CL-NN and NN model results. These results show that cluster analysis and CL-NN model can be applied in the regional storm surge prediction and developed forecast system.

An Improvement Of Spatial Partitioning Method For Flocking Behaviors By Using Previous k-Nearest Neighbors (이전 k 개의 가장 가까운 이웃을 이용한 무리 짓기에 대한 공간분할 방법의 개선)

  • Lee, Jae-Moon
    • Journal of Korea Game Society
    • /
    • v.9 no.2
    • /
    • pp.115-123
    • /
    • 2009
  • This paper proposes an algorithm to improve the performance of the spatial partitioning method for flocking behaviors. The core concept is to improve the performance by using the fact that even if a moving entity, boid in flock continuously changes its direction and position, its k-nearest neighbors, kNN to effect on decision of the next direction is not changed frequently. From the previous kNN, the method to check whether new kNN is changed or not is proposed in this paper and then the correctness of the proposed method is proved by two theorems. The proposed algorithm was implemented and its performance was compared with the conventional spatial partitioning method. The results of the comparison show that the proposed algorithm outperforms the conventional one by about 30% with respect to the number of frames per a second.

  • PDF

The Method of Continuous Nearest Neighbor Search on Trajectory of Moving Objects

  • Park, Bo-Yoon;Kim, Sang-Ho;Nam, Kwang-Woo;Ryo, Keun-Ho
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.467-470
    • /
    • 2003
  • When user wants to find objects which have the nearest position from him, we use the nearest neighbor (NN) query. The GIS applications, such as navigation system and traffic control system, require processing of NN query for moving objects (MOs). MOs have trajectory with changing their position over time. Therefore, we should be able to find NN object continuously changing over the whole query time when process NN query for MOs, as well as moving nearby on trajectory of query. However, none of previous works consider trajectory information between objects. Therefore, we propose a method of continuous NN query for trajectory of MOs. We call this CTNN (continuous trajectory NN) technique. It ran find constantly valid NN object on the whole query time by considering of trajectory information.

  • PDF

Plurality Rule-based Density and Correlation Coefficient-based Clustering for K-NN

  • Aung, Swe Swe;Nagayama, Itaru;Tamaki, Shiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.183-192
    • /
    • 2017
  • k-nearest neighbor (K-NN) is a well-known classification algorithm, being feature space-based on nearest-neighbor training examples in machine learning. However, K-NN, as we know, is a lazy learning method. Therefore, if a K-NN-based system very much depends on a huge amount of history data to achieve an accurate prediction result for a particular task, it gradually faces a processing-time performance-degradation problem. We have noticed that many researchers usually contemplate only classification accuracy. But estimation speed also plays an essential role in real-time prediction systems. To compensate for this weakness, this paper proposes correlation coefficient-based clustering (CCC) aimed at upgrading the performance of K-NN by leveraging processing-time speed and plurality rule-based density (PRD) to improve estimation accuracy. For experiments, we used real datasets (on breast cancer, breast tissue, heart, and the iris) from the University of California, Irvine (UCI) machine learning repository. Moreover, real traffic data collected from Ojana Junction, Route 58, Okinawa, Japan, was also utilized to lay bare the efficiency of this method. By using these datasets, we proved better processing-time performance with the new approach by comparing it with classical K-NN. Besides, via experiments on real-world datasets, we compared the prediction accuracy of our approach with density peaks clustering based on K-NN and principal component analysis (DPC-KNN-PCA).

An Improvement Of Efficiency For kNN By Using A Heuristic (휴리스틱을 이용한 kNN의 효율성 개선)

  • Lee, Jae-Moon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.719-724
    • /
    • 2003
  • This paper proposed a heuristic to enhance the speed of kNN without loss of its accuracy. The proposed heuristic minimizes the computation of the similarity between two documents which is the dominant factor in kNN. To do this, the paper proposes a method to calculate the upper limit of the similarity and to sort the training documents. The proposed heuristic was implemented on the existing framework of the text categorization, so called, AI :: Categorizer and it was compared with the conventional kNN with the well-known data, Router-21578. The comparisons show that the proposed heuristic outperforms kNN about 30∼40% with respect to the execution time.

A Density-Based K-Nearest Neighbors Search Method

  • Jang I. S.;Min K.W.;Choi W.S
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.260-262
    • /
    • 2004
  • Spatial database system provides many query types and most of them are required frequent disk I/O and much CPU time. k-NN search is to find k-th closest object from the query point and up to now, several k-NN search methods have been proposed. Among these, MINMAX distance method has an aim not to visit unnecessary node by applying pruning technique. But this method access more disk than necessary while pruning unnecessary node. In this paper, we propose new k-NN search algorithm based on density of object. With this method, we predict the radius to be expected to contain k-NN object using density of data set and search those objects within this radius and then adjust radius if failed. Experimental results show that this method outperforms the previous MINMAX distance method. This algorithm visit fewer disks than MINMAX method by the factor of maximum $22\%\;and\;average\;6\%.$

  • PDF