• 제목/요약/키워드: k-NN Method

검색결과 309건 처리시간 0.025초

유사도 임계치에 근거한 최근접 이웃 집합의 구성 (Formation of Nearest Neighbors Set Based on Similarity Threshold)

  • 이재식;이진천
    • 지능정보연구
    • /
    • 제13권2호
    • /
    • pp.1-14
    • /
    • 2007
  • 사례기반추론은 다양한 예측 문제에 있어서 성공적으로 활용되고 있는 데이터 마이닝 기법 중 하나이다. 사례기반추론 시스템의 예측 성능은 예측에 사용되는 최근접 이웃 집합을 어떻게 구성하느냐에 따라 영향을 받게 된다. 최근접 이웃 집합의 구성에 있어서 대부분의 선행 연구들은 고정된 값인 K개의 사례를 포함시키는 k-NN 방법을 채택해왔다. 그러나 k-NN 방법을 채택하는 사례기반추론 시스템은 k 값을 너무 크게 혹은 작게 설정하게 되면 예측 성능이 저하된다. 본 연구에서는 이러한 문제를 해결하기 위해 최근접 이웃 집합을 구성함에 있어서 유사도의 임계치 자체를 이용하는 s-NN 방법을 제안하였다. UCI의 Machine Learning Repository에서 제공하는 데이터를 사용하여 실험한 결과, s-NN 방법을 적용한 사례기반추론 모델이 k-NN 방법을 적용한 사례기반추론 모델보다 더 우수한 성능을 보여주었다.

  • PDF

K-nn을 이용한 Hot Deck 기반의 결측치 대체 (Imputation of Missing Data Based on Hot Deck Method Using K-nn)

  • 권순창
    • 한국IT서비스학회지
    • /
    • 제13권4호
    • /
    • pp.359-375
    • /
    • 2014
  • Researchers cannot avoid missing data in collecting data, because some respondents arbitrarily or non-arbitrarily do not answer questions in studies and experiments. Missing data not only increase and distort standard deviations, but also impair the convenience of estimating parameters and the reliability of research results. Despite widespread use of hot deck, researchers have not been interested in it, since it handles missing data in ambiguous ways. Hot deck can be complemented using K-nn, a method of machine learning, which can organize donor groups closest to properties of missing data. Interested in the role of k-nn, this study was conducted to impute missing data based on the hot deck method using k-nn. After setting up imputation of missing data based on hot deck using k-nn as a study objective, deletion of listwise, mean, mode, linear regression, and svm imputation were compared and verified regarding nominal and ratio data types and then, data closest to original values were obtained reasonably. Simulations using different neighboring numbers and the distance measuring method were carried out and better performance of k-nn was accomplished. In this study, imputation of hot deck was re-discovered which has failed to attract the attention of researchers. As a result, this study shall be able to help select non-parametric methods which are less likely to be affected by the structure of missing data and its causes.

PD 분류에 있어서 핑거프린트법과 신경망의 비교 (Comparison with Finger Print Method and NN as PD Classification)

  • 박성희;박재열;이강원;강성화;임기조
    • 한국전기전자재료학회:학술대회논문집
    • /
    • 한국전기전자재료학회 2003년도 하계학술대회 논문집 Vol.4 No.2
    • /
    • pp.1163-1167
    • /
    • 2003
  • As a PD classification method, statistical distribution parameters have been used during several ten years. And this parameters are recently finger print method, NN(Neural Network) and etc. So in this paper we studied finger print method and NN with BP(Back propagation) learning algorithm using the statistical distribution parameter, and compared with two method as classification method. As a result of comparison, classification of NN is more good result than Finger print method in respect to calculation speed, visible effect and simplicity. So, NN has more advantage as a tool for PD classification.

  • PDF

Neural network based model for seismic assessment of existing RC buildings

  • Caglar, Naci;Garip, Zehra Sule
    • Computers and Concrete
    • /
    • 제12권2호
    • /
    • pp.229-241
    • /
    • 2013
  • The objective of this study is to reveal the sufficiency of neural networks (NN) as a securer, quicker, more robust and reliable method to be used in seismic assessment of existing reinforced concrete buildings. The NN based approach is applied as an alternative method to determine the seismic performance of each existing RC buildings, in terms of damage level. In the application of the NN, a multilayer perceptron (MLP) with a back-propagation (BP) algorithm is employed using a scaled conjugate gradient. NN based model wasd eveloped, trained and tested through a based MATLAB program. The database of this model was developed by using a statistical procedure called P25 method. The NN based model was also proved by verification set constituting of real existing RC buildings exposed to 2003 Bingol earthquake. It is demonstrated that the NN based approach is highly successful and can be used as an alternative method to determine the seismic performance of each existing RC buildings.

군집분석을 이용한 국지해일모델 지역확장 (Regional Extension of the Neural Network Model for Storm Surge Prediction Using Cluster Analysis)

  • 이다운;서장원;윤용훈
    • 대기
    • /
    • 제16권4호
    • /
    • pp.259-267
    • /
    • 2006
  • In the present study, the neural network (NN) model with cluster analysis method was developed to predict storm surge in the whole Korean coastal regions with special focuses on the regional extension. The model used in this study is NN model for each cluster (CL-NN) with the cluster analysis. In order to find the optimal clustering of the stations, agglomerative method among hierarchical clustering methods was used. Various stations were clustered each other according to the centroid-linkage criterion and the cluster analysis should stop when the distances between merged groups exceed any criterion. Finally the CL-NN can be constructed for predicting storm surge in the cluster regions. To validate model results, predicted sea level value from CL-NN model was compared with that of conventional harmonic analysis (HA) and of the NN model in each region. The forecast values from NN and CL-NN models show more accuracy with observed data than that of HA. Especially the statistics analysis such as RMSE and correlation coefficient shows little differences between CL-NN and NN model results. These results show that cluster analysis and CL-NN model can be applied in the regional storm surge prediction and developed forecast system.

이전 k 개의 가장 가까운 이웃을 이용한 무리 짓기에 대한 공간분할 방법의 개선 (An Improvement Of Spatial Partitioning Method For Flocking Behaviors By Using Previous k-Nearest Neighbors)

  • 이재문
    • 한국게임학회 논문지
    • /
    • 제9권2호
    • /
    • pp.115-123
    • /
    • 2009
  • 논문에서는 무리 짓기에 대한 공간분할 방법의 성능을 개선하는 알고리즘을 제안한다. 핵심 개념은 무리 속에서 움직이는 개체인 보이드가 지속적으로 자신의 방향과 위치를 변경시키나 자신의 다음 방향의 결정에 영향을 주는 k개의 가장 가까운 이웃인 kNN은 자주 바뀌지 않는다는 사실을 이용하여 성능을 개선하는 것이다. 본 논문에서 이전의 kNN을 이용하여 새로운 kNN이 변경되었는지를 판별하는 방법이 제안되었고, 제안된 방법의 정당성은 정리를 통하여 증명되었다. 제안된 방법은 구현되었으며, 기존의 공간분할 방법과 성능이 비교되었다. 비교 결과로부터 제안된 알고리즘이 초당 프레임 수 관점에서 기존의 알고리즘보다 약 30% 개선 효과를 주는 것을 알 수 있었다.

  • PDF

The Method of Continuous Nearest Neighbor Search on Trajectory of Moving Objects

  • Park, Bo-Yoon;Kim, Sang-Ho;Nam, Kwang-Woo;Ryo, Keun-Ho
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 ISIS 2003
    • /
    • pp.467-470
    • /
    • 2003
  • When user wants to find objects which have the nearest position from him, we use the nearest neighbor (NN) query. The GIS applications, such as navigation system and traffic control system, require processing of NN query for moving objects (MOs). MOs have trajectory with changing their position over time. Therefore, we should be able to find NN object continuously changing over the whole query time when process NN query for MOs, as well as moving nearby on trajectory of query. However, none of previous works consider trajectory information between objects. Therefore, we propose a method of continuous NN query for trajectory of MOs. We call this CTNN (continuous trajectory NN) technique. It ran find constantly valid NN object on the whole query time by considering of trajectory information.

  • PDF

Plurality Rule-based Density and Correlation Coefficient-based Clustering for K-NN

  • Aung, Swe Swe;Nagayama, Itaru;Tamaki, Shiro
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제6권3호
    • /
    • pp.183-192
    • /
    • 2017
  • k-nearest neighbor (K-NN) is a well-known classification algorithm, being feature space-based on nearest-neighbor training examples in machine learning. However, K-NN, as we know, is a lazy learning method. Therefore, if a K-NN-based system very much depends on a huge amount of history data to achieve an accurate prediction result for a particular task, it gradually faces a processing-time performance-degradation problem. We have noticed that many researchers usually contemplate only classification accuracy. But estimation speed also plays an essential role in real-time prediction systems. To compensate for this weakness, this paper proposes correlation coefficient-based clustering (CCC) aimed at upgrading the performance of K-NN by leveraging processing-time speed and plurality rule-based density (PRD) to improve estimation accuracy. For experiments, we used real datasets (on breast cancer, breast tissue, heart, and the iris) from the University of California, Irvine (UCI) machine learning repository. Moreover, real traffic data collected from Ojana Junction, Route 58, Okinawa, Japan, was also utilized to lay bare the efficiency of this method. By using these datasets, we proved better processing-time performance with the new approach by comparing it with classical K-NN. Besides, via experiments on real-world datasets, we compared the prediction accuracy of our approach with density peaks clustering based on K-NN and principal component analysis (DPC-KNN-PCA).

휴리스틱을 이용한 kNN의 효율성 개선 (An Improvement Of Efficiency For kNN By Using A Heuristic)

  • 이재문
    • 정보처리학회논문지B
    • /
    • 제10B권6호
    • /
    • pp.719-724
    • /
    • 2003
  • 이 논문은 kNN의 정확도의 손실 없이 kNN의 효율성을 개선하는 휴리스틱을 제안한다. 제안된 휴리스틱은 kNN 실행 시간의 주요 요소인 두 문서간 유사성 계산을 최소화하는 것이다. 이것을 위하여 본 논문은 유사성의 상한값을 계산하는 방법과 훈련 문서를 정렬하는 방법을 제안한다. 제안된 휴리스틱을 문서 분류 프레임?인 AI :: Categorizer 상에서 구현하였으며, 잘 알려진 로이터-21578 데이터를 사용하여 기존의 kNN과 비교하였다. 성능 비교의 결과로부터 제안된 휴리스틱을 적용한 방법이 기존의 kNN보다 실행 속도측면에서 약 30∼40%의 개선 효과가 있음을 알 수 있었다.

A Density-Based K-Nearest Neighbors Search Method

  • Jang I. S.;Min K.W.;Choi W.S
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2004년도 Proceedings of ISRS 2004
    • /
    • pp.260-262
    • /
    • 2004
  • Spatial database system provides many query types and most of them are required frequent disk I/O and much CPU time. k-NN search is to find k-th closest object from the query point and up to now, several k-NN search methods have been proposed. Among these, MINMAX distance method has an aim not to visit unnecessary node by applying pruning technique. But this method access more disk than necessary while pruning unnecessary node. In this paper, we propose new k-NN search algorithm based on density of object. With this method, we predict the radius to be expected to contain k-NN object using density of data set and search those objects within this radius and then adjust radius if failed. Experimental results show that this method outperforms the previous MINMAX distance method. This algorithm visit fewer disks than MINMAX method by the factor of maximum $22\%\;and\;average\;6\%.$

  • PDF