• Title/Summary/Keyword: 최근접이웃법

Search Result 31, Processing Time 0.029 seconds

On the use of weighted adaptive nearest neighbors for missing value imputation (가중 적응 최근접 이웃을 이용한 결측치 대치)

  • Yum, Yunjin;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.507-516
    • /
    • 2018
  • Widely used among the various single imputation methods is k-nearest neighbors (KNN) imputation due to its robustness even when a parametric model such as multivariate normality is not satisfied. We propose a weighted adaptive nearest neighbors imputation method that combines the adaptive nearest neighbors imputation method that accounts for the local features of the data in the KNN imputation method and weighted k-nearest neighbors method that are less sensitive to extreme value or outlier among k-nearest neighbors. We conducted a Monte Carlo simulation study to compare the performance of the proposed imputation method with previous imputation methods.

On the Use of Sequential Adaptive Nearest Neighbors for Missing Value Imputation (순차 적응 최근접 이웃을 활용한 결측값 대치법)

  • Park, So-Hyun;Bang, Sung-Wan;Jhun, Myoung-Shic
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1249-1257
    • /
    • 2011
  • In this paper, we propose a Sequential Adaptive Nearest Neighbor(SANN) imputation method that combines the Adaptive Nearest Neighbor(ANN) method and the Sequential k-Nearest Neighbor(SKNN) method. When choosing the nearest neighbors of missing observations, the proposed SANN method takes the local feature of the missing observations into account as well as reutilizes the imputed observations in a sequential manner. By using a Monte Carlo study and a real data example, we demonstrate the characteristics of the SANN method and its potential performance.

Performance Comparison of Classification Algorithms in Music Recognition using Violin and Cello Sound Files (바이올린과 첼로 연주 데이터를 이용한 분류 알고리즘의 성능 비교)

  • Kim Jae Chun;Kwak Kyung sup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.5C
    • /
    • pp.305-312
    • /
    • 2005
  • Three classification algorithms are tested using musical instruments. Several classification algorithms are introduced and among them, Bayes rule, NN and k-NN performances evaluated. ZCR, mean, variance and average peak level feature vectors are extracted from instruments sample file and used as data set to classification system. Used musical instruments are Violin, baroque violin and baroque cello. Results of experiment show that the performance of NN algorithm excels other algorithms in musical instruments classification.

신재생 에너지 생산량 예측 알고리즘

  • Kim, Ji-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.11a
    • /
    • pp.389-392
    • /
    • 2012
  • 에너지관리 지원 서비스는 공장 내에서 일어나는 전력발전 및 전력할당을 데어터 분석 기법 등을 이용하여 효과적으로 관리하는 것을 목적으로 한다. 특히 그 중에서도 태양광, 풍력 등 친환경 에너지를 이용한 에너지관리 시스템은 비용절감 뿐만 아니라 환경보호 측면에서도 중요한 문제라 할 수 있다. 이들 친환경 에너지를 제대로 이용하기 위해서는 그들의 발전량을 정확히 예측할 필요가 있지만 현재의 시스템에는 가장 기본적인 예측법인 최근접 이웃법을 사용하고 있다. 최근접 이웃법의 경우 노이즈와 아웃라이어에 민감하다는 단점이 있기 때문에 이들 상황에 대처할 수 있는 보다 정교한 예측법이 필요하다.

A Missing Data Imputation by Combining K Nearest Neighbor with Maximum Likelihood Estimation for Numerical Software Project Data (K-NN과 최대 우도 추정법을 결합한 소프트웨어 프로젝트 수치 데이터용 결측값 대치법)

  • Lee, Dong-Ho;Yoon, Kyung-A;Bae, Doo-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.273-282
    • /
    • 2009
  • Missing data is one of the common problems in building analysis or prediction models using software project data. Missing imputation methods are known to be more effective missing data handling method than deleting methods in small software project data. While K nearest neighbor imputation is a proper missing imputation method in the software project data, it cannot use non-missing information of incomplete project instances. In this paper, we propose an approach to missing data imputation for numerical software project data by combining K nearest neighbor and maximum likelihood estimation; we also extend the average absolute error measure by normalization for accurate evaluation. Our approach overcomes the limitation of K nearest neighbor imputation and outperforms on our real data sets.

On the Use of Weighted k-Nearest Neighbors for Missing Value Imputation (Weighted k-Nearest Neighbors를 이용한 결측치 대치)

  • Lim, Chanhui;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.1
    • /
    • pp.23-31
    • /
    • 2015
  • A conventional missing value problem in the statistical analysis k-Nearest Neighbor(KNN) method are used for a simple imputation method. When one of the k-nearest neighbors is an extreme value or outlier, the KNN method can create a bias. In this paper, we propose a Weighted k-Nearest Neighbors(WKNN) imputation method that can supplement KNN's faults. A Monte-Carlo simulation study is also adapted to compare the WKNN method and KNN method using real data set.

Missing values imputation for time course gene expression data using the pattern consistency index adaptive nearest neighbors (시간경로 유전자 발현자료에서 패턴일치지수와 적응 최근접 이웃을 활용한 결측값 대치법)

  • Shin, Heyseo;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.269-280
    • /
    • 2020
  • Time course gene expression data is a large amount of data observed over time in microarray experiments. This data can also simultaneously identify the level of gene expression. However, the experiment process is complex, resulting in frequent missing values due to various causes. In this paper, we propose a pattern consistency index adaptive nearest neighbors as a method of missing value imputation. This method combines the adaptive nearest neighbors (ANN) method that reflects local characteristics and the pattern consistency index that considers consistent degree for gene expression between observations over time points. We conducted a Monte Carlo simulation study to evaluate the usefulness of proposed the pattern consistency index adaptive nearest neighbors (PANN) method for two yeast time course data.

Status Diagnosis of Pump and Motor Applying K-Nearest Neighbors (K-최근접 이웃 알고리즘을 적용한 펌프와 모터의 상태 진단)

  • Kim, Nam-Jin;Bae, Young-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1249-1256
    • /
    • 2018
  • Recently the research on artificial intelligence is actively processing in the fields of diagnosis and prediction. In this paper, we acquire the data of electrical current, revolution per minute (RPM) and vibration that is occurred in the motor and pump where hey are installed in the industrial fields. We train the acquired data by using the k-nearest neighbors. Also, we propose the status diagnosis methods that judges normal and abnormal status of motor and pump by using the trained data. As a proposed result, we confirm that normal status and abnormal status are well judged.

A Study on Interpolation for Enlarged Still Image (정지영상 확대시 보간법에 관한 연구)

  • 강길봉;양영수;김장형
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2001.05a
    • /
    • pp.643-648
    • /
    • 2001
  • 본 논문은 정지영상을 확대했을 때 높은 해상도_를 얻기 위한 영상 처리 기술로서 기존의 보간법 이외에 새로운 보간법을 제안했다. 영상 처리에서 주로 사용되는 보간법인 최근접 이웃화소 보간법과 양선형 보간법인 두 보간법을 조합하여 장점을 살리고 단점을 보완하는 알고리즘으로서 향상된 화질의 확대 영상을 얼을 수 있는 혼합형 보간법에 대하여 연구를 하였다.

  • PDF

Machine Learning Methods to Predict Vehicle Fuel Consumption

  • Ko, Kwangho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.13-20
    • /
    • 2022
  • It's proposed and analyzed ML(Machine Learning) models to predict vehicle FC(Fuel Consumption) in real-time. The test driving was done for a car to measure vehicle speed, acceleration, road gradient and FC for training dataset. The various ML models were trained with feature data of speed, acceleration and road-gradient for target FC. There are two kind of ML models and one is regression type of linear regression and k-nearest neighbors regression and the other is classification type of k-nearest neighbors classifier, logistic regression, decision tree, random forest and gradient boosting in the study. The prediction accuracy is low in range of 0.5 ~ 0.6 for real-time FC and the classification type is more accurate than the regression ones. The prediction error for total FC has very low value of about 0.2 ~ 2.0% and regression models are more accurate than classification ones. It's for the coefficient of determination (R2) of accuracy score distributing predicted values along mean of targets as the coefficient decreases. Therefore regression models are good for total FC and classification ones are proper for real-time FC prediction.