DOI QR코드

DOI QR Code

On the Use of Sequential Adaptive Nearest Neighbors for Missing Value Imputation

순차 적응 최근접 이웃을 활용한 결측값 대치법

  • Received : 20110900
  • Accepted : 20111000
  • Published : 2011.12.31

Abstract

In this paper, we propose a Sequential Adaptive Nearest Neighbor(SANN) imputation method that combines the Adaptive Nearest Neighbor(ANN) method and the Sequential k-Nearest Neighbor(SKNN) method. When choosing the nearest neighbors of missing observations, the proposed SANN method takes the local feature of the missing observations into account as well as reutilizes the imputed observations in a sequential manner. By using a Monte Carlo study and a real data example, we demonstrate the characteristics of the SANN method and its potential performance.

비모수적 결측치 대치법인 k-최근접 이웃(k-Nearest Neighbors; KNN) 대치법을 개선한 적응 최근접 이웃(Adaptive Nearest Neighbor; ANN) 대치법과 순차 k-최근접 이웃(Sequential k-Nearest Neighbor; SKNN) 대치법의 장점들을 결합한 순차 적응 최근접 이웃(Sequential Adaptive Nearest Neighbor; SANN) 대치법을 제안하고자 한다. 이 방법은 ANN 대치법의 장점인 자료의 국소적 특징을 반영할 뿐 아니라, SKNN 대치법과 같이 결측값 대치가 이루어진 개체를 다음 결측값을 대치할 때 사용함으로써 효율성에 개선이 있을 것으로 기대한다.

Keywords

References

  1. 맹진우, 방성완, 전명식 (2010). 수정된 적응 최근접 방법을 활용한 판별분류방법에 대한 연구, 응용통계연구, 23, 1093-1102. https://doi.org/10.5351/KJAS.2010.23.6.1093
  2. 이상은, 신기일 (2010). BLS 무응답 보정법을 이용한 대체법과 이월대체법에 관한 연구, 응용통계연구, 23, 909-921. https://doi.org/10.5351/KJAS.2010.23.5.909
  3. 이진희, 김진, 이기재 (2006). 표본조사에서 공간변수를 이용한 결측 대체의 효율성 비교, 응용통계연구, 19, 57. https://doi.org/10.5351/KJAS.2006.19.1.057
  4. 전명식, 최인경 (2009). Adaptive nearest neighbors를 활용한 판별분류방법, 응용통계연구, 22, 479-488. https://doi.org/10.5351/KJAS.2009.22.3.479
  5. Dixon, J. K. (1979). Pattern recognition with partly missing data, IEEE Transactions on Systems, Man, and Cybernetics, 9, 617-621. https://doi.org/10.1109/TSMC.1979.4310090
  6. Jhun, M., Jeong, H. C. and Koo, J. Y. (2007). On the use of adaptive nearest neighbors for missing value imputation, Communications in Statistics: Simulation and Computation, 36, 1275-1286. https://doi.org/10.1080/03610910701569069
  7. Kim, K. Y., Kim, B. J. and Yi, G. S. (2004). Reuse of imputed data in microarray analysis increases imputation efficiency, BMC Bioinformatics, 5, 160. https://doi.org/10.1186/1471-2105-5-160
  8. Little, R. J. A. and Rubin, D. B. (1987). Statistical Analysis With Missing Data, Wiley, New York.
  9. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D. and Altman, R. B. (2001). Missing value estimation methods for DNA microarrays, Bioinformatics, 7, 520-525.

Cited by

  1. On the Use of Weighted k-Nearest Neighbors for Missing Value Imputation vol.28, pp.1, 2015, https://doi.org/10.5351/KJAS.2015.28.1.023
  2. A Study of Travel Time Prediction using K-Nearest Neighborhood Method vol.26, pp.5, 2013, https://doi.org/10.5351/KJAS.2013.26.5.835