DOI QR코드

DOI QR Code

초모집단 모형의 오차가 이분산일 때 무시할 수 없는 무응답에서 편향수정 무응답 대체

Bias-corrected imputation method for non-ignorable nonresponse with heteroscedasticity in super-population model

  • 이유진 (한국외국어대학교 통계학과) ;
  • 신기일 (한국외국어대학교 통계학과)
  • Yujin Lee (Department of Statistics, Hankuk University of Foreign Studies) ;
  • Key-Il Shin (Department of Statistics, Hankuk University of Foreign Studies)
  • 투고 : 2023.11.25
  • 심사 : 2024.01.30
  • 발행 : 2024.06.30

초록

무응답을 적절히 처리하기 위한 많은 방법이 연구되었다. 최근 다수의 무응답 대체법이 개발되고 실질적으로 사용되고 있다. 기존에 발표된 다수의 방법은 MCAR (missing completely at random) 또는 MAR (missing at random) 가정을 사용하고 있다. 그러나 관심변수에 영향을 받는 MNAR (missing not at random) 또는 무시할 수 없는 무응답(non-ignorable non-response; NN)은 편향을 발생시켜 대체 결과의 정확성을 크게 떨어뜨리지만 이에 관한 연구는 상대적으로 미미하다. Lee와 Shin (2022)은 등분산 가정하에서 무시할 수 없는 무응답을 적절히 처리할 수 있는 편향수정 무응답 대체법을 제안하였다. 본 연구에서는 Lee와 Shin (2022)이 제안한 방법을 확장한 무응답 대체법으로 초모집단 모형의 오차가 이분산인 경우에서 편향을 제거함으로써 추정의 정확성을 향상하는 방법을 제안하였다. 모의실험을 이용하여 제안된 방법의 타당성을 확인하였다.

Many studies have been conducted to properly handle nonresponse. Recently, many nonresponse imputation methods have been developed and practically used. Most imputation methods assume MCAR (missing completely at random) or MAR (missing at random). On the contrary, there are relatively few studies on imputation under the assumption of MNAR (missing not at random) or NN (nonignorable nonresponse) that are affected by the study variable. The MNAR causes Bias and reduces the accuracy of imputation whenever response probability is not properly estimated. Lee and Shin (2022) proposed a nonresponse imputation method that can be applied to nonignorable nonresponse assuming homoscedasticity in super-population model. In this paper we propose an generalized version of the imputation method proposed by Lee and Shin (2022) to improve the accuracy of estimation by removing the Bias caused by MNAR under heteroscedasticity. In addition, the superiority of the proposed method is confirmed through simulation studies.

키워드

과제정보

이 성과는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (NRF-2021R1F1A1045602).

참고문헌

  1. Berk E, Kim J-K, and Skinner C (2016). Imputation under imformative sampling, Journal of Survey Statistics and Methodology, 4, 436-462. https://doi.org/10.1093/jssam/smw032
  2. Bethlehem J (2020). Working with response probabilities, Journal of Official Statistics, 36, 647-674. https://doi.org/10.2478/jos-2020-0033
  3. Chung HY and Shin K-I (2017). Estimation using informative sampling technique when response probability follows exponential function of variable of interest, The Korean Journal of Applied Statistics, 30, 993-1004. https://doi.org/10.5351/KJAS.2017.30.6.993
  4. Chung HY and Shin K-I (2022). A response probability estimation for non-ignorable non-response, Communications for Statistical Applications and Methods, 29, 263-275. https://doi.org/10.29220/CSAM.2022.29.2.263
  5. Da Silva DN and Opsomer JD (2006). A kernel smoothing method of adjusting for unit non-response in sample surveys, The Canadian Journal of Statistics, 34, 563-579. https://doi.org/10.1002/cjs.5550340402
  6. Da Silva DN and Opsomer JD (2009). Nonparametric propensity weighting for survey nonresponse through local polynomial regression, Survey Methodology, 35, 165-176.
  7. Hong S and Lynn HS (2020). Accuracy of random-forest-based imputation of missing data in the presence of non-normality, non-linearity and interaction, BMC Medical Research Methodology, 20, 1-12. https://doi.org/10.1186/s12874-019-0863-0
  8. Kim JK and Yu CL (2011). A semiparametric estimation of mean functionals with nonignorable missing data, Journal of the American Statistical Association, 106, 157-165. https://doi.org/10.1198/jasa.2011.tm10104
  9. Lee M and Shin K-I (2022). Bias corrected imputation method for non-ignorable non-response, The Korean Journal of Applied Statistics, 35, 485-499. https://doi.org/10.5351/KJAS.2022.35.4.485
  10. Morikawa K, Kim JK, and Kano Y (2017). Semiparametric maximum likelihood estimation with data missing not at random, The Canadian Journal of Statistics, 45, 393-409. https://doi.org/10.1002/cjs.11340
  11. Morikawa K and Kim JK (2018). A note on the equivalence of two semiparametirc estimation methods for nonignorable nonresponse, Statistics and Probability Letters, 140, 1-6. https://doi.org/10.1016/j.spl.2018.03.020
  12. Pfeffermann D, Krieger AM, and Rinott Y (1998). Parametric distributions of complex survey data under informative probability sampling, Statistica Sinica, 8, 1087-1114.
  13. Sim J and Shin K-I (2021). Bias corrected non-response estimation using nonparametric function estimation of super population model, The Korean Journal of Applied Statistics, 34, 923-936. https://doi.org/10.5351/KJAS.2021.34.6.923