DOI QR코드

DOI QR Code

A study to improve the accuracy of the naive propensity score adjusted estimator using double post-stratification method

나이브 성향점수보정 추정량의 정확성 향상을 위한 이중 사후층화 방법 연구

  • Leesu Yeo (Department of Statistics, Hankuk University of Foreign Studies) ;
  • Key-Il Shin (Department of Statistics, Hankuk University of Foreign Studies)
  • 여이수 (한국외국어대학교 통계학과) ;
  • 신기일 (한국외국어대학교 통계학과)
  • Received : 2023.04.28
  • Accepted : 2023.07.10
  • Published : 2023.12.31

Abstract

Proper handling of nonresponse in sample survey improves the accuracy of the parameter estimation. Various studies have been conducted to properly handle MAR (missing at random) nonresponse or MCAR (missing completely at random) nonresponse. When nonresponse occurs, the PSA (propensity score adjusted) estimator is commonly used as a mean estimator. The PSA estimator is known to be unbiased when known sample weights and properly estimated response probabilities are used. However, for MNAR (missing not at random) nonresponse, which is affected by the value of the study variable, since it is very difficult to obtain accurate response probabilities, bias may occur in the PSA estimator. Chung and Shin (2017, 2022) proposed a post-stratification method to improve the accuracy of mean estimation when MNAR nonresponse occurs under a non-informative sample design. In this study, we propose a double post-stratification method to improve the accuracy of the naive PSA estimator for MNAR nonresponse under an informative sample design. In addition, we perform simulation studies to confirm the superiority of the proposed method.

표본조사에서 무응답의 적절한 처리는 추정의 정확성을 향상한다. 결측 메카니즘이 MCAR (missing completely at random) 또는 MAR (missing at random)인 경우에서는 이를 적절히 처리할 수 있는 다양한 방법이 연구되었다. 무응답이 발생하였을 때 사용하는 평균 추정량으로 흔히 성향점수보정 추정량이 사용되며 MAR 또는 MCAR 무응답인 경우, 알려진 표본 가중치와 타당한 방법으로 추정된 응답확률을 사용할 수 있으므로 성향점수보정 추정량은 불편추정량이 된다. 그러나 관심변수 값에 영향을 받는 무응답인 MNAR (missing not at random) 무응답에서는 정확한 응답확률을 구하는 것이 어려워 성향점수보정 추정량에 편향이 발생할 수 있다. Chung과 Shin (2017, 2022)은 무정보적 표본설계에서 MNAR 무응답이 발생하였을 때 평균 추정의 정확성을 향상하는 방법으로 단일 사후층화 방법을 제안하였다. 본 연구에서는 정보적 표본설계를 사용하고, MNAR 무응답이 발생한 경우에서 나이브 성향점수보정 추정량의 정확성 향상을 위한 이중 사후층화 방법을 제안하였다. 또한, 모의실험을 통해 제안된 방법의 우수성을 확인하였다.

Keywords

Acknowledgement

이 연구는 2023년 한국외국어대학교 교내연구비 지원을 받아 수행되었음.

References

  1. Bethlehem J (2020). Working with response probabilities, Journal of Official Statistics, 36, 647-674. https://doi.org/10.2478/jos-2020-0033
  2. Chung HY and Shin KI (2017). Estimation using informative sampling technique when response rate follows exponential function of variable of interest, Korean Journal of Applied Statistics, 30, 993-1004. https://doi.org/10.5351/KJAS.2017.30.6.993
  3. Chung HY and Shin KI (2019). Bias adjusted estimation in a sample survey with linear response rate, Korean Journal of Applied Statistics, 32, 631-642. https://doi.org/10.5351/KJAS.2019.32.4.631
  4. Chung HY and Shin KI (2020). A study on non-response bias adjusted estimation in business survey, Korean Journal of Applied Statistics, 33, 11-23. https://doi.org/10.5351/KJAS.2020.33.1.011
  5. Chung HY and Shin KI (2022). A response probability estimation for non-ignorable non-response, Communications for Statistical Application and Methods, 29, 263-275. https://doi.org/10.29220/CSAM.2022.29.2.263
  6. Kim JK and Riddles MK (2012). Some theory for propensity-score-adjustment estimators in survey sampling, Survey Methodology, 38, 157-165.
  7. Lee MH and Shin KI (2022). Bias corrected imputation method for non-ignorable non-response, Korean Journal of Applied Statistics, 35, 485-499. https://doi.org/10.5351/KJAS.2022.35.4.485
  8. Min JW and Shin KI (2018). A study on the determination of substrata using the information of exponential response rate by simulation studies, Korean Journal of Applied Statistics, 31, 621-636. https://doi.org/10.5351/KJAS.2018.31.5.621
  9. Pfeffermann D, Krieger AM, and Rinott Y (1998). Parametric distributions of complex survey data under informative probability sampling, Statistica Sinica, 8, 1087-1114.
  10. Pfeffermann D, Mour F, and Silva PN (2006). Multi-level modelling under informative sampling, Biometika, 93, 943-959. https://doi.org/10.1093/biomet/93.4.943
  11. Riddles MK, Kim JK, and Im J (2016). A propensity-score-adjustment method for nonignorable nonresponse, Journal of Survey Statistics and Methodology, 4, 215-245. https://doi.org/10.1093/jssam/smv047
  12. Sim JY and Shin KI (2021). Bias corrected non-response estimation using nonparametric function estimation of super population model, Korean Journal of Applied Statistics, 34, 923-936. https://doi.org/10.5351/KJAS.2021.34.6.923