• Title/Summary/Keyword: Non-response imputation

Search Result 17, Processing Time 0.023 seconds

Non-Response Imputation for Panel Data (패널자료의 무응답 대체법)

  • Pak, Gi-Deok;Shin, Key-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.6
    • /
    • pp.899-907
    • /
    • 2010
  • Several non-response imputation methods are suggested, however, mainly cross-sectional imputations are studied and applied to this analysis. A simple and common imputation method for panel data is the cross-wave regression imputation or carry-over imputation as a special case of cross-wave regression imputation. This study suggests a multiple imputation method combined time series analysis and cross-sectional multiple imputation method. We compare this method and the cross-wave regression imputation method using MSE, MAE, and Bias. The 2008 monthly labor survey data is used for this study.

Bias corrected imputation method for non-ignorable non-response (무시할 수 없는 무응답에서 편향 보정을 이용한 무응답 대체)

  • Lee, Min-Ha;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.4
    • /
    • pp.485-499
    • /
    • 2022
  • Controlling the total survey error including sampling error and non-sampling error is very important in sampling design. Non-sampling error caused by non-response accounts for a large proportion of the total survey error. Many studies have been conducted to handle non-response properly. Recently, a lot of non-response imputation methods using machine learning technique and traditional statistical methods have been studied and practically used. Most imputation methods assume MCAR(missing completely at random) or MAR(missing at random) and few studies have been conducted focusing on MNAR (missing not at random) or NN(non-ignorable non-response) which cause bias and reduce the accuracy of imputation. In this study, we propose a non-response imputation method that can be applied to non-ignorable non-response. That is, we propose an imputation method to improve the accuracy of estimation by removing the bias caused by NN. In addition, the superiority of the proposed method is confirmed through small simulation studies.

A Comparison of BLS Non-Response Adjustment and Cross-Wave Regression Imputation Methods (BLS 무응답 보정법을 이용한 대체법과 이월대체법에 관한 연구)

  • Lee, Sang-Eun;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.909-921
    • /
    • 2010
  • Cross-wave regression imputation and carry-over imputation method are generally used in the analysis of panel data with missing values. Recently it is known that the BLS non-response adjust method has good statistical properties. In this paper we show that the BLS method can be considered as an imputation method with a similar formula of a ratio-estimator. In addition, we show that the carry-over imputation and BLS imputation are approximately the same under the assumption that data follow a non-stationary process with drift. Small simulation studies and real data analysis are performed. For the real data analysis, a monthly labor statistic (2007) is used.

Imputation Methods for the Population and Housing Census 2000 in Korea

  • Kim, Young-Won;Ryu, Jeabok;Park, Jinwoo;Lee, Jaewon
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.575-583
    • /
    • 2003
  • We proposed imputation strategies for the Population and Housing Census 2000 in Korea. The total area of floor space and marital status which have relatively high non-response rates in the Census are considered to develope the effective missing value imputation procedures. The Classification and Regression Tree(CART) is employed to construct the imputation cells for hot-deck imputation, as well as to predict missing value by model-based approach. We compare three imputation methods which include CART model-based imputation, hot-deck imputation based on CART and logical hot-deck imputation proposed by The Korea National Statistical Office. The results suggest that the proposed hot-deck imputation based on CART is very efficient and strongly recommendable.

Two-stage imputation method to handle missing data for categorical response variable

  • Jong-Min Kim;Kee-Jae Lee;Seung-Joo Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.6
    • /
    • pp.577-587
    • /
    • 2023
  • Conventional categorical data imputation techniques, such as mode imputation, often encounter issues related to overestimation. If the variable has too many categories, multinomial logistic regression imputation method may be impossible due to computational limitations. To rectify these limitations, we propose a two-stage imputation method. During the first stage, we utilize the Boruta variable selection method on the complete dataset to identify significant variables for the target categorical variable. Then, in the second stage, we use the important variables for the target categorical variable for logistic regression to impute missing data in binary variables, polytomous regression to impute missing data in categorical variables, and predictive mean matching to impute missing data in quantitative variables. Through analysis of both asymmetric and non-normal simulated and real data, we demonstrate that the two-stage imputation method outperforms imputation methods lacking variable selection, as evidenced by accuracy measures. During the analysis of real survey data, we also demonstrate that our suggested two-stage imputation method surpasses the current imputation approach in terms of accuracy.

Modified BLS Weight Adjustment (수정된 BLS 가중치보정법)

  • Park, Jung-Joon;Cho, Ki-Jong;Lee, Sang-Eun;Shin, Key-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.367-376
    • /
    • 2011
  • BLS weight adjustment is a widely used method for business surveys with non-responses and outliers. Recent surveys show that the non-response weight adjustment of the BLS method is the same as the ratio imputation method. In this paper, we suggested a modified BLS weight adjustment method by imputing missing values instead of using weight adjustment for non-response. Monthly labor survey data is used for a small Monte-Carlo simulation and we conclude that the suggested method is superior to the original BLS weight adjustment method.

An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm (Monte-Carlo expectation-maximaization 방법을 이용한 무응답 모형 추정방법)

  • Choi, Boseung;You, Hyeon Sang;Yoon, Yong Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.587-598
    • /
    • 2016
  • In predicting an outcome of election using a variety of methods ahead of the election, non-response is one of the major issues. Therefore, to address the non-response issue, a variety of methods of non-response imputation may be employed, but the result of forecasting tend to vary according to methods. In this study, in order to improve electoral forecasts, we studied a model based method of non-response imputation attempting to apply the Monte Carlo Expectation Maximization (MCEM) algorithm, introduced by Wei and Tanner (1990). The MCEM algorithm using maximum likelihood estimates (MLEs) is applied to solve the boundary solution problem under the non-ignorable non-response mechanism. We performed the simulation studies to compare estimation performance among MCEM, maximum likelihood estimation, and Bayesian estimation method. The results of simulation studies showed that MCEM method can be a reasonable candidate for non-response model estimation. We also applied MCEM method to the Korean presidential election exit poll data of 2012 and investigated prediction performance using modified within precinct error (MWPE) criterion (Bautista et al., 2007).

A study on multiple imputation modeling for Korean EAPS (경제활동인구조사 자료를 위한 다중대체 방식 연구)

  • Park, Min-Jeong;Bae, Yoonjong;Kim, Joungyoun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.685-696
    • /
    • 2021
  • The Korean Economically Active Population Survey (KEAPS) is a national survey that produces employment-related statistics. The main purpose of the survey is to find out the economic activity status (employed/ unemployed/ non-employed) of the people. KEAPS has a unique characteristics caused by the survey method. In this study, through understanding of structural non-response and utilization of past data, we would like to present an improved imputation model. The performance of the proposed model is compared with the existing model through simulation. The performance of the imputation models is evaluated based on the degree of mathing/nonmatching rates. For this, we employ the KEAPS data in November 2019. For the randomly selected ones among the total 59,996 respondents, the six explanatory variables, which are critical in determining the economic activity states, are treated as non-response. The proposed model includes industry variable and job status variable in addition to the explanatory variables used in the precedent research. This is based on the linkage and utilization of past data. The simulation results confirm that the proposed model with additional variables outperforms the existing model in the precedent research. In addition, we consider various scenarios for the number of non-responders by the economic activity status.

Missing Value Imputation Method Using CART : For Marital Status in the Population and Housing Census (CART를 활용한 결측값 대체방법 : 인구주택총조사 혼인상태 항목을 중심으로)

  • 김영원;이주원
    • Survey Research
    • /
    • v.4 no.2
    • /
    • pp.1-21
    • /
    • 2003
  • We proposed imputation strategies for marital status in the Population and Housing Census 2000 in Korea to illustrate the effective missing value imputation methods for social survey. The marital status which have relatively high non-response rates in the Census are considered to develope the effective missing value imputation procedures. The Classification and Regression Tree(CART)is employed to construct the imputation cells for hot-deck imputation, as well as to predict the missing value by model-based approach. We compare to imputation methods which include the CART model-based imputation and the sequential hot-deck imputation based on CART. Also we check whether different modeling for each region provides the more improved results. The results suggest that the proposed hot-deck imputation based on CART is very efficient and strongly recommendable. And the results show that different modeling for each region is not necessary.

  • PDF

Missing Imputation Methods Using the Spatial Variable in Sample Survey (표본조사에서 공간 변수(SPATIAL VARIABLE)를 이용한 결측 대체(MISSING IMPUTATION)의 효율성 비교)

  • Lee Jin-Hee;Kim Jin;Lee Kee-Jae
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.57-67
    • /
    • 2006
  • In sampling survey, nonresponse tend to occur inevitably. If we use information from respondents only, the estimates will be baised. To overcome this, various non-response imputation methods have been studied. If there are few auxiliary variables for replacing missing imputation or spatial autocorrelation exists between respondents and nonrespondents, spatial autocorrelation can be used for missing imputation. In this paper, we apply several nonresponse imputation methods including spatial imputation for the analysis of farm household economy data of the Gangwon-Do in 2002 as an example. We show that spatial imputation is more efficient than other methods through the numerical simulations.