• Title/Summary/Keyword: 대체방법

Search Result 4,117, Processing Time 0.044 seconds

Comparison of GEE Estimators Using Imputation Methods (대체방법별 GEE추정량 비교)

  • 김동욱;노영화
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.407-426
    • /
    • 2003
  • We consider the missing covariates problem in generalized estimating equations(GEE) model. If the covariate is partially missing, GEE can not be calculated. In this paper, we study the performance of 7 imputation methods to handle missing covariates in GEE models, and the properties of GEE estimators are investigated after missing covariates are imputed for ordinal data of repeated measurements. The 7 imputation methods include i) Naive Deletion ii) Sample Average Imputation iii) Row Average Imputation iv) Cross-wave Regression Imputation v) Carry-over Imputation vi) Bayesian Bootstrap vii) Approximate Bayesian Bootstrap. A Monte-Carlo simulation is used to compare the performance of these methods. For the missing mechanism generating the missing data, we assume ignorable nonresponse. Furthermore, we generate missing covariates with or without considering wave nonresp onse patterns.

Comparison of Data Reconstruction Methods for Missing Value Imputation (결측값 대체를 위한 데이터 재현 기법 비교)

  • Cheongho Kim;Kee-Hoon Kang
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.603-608
    • /
    • 2024
  • Nonresponse and missing values are caused by sample dropouts and avoidance of answers to surveys. In this case, problems with the possibility of information loss and biased reasoning arise, and a replacement of missing values with appropriate values is required. In this paper, as an alternative to missing values imputation, we compare several replacement methods, which use mean, linear regression, random forest, K-nearest neighbor, autoencoder and denoising autoencoder based on deep learning. These methods of imputing missing values are explained, and each method is compared by using continuous simulation data and real data. The comparison results confirm that in most cases, the performance of the random forest imputation method and the denoising autoencoder imputation method are better than the others.

Imputation Methods for Nonresponse and Their Effect (무응답 대체 방법과 대체 효과)

  • 김규성
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2000.06a
    • /
    • pp.1-14
    • /
    • 2000
  • We consider statistical methods for nonresponse problem in social and economic sample surveys. To create a complete data set, which does not include item nonresponse data, imputation methods are generally used. In this paper, we introduce some imputation methods and compare them with one another. Also, we consider some problems, which occur when an imputed data set is treated as a response data set. Due to the imputed values, the true variance of the estimator after imputation is increased by the imputation variance. However, since usual naive variance estimator constructed from the imputed data set does not estimate the imputation variance, the true variance of the estimator after imputation tends to be underestimated. Theoretical reason is investigated and serious results are explained through a simulation study. Finally, some adjusted variance estimation methods to compensate for underestimation are presented and discussed.

  • PDF

Imputation Methods for Nonresponse and Their Effect (무응답 대체 방법과 대체 효과)

  • Kim, Kyu-Seong
    • Survey Research
    • /
    • v.1 no.2
    • /
    • pp.1-14
    • /
    • 2000
  • We consider statistical methods for nonresponse problem in social and economic sample surveys. To create a complete data set, which does not include item nonresponse data, imputation methods are generally used. In this paper, we introduce some imputation methods and compare them with one another. Also, we consider some problems, which occur when an imputed data set is treated as a response data set. Due to the imputed values, the true variance of the estimator after imputation is increased by the imputation variance. However, since usual naive variance estimator constructed from the imputed data set does not estimate the imputation variance, the true variance of the estimator after imputation tends to be underestimated. Theoretical reason is investigated and serious results are explained through a simulation study. Finally, some adjusted variance estimation methods to compensate for underestimation are presented and discussed.

  • PDF

Comparison of imputation methods for item nonresponses in a panel study (패널자료에서의 항목무응답 대체 방법 비교)

  • Lee, Hyejung;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.3
    • /
    • pp.377-390
    • /
    • 2017
  • When conducting a survey, item nonresponse occurs if the respondent does not respond to some items. Since analysis based only on completely observed data may cause biased results, imputation is often conducted to analyze data in its complete form. The panel study is a survey method that examines changes of responses over time. In panel studies, there has been a preference for using information from response values of previous waves when the imputation of item nonresponses is performed; however, limited research has been conducted to support this preference. Therefore, this study compares the performance of imputation methods according to whether or not information from previous waves is utilized in the panel study. Among imputation methods that utilize information from previous responses, we consider ratio imputation, imputation based on the linear mixed model, and imputation based on the Bayesian linear mixed model approach. We compare the results from these methods against the results of methods that do not use information from previous responses, such as mean imputation and hot deck imputation. Simulation results show that imputation based on the Bayesian linear mixed model performs best and yields small biases and high coverage rates of the 95% confidence interval even at higher nonresponse rates.

Missing Value Imputation Method Using CART : For Marital Status in the Population and Housing Census (CART를 활용한 결측값 대체방법 : 인구주택총조사 혼인상태 항목을 중심으로)

  • 김영원;이주원
    • Survey Research
    • /
    • v.4 no.2
    • /
    • pp.1-21
    • /
    • 2003
  • We proposed imputation strategies for marital status in the Population and Housing Census 2000 in Korea to illustrate the effective missing value imputation methods for social survey. The marital status which have relatively high non-response rates in the Census are considered to develope the effective missing value imputation procedures. The Classification and Regression Tree(CART)is employed to construct the imputation cells for hot-deck imputation, as well as to predict the missing value by model-based approach. We compare to imputation methods which include the CART model-based imputation and the sequential hot-deck imputation based on CART. Also we check whether different modeling for each region provides the more improved results. The results suggest that the proposed hot-deck imputation based on CART is very efficient and strongly recommendable. And the results show that different modeling for each region is not necessary.

  • PDF

무응답 대체 후 발생하는 문제점과 해결 방안

  • 김규성
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2000.06a
    • /
    • pp.105-112
    • /
    • 2000
  • 대부분 통계조사에서 흔히 발생하는 무응답을 처리하기 위한 방법으로 최근에는 표본 대체 방법이 널리 이용되고 있다. 본 논문에서는 여러 가지 표본 대체 방법을 소개하고 각 방법의 장. 단점을 비교. 설명한다. 그리고 대체된 데이터를 응답 데이터인 것처럼 활용했을 때 발생하는 문제점들을 지적하고 모의 실험을 통하여 그 정도를 살펴본다. 이와 더불어 제기된 문제점을 해결하는 몇 가지 해결방안을 소개한다.

  • PDF

Comparison of binary data imputation methods in clinical trials (임상시험에서 이분형 결측치 처리방법의 비교연구)

  • An, Koosung;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.3
    • /
    • pp.539-547
    • /
    • 2016
  • We discussed how to handle missing binary data clinical trials. Patterns of occurring missing data are discussed and introduce missing binary data imputation methods that include the modified method. A simulation is performed by modifying actual data for each method. The condition of this simulation is controlled by a response rate and a missing value rate. We list the simulation results for each method and discussed them at the end of this paper.

A comparison of imputation methods using nonlinear models (비선형 모델을 이용한 결측 대체 방법 비교)

  • Kim, Hyein;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.4
    • /
    • pp.543-559
    • /
    • 2019
  • Data often include missing values due to various reasons. If the missing data mechanism is not MCAR, analysis based on fully observed cases may an estimation cause bias and decrease the precision of the estimate since partially observed cases are excluded. Especially when data include many variables, missing values cause more serious problems. Many imputation techniques are suggested to overcome this difficulty. However, imputation methods using parametric models may not fit well with real data which do not satisfy model assumptions. In this study, we review imputation methods using nonlinear models such as kernel, resampling, and spline methods which are robust on model assumptions. In addition, we suggest utilizing imputation classes to improve imputation accuracy or adding random errors to correctly estimate the variance of the estimates in nonlinear imputation models. Performances of imputation methods using nonlinear models are compared under various simulated data settings. Simulation results indicate that the performances of imputation methods are different as data settings change. However, imputation based on the kernel regression or the penalized spline performs better in most situations. Utilizing imputation classes or adding random errors improves the performance of imputation methods using nonlinear models.

표본조사에서 항목 무응답 대체 방법

  • 김영원;조선경
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.3
    • /
    • pp.145-159
    • /
    • 1996
  • 항목 무응답은 표본조사에서 비표본오차를 발생시키는 중요한 요인으로 지적되고 있다. 본 논문에서는 현재까지 통계조사의 분석과정에서 직관적으로 제시된 다양한 항목 무응답 대체방법들을 정리하고, 이런 방법들 간의 장.단점과 무응답의 발생 형태에 다른 대체 효과를 실제 사회조사 자료를 이용한 모의 실험을 통하여 비교, 분석하였다.

  • PDF