• Title/Summary/Keyword: missing mechanism

Search Result 72, Processing Time 0.02 seconds

Analysis of Incomplete Data with Nonignorable Missing Values

  • Kim, Hyun-Jeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.167-174
    • /
    • 2002
  • In the case of "nonignorable missing data", it is necessary to assume a model dealing with the missing on each situations. In this article, for example, we sometimes meet situations where data set are income amounts in a survey of individuals and assume a model as the values are the larger, a missing data probability is the higher. The method is to maximize using the EM(Expectation and Maximization) algorithm based on the (missing data) mechanism that creates missing data of the case of exponential distribution. The method started from any initial values, and converged in a few iterations. We changed the missing data probability and the artificial data size to show the estimated accuracy. Then we discuss the properties of estimates.

  • PDF

On statistical Computing via EM Algorithm in Logistic Linear Models Involving Non-ignorable Missing data

  • Jun, Yu-Na;Qian, Guoqi;Park, Jeong-Soo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.181-186
    • /
    • 2005
  • Many data sets obtained from surveys or medical trials often include missing observations. When these data sets are analyzed, it is general to use only complete cases. However, it is possible to have big biases or involve inefficiency. In this paper, we consider a method for estimating parameters in logistic linear models involving non-ignorable missing data mechanism. A binomial response and normal exploratory model for the missing data are used. We fit the model using the EM algorithm. The E-step is derived by Metropolis-hastings algorithm to generate a sample for missing data and Monte-carlo technique, and the M-step is by Newton-Raphson to maximize likelihood function. Asymptotic variances of the MLE's are derived and the standard error and estimates of parameters are compared.

  • PDF

Pattern-Mixture Model of the Cox Proportional Hazards Model with Missing Binary Covariates (결측이 있는 이산형 공변량에 대한 Cox비례위험모형의 패턴-혼합 모델)

  • Youk, Tae-Mi;Song, Ju-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.279-291
    • /
    • 2012
  • When fitting a Cox proportional hazards model with missing covariates, it is inefficient to exclude observations with missing values in the analysis. Furthermore, if the missing-data mechanism is not Missing Completely At Random(MCAR), it may lead to biased parameter estimation. Many approaches have been suggested to handle the Cox proportional hazards model when covariates are sometimes missing, but they are based on the selection model. This paper suggest an approach to handle Cox proportional hazards model with missing covariates by using the pattern-mixture model (Little, 1993). The pattern-mixture model is expressed by the joint distribution of survival time and the missing-data mechanism. In the pattern-mixture model, many models can be considered by setting up various restrictions, and different results under various restrictions indicate the sensitivity of the model due to missing covariates. A simulation study was conducted to show the sensitivity of parameter estimation under different restrictions in a pattern-mixture model. The proposed approach was also applied to mouse leukemia data.

Comparison of GEE Estimators Using Imputation Methods (대체방법별 GEE추정량 비교)

  • 김동욱;노영화
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.407-426
    • /
    • 2003
  • We consider the missing covariates problem in generalized estimating equations(GEE) model. If the covariate is partially missing, GEE can not be calculated. In this paper, we study the performance of 7 imputation methods to handle missing covariates in GEE models, and the properties of GEE estimators are investigated after missing covariates are imputed for ordinal data of repeated measurements. The 7 imputation methods include i) Naive Deletion ii) Sample Average Imputation iii) Row Average Imputation iv) Cross-wave Regression Imputation v) Carry-over Imputation vi) Bayesian Bootstrap vii) Approximate Bayesian Bootstrap. A Monte-Carlo simulation is used to compare the performance of these methods. For the missing mechanism generating the missing data, we assume ignorable nonresponse. Furthermore, we generate missing covariates with or without considering wave nonresp onse patterns.

Analysis of Missing Data Using an Empirical Bayesian Method (경험적 베이지안 방법을 이용한 결측자료 연구)

  • Yoon, Yong Hwa;Choi, Boseung
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.1003-1016
    • /
    • 2014
  • Proper missing data imputation is an important procedure to obtain superior results for data analysis based on survey data. This paper deals with both a model based imputation method and model estimation method. We utilized a Bayesian method to solve a boundary solution problem in which we applied a maximum likelihood estimation method. We also deal with a missing mechanism model selection problem using forecasting results and a comparison between model accuracies. We utilized MWPE(modified within precinct error) (Bautista et al., 2007) to measure prediction correctness. We applied proposed ML and Bayesian methods to the Korean presidential election exit poll data of 2012. Based on the analysis, the results under the missing at random mechanism showed superior prediction results than under the missing not at random mechanism.

Sensitivity analysis of missing mechanisms for the 19th Korean presidential election poll survey (19대 대선 여론조사에서 무응답 메카니즘의 민감도 분석)

  • Kim, Seongyong;Kwak, Dongho
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.1
    • /
    • pp.29-40
    • /
    • 2019
  • Categorical data with non-responses are frequently observed in election poll surveys, and can be represented by incomplete contingency tables. To estimate supporting rates of candidates, the identification of the missing mechanism should be pre-determined because the estimates of non-responses can be changed depending on the assumed missing mechanism. However, it has been shown that it is not possible to identify the missing mechanism when using observed data. To overcome this problem, sensitivity analysis has been suggested. The previously proposed sensitivity analysis can be applicable only to two-way incomplete contingency tables with binary variables. The previous sensitivity analysis is inappropriate to use since more than two of the factors such as region, gender, and age are usually considered in election poll surveys. In this paper, sensitivity analysis suitable to an multi-dimensional incomplete contingency table is devised, and also applied to the 19th Korean presidential election poll survey data. As a result, the intervals of estimates from the sensitivity analysis include actual results as well as estimates from various missing mechanisms. In addition, the properties of the missing mechanism that produce estimates nearest to actual election results are investigated.

The Interpolation Method for the missing AIS Data of Ship

  • Nguyen, Van-Suong;Im, Nam-kyun;Lee, Sang-min
    • Journal of Navigation and Port Research
    • /
    • v.39 no.5
    • /
    • pp.377-384
    • /
    • 2015
  • The interpolation of missing AIS data can be used for recovering the lost data of a ship's state which is then able to produce useful information for VTS stations or other ships. Previous research has introduced some interpolating methods however there are some problems with regard to missing AIS data. This paper proposes one new method which includes linear interpolation, cubic Hermit interpolation and an identification mechanism to overcome some of those limitations, first AIS data regarding ship position, COG, SOG and HDG is divided into separate time series, then the characteristic of the missing data is investigated into through using an identification mechanism, an appropriate interpolation is selected to fit all the time series which matches the characteristics. Numerical experiments are carried out using real AIS data to validate the algorithm of this approach and the results are compared with the previous method, after which the actual missing area is suggested to be interpolated by the proposed method. The interpolation results show this approach can be applied well in practice.

Comparison of missing data methods in clustered survival data using Bayesian adaptive B-Spline estimation

  • Yoo, Hanna;Lee, Jae Won
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.2
    • /
    • pp.159-172
    • /
    • 2018
  • In many epidemiological studies, missing values in the outcome arise due to censoring. Such censoring is what makes survival analysis special and differentiated from other analytical methods. There are many methods that deal with censored data in survival analysis. However, few studies have dealt with missing covariates in survival data. Furthermore, studies dealing with missing covariates are rare when data are clustered. In this paper, we conducted a simulation study to compare results of several missing data methods when data had clustered multi-structured type with missing covariates. In this study, we modeled unknown baseline hazard and frailty with Bayesian B-Spline to obtain more smooth and accurate estimates. We also used prior information to achieve more accurate results. We assumed the missing mechanism as MAR. We compared the performance of five different missing data techniques and compared these results through simulation studies. We also presented results from a Multi-Center study of Korean IBD patients with Crohn's disease(Lee et al., Journal of the Korean Society of Coloproctology, 28, 188-194, 2012).

Imputation method for missing data based on measure of property (특성도를 이용한 결측치 대체방법)

  • Kim, Hyungju;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.3
    • /
    • pp.463-473
    • /
    • 2017
  • How to handle missing data is a main issue in clinical trials. We impute missing data based on missing data that follows a mechanism according to the intention-to-treat rule. However, using the right imputation method for missing data is very important because this supposition is unclear. We suggest a new imputation method for missing data using agreement and maintenance introduced by Kang and Kim (1997). We give an example and adapt a Monte Carlo simulation to compare the performance between the established method and the suggested method.

Reject Inference of Incomplete Data Using a Normal Mixture Model

  • Song, Ju-Won
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.425-433
    • /
    • 2011
  • Reject inference in credit scoring is a statistical approach to adjust for nonrandom sample bias due to rejected applicants. Function estimation approaches are based on the assumption that rejected applicants are not necessary to be included in the estimation, when the missing data mechanism is missing at random. On the other hand, the density estimation approach by using mixture models indicates that reject inference should include rejected applicants in the model. When mixture models are chosen for reject inference, it is often assumed that data follow a normal distribution. If data include missing values, an application of the normal mixture model to fully observed cases may cause another sample bias due to missing values. We extend reject inference by a multivariate normal mixture model to handle incomplete characteristic variables. A simulation study shows that inclusion of incomplete characteristic variables outperforms the function estimation approaches.