• 제목/요약/키워드: censored dependent variable

검색결과 7건 처리시간 0.018초

CENSORED FUZZY REGRESSION MODEL

  • Choi, Seung-Hoe;Kim, Kyung-Joong
    • 대한수학회지
    • /
    • 제43권3호
    • /
    • pp.623-634
    • /
    • 2006
  • Various methods have been studied to construct a fuzzy regression model in order to present a fuzzy relation between a dependent variable and an independent variable. However, in the fuzzy regression analysis the value of the center point of estimated fuzzy output may be either greater than the value of the right endpoint or smaller than the value of the left endpoint. In the case, we cannot predict the fuzzy output properly. This paper presents sufficient conditions to construct the fuzzy regression model using several methods investigated by some authors and then introduces the censored fuzzy regression model using the censored samples to manipulate the problem of crossing of the center and the end points of the estimated fuzzy number. Examples show that the censored fuzzy regression model is an extension of the fuzzy regression model and also it improves the problem of crossing.

The restricted maximum likelihood estimation of a censored regression model

  • Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • 제24권3호
    • /
    • pp.291-301
    • /
    • 2017
  • It is well known in a small sample that the maximum likelihood (ML) approach for variance components in the general linear model yields estimates that are biased downward. The ML estimate of residual variance tends to be downwardly biased. The underestimation of residual variance, which has implications for the estimation of marginal effects and asymptotic standard error of estimates, seems to be more serious in some limited dependent variable models, as shown by some researchers. An alternative frequentist's approach may be restricted or residual maximum likelihood (REML), which accounts for the loss in degrees of freedom and gives an unbiased estimate of residual variance. In this situation, the REML estimator is derived in a censored regression model. A small sample the REML is shown to provide proper inference on regression coefficients.

Restricted maximum likelihood estimation of a censored random effects panel regression model

  • Lee, Minah;Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • 제26권4호
    • /
    • pp.371-383
    • /
    • 2019
  • Panel data sets have been developed in various areas, and many recent studies have analyzed panel, or longitudinal data sets. Maximum likelihood (ML) may be the most common statistical method for analyzing panel data models; however, the inference based on the ML estimate will have an inflated Type I error because the ML method tends to give a downwardly biased estimate of variance components when the sample size is small. The under estimation could be severe when data is incomplete. This paper proposes the restricted maximum likelihood (REML) method for a random effects panel data model with a censored dependent variable. Note that the likelihood function of the model is complex in that it includes a multidimensional integral. Many authors proposed to use integral approximation methods for the computation of likelihood function; however, it is well known that integral approximation methods are inadequate for high dimensional integrals in practice. This paper introduces to use the moments of truncated multivariate normal random vector for the calculation of multidimensional integral. In addition, a proper asymptotic standard error of REML estimate is given.

Bayesian Inference for Censored Panel Regression Model

  • Lee, Seung-Chun;Choi, Byongsu
    • Communications for Statistical Applications and Methods
    • /
    • 제21권2호
    • /
    • pp.193-200
    • /
    • 2014
  • It was recognized by some researchers that the disturbance variance in a censored regression model is frequently underestimated by the maximum likelihood method. This underestimation has implications for the estimation of marginal effects and asymptotic standard errors. For instance, the actual coverage probability of the confidence interval based on a maximum likelihood estimate can be significantly smaller than the nominal confidence level; consequently, a Bayesian estimation is considered to overcome this difficulty. The behaviors of the maximum likelihood and Bayesian estimators of disturbance variance are examined in a fixed effects panel regression model with a limited dependent variable, which is known to have the incidental parameter problem. Behavior under random effect assumption is also investigated.

Estimation on Modified Proportional Hazards Model

  • Lee, Kwang-Ho;Lee, Mi-Sook
    • Journal of the Korean Data and Information Science Society
    • /
    • 제5권1호
    • /
    • pp.59-66
    • /
    • 1994
  • Heller and Simonoff(1990) compared several methods of estimating the regression coefficient in a modified proportional hazards model, when the response variable is subject to censoring. We give another method of estimating the parameters in the model which also allows the dependent variable to be censored and the error distribution to be unspecified. The proposed method differs from that of Miller(1976) and that of Buckely and James(1979). We also obtain the variance estimator of the coefficient estimator and compare that with the Buckely-James Variance estimator studied by Hillis(1993).

  • PDF

생존분석에서의 기계학습 (Machine learning in survival analysis)

  • 백재욱
    • 산업진흥연구
    • /
    • 제7권1호
    • /
    • pp.1-8
    • /
    • 2022
  • 본 논문은 중도중단 데이터가 포함된 생존데이터의 경우 적용할 수 있는 기계학습 방법에 대해 살펴보았다. 우선 탐색적인 자료분석으로 각 특성에 대한 분포, 여러 특성들 간의 관계 및 중요도 순위를 파악할 수 있었다. 다음으로 독립변수에 해당하는 여러 특성들과 종속변수에 해당하는 특성(사망여부) 간의 관계를 분류문제로 보고 logistic regression, K nearest neighbor 등의 기계학습 방법들을 적용해본 결과 적은 수의 데이터이지만 통상적인 기계학습 결과에서와 같이 logistic regression보다는 random forest가 성능이 더 좋게 나왔다. 하지만 근래에 성능이 좋다고 하는 artificial neural network나 gradient boost와 같은 기계학습 방법은 성능이 월등히 좋게 나오지 않았는데, 그 이유는 주어진 데이터가 빅데이터가 아니기 때문인 것으로 판명된다. 마지막으로 Kaplan-Meier나 Cox의 비례위험모델과 같은 통상적인 생존분석 방법을 적용하여 어떤 독립변수가 종속변수 (ti, δi)에 결정적인 영향을 미치는지 살펴볼 수 있었으며, 기계학습 방법에 속하는 random forest를 중도중단 데이터가 포함된 생존데이터에도 적용하여 성능을 평가할 수 있었다.

패널 토빗모형을 이용한 청년채용비율 결정요인 분석 (The determinants of the youth employment rate using panel tobit model)

  • 박성익;류장수;김종한;조장식
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권4호
    • /
    • pp.853-862
    • /
    • 2017
  • 본 연구에서는 고용노동부의 공공기관 및 지방공기업의 청년고용현황 조사, 국회의 공공기관 고용현황 조사, 알리오 (www.alio.go.kr) 및 클린아이 (www.cleaneye.go.kr) 등 4개의 데이터를 이용하여 공공기관 및 지방공기업의 청년채용비율 결정요인을 분석하였다. 종속변수인 기관별 청년채용 비율은 청년채용 여부와 청년채용비율의 크기에 대한 두 가지의 정보를 포함하고 있다. 즉 종속변수 가 일정한 영역에서만 관찰되는 중도자료를 갖는 형태로서 통상적 최소제곱추정은 편의가 발생할 뿐만 아니라, 일치추정량을 제공하지 못한다. 이런 문제점을 극복하기 위해 본 연구에서는 합동 토빗모형과 패널 토빗모형을 활용하였다. 분석결과를 요약하면 다음과 같다. 먼저, 합동 토빗모형에 비해서 패널 토빗모형이 통계적으로 유의함을 알 수 있었고, 2011년에 비해서 2014년과 2015년의 청년채용 비율이 증가하였음을 알 수 있다. 그리고 지방공공기관에 비해서 공기업의 청년채용비율이 유의하게 높았으며, 평균 보수액이 증가할수록 청년채용비율이 통계적으로 유의하게 낮음을 알 수 있다. 마지막으로, 신입직원의 평균보수액이 증가할수록, 정원대비 정규직비율이 증가할수록 청년채용비율이 유의하게 증가하였음을 알 수 있다.