DOI QR코드

DOI QR Code

On Confidence Intervals of Robust Regression Estimators

로버스트 회귀추정에 의한 신뢰구간 구축

  • 이동희 (고려대학교 통계학과) ;
  • 박유성 (고려대학교 통계학과) ;
  • 김기환 (고려대학교 자연과학대학 정보통계학과)
  • Published : 2006.03.01

Abstract

Since it is well-established that even high quality data tend to contain outliers, one would expect fat? greater reliance on robust regression techniques than is actually observed. But most of all robust regression estimators suffers from the computational difficulties and the lower efficiency than the least squares under the normal error model. The weighted self-tuning estimator (WSTE) recently suggested by Lee (2004) has no more computational difficulty and it has the asymptotic normality and the high break-down point simultaneously. Although it has better properties than the other robust estimators, WSTE does not have full efficiency under the normal error model through the weighted least squares which is widely used. This paper introduces a new approach as called the reweighted WSTE (RWSTE), whose scale estimator is adaptively estimated by the self-tuning constant. A Monte Carlo study shows that new approach has better behavior than the general weighted least squares method under the normal model and the large data.

대부분의 자료는 여러가지 원인으로 인한 특이치로 오염되어 있으며, 이러한 상황에서 신뢰성 있는 추정량을 얻어내고 이에 대한 통계적 추론을 시행하는 것은 중요한 문제이다. 그러나 이제까지 제안된 로버스트 회귀추정량들은 계산상의 어려움과 정규오차모형에서 최소제곱추정량에 비하여 떨어지는 효율성때문에 통계적 추론의 정확성을 확신할 수 없었다. 최근 제안된 Lee(2004)의 가중자기조율회귀추정량(weighted self-tuning estimator, WSTE)은 다른 로버스트 회귀추정량에 비하여 정확한 계산과정과 그에 따른 추정량의 점근적 정규성 및 고붕괴점을 갖는다. 그러나 통계적 추론을 위하여 이제까지 널리 사용해왔던 로버스트 추정량에 기반한 가중최소제곱추정방법(weighted least squares estimator)은 WSTE에서조차 정규오차모형하에서 최소제곱추정량과 동일한 수준의 효율성을 제공해주지 는 못한다. 본 논문에서는 WSTE에 기반한 또다른 통계적 추론 방법을 제안하고, 이 방법을 사용함으로써 정규오차모형 및 대표본에서 보다 정확한 결과를 얻을 수 있음을 몬테칼로 모의실험을 통해 제시하였다.

Keywords

References

  1. Chang, W. H., Mckean, J. W., Naranjo, J. D. and Sheather, S. J. (1999). High-breakdown rank regression. Journal of the American Statistical Association, 94, 205-219 https://doi.org/10.2307/2669695
  2. Coakley, C. W. and Hettamansperger, T. P. (1993). A bounded influence, high breakdown, efficiency regression estimator. Journal of the American Statistical Association, 88, 872-880 https://doi.org/10.2307/2290776
  3. Croux, C., Rousseeuw, P. J., and Hossjer, O. (1994). Generalized S-estimators. Journal of the American Statistical Association, 89, 1271-1281 https://doi.org/10.2307/2290990
  4. Davies, P. L. (1990). The asymptotics of S-estimators in the linear regression model. The Annals of Statistics, 18, 1651-1675 https://doi.org/10.1214/aos/1176347871
  5. Donoho, D. L. and Huber, P. J. (1983). The notion of breakdown point. In A Festschrift for Erich L. Lehmann(Bickel, P. J., Doksum, K. A. and Hodges, J. L., eds.) 157-184. Wadsworth, Belmont, California
  6. Gervini, D. and Yohai, V. J. (2002). A class of robust and fully efficient regression estimators. Annals of Statistics, 30, 583-616 https://doi.org/10.1214/aos/1021379866
  7. Hawkins, D. M. and Olive, D. J. (2002). Inconsistency of resampling algorithms for high breakdown regression estimators and a new algorithm. Journal of the American Statistical Association, 97, 136-148 https://doi.org/10.1198/016214502753479293
  8. He, X. and Portnoy, S. (1992). High breakdown point and high efficiency robust estimates for regression. The Annals of Statistics, 20, 2161-2167 https://doi.org/10.1214/aos/1176348910
  9. Hossjer, O. (1994). Rank-based estimates in the linear model with high breakdown point
  10. Lee, Dong-Hee (2004). Self-tuning Robust Regression Estimator. Ph.D. Thesis, Korea University, Seoul
  11. Rousseeuw, P. J. (1984). Least median of squares. Journal of the American Statistical Association, 79, 871-880 https://doi.org/10.2307/2288718
  12. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. Wiley, New York
  13. Rousseeuw, P. J. and Van Driessen, K. (2002). Computing LTS regression for large data sets. Estadistica, 54, 163-190
  14. Rousseeuw, P. J. and Yohai, V. J. (1984). 'Robust regression by means of S-estimators' in Robust and Nonlinear Time Series Analysis, eds. J. Frank, W. Hardie, and RD. Martin, Springer-Verlag, New York
  15. Salibian-Berrera, M. and Zamar, R H. (2004). Uniform asymptotics for robust location estimator when the scale is unknown. The Annals of Statistics, 32, 1434-1447 https://doi.org/10.1214/009053604000000544
  16. Simpson, D. G., Ruppert, D. and Carroll, R J. (1992). On one-step GM estimates and stability of inferences in linear regression. Journal of the American Statistical Association, 87, 439-450 https://doi.org/10.2307/2290275
  17. Yohai, V. J. (1987). High breakdown point and high efficient robust estimates for regression. The Annals of Statistics, 15, 642-656 https://doi.org/10.1214/aos/1176350366
  18. Yohai, V. J. and Zamar, R H. (1988). High breakdown point estimates of regression by means of the minimization of an efficient scale. Journal of the American Statistical Association, 83, 406-414 https://doi.org/10.2307/2288856