DOI QR코드

DOI QR Code

Robust Response Transformation Using Outlier Detection in Regression Model

회귀모형에서 이상치 검색을 이용한 로버스트 변수변환방법

  • Received : 20111000
  • Accepted : 20111100
  • Published : 2012.02.29

Abstract

Transforming response variable is a general tool to adapt data to a linear regression model. However, it is well known that response transformations in linear regression are very sensitive to one or a few outliers. Many methods have been suggested to develop transformations that will not be influenced by potential outliers. Recently Cheng (2005) suggested to using a trimmed likelihood estimator based on the idea of the least trimmed squares estimator(LTS). However, the method requires presetting the number of outliers and needs many computations. A new method is proposed, that can solve the problems addressed and improve the robustness of the estimates. The method uses a stepwise procedure, suggested by Hadi and Simonoff (1993), to detect outliers that determine response transformations.

Keywords

Box-Cox transformation;variable transformation;outlier;least trimmed squares estimator;regression model

References

  1. Atkinson, A. C. (1985). Plots, Transformations and Regression: An Introduction to Graphical Method of Diagnostic Regression Analysis, Oxford University Press, Oxford.
  2. Atkinson, A. C. (1986). Aspects of diagnostic regression analysis (discussion of influential observations, high leverage points, and outliers in linear regression), Statistical Science, 1, 397-402. https://doi.org/10.1214/ss/1177013624
  3. Atkinson, A. C. (1988). Transformations unmasked, Technometrics, 30, 311-318. https://doi.org/10.2307/1270085
  4. Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations (with discussion), Journal of the Royal Statistical Society. Series B (Methodological), 26, 211-246.
  5. Cheng, T.-C. (2005). Robust regression diagnostics with data transformations, Computational Statistics & Data Analysis, 49, 875-891. https://doi.org/10.1016/j.csda.2004.06.010
  6. Cook, R. D. and Wang, P. C. (1983). Transformations and influential cases in regression, Technometrics, 25, 337-343. https://doi.org/10.2307/1267855
  7. Gentleman, J. F. and Wilk, M. B. (1975). Detecting outliers. II. Supplementing The direct analysis of residuals, Biometrics, 31, 387-410. https://doi.org/10.2307/2529428
  8. Hadi, A. S. and Luceno, A. (1997). Maximum trimmed likelihood estimators: A unified approach, examples, and algorithms, Computational Statistics & Data Analysis, 25, 251-272. https://doi.org/10.1016/S0167-9473(97)00011-X
  9. Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272. https://doi.org/10.2307/2291266
  10. Hinkley, D. V. and Wang, S. (1988). More about transformations and influential cases in regression, Technometrics, 30, 435-440. https://doi.org/10.2307/1269807
  11. Kianifard, F. and Swallow, W. H. (1989). Using recursive residuals, calculated on adaptively-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-585. https://doi.org/10.2307/2531498
  12. Marasinghe, M. G. (1985). A multistage procedure for detecting several outliers in linear regression, Technometrics, 27, 395-399. https://doi.org/10.2307/1270206
  13. Paul, S. R. and Fung, K. Y. (1991). A generalized extreme studentized residual multiple-outlier-detection procedure in linear regression, Technometrics, 33, 339-348. https://doi.org/10.2307/1268785
  14. Rousseeuw, P. J. (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880. https://doi.org/10.2307/2288718
  15. Rousseeuw, P. J. and Driessen, K. V. (2006). Computing LTS regression for large data sets, Data Mining and Knowledge Discovery, 12, 29-45. https://doi.org/10.1007/s10618-005-0024-4
  16. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection, John Wiley, New York.
  17. Tsai, C. L. and Wu, X. (1990). Diagnostics in transformation and weighted regression, Technometrics, 32, 315-322. https://doi.org/10.2307/1269108