Robust Response Transformation Using Outlier Detection in Regression Model

회귀모형에서 이상치 검색을 이용한 로버스트 변수변환방법

  • Received : 20111000
  • Accepted : 20111100
  • Published : 2012.02.29


Transforming response variable is a general tool to adapt data to a linear regression model. However, it is well known that response transformations in linear regression are very sensitive to one or a few outliers. Many methods have been suggested to develop transformations that will not be influenced by potential outliers. Recently Cheng (2005) suggested to using a trimmed likelihood estimator based on the idea of the least trimmed squares estimator(LTS). However, the method requires presetting the number of outliers and needs many computations. A new method is proposed, that can solve the problems addressed and improve the robustness of the estimates. The method uses a stepwise procedure, suggested by Hadi and Simonoff (1993), to detect outliers that determine response transformations.


Box-Cox transformation;variable transformation;outlier;least trimmed squares estimator;regression model


  1. Atkinson, A. C. (1985). Plots, Transformations and Regression: An Introduction to Graphical Method of Diagnostic Regression Analysis, Oxford University Press, Oxford.
  2. Atkinson, A. C. (1986). Aspects of diagnostic regression analysis (discussion of influential observations, high leverage points, and outliers in linear regression), Statistical Science, 1, 397-402.
  3. Atkinson, A. C. (1988). Transformations unmasked, Technometrics, 30, 311-318.
  4. Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations (with discussion), Journal of the Royal Statistical Society. Series B (Methodological), 26, 211-246.
  5. Cheng, T.-C. (2005). Robust regression diagnostics with data transformations, Computational Statistics & Data Analysis, 49, 875-891.
  6. Cook, R. D. and Wang, P. C. (1983). Transformations and influential cases in regression, Technometrics, 25, 337-343.
  7. Gentleman, J. F. and Wilk, M. B. (1975). Detecting outliers. II. Supplementing The direct analysis of residuals, Biometrics, 31, 387-410.
  8. Hadi, A. S. and Luceno, A. (1997). Maximum trimmed likelihood estimators: A unified approach, examples, and algorithms, Computational Statistics & Data Analysis, 25, 251-272.
  9. Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272.
  10. Hinkley, D. V. and Wang, S. (1988). More about transformations and influential cases in regression, Technometrics, 30, 435-440.
  11. Kianifard, F. and Swallow, W. H. (1989). Using recursive residuals, calculated on adaptively-ordered observations, to identify outliers in linear regression, Biometrics, 45, 571-585.
  12. Marasinghe, M. G. (1985). A multistage procedure for detecting several outliers in linear regression, Technometrics, 27, 395-399.
  13. Paul, S. R. and Fung, K. Y. (1991). A generalized extreme studentized residual multiple-outlier-detection procedure in linear regression, Technometrics, 33, 339-348.
  14. Rousseeuw, P. J. (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880.
  15. Rousseeuw, P. J. and Driessen, K. V. (2006). Computing LTS regression for large data sets, Data Mining and Knowledge Discovery, 12, 29-45.
  16. Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection, John Wiley, New York.
  17. Tsai, C. L. and Wu, X. (1990). Diagnostics in transformation and weighted regression, Technometrics, 32, 315-322.