An Empirical Study on Dimension Reduction

  • Suh, Changhee (Department of Applied Statistics, Yonsei University) ;
  • Lee, Hakbae (Department of Applied Statistics, Yonsei University)
  • Received : 2018.11.20
  • Accepted : 2018.12.20
  • Published : 2018.12.31

Abstract

The two inverse regression estimation methods, SIR and SAVE to estimate the central space are computationally easy and are widely used. However, SIR and SAVE may have poor performance in finite samples and need strong assumptions (linearity and/or constant covariance conditions) on predictors. The two non-parametric estimation methods, MAVE and dMAVE have much better performance for finite samples than SIR and SAVE. MAVE and dMAVE need no strong requirements on predictors or on the response variable. MAVE is focused on estimating the central mean subspace, but dMAVE is to estimate the central space. This paper explores and compares four methods to explain the dimension reduction. Each algorithm of these four methods is reviewed. Empirical study for simulated data shows that MAVE and dMAVE has relatively better performance than SIR and SAVE, regardless of not only different models but also different distributional assumptions of predictors. However, real data example with the binary response demonstrates that SAVE is better than other methods.

Keywords

References

  1. Choe, J. (2016). A study on the relationship between democracy and government effectiveness using nonparametric smoothing, Journal of the Korean Data Analysis Society, 18(5A), 2355-2365.
  2. Choi, P., Lee, H., Lee, H. (2012). Estimating structural dimensionality using contour regression, Journal of the Korean Data Analysis Society, 14(2A), 577-586.
  3. Cook, R. D. (1994a). On the interpretation of regression plots, Journal of the American Statistical Association, 89(425), 177-189. https://doi.org/10.1080/01621459.1994.10476459
  4. Cook, R. D. (1994b). Using dimension-reduction subspaces to identify important inputs in models of physical systems, In Proceedings of the Section on Physical and Engineering Sciences, American Statistical Association Alexandria, 18-25.
  5. Cook, R. D. (1996). Graphics for regressions with a binary response, Journal of the American Statistical Association, 91(435), 983-992. https://doi.org/10.1080/01621459.1996.10476968
  6. Cook, R. D. (2000). Save: a method for dimension reduction and graphics in regression, Communications in Statistics-Theory and Methods, 29(9-10), 2109-2121. https://doi.org/10.1080/03610920008832598
  7. Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction, The Annals of Statistics, 32(3), 1062-1092. https://doi.org/10.1214/009053604000000292
  8. Cook, R. D. (2009). Regression graphics: ideas for studying regressions through graphics, John Wiley & Sons.
  9. Cook, R. D., Lee, H. (1999). Dimension reduction in binary response regression, Journal of the American Statistical Association, 94(448), 1187-1200. https://doi.org/10.1080/01621459.1999.10473873
  10. Cook, R. D., Li, B. (2002). Dimension reduction for conditional mean in regression, The Annals of Statistics, 30(2), 455-474. https://doi.org/10.1214/aos/1021379861
  11. Cook, R. D., Weisberg, S. (1991). Comment, Journal of the American Statistical Association, 86(414), 328-332. https://doi.org/10.1080/01621459.1991.10475036
  12. Eaton, M. L. (1986). A characterization of spherical distributions, Journal of Multivariate Analysis, 20(2), 272-276.
  13. Fan, J., Yao, Q., Tong, H. (1996). Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems, Biometrika, 83(1), 189-206. https://doi.org/10.1093/biomet/83.1.189
  14. Flury, B., Riedwyl, H. (1988). Multivariate statistics: a practical approach, Cambridge: Cambridge University Press.
  15. Friedman, J., Hastie, T., Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical LASSO, Biostatistics, 9(3), 432-441. https://doi.org/10.1093/biostatistics/kxm045
  16. Hang, W., Xia, Y. (2018). The MAVE package, https://cran.r-project.org/web/packages/MAVE/MAVE.pdf.
  17. Hardle, W., Stoker, T. M. (1989). Investigating smooth multiple regression by the method of average derivatives, Journal of the American statistical Association, 84(408), 986-995.
  18. Jung, K. M. (2016). Local influence in LASSO regression, Journal of the Korean Data Analysis Society, 18(6A), 2913-2924.
  19. Li, K. C. (1991). Sliced inverse regression for dimension reduction, Journal of the American Statistical Association, 86(414), 316-327. https://doi.org/10.1080/01621459.1991.10475035
  20. Li, B., Zha, H., Chiaromonte, F. (2005). Contour regression: a general approach to dimension reduction, The Annals of Statistics, 33(4), 1580-1616. https://doi.org/10.1214/009053605000000192
  21. Oh, E. J., Lee, H. (2013). Dimension reduction and prediction for high-dimensional regression models using the graphical LASSO, Journal of the Korean Data Analysis Society, 15(5A), 2321-2332.
  22. Weisberg, S. (2015). The dr package. https://cran.r-project.org/web/packages/dr/vignettes/overview.pdf.
  23. Xia, Y. (2007). A constructive approach to the estimation of dimension reduction directions, The Annals of Statistics, 35(6), 2654-2690. https://doi.org/10.1214/009053607000000352
  24. Xia, Y., Tong, H., Li, W., Zhu, L. X. (2002). An adaptive estimation of dimension reduction space, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 363-410.