DOI QR코드

DOI QR Code

Sufficient conditions for the oracle property in penalized linear regression

선형 회귀모형에서 벌점 추정량의 신의 성질에 대한 충분조건

  • Kwon, Sunghoon (Department of Applied Statistics, Konkuk University) ;
  • Moon, Hyeseong (Department of Applied Statistics, Konkuk University) ;
  • Chang, Jaeho (Department of Applied Statistics, Konkuk University) ;
  • Lee, Sangin (Department of Information and Statistics, Chungnam National University)
  • 권성훈 (건국대학교 응용통계학과) ;
  • 장재호 (건국대학교 응용통계학과) ;
  • 문혜성 (건국대학교 응용통계학과) ;
  • 이상인 (충남대학교 정보통계학과)
  • Received : 2021.03.16
  • Accepted : 2021.03.30
  • Published : 2021.04.30

Abstract

In this paper, we introduce how to construct sufficient conditions for the oracle property in penalized linear regression model. We give formal definitions of the oracle estimator, penalized estimator, oracle penalized estimator, and the oracle property of the oracle estimator. Based on the definitions, we present a unified way of constructing optimality conditions for the oracle property and sufficient conditions for the optimality conditions that covers most of the existing penalties. In addition, we present an illustrative example and results from the numerical study.

본 논문은 선형 회귀모형에서 벌점 추정량의 신의 성질에 대한 충분조건을 구성하는 방법을 소개하였다. 신의 추정량, 벌점 추정량, 신의 벌점 추정량, 신의 성질을 명확히 정의하였으며 이를 바탕으로 신의 성질에 대한 최적조건과 최적조건에 대한 충분조건을 구성하는 방법을 대부분의 벌점함수에 적용 가능하도록 하나의 통합된 원리로 소개하였다. 추가로 신의 성질에 대한 이해를 돕기 위해 간단한 예제와 함께 가상실험 결과를 첨부하였다.

Keywords

Acknowledgement

본 논문은 2019년도 건국대학교 우수연구인력 양성사업과 한국연구재단 지원에 의한 논문임 (No. 2020R1I1A3071646).

References

  1. Choi H and Park C (2012). Approximate penalization path for smoothly clipped absolute deviation, Journal of Statistical Computation and Simulation, 82, 643-652. https://doi.org/10.1080/00949655.2010.550292
  2. Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96, 1348-1360. https://doi.org/10.1198/016214501753382273
  3. Fan J and Lv J (2011). Nonconcave Penalized Likelihood with NP-Dimensionality, IEEE Transactions on information theory, 57, 5467-5484. https://doi.org/10.1109/TIT.2011.2158486
  4. Fan J and Peng H (2004). Nonconcave penalized likelihood with a diverging number of parameters, The Annals of Statistics, 32, 928-961. https://doi.org/10.1214/009053604000000256
  5. Hoerl AE and Kennard RW (1970). Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634
  6. Huang J, Breheny P, Lee S, Ma S, and Zhang CH (2016). The mnet method for variable selection, Statistica Sinica, 903-923.
  7. Huang J, Horowitz JL, and Ma S (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models, The Annals of Statistics, 36, 587-613. https://doi.org/10.1214/009053607000000875
  8. Kim Y, Choi H, and Oh HS (2008). Smoothly clipped absolute deviation on high dimensions, Journal of the American Statistical Association, 103, 1665-1673. https://doi.org/10.1198/016214508000001066
  9. Kim Y, Jeon JJ, and Han S (2016). A necessary condition for the strong oracle property, Scandinavian Journal of Statistics, 43, 610-624. https://doi.org/10.1111/sjos.12195
  10. Kim Y and Kwon S (2012). Global optimality of nonconvex penalized estimators, Biometrika, 99, 315-325. https://doi.org/10.1093/biomet/asr084
  11. Kwon S, Ahn J, Jang W, Lee S, and Kim Y (2017). A doubly sparse approach for group variable selection, Annals of the Institute of Statistical Mathematics, 69, 997-1025. https://doi.org/10.1007/s10463-016-0571-z
  12. Kwon S and Kim Y (2012). Large sample properties of the scad-penalized maximum likelihood estimation on high dimensions, Statistica Sinica, 629-653.
  13. Kwon S, Kim Y, and Choi H (2013). Sparse bridge estimation with a diverging number of parameters, Statistics and Its Interface, 6, 231-242. https://doi.org/10.4310/SII.2013.v6.n2.a7
  14. Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67. https://doi.org/10.1016/j.csda.2015.07.001
  15. Kwon S, Oh S, and Lee Y (2016). The use of random-effect models for high-dimensional variable selection problems, Computational Statistics & Data Analysis, 103, 401-412. https://doi.org/10.1016/j.csda.2016.05.016
  16. Lee S and Kim S (2019). Marginalized lasso in sparse regression, Journal of the Korean Statistical Society, 48, 396-411. https://doi.org/10.1016/j.jkss.2018.12.004
  17. Lee Y and Oh HS (2014). A new sparse variable selection via random-effect model, Journal of Multivariate Analysis, 125.
  18. Lv J and Fan Y (2009). A unified approach to model selection and sparse recovery using regularized least squares, The Annals of Statistics, 37, 3498-3528. https://doi.org/10.1214/09-AOS683
  19. Pan W, Shen X, and Liu B (2013). Cluster analysis: unsupervised learning via supervised learning with a non-convex penalty, The Journal of Machine Learning Research, 14, 1865-1889.
  20. Shen X and Huang HC (2010). Grouping pursuit through a regularization solution surface, Journal of the American Statistical Association, 105, 727-739. https://doi.org/10.1198/jasa.2010.tm09380
  21. Shen X, Pan W, and Zhu Y (2012). Likelihood-based selection and sharp parameter estimation, Journal of the American Statistical Association, 107, 223-232. https://doi.org/10.1080/01621459.2011.645783
  22. Shen X, Pan W, Zhu Y, and Zhou H (2013). On constrained and regularized high-dimensional regression, Annals of the Institute of Statistical Mathematics, 65, 807-832. https://doi.org/10.1007/s10463-012-0396-3
  23. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society : Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  24. Tibshirani RJ and Taylor J (2011). The solution path of the generalized lasso, The Annals of Statistics, 39, 1335-1371. https://doi.org/10.1214/11-AOS878
  25. Um S, Kim D, Lee S, and Kwon S (2020). On the strong oracle property of concave penalized estimators with infinite penalty derivative at the origin, Journal of the Korean Statistical Society, 49, 439-456. https://doi.org/10.1007/s42952-019-00024-w
  26. Wang H, Li B, and Leng C (2009). Shrinkage tuning parameter selection with a diverging number of parameters, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71, 671-683. https://doi.org/10.1111/j.1467-9868.2008.00693.x
  27. Wang H, Li R, and Tsai CL (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method, Biometrika, 94, 553-568. https://doi.org/10.1093/biomet/asm053
  28. Yuan M and Lin Y (2006). Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68, 49-67. https://doi.org/10.1111/j.1467-9868.2005.00532.x
  29. Yuan M and Lin Y (2007). Model selection and estimation in the gaussian graphical model, Biometrika, 94, 19-35. https://doi.org/10.1093/biomet/asm018
  30. Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38, 894-942. https://doi.org/10.1214/09-AOS729
  31. Zhao P and Yu B (2006). On model selection consistency of lasso, The Journal of Machine Learning Research, 7, 2541-2563.
  32. Zou H (2006). The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101, 1418-1429. https://doi.org/10.1198/016214506000000735
  33. Zou H, Hastie T, and Tibshirani R (2006). Sparse principal component analysis, Journal of Computational and Graphical Statistics, 15, 265-286. https://doi.org/10.1198/106186006X113430
  34. Zou H and Li R (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics, 36, 1509. https://doi.org/10.1214/009053607000000802