DOI QR코드

DOI QR Code

A comparison study of Bayesian high-dimensional linear regression models

베이지안 고차원 선형 회귀분석에서의 비교연구

  • Received : 2021.05.05
  • Accepted : 2021.06.04
  • Published : 2021.06.30

Abstract

We consider linear regression models in high-dimensional settings (p ≫ n) and compare various classes of priors. The spike and slab prior is one of the most widely used priors for Bayesian regression models, but its model space is vast, resulting in a bad performance in finite samples. As an alternative, various continuous shrinkage priors, including the horseshoe prior and its variants, have been proposed. Although each of the above priors has been investigated separately, exhaustive comparative studies of their performance have been conducted very rarely. In this study, we compare the spike and slab prior, the horseshoe prior and its variants in various simulation settings. The performance of each method is demonstrated in terms of the regression coefficient estimation and variable selection. Finally, some remarks and suggestions are given based on comprehensive simulation studies.

본 연구에서는, 고차원상황(p ≫ n)에서의 회귀분석 모형을 고려하여 다양한 베이지안 회귀분석 방법들을 비교하였다. Spike and slab 사전분포는 고차원 베이지안 회귀분석에서 가장 많이 사용되는 사전분포 중 하나이지만, 탐험해야 하는 모형 공간이 너무 크기 때문에 유한 표본에서 좋지 않은 성능을 보일 수 있다는 문제가 있다. 이에 대한 대안으로, horseshoe 사전분포를 비롯한 다양한 연속 수축사전분포들이 제안되어 사용되고 있다. 비록 위 사전분포들 각각에 대해서는 많은 연구들이 진행되고 있지만, 이들에 대한 포괄적인 비교연구는 매우 드물게 진행되고 있다. 따라서 본 연구에서는, spike and slab 사전분포와 다양한 연속수축사 전분포들을 다양한 상황에서 비교하는 연구를 진행 하였다. 각 방법의 성능은 회귀계수 추정 측면과 변수선택 측면을 나누어 비교하였다. 최종적으로, 본 연구에서 진행된 시뮬레이션 연구에 기반하여, 사용시 몇 가지 주의점과 제안들을 제시하였다.

Keywords

Acknowledgement

The National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT)(No. 2019R1F1A1059483).

References

  1. Barbieri MM, Berger JO (2004). Optimal predictive model selection, The Annals of Statistics, 32, 870-897. https://doi.org/10.1214/009053604000000238
  2. Bhadra A, Datta J, Polson NG, and Willard B (2017). The horseshoe+ estimator of ultra-sparse signals, Bayesian Analysis, 12, 1105-1131. https://doi.org/10.1214/16-BA1028
  3. Carvalho CM, Polson NG, and Scott JG (2010). The horseshoe estimator for sparse signals, Biometrika, 97, 465-480. https://doi.org/10.1093/biomet/asq017
  4. Van ES, Oberski DL, and Mulder J (2019). Shrinkage priors for Bayesian penalized regression, Journal of Mathematical Psychology, 89, 31-50. https://doi.org/10.1016/j.jmp.2018.12.004
  5. George EI and McCulloch RE (1993). Variable selection via gibbs sampling, Journal of the American Statistical Association, 88, 881-889. https://doi.org/10.1080/01621459.1993.10476353
  6. Hoerl AE and Kennard RW (1970). Ridge regression: applications to nonorthogonal problem, Technometrics, 12, 69-82. https://doi.org/10.1080/00401706.1970.10488635
  7. Ishwaran H and Rao JS (2005). Spike and slab variable selection: frequentist and Bayesian strategies, The Annals of Statistics, 33, 730-773. https://doi.org/10.1214/009053604000001147
  8. Lee SY, Pati D, and Mallick BK (2020). Continuous Shrinkage Prior Revisited: A Collapsing Behavior and Remedy, arXiv preprint arXiv:2007.02192.
  9. Makalic E and Schmidt DF (2015). A simple sampler for the horseshoe estimator, IEEE Signal Processing Letters, 23, 179-182. https://doi.org/10.1109/LSP.2015.2503725
  10. Makalic E and Schmidt DF (2016). High-Dimensional Bayesian Regularised Regression with the Bayesreg Package, arXiv preprint arXiv:1611.06649.
  11. Martin R, Mess R, and Walker SG (2017). Empirical bayes posterior concentration in sparse high-dimensional linear models, Bernoulli, 23, 1822-1847.
  12. Piironen J and Vehtari A (2017). On the hyperprior choice for the global shrinkage parameter in the horseshoe prior. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 905-913.
  13. Piironen J and Vehtari A (2017) . Sparsity information and regularization in the horseshoe and other shrinkage priors, Electronic Journal of Statistics, 11, 5018-5051. https://doi.org/10.1214/17-EJS1337SI
  14. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x