DOI QR코드

DOI QR Code

Horseshoe 사전분포에 대한 MCMC 알고리듬 비교 연구

Comparing MCMC algorithms for the horseshoe prior

  • 마미루 (성균관대학교 통계학과) ;
  • 강민기 (성균관대학교 통계학과) ;
  • 이경재 (성균관대학교 통계학과)
  • Miru Ma (Department of Statistics, Sungkyunkwan University) ;
  • Mingi Kang (Department of Statistics, Sungkyunkwan University) ;
  • Kyoungjae Lee (Department of Statistics, Sungkyunkwan University)
  • 투고 : 2023.08.07
  • 심사 : 2023.11.27
  • 발행 : 2024.02.29

초록

Horseshoe 사전분포는 희박 회귀 분석에서 가장 자주 사용되는 사전분포 중 하나이다. Horseshoe 사전분포는 탐험해야 할 모수공간이 spike and slab 사전분포보다 작다는 장점이 있으나 MCMC 단계 당 시간이 오래 걸린다는 단점이 있다. 이로 인해 horseshoe 사전분포를 고차원 데이터 분석으로 확장하는 것은 한계가 존재한다. 이러한 한계를 극복하기 위해 horseshoe 사전분포에 대한 MCMC 알고리듬들이 많이 제안되었으며, 특히 최근 Johndrow 등 (2020)은 빠른 계산을 위한 근사 알고리듬을 제안하였다. 본 논문에서는 horseshoe 사전분포에 대한 (1) 전통적인 MCMC 알고리듬, (2) 근사 알고리듬 그리고 (3) 근사 알고리듬의 변형 알고리듬의 성능을 다양한 모의실험을 통해 비교하려 한다. 성능은 계산 시간, 회귀계수 추정 및 변수선택을 중심으로 비교한다. 이때 변수선택을 위해 Li와 Pati (2017)에서 제안한 horseshoe 사전분포 기반 변수선택법을 고려한다.

The horseshoe prior is notably one of the most popular priors in sparse regression models, where only a small fraction of coefficients are nonzero. The parameter space of the horseshoe prior is much smaller than that of the spike and slab prior, so it enables us to efficiently explore the parameter space even in high-dimensions. However, on the other hand, the horseshoe prior has a high computational cost for each iteration in the Gibbs sampler. To overcome this issue, various MCMC algorithms for the horseshoe prior have been proposed to reduce the computational burden. Especially, Johndrow et al. (2020) recently proposes an approximate algorithm that can significantly improve the mixing and speed of the MCMC algorithm. In this paper, we compare (1) the traditional MCMC algorithm, (2) the approximate MCMC algorithm proposed by Johndrow et al. (2020) and (3) its variant in terms of computing times, estimation and variable selection performance. For the variable selection, we adopt the sequential clustering-based method suggested by Li and Pati (2017). Practical performances of the MCMC methods are demonstrated via numerical studies.

키워드

과제정보

이 성과는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (No. RS-2023-00211974).

참고문헌

  1. Bhattacharya A and Dunson DB (2011). Sparse Bayesian infinite factor models, Biometrika, 98, 291-306. https://doi.org/10.1093/biomet/asr013
  2. Bhattacharya A, Chakraborty A, and Mallick BK (2016). Fast sampling with Gaussian scale mixture priors in high-dimensional regression, Biometrika, 103, 985-991. https://doi.org/10.1093/biomet/asw042
  3. Carvalho CM, Polson NG, and Scott JG (2009). Handling sparsity via the horseshoe, In Proceedings of the12th International Conference on Artificial Intelligence and Statistics, Clearwater Beach, Florida, USA, 73-80.
  4. Carvalho CM, Polson NG, and Scott JG (2010). The horseshoe estimator for sparse signals, Biometrika, 97, 465-480. https://doi.org/10.1093/biomet/asq017
  5. George EI and McCulloch RE (1993). Variable selection via Gibbs sampling, Journal of the American Statistical Association, 88, 881-889. https://doi.org/10.1080/01621459.1993.10476353
  6. Ishwaran H and Rao JS (2005). Spike and slab variable selection: Frequentist and Bayesian strategies, The Annals of Statistics, 33, 730-773. https://doi.org/10.1214/009053604000001147
  7. Johndrow J, Orenstein P, and Bhattacharya A (2020). Scalable approximate MCMC algorithms for the horseshoe prior, Journal of Machine Learning Research, 21, 1-61.
  8. Li H and Pati D (2017). Variable selection using shrinkage priors, Computational Statistics & Data Analysis, 107, 107-119. https://doi.org/10.1016/j.csda.2016.10.008
  9. Polson NG, Scott JG, and Windle J (2014). The Bayesian bridge, Journal of the Royal Statistical Society Series B: Statistical Methodology, 76, 713-733. https://doi.org/10.1111/rssb.12042
  10. Rue H (2001). Fast sampling of Gaussian Markov random fields, Journal of the Royal Statistical Society Series B: Statistical Methodology, 63, 325-338. https://doi.org/10.1111/1467-9868.00288
  11. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society Series B: Statistical Methodology, 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  12. Zou H and Hastie T (2005). Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society Series B: Statistical Methodology, 67, 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x