DOI QR코드

DOI QR Code

Nonparametric Bayesian methods: a gentle introduction and overview

  • MacEachern, Steven N. (Department of Statistics, The Ohio State University)
  • Received : 2016.10.22
  • Accepted : 2016.11.08
  • Published : 2016.11.30

Abstract

Nonparametric Bayesian methods have seen rapid and sustained growth over the past 25 years. We present a gentle introduction to the methods, motivating the methods through the twin perspectives of consistency and false consistency. We then step through the various constructions of the Dirichlet process, outline a number of the basic properties of this process and move on to the mixture of Dirichlet processes model, including a quick discussion of the computational methods used to fit the model. We touch on the main philosophies for nonparametric Bayesian data analysis and then reanalyze a famous data set. The reanalysis illustrates the concept of admissibility through a novel perturbation of the problem and data, showing the benefit of shrinkage estimation and the much greater benefit of nonparametric Bayesian modelling. We conclude with a too-brief survey of fancier nonparametric Bayesian methods.

Keywords

References

  1. Antoniak CE (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems, Annals of Statistics, 2, 1152-1174. https://doi.org/10.1214/aos/1176342871
  2. Barrientos AF, Jara A, and Quintana FA (2012). On the support of MacEachern's dependent Dirichlet processes and extensions, Bayesian Analysis, 7, 277-310. https://doi.org/10.1214/12-BA709
  3. Bean A, Xu X, and MacEachern SN (2016). Transformations and Bayesian density estimation. To appear in the Electronic Journal of Statistics, 10, 3355-3373. https://doi.org/10.1214/16-EJS1158
  4. Berger JO (1985). Statistical Decision Theory and Bayesian Analysis (2nd ed), Springer-Verlag, New York.
  5. Berry DA and Christensen R (1979). Empirical Bayes estimation of a binomial parameter via mixtures of Dirichlet processes, Annals of Statistics, 7, 558-568. https://doi.org/10.1214/aos/1176344677
  6. Blackwell D and MacQueen JB (1973). Ferguson distributions via Polya urn schemes, Annals of Statistics, 1, 353-355. https://doi.org/10.1214/aos/1176342372
  7. Blei DM and Jordan MI (2006). Variational inference for Dirichlet process mixtures, Bayesian Analysis, 1, 121-143. https://doi.org/10.1214/06-BA104
  8. Broderick T, Pitman J, and Jordan MI (2013). Feature allocations, probability functions, and paintboxes, Bayesian Analysis, 8, 801-836. https://doi.org/10.1214/13-BA823
  9. Bush CA, Lee J, and MacEachern SN (2010). Minimally informative prior distributions for nonparametric Bayesian analysis, Journal of the Royal Statistical Society Series B (Statistical Methodology), 72, 253-268. https://doi.org/10.1111/j.1467-9868.2009.00735.x
  10. Bush CA and MacEachern SN (1996). A semiparametric model for randomised block designs, Biometrika, 83, 275-285. https://doi.org/10.1093/biomet/83.2.275
  11. Dahl DB (2003). An improved merge-split sampler for conjugate Dirichlet process mixture models, Department of Statistics, University of Wisconsin. Technical Report 1086.
  12. De Iorio M, Muller P, Rosner G, and MacEachern SN (2004). An ANOVA model for dependent random measures, Journal of the American Statistical Association, 99, 205-215. https://doi.org/10.1198/016214504000000205
  13. Doksum K (1974). Tailfree and neutral random probabilities and their posterior distributions, Annals of Probability, 2, 183-201. https://doi.org/10.1214/aop/1176996703
  14. Dunson DB and Park JH (2008). Kernel stick-breaking processes, Biometrika, 95, 307-323. https://doi.org/10.1093/biomet/asn012
  15. Dunson DB, Pillai N, and Park JH (2007). Bayesian density regression, Journal of the Royal Statistical Society Series B (Statistical Methodology), 69, 163-183. https://doi.org/10.1111/j.1467-9868.2007.00582.x
  16. Dykstra RL and Laud P (1981). Bayesian nonparametric approach to reliability, Annals of Statistics, 9, 356-367. https://doi.org/10.1214/aos/1176345401
  17. Efron B and Morris C (1975). Data analysis using Stein's estimator and its generalizations, Journal of the American Statistical Association, 70, 311-319. https://doi.org/10.1080/01621459.1975.10479864
  18. Escobar MD (1988). Estimating the means of several normal populations by estimating the distribution of the means (Doctoral dissertation), Yale University, New Haven, CT.
  19. Escobar MD (1994). Estimating normal means with a Dirichlet process prior, Journal of the American Statistical Association, 89, 268-277. https://doi.org/10.1080/01621459.1994.10476468
  20. Escobar MD and West M (1995). Bayesian density estimation and inference using mixtures, Journal of the American Statistical Association, 90, 577-588. https://doi.org/10.1080/01621459.1995.10476550
  21. Ferguson TS (1973). A Bayesian analysis of some nonparametric problems, Annals of Statistics, 1, 209-230. https://doi.org/10.1214/aos/1176342360
  22. Gelfand AE and Kottas A (2002). A computational approach for full nonparametric Bayesian inference under Dirichlet process mixture models, Journal of Computational and Graphical Statistics, 11, 289-305. https://doi.org/10.1198/106186002760180518
  23. Gelfand AE, Kottas A, and MacEachern SN (2005). Bayesian nonparametric spatial modeling with Dirichlet process mixing, Journal of the American Statistical Association, 100, 1021-1035. https://doi.org/10.1198/016214504000002078
  24. Gelfand AE and Smith AFM (1990). Sampling-based approaches to calculating marginal densities, Journal of the American Statistical Association, 85, 398-409. https://doi.org/10.1080/01621459.1990.10476213
  25. Ghosal S and van der Vaart AW (2001). Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities, Annals of Statistics, 29, 1233-1263. https://doi.org/10.1214/aos/1013203452
  26. Ghosh JK and Ramamoorthi RV (2003). Bayesian Nonparametrics, Springer, New York.
  27. Griffin JE (2010). Default priors for density estimation with mixture models, Bayesian Analysis, 5, 45-64. https://doi.org/10.1214/10-BA502
  28. Griffin JE and Steel MFJ (2006). Order-based dependent Dirichlet processes, Journal of the American Statistical Association, 101, 179-194. https://doi.org/10.1198/016214505000000727
  29. Griffiths TL and Ghahramani Z (2011). The Indian buffet process: an introduction and review, Journal of Machine Learning Research, 12, 1185-1224.
  30. Guha S (2008). Posterior simulation in the generalized linear mixed model with semiparametric random effects, Journal of Computational and Graphical Statistics, 17, 410-425. https://doi.org/10.1198/106186008X319854
  31. Hahn PR and Carvalho CM (2015). Decoupled shrinkage and selection in Bayesian linear models: a posterior summary perspective, Journal of the American Statistical Association, 110, 435-448. https://doi.org/10.1080/01621459.2014.993077
  32. Hanson TE (2006). Inference for mixtures of finite Polya tree models, Journal of the American Statistical Association, 101, 1548-1565. https://doi.org/10.1198/016214506000000384
  33. Hjort NL (1990). Nonparametric Bayes estimators based on beta processes in models for life history data, Annals of Statistics, 18, 1259-1294. https://doi.org/10.1214/aos/1176347749
  34. Huber PJ (1981). Robust Statistics, John Wiley & Sons, New York.
  35. Ishwaran H and James LF (2001). Gibbs sampling methods for stick-breaking priors, Journal of the American Statistical Association, 96, 161-173. https://doi.org/10.1198/016214501750332758
  36. James LF, Lijoi A, and Prunster I (2005). Conjugacy as a distinctive feature of the Dirichlet process, Scandinavian Journal of Statistics, 33, 105-120.
  37. Jain S and Neal RM (2004). A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model, Journal of Computational and Graphical Statistics, 13, 158-182. https://doi.org/10.1198/1061860043001
  38. Jain S and Neal RM (2007). Splitting and merging components of a nonconjugate Dirichlet process mixture model, Bayesian Analysis, 2, 445-472. https://doi.org/10.1214/07-BA219
  39. Jara A, Hanson T, Quintana FA, Muller P, and Rosner GL (2011). DPpackage: Bayesian semi- and nonparametric modeling in R, Journal of Statistical Software, 40, 1-30.
  40. Johnson W and Christensen R (1986). Bayesian nonparametric survival analysis for grouped data, Canadian Journal of Statistics, 14, 307-314. https://doi.org/10.2307/3315188
  41. Kalli M, Griffin JE, andWalker SG (2011). Slice sampling mixture models, Statistics and Computing, 21, 93-105. https://doi.org/10.1007/s11222-009-9150-y
  42. Kessler DC, Hoff PD, and Dunson DB (2014). Marginally specified priors for nonparametric Bayesian estimation, Journal of the Royal Statistical Society B(Statistical Methodology), 77, 35-58.
  43. Kim Y (1999). Nonparametric Bayesian estimators for counting processes, Annals of Statistics, 27, 562-588. https://doi.org/10.1214/aos/1018031207
  44. Kim Y and Lee J (2003). Bayesian bootstrap for proportional hazards models, Annals of Statistics, 31, 1905-1922. https://doi.org/10.1214/aos/1074290331
  45. Kleinman KP and Ibrahim JG (1998). A semiparametric Bayesian approach to the random effects model, Biometrics, 54, 921-938. https://doi.org/10.2307/2533846
  46. Kuo L and Smith AF (1992). Bayesian computations in survival models via the Gibbs sampler (with discussion). In JP Klein and PK Goel (Eds), Survival Analysis: State of the Art (pp. 11-24), Springer Netherlands, Dordrecht.
  47. Lavine M (1992). Some aspects of Polya tree distributions for statistical modelling, Annals of Statistics, 20, 1222-1235. https://doi.org/10.1214/aos/1176348767
  48. Lee J and MacEachern SN (2014). Inference functions in high dimensional Bayesian inference, Statistics and Its Interface, 7, 477-486 https://doi.org/10.4310/SII.2014.v7.n4.a5
  49. Lee J, MacEachern SN, Lu Y, and Mills GB (2014). Local-mass preserving prior distributions for nonparametric Bayesian models, Bayesian Analysis, 9, 307-330. https://doi.org/10.1214/13-BA857
  50. Lee J, Quintana FA, Muller P, and Trippa L (2013). Defining predictive probability functions for species sampling models, Statistical Science, 28, 209-222. https://doi.org/10.1214/12-STS407
  51. Lenk PJ (1988). The logistic normal distribution for Bayesian, nonparametric, predictive densities, Journal of the American Statistical Association, 83, 509-516. https://doi.org/10.1080/01621459.1988.10478625
  52. Lijoi A, Mena RH, and Prunster I (2005). Hierarchical mixture modelling with normalized inverse-Gaussian priors, Journal of the American Statistical Association, 100, 1278-1291. https://doi.org/10.1198/016214505000000132
  53. Liu JS (1996). Nonparametric hierarchical Bayes via sequential imputations, Annals of Statistics, 24, 910-930.
  54. Lo AY (1984). On a class of Bayesian nonparametric estimates: I. Density estimates, Annals of Statistics, 12, 351-357. https://doi.org/10.1214/aos/1176346412
  55. MacEachern SN (1988). Sequential Bayesian bioassay design (Doctoral dissertation), University of Minnesota, Minneapolis, MN.
  56. MacEachern SN (1994). Estimating normal means with a conjugate style Dirichlet process prior, Communications in Statistics - Simulation and Computation, 23, 727-741. https://doi.org/10.1080/03610919408813196
  57. MacEachern SN (1999). Dependent nonparametric processes, in American Statistical Association 1999 Proceedings of the Section on Bayesian Statistics, Alexandria, VA, 50-55.
  58. MacEachern SN (2000). Dependent Dirichlet Processes, The Ohio State University, Department of Statistics, Columbus, OH.
  59. MacEachern SN (2001). Decision theoretic aspects of dependent nonparametric processes, in In Bayesian Methods with Applications to Science, Policy, and Official Statistics, (pp. 551-560), Eurostat, Luxembourg.
  60. MacEachern SN (2007). Comment on article by Jain and Neal, Bayesian Analysis, 2, 483-494. https://doi.org/10.1214/07-BA219C
  61. MacEachern SN, Clyde M, and Liu JS (1999). Sequential importance sampling for nonparametric Bayes models: the next generation, Canadian Journal of Statistics, 27, 251-267. https://doi.org/10.2307/3315637
  62. MacEachern SN and Guha S (2011). Parametric and semiparametric hypotheses in the linear model, Canadian Journal of Statistics, 39, 165-180. https://doi.org/10.1002/cjs.10091
  63. MacEachern SN, Kottas A, and Gelfand AE (2001). Spatial nonparametric Bayesian models, In Proceedings of the 2001 Joint Statistical Meetings, Atlanta, GA.
  64. MacEachern SN and Muller P (1998). Estimating mixture of Dirichlet process models, Journal of Computational and Graphical Statistics, 7, 223-238.
  65. Martin R and Tokdar ST (2009). Asymptotic properties of predictive recursion: robustness and rate of convergence, Electronic Journal of Statistics, 3, 1455-1472. https://doi.org/10.1214/09-EJS458
  66. Mauldin RD, Sudderth WD, and Williams SC (1992). Polya trees and random distributions, Annals of Statistics, 20, 1203-1221. https://doi.org/10.1214/aos/1176348766
  67. Muller P, Erkanli A, and West M (1996). Bayesian curve fitting using multivariate normal mixtures, Biometrika, 83, 67-79. https://doi.org/10.1093/biomet/83.1.67
  68. Muller P and Mitra R (2013). Bayesian nonparametric inference: why and how, Bayesian Analysis, 8, 1-35 https://doi.org/10.1214/13-BA801
  69. Muller P and Quintana FA (2004). Nonparametric Bayesian data analysis, Statistical Science, 19, 95-110. https://doi.org/10.1214/088342304000000017
  70. Muller P, Quintana FA, Jara A, and Hanson T (2015). Bayesian Nonparametric Data Analysis, Springer, New York.
  71. Muller P, Quintana FA, and Rosner G (2004). A method for combining inference across related nonparametric Bayesian models, Journal of the Royal Statistical Society B (Statistical Methodology), 66, 735-749. https://doi.org/10.1111/j.1467-9868.2004.05564.x
  72. Neal RM (2000). Markov chain sampling methods for Dirichlet process mixture models, Journal of Computational and Graphical Statistics, 9, 249-265.
  73. Newton MA and Raftery AE (1994). Approximate Bayesian inference with the weighted likelihood bootstrap, Journal of the Royal Statistical Society B (Methodological), 56, 3-48.
  74. Newton MA and Zhang Y (1999). A recursive algorithm for nonparametric analysis with missing data, Biometrika, 86, 15-26. https://doi.org/10.1093/biomet/86.1.15
  75. Orbanz P and Roy DM (2015). Bayesian models of graphs, arrays, and other exchangeable random structures, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 437-461. https://doi.org/10.1109/TPAMI.2014.2334607
  76. Pennell ML and Dunson DB (2006). Bayesian semiparametric dynamic frailty models for multiple event time data, Biometrics, 62, 1044-1052. https://doi.org/10.1111/j.1541-0420.2006.00571.x
  77. Petrone S (1999). Bayesian density estimation using Bernstein polynomials, Canadian Journal of Statistics, 27, 105-126. https://doi.org/10.2307/3315494
  78. Regazzini E, Lijoi A, and Prunster I (2003). Distributional results for means of normalized random measures with independent increments, Annals of Statistics, 31, 560-585. https://doi.org/10.1214/aos/1051027881
  79. Rodriguez A, Dunson DB, and Gelfand AE (2008). The nested Dirichlet process, Journal of the American Statistical Association, 103, 1131-1154. https://doi.org/10.1198/016214508000000553
  80. Rubin DB (1981). The Bayesian bootstrap, Annals of Statistics, 9, 130-134. https://doi.org/10.1214/aos/1176345338
  81. Savage LJ (1954). The Foundations of Statistics, John Wiley & Sons, New York.
  82. Sethuraman J (1994). A constructive definition of Dirichlet priors, Statistica Sinica, 4, 639-650.
  83. Susarla V and Van Ryzin J (1976). Nonparametric Bayesian estimation of survival curves from incomplete observations, Journal of the American Statistical Association, 71, 897-902. https://doi.org/10.1080/01621459.1976.10480966
  84. Teh YW, Jordan MI, Beal MJ, and Blei DM (2006). Hierarchical Dirichlet processes, Journal of the American Statistical Association, 101, 1566-1581. https://doi.org/10.1198/016214506000000302
  85. Tokdar ST (2007). Towards a faster implementation of density estimation with logistic Gaussian process priors, Journal of Computational and Graphical Statistics, 16, 633-655. https://doi.org/10.1198/106186007X210206
  86. Tomlinson GA (1998). Analysis of densities (Doctoral dissertation), University of Toronto, ON.
  87. Walker SG (2004). New approaches to Bayesian consistency, Annals of Statistics, 32, 2028-2043. https://doi.org/10.1214/009053604000000409
  88. Walker SG (2007). Sampling the Dirichlet mixture model with slices, Communications in Statistics - Simulation and Computation, 36, 45-54. https://doi.org/10.1080/03610910601096262
  89. Walker SG, Damien P, Laud PW, and Smith AFM (1999). Bayesian nonparametric inference for random distributions and related functions, Journal of the Royal Statistical Society B (Statistical Methodology), 61, 485-527. https://doi.org/10.1111/1467-9868.00190
  90. Walker SG and Gutierrez-Pena E (1999). Robustifying Bayesian procedures, Bayesian Statistics, 6, 685-710.
  91. Wang Z (2009). Semiparametric Bayesian models extending weighted least squares (Doctoral dissertation), The Ohio State University, Columbus, OH.
  92. Xu X, Lu P, MacEachern SN, and Xu R (2012). Calibrated Bayes factor for model comparison and prediction, Department of Statistics, The Ohio State University, Technical Report.
  93. Yang L and Marron JS (1999). Iterated transformation-kernel density estimation, Journal of the American Statistical Association, 94, 580-589.

Cited by

  1. A review of tree-based Bayesian methods vol.24, pp.6, 2017, https://doi.org/10.29220/CSAM.2017.24.6.543
  2. Bayesian methods in clinical trials with applications to medical devices vol.24, pp.6, 2017, https://doi.org/10.29220/CSAM.2017.24.6.561
  3. Identifying differentially expressed genes using the Polya urn scheme vol.24, pp.6, 2017, https://doi.org/10.29220/CSAM.2017.24.6.627
  4. Geometric Sensitivity Measures for Bayesian Nonparametric Density Estimation Models pp.0976-8378, 2018, https://doi.org/10.1007/s13171-018-0145-7