DOI QR코드

DOI QR Code

Bayesian analysis of random partition models with Laplace distribution

  • Kyung, Minjung (Department of Statistics, Duksung Women's University)
  • Received : 2017.04.25
  • Accepted : 2017.08.21
  • Published : 2017.09.30

Abstract

We develop a random partition procedure based on a Dirichlet process prior with Laplace distribution. Gibbs sampling of a Laplace mixture of linear mixed regressions with a Dirichlet process is implemented as a random partition model when the number of clusters is unknown. Our approach provides simultaneous partitioning and parameter estimation with the computation of classification probabilities, unlike its counterparts. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo posterior computation. The proposed method is illustrated with simulated data and one real data of the energy efficiency of Tsanas and Xifara (Energy and Buildings, 49, 560-567, 2012).

Keywords

References

  1. Airoldi EM, Costa T, Bassetti F, Leisen F, and Guindani M (2014). Generalized species sampling priors with latent Beta reinforcements. Journal of the American Statistical Association, 109, 1466-1480. https://doi.org/10.1080/01621459.2014.950735
  2. Andrews DF and Mallows CL (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society Series B (Methodological), 36, 99-102.
  3. Antoniak CE (1974). Mixture of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2, 1152-1174. https://doi.org/10.1214/aos/1176342871
  4. Argiento R, Cremaschi A, and Guglielmi A (2014). A "density-based" algorithm for cluster analy-sis using species sampling Gaussian mixture models. Journal of Computational and Graphical Statistics, 23, 1126-1142. https://doi.org/10.1080/10618600.2013.856796
  5. Banfield JD and Raftery AE (1993). Model-based Gaussian and non-Gaussian clustering, Biometrics. 49, 803-821. https://doi.org/10.2307/2532201
  6. Barry D and Hartigan JA (1992). Product partition models for change point problems. The Annals of Statistics, 20, 260-279. https://doi.org/10.1214/aos/1176348521
  7. Blackwell D and MacQueen JB (1973). Ferguson distributions via P'olya urn schemes. The Annals of Statistics, 1, 353-355. https://doi.org/10.1214/aos/1176342372
  8. Booth JG, Casella G, and Hobert JP (2008). Clustering using objective functions and stochastic search. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 70, 119-139. https://doi.org/10.1111/j.1467-9868.2007.00629.x
  9. Bush CA and MacEachern SN (1996). A semiparametric Bayesian model for randomized block design. Biometrika, 83, 275-285. https://doi.org/10.1093/biomet/83.2.275
  10. Crowley EM (1997). Product partition models for normal means. Journal of the American Statistical Association, 92, 192-198. https://doi.org/10.1080/01621459.1997.10473616
  11. Dempster AP, Laird NM, and Rubin DB (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39, 1-38.
  12. Dielman TE (1984). Least absolute value estimation in regression models: an annotated bibliography. Communications in Statistics Theory and Methods, 4, 513-541.
  13. Dielman TE (2005). Least absolute value regression: recent contributions. Journal of Statistical Computation and Simulation, 75, 263-286. https://doi.org/10.1080/0094965042000223680
  14. Dunson DB, Pillai N, and Park JH (2007). Bayesian density regression. Journal of the Royal Statistical Society Series B (Statistical Methodology), 69, 163-183. https://doi.org/10.1111/j.1467-9868.2007.00582.x
  15. Escobar MD and West M (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 577-588. https://doi.org/10.1080/01621459.1995.10476550
  16. Ferguson TS (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1, 209-230. https://doi.org/10.1214/aos/1176342360
  17. Fritsch A and Ickstadt K (2009). Improved criteria for clustering based on the posterior similarity matrix. Bayesian Analysis, 4, 367-392. https://doi.org/10.1214/09-BA414
  18. Fraley C and Raftery AE (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611-631. https://doi.org/10.1198/016214502760047131
  19. Fraley C and Raftery AE (2007). Bayesian regularization for normal mixture estimation and model-based clustering. Journal of Classification, 24, 155-181. https://doi.org/10.1007/s00357-007-0004-5
  20. Fraley C, Raftery AE, Murphy TB, and Srucca L (2012). mclust version 4 for R: Normal mixture modeling for model-based clustering, classification, and density estimation, University of Washington, Department of Statistics.
  21. Hartigan JA (1990). Partition models. Communications in Statistics Theory and Methods, 19, 2745-2756. https://doi.org/10.1080/03610929008830345
  22. Ishwaran H and James LF (2001) Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96, 161-173. https://doi.org/10.1198/016214501750332758
  23. Ishwaran H and James LF (2003) Some further developments for stick-breaking priors: finite and infinite clustering and classification. Sankhya: The Indian Journal of Statistics, 65, 577-592.
  24. Ishwaran H and Zarepour M (2000) Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models. Biometrika, 87, 371-390. https://doi.org/10.1093/biomet/87.2.371
  25. Jordan C, Livingstone V, and Barry D (2007). Statistical modelling using product partition models. Statistical Modelling, 7, 275-295. https://doi.org/10.1177/1471082X0700700304
  26. Kyung M, Gill J, Ghosh M, and Casella G (2010). Penalized regression, standard errors, and Bayesian lassos. Bayesian Analysis, 5, 369-412. https://doi.org/10.1214/10-BA607
  27. MacEachern SN (1999). Dependent nonparametric processes. In ASA Proceedings of the Section on Bayesian Statistical Science, Alexandria, VA, Alexandria, VA.
  28. MacEachern SN (2000). Dependent Dirichlet processes. Department of Statistics, The Ohio State University, Columbus, OH.
  29. MacEachern SN and Muller P (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7, 223-238.
  30. McCullagh P and Yang J (2007). Stochastic classification models. In Proceedings of the International Congress of Mathematicians (Madrid, 2006), Madrid, 669-686.
  31. McLachlan GJ and Peel D (2000). Finite Mixture Models, John Wiley & Sons, New York.
  32. Muller P, Quintana F, Jara A, and Hanson T (2015). Bayesian Nonparametric Data Analysis, Springer, Cham.
  33. Muller P, Quintana F, and Rosner GL (2011). A product partition model with regression on covariates. Journal of Computational and Graphical Statistics, 20, 260-278. https://doi.org/10.1198/jcgs.2011.09066
  34. Murua A and Quintana FA (2017). Semiparametric Bayesian regression via Potts model. Journal of Computational and Graphical Statistics, 26, 265-274. https://doi.org/10.1080/10618600.2016.1172015
  35. Neal RM (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9, 249-265.
  36. Park JH and Dunson DB (2010). Bayesian generalized product partition model. Statistica Sinica, 20, 1203-1226.
  37. Pitman J (1996). Some developments of the Blackwell-MacQueen urn scheme, Statistics, Probability and Game Theory, 245-267, IMS Lecture Notes Monograph Series, 30, Institute of Mathematical Statistics, Hayward, CA.
  38. Richardson S and Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. Journal of the Royal Statistical Society, Series B, 59, 731-792. https://doi.org/10.1111/1467-9868.00095
  39. Sethuraman J (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639-650.
  40. SongW, YaoW, and Xing Y (2014). Robust mixture regression model fitting by Laplace distribution. Computational Statistics and Data Analysis, 71, 128-137. https://doi.org/10.1016/j.csda.2013.06.022
  41. Stephens M (2000) Dealing with label switching in mixture models. Journal of the Royal Statistical Society, Series B, 62, 795-809. https://doi.org/10.1111/1467-9868.00265
  42. Tokdar ST, Zhu YM, and Ghosh JK (2010). Bayesian density regression with logistic Gaussian process and subspace projection. Bayesian Analysis, 5, 319-344. https://doi.org/10.1214/10-BA605
  43. Tsanas A and Xifara A (2012). Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49, 560-567. https://doi.org/10.1016/j.enbuild.2012.03.003
  44. Quintana FA and Iglesias PL (2003). Bayesian clustering and product partition models. Journal of the Royal Statistical Society Series B (Statistical Methodology), 65, 557-574. https://doi.org/10.1111/1467-9868.00402
  45. Wolfe JH (1970). Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research, 5, 329-350. https://doi.org/10.1207/s15327906mbr0503_6