DOI QR코드

DOI QR Code

Bayesian Conway-Maxwell-Poisson (CMP) regression for longitudinal count data

  • Morshed Alam (Department of Biostatistics, University of Nebraska Medical Center) ;
  • Yeongjin Gwon (Department of Biostatistics, University of Nebraska Medical Center) ;
  • Jane Meza (Department of Biostatistics, University of Nebraska Medical Center)
  • Received : 2022.10.03
  • Accepted : 2023.03.22
  • Published : 2023.05.31

Abstract

Longitudinal count data has been widely collected in biomedical research, public health, and clinical trials. These repeated measurements over time on the same subjects need to account for an appropriate dependency. The Poisson regression model is the first choice to model the expected count of interest, however, this may not be an appropriate when data exhibit over-dispersion or under-dispersion. Recently, Conway-Maxwell-Poisson (CMP) distribution is popularly used as the distribution offers a flexibility to capture a wide range of dispersion in the data. In this article, we propose a Bayesian CMP regression model to accommodate over and under-dispersion in modeling longitudinal count data. Specifically, we develop a regression model with random intercept and slope to capture subject heterogeneity and estimate covariate effects to be different across subjects. We implement a Bayesian computation via Hamiltonian MCMC (HMCMC) algorithm for posterior sampling. We then compute Bayesian model assessment measures for model comparison. Simulation studies are conducted to assess the accuracy and effectiveness of our methodology. The usefulness of the proposed methodology is demonstrated by a well-known example of epilepsy data.

Keywords

References

  1. Albert J (1992). A Bayesian Bayesian analysis of a poisson random effects model for home run hitters, The American Statistician, 46, 246-253. https://doi.org/10.1080/00031305.1992.10475898
  2. Alvarez I, Niemi J, and Simpson M (2014). Bayesian inference for a covariance matrix, Conference on Applied Statistics in Agriculture 2014, 26, 71-82, Available from: arXiv preprint arXiv:1408.4050
  3. Barnard J, McCulloch R, and Meng XL (2000). Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage, Statistica Sinica, 10, 1281-1311.
  4. Betancourt M, Byrne S, Livingstone S, and Girolami M (2017). The geometric foundations of hamiltonian monte carlo, Bernoulli, 23, 2257-2298. https://doi.org/10.3150/16-BEJ810
  5. Carpenter B, Gelman A, Hoffman MD et al. (2017). Stan: A probabilistic programming language, Journal of Statistical Software, 76, 1-32. https://doi.org/10.18637/jss.v076.i01
  6. Chen MH, Shao QM, and Ibrahim JG (2000). Monte Carlo Methods in Bayesian Computation, Springer, New York.
  7. Choo-Wosoba H, Gaskins J, Levy S, and Datta S (2018). A Bayesian approach for analyzing zero-inflated clustered count data with dispersion, Statistics in Medicine, 37, 801-812. https://doi.org/10.1002/sim.7541
  8. Consul PC and Jain GC (2004). A generalization of the poisson distribution, Technometrics, 15, 791-799. https://doi.org/10.1080/00401706.1973.10489112
  9. Conway RW and Maxwell WL (1962). A queuing model with state dependent service rates, Journal of Industrial Engineering, 12, 132-136.
  10. del Castillo J and P'erez-Casany M (2005). Overdispersed and underdispersed poisson generalizations, Journal of Statistical Planning and Inference, 134, 486-500. https://doi.org/10.1016/j.jspi.2004.04.019
  11. Famoye F (1993). Restricted generalized poisson regression model, Communications in Statistics (Theory and Methods), 22, 1335-1354. https://doi.org/10.1080/03610929308831089
  12. Famoye F, Wulu JT, and Singh KP (2004). On the generalized poisson regression model with an application to accident data, Journal of Data Science, 2, 287-295. https://doi.org/10.6339/JDS.2004.02(3).167
  13. Fitzmaurice G, Davidian M, Verbeke G, and Molenberghs G (2008). Longitudinal Data Analysis, Columbia University, New York.
  14. Geisser S and Eddy WF (1979). A predictive approach to model selection, Journal of the American Statistical Association, 74, 153-160. https://doi.org/10.1080/01621459.1979.10481632
  15. Gelman A (2006). Prior distributions for variance parameters in hierarchical models, Bayesian Analysis, 1, 515-533. https://doi.org/10.1214/06-BA117A
  16. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, and Rubin DB (2013). Bayesian Data Analysis, Chapman and Hall/CRC, Boca Raton, Florida, USA.
  17. Guikema S and Goffelt J (2008). A flexible count data regression model for risk analysis, Risk Analysis, 28, 213-223. https://doi.org/10.1111/j.1539-6924.2008.01014.x
  18. Hedeker D and Gibbons RD (2006). Longitudinal Data Analysis, volume 451, JohnWiley & Sons, Hoboken, New Jersey, USA.
  19. Hoffman M and Gelman A (2014). The No-U-Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo, Journal of Machine Learning Research, 15, 1593-1623.
  20. Huang A and Wand MP (2013). Simple marginally noninformative prior distributions for covariance matrices, International Society for Bayesian Analysis, 8, 439-452. https://doi.org/10.1214/13-BA815
  21. Ibarhim JG, Chen MH, and Shiha D (2001). Bayesian Survival Analysis, Springer, New York.
  22. Leppik I, Dreifuss F, Porter R et al. (1987). A controlled study of progabide in partial seizures: Methodology and results, Neurology, 37, 963-968. https://doi.org/10.1212/WNL.37.6.963
  23. Lewandowski D, Kurowicka D, and Joe H (2009). Generating random correlation matrices based on vines and extended onion method, Journal of Multivariate Analysis, 100, 1989-2001. https://doi.org/10.1016/j.jmva.2009.04.008
  24. Morris D, Sellers K, and Menger A (2017). Fitting a flexible model for longitudinal count data using the NLMIXED procedure, SAS Global Forum Paper, 202, 1-6.
  25. Neal R (2011). MCMC Using Hamiltonian Dynamics (Handbook of Markov Chain Monte Carlo), CRC Press, Boca Raton, Florida.
  26. Neelon B (2019). Bayesian zero-inflated negative binomial regression based on Polya-Gamma mixtures, Bayesian Analysis, 14, 829-855. https://doi.org/10.1214/18-BA1132
  27. O'Malley AJ and Zaslavsky AM (2008). Domain-Level covariance analysis for multilevel survey data with structured nonresponse, Journal of the American Statistical Association, 103, 1405-1418. https://doi.org/10.1198/016214508000000724
  28. Ridout MS and Besbeas P (2004). An empirical model for underdispersed count data, Statistical Modelling, 4, 77-89. https://doi.org/10.1191/1471082X04st064oa
  29. Sellers KF, Borle S, and Shmueli G (2012). The com-poisson model for count data: A survey of methods and applications, Applied Stochastic Models in Business and Industry, 28, 104-116. https://doi.org/10.1002/asmb.918
  30. Sellers KF and Morris DS (2017). Underdispersion models: Models that are "under the radar", Communications in Statistics (Theory and Methods), 46, 12075-12086. https://doi.org/10.1080/03610926.2017.1291976
  31. Sellers KF and Shmueli G (2010). A flexible regression model for count data, The Annals of Applied Statistics, 4, 943-961. https://doi.org/10.1214/09-AOAS306
  32. Shmueli G, Minka TP, Kadane JB, Borle S, and Boatwright P (2005). A useful distribution for fitting discrete data: Revival of the Conway-Maxwell-Poisson distribution, Journal of the Royal Statistical Society: Series C (Applied Statistics), 54, 127-142. https://doi.org/10.1111/j.1467-9876.2005.00474.x
  33. Spiegelhalter DJ, Best NG, Carlin BP, and Van Der Linde A (2002). Bayesian measures of model complexity and fit, Journal of the Royal Statistical Society: Series B (statistical methodology), 64, 583-639. https://doi.org/10.1111/1467-9868.00353
  34. Thall P and Vail S (1990). Some covariance models for longitudinal count data with overdispersion, Biometrics, 46, 657-671. https://doi.org/10.2307/2532086
  35. Tokuda T, Goodrich B, Van Mechelen I, Gelman A, and Tuerlinckx F (2011). Visualizing distributions of covariance matrices, Columbia University, New York, USA, 1, 1-30.
  36. Tsonaka R and Spittle P (2020). Negative binomial mixed models estimated with the maximum likelihood method can be used for longitudinal rnaseq data, Bioinformatics, 22, Available from: http://doi.org/10.1093/bib/bbaa264
  37. Wang Z, Wu Y, and Chu H (2018). On equivalence of the LKJ distribution and the restricted Wishart distribution, arXiv: Computation, Available from: arXiv preprint arXiv:1809.04746
  38. Wu J, Chen MH, Schifano E, Ibrahim JG, and Fisher J (2019). A new Bayesian joint model for longitudinal count data with many zeros, intermittent missingness, and dropout with applications to HIV prevention trials, Statistics in Medicine, 38, 5565-5586. https://doi.org/10.1002/sim.8379
  39. Zhang D, Chen MH, Ibrahim JG, Boye ME, and Shen W (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials, Journal of Computational and Graphical Statistics, 26, 121-133. https://doi.org/10.1080/10618600.2015.1117472
  40. Zhang X, Pei Y, Zhang L, Gun B, Pendegraft A, Zhuang W, and Yi N (2018). Negative binomial mixed models for analyzing longitudinal microbiome data, Frontiers in Microbiology, 9, Available from: http://doi.org/10.3389/fmicb.2018.01683