• Title/Summary/Keyword: Count model

Search Result 514, Processing Time 0.024 seconds

Bayesian Conway-Maxwell-Poisson (CMP) regression for longitudinal count data

  • Morshed Alam ;Yeongjin Gwon ;Jane Meza
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.3
    • /
    • pp.291-309
    • /
    • 2023
  • Longitudinal count data has been widely collected in biomedical research, public health, and clinical trials. These repeated measurements over time on the same subjects need to account for an appropriate dependency. The Poisson regression model is the first choice to model the expected count of interest, however, this may not be an appropriate when data exhibit over-dispersion or under-dispersion. Recently, Conway-Maxwell-Poisson (CMP) distribution is popularly used as the distribution offers a flexibility to capture a wide range of dispersion in the data. In this article, we propose a Bayesian CMP regression model to accommodate over and under-dispersion in modeling longitudinal count data. Specifically, we develop a regression model with random intercept and slope to capture subject heterogeneity and estimate covariate effects to be different across subjects. We implement a Bayesian computation via Hamiltonian MCMC (HMCMC) algorithm for posterior sampling. We then compute Bayesian model assessment measures for model comparison. Simulation studies are conducted to assess the accuracy and effectiveness of our methodology. The usefulness of the proposed methodology is demonstrated by a well-known example of epilepsy data.

Count Data Model for The Estimation of Bus Ridership (Focusing on Commuters and Students in Seoul) (가산자료모형(Count Data Model)을 이용한 버스이용횟수추정에 관한 연구 (서울시 통근.통학자를 대상으로))

  • 문진수;김순관;임강원
    • Journal of Korean Society of Transportation
    • /
    • v.17 no.5
    • /
    • pp.123-135
    • /
    • 1999
  • The rapid increase of Passenger cars which is caused by the discomfort of Public transit and the Preference of automobiles is the major factor of increasing traffic congestions in Seoul With the point that leading the automobilists to the Public transit can be the most important Policy to ease these traffic congestions, this study focuses on the behavioral aspects of company employees and university students and investigates factors influencing bus ridership. To be brief, by estimating bus ridership through count models, this study investigates factors which influence bus ridership and elicits Political suggestions which lead automobilists to Public transit. The Purpose in this study is the application of appropriate count data model. The count data models have been widely applied to the economic area from the middle of the 1980s and to transportation aspect mainly in the foreign countries from the latter half of the 1980s. Even though a few studies in this country employed count data model to count data. all of them were Poisson regression models without suitable tests for the importance of the model specification. In the end, as the result of statistical test, negative binomial regression model which is suitable for overdispersed data was found to be appropriate for the data of weekly bus ridership. To emphasize the importance of model specification, both of poisson regression model and negative binomial regression model were estimated and the results were compared.

  • PDF

Application of Bootstrap Method to Primary Model of Microbial Food Quality Change

  • Lee, Dong-Sun;Park, Jin-Pyo
    • Food Science and Biotechnology
    • /
    • v.17 no.6
    • /
    • pp.1352-1356
    • /
    • 2008
  • Bootstrap method, a computer-intensive statistical technique to estimate the distribution of a statistic was applied to deal with uncertainty and variability of the experimental data in stochastic prediction modeling of microbial growth on a chill-stored food. Three different bootstrapping methods for the curve-fitting to the microbial count data were compared in determining the parameters of Baranyi and Roberts growth model: nonlinear regression to static version function with resampling residuals onto all the experimental microbial count data; static version regression onto mean counts at sampling times; dynamic version fitting of differential equations onto the bootstrapped mean counts. All the methods outputted almost same mean values of the parameters with difference in their distribution. Parameter search according to the dynamic form of differential equations resulted in the largest distribution of the model parameters but produced the confidence interval of the predicted microbial count close to those of nonlinear regression of static equation.

A Bayesian zero-inflated Poisson regression model with random effects with application to smoking behavior (랜덤효과를 포함한 영과잉 포아송 회귀모형에 대한 베이지안 추론: 흡연 자료에의 적용)

  • Kim, Yeon Kyoung;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.287-301
    • /
    • 2018
  • It is common to encounter count data with excess zeros in various research fields such as the social sciences, natural sciences, medical science or engineering. Such count data have been explained mainly by zero-inflated Poisson model and extended models. Zero-inflated count data are also often correlated or clustered, in which random effects should be taken into account in the model. Frequentist approaches have been commonly used to fit such data. However, a Bayesian approach has advantages of prior information, avoidance of asymptotic approximations and practical estimation of the functions of parameters. We consider a Bayesian zero-inflated Poisson regression model with random effects for correlated zero-inflated count data. We conducted simulation studies to check the performance of the proposed model. We also applied the proposed model to smoking behavior data from the Regional Health Survey (2015) of the Korea Centers for disease control and prevention.

A joint modeling of longitudinal zero-inflated count data and time to event data (경시적 영과잉 가산자료와 생존자료의 결합모형)

  • Kim, Donguk;Chun, Jihun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1459-1473
    • /
    • 2016
  • Both longitudinal data and survival data are collected simultaneously in longitudinal data which are observed throughout the passage of time. In this case, the effect of the independent variable becomes biased (provided that sole use of longitudinal data analysis does not consider the relation between both data used) if the missing that occurred in the longitudinal data is non-ignorable because it is caused by a correlation with the survival data. A joint model of longitudinal data and survival data was studied as a solution for such problem in order to obtain an unbiased result by considering the survival model for the cause of missing. In this paper, a joint model of the longitudinal zero-inflated count data and survival data is studied by replacing the longitudinal part with zero-inflated count data. A hurdle model and proportional hazards model were used for each longitudinal zero inflated count data and survival data; in addition, both sub-models were linked based on the assumption that the random effect of sub-models follow the multivariate normal distribution. We used the EM algorithm for the maximum likelihood estimator of parameters and estimated standard errors of parameters were calculated using the profile likelihood method. In simulation, we observed a better performance of the joint model in bias and coverage probability compared to the separate model.

The Economic and Social Implication of Count Regression Models for Married Women's Completed Fertility in Korea (우리나라 가구의 자녀수 결정요인에 관한 Count 모형 분석 및 경제적 함의)

  • Kim, Hyun-Sook
    • Korea journal of population studies
    • /
    • v.30 no.3
    • /
    • pp.107-135
    • /
    • 2007
  • This paper uses a static Gamma count model, a traditional hurdle model and an endogenous switching Poisson model, respectively for determining married women's completed fertility rates in Korea. This paper analyzes the impact of household income, women's wage and education, and women's job market participation on the number of children of married women above age 40 and on the expected number of children of women aged below 40. The paper shows that a household income significantly increases the number of children for at least women aged above 40, however, this income effect is disappearing for younger generation. The empirical model suggests that women having a job tend to have fewer children for a group 39 years old and below and find that there is an endogeneity problem between child birth and labor force participation, too. The education level of married women gives a positive effect for giving a birth, itself, while it gives a negative impact on the number of children. Based on the empirical results, it concludes that Becker's Quantity-Quality theory works for Korea, too.

Estimating the Economic Value of Recreation Sea Fishing in the Yellow Sea: An Application of Count Data Model (가산자료모형을 이용한 서해 태안군 유어객의 편익추정)

  • Choi, Jong Du
    • Environmental and Resource Economics Review
    • /
    • v.23 no.2
    • /
    • pp.331-347
    • /
    • 2014
  • The purpose of this study is to estimate the economic value of the recreational sea fishing in the Yellow Sea using count data model. For estimating consumer surplus, we used several count data model of travel cost recreation demand such as a poisson model(PM), a negative binomial model(NBM), a truncated poisson model(TPM), and a truncated negative binomial model(TNBM). Model results show that there is no exist the over-dispersion problem and a NBM was statistically more suitable than the other models. All parameters estimated are statistically significant and theoretically valid. The NBM was applied to estimate the travel demand and consumer surplus. The consumer surplus pre trip was estimated to be 254,453won, total consumer surplus per person and per year 1,536,896won.

An Analysis of Panel Count Data from Multiple random processes

  • Park, You-Sung;Kim, Hee-Young
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.265-272
    • /
    • 2002
  • An Integer-valued autoregressive integrated (INARI) model is introduced to eliminate stochastic trend and seasonality from time series of count data. This INARI extends the previous integer-valued ARMA model. We show that it is stationary and ergodic to establish asymptotic normality for conditional least squares estimator. Optimal estimating equations are used to reflect categorical and serial correlations arising from panel count data and variations arising from three random processes for obtaining observation into estimation. Under regularity conditions for martingale sequence, we show asymptotic normality for estimators from the estimating equations. Using cancer mortality data provided by the U.S. National Center for Health Statistics (NCHS), we apply our results to estimate the probability of cells classified by 4 causes of death and 6 age groups and to forecast death count of each cell. We also investigate impact of three random processes on estimation.

  • PDF

Application of discrete Weibull regression model with multiple imputation

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.325-336
    • /
    • 2019
  • In this article we extend the discrete Weibull regression model in the presence of missing data. Discrete Weibull regression models can be adapted to various type of dispersion data however, it is not widely used. Recently Yoo (Journal of the Korean Data and Information Science Society, 30, 11-22, 2019) adapted the discrete Weibull regression model using single imputation. We extend their studies by using multiple imputation also with several various settings and compare the results. The purpose of this study is to address the merit of using multiple imputation in the presence of missing data in discrete count data. We analyzed the seventh Korean National Health and Nutrition Examination Survey (KNHANES VII), from 2016 to assess the factors influencing the variable, 1 month hospital stay, and we compared the results using discrete Weibull regression model with those of Poisson, negative Binomial and zero-inflated Poisson regression models, which are widely used in count data analyses. The results showed that the discrete Weibull regression model using multiple imputation provided the best fit. We also performed simulation studies to show the accuracy of the discrete Weibull regression using multiple imputation given both under- and over-dispersed distribution, as well as varying missing rates and sample size. Sensitivity analysis showed the influence of mis-specification and the robustness of the discrete Weibull model. Using imputation with discrete Weibull regression to analyze discrete data will increase explanatory power and is widely applicable to various types of dispersion data with a unified model.

Automatic order selection procedure for count time series models (계수형 시계열 모형을 위한 자동화 차수 선택 알고리즘)

  • Ji, Yunmi;Seong, Byeongchan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.2
    • /
    • pp.147-160
    • /
    • 2020
  • In this paper, we study an algorithm that automatically determines the orders of past observations and conditional mean values that play an important role in count time series models. Based on the orders of the ARIMA model, the algorithm constitutes the order candidates group for time series generalized linear models and selects the final model based on information criterion among the combinations of the order candidates group. To evaluate the proposed algorithm, we perform small simulations and empirical analysis according to underlying models and time series as well as compare forecasting performances with the ARIMA model. The results of the comparison confirm that the time series generalized linear model offers better performance than the ARIMA model for the count time series analysis. In addition, the empirical analysis shows better performance in mid and long term forecasting than the ARIMA model.