• Title/Summary/Keyword: Zero-Inflated Poisson Model

Search Result 43, Processing Time 0.028 seconds

Hurdle Model for Longitudinal Zero-Inflated Count Data Analysis (영과잉 경시적 가산자료 분석을 위한 허들모형)

  • Jin, Iktae;Lee, Keunbaik
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.923-932
    • /
    • 2014
  • The Hurdle model can to analyze zero-inflated count data. This model is a mixed model of the logit model for a binary component and a truncated Poisson model of a truncated count component. We propose a new hurdle model with a general heterogeneous random effects covariance matrix to analyze longitudinal zero-inflated count data using modified Cholesky decomposition. This decomposition factors the random effects covariance matrix into generalized autoregressive parameters and innovation variance. The parameters are modeled using (generalized) linear models and estimated with a Bayesian method. We use these methods to carefully analyze a real dataset.

Prediction of K-league soccer scores using bivariate Poisson distributions (이변량 포아송분포를 이용한 K-리그 골 점수의 예측)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1221-1229
    • /
    • 2014
  • In this paper we choose the best model among several bivariate Poisson models on Korean soccer data. The models considered allow for correlation between the number of goals of two competing teams. We use an R package called bivpois for bivariate Poisson regression models and the data of K-league for season 1983-2012. Finally we conclude that the best fitted model supported by the AIC and BIC is the bivariate Poisson model with constant covariance. The zero and diagonal inflated models did not improve the model fit. The model can be used to examine home-away effect, goodness of fit, attack and defense parameters.

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.

An Analysis on the Determinants of Employed Labour Quantity in the Fishing Industry (어가의 고용량 결정요인 분석)

  • Kim, Tae-Hyun;Park, Cheol-Hyung;Nam, Jongoh
    • Environmental and Resource Economics Review
    • /
    • v.27 no.3
    • /
    • pp.545-567
    • /
    • 2018
  • This study applied and compared Poisson model, negative binomial model, zero inflated Poisson model, and zero inflated negative binomial model to estimate determinants of employed labour quantity. To estimate each of models, this study used fisheries census data which were obtained at microdata integrated service running by Statistics Korea. The study selected zero inflated negative binomial model according to the Vuong test and Likelihood-ratio test. In addition, the study estimated fishing village's practical changes on employed labour quantity as analyzing changes from 2010 to 2015. The results showed that the household with fishing vessels and high selling price had a significant effect on decrease of the labour quantities. Meanwhile, the longer work experience of the household, the more significant the increase in the labour quantities. In conclusion, this study presented that capitalized fishing household and the acceleration of aging had a significant impact on the change in the labour quantities.

Sample size calculations for clustered count data based on zero-inflated discrete Weibull regression models

  • Hanna Yoo
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.1
    • /
    • pp.55-64
    • /
    • 2024
  • In this study, we consider the sample size determination problem for clustered count data with many zeros. In general, zero-inflated Poisson and binomial models are commonly used for zero-inflated data; however, in real data the assumptions that should be satisfied when using each model might be violated. We calculate the required sample size based on a discrete Weibull regression model that can handle both underdispersed and overdispersed data types. We use the Monte Carlo simulation to compute the required sample size. With our proposed method, a unified model with a low failure risk can be used to cope with the dispersed data type and handle data with many zeros, which appear in groups or clusters sharing a common variation source. A simulation study shows that our proposed method provides accurate results, revealing that the sample size is affected by the distribution skewness, covariance structure of covariates, and amount of zeros. We apply our method to the pancreas disorder length of the stay data collected from Western Australia.

Traffic Crash Prediction Models for Expressway Ramps (고속도로 연결로의 교통사고예측모형 개발)

  • Choi, Yoon-Hwan;Oh, Young-Tae;Choi, Kee-Choo;Lee, Choul-Ki;Yun, Il-Soo
    • International Journal of Highway Engineering
    • /
    • v.14 no.5
    • /
    • pp.133-143
    • /
    • 2012
  • PURPOSES: Using the collected data for crash, traffic volume, and design elements on ramps between 2007 and 2009, this research effort was initiated to develop traffic crash prediction models for expressway ramps. METHODS: Three negative binomial regression models and three zero-inflated negative binomial regression models were developed for individual ramp types, including direct, semi-direct and loop, respectively. For validating the developed models, authors compared the estimated crash frequencies with actual crash frequencies of twelve randomly selected interchanges, the ramps of which have not been used for model developing. RESULTS: The results show that the negative binomial regression models for direct, semi-direct and loop ramps showed 60.3%, 63.8% and 48.7% error rates on average whereas the zero-inflated negative binomial regression models showed 82.1%, 120.4% and 57.3%, respectively. CONCLUSIONS: Conclusively, the negative binomial regression models worked better in traffic crash prediction than the zero-inflated negative binomial regression models for estimating the frequency of traffic accidents on expressway ramps.

Fit of the number of insurance solicitor's turnovers using zero-inflated negative binomial regression (영과잉 음이항회귀 모형을 이용한 보험설계사들의 이직횟수 적합)

  • Chun, Heuiju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1087-1097
    • /
    • 2017
  • This study aims to find the best model to fit the number of insurance solicitor's turnovers of life insurance companies using count data regression models such as poisson regression, negative binomial regression, zero-inflated poisson regression, or zero-inflated negative binomial regression. Out of the four models, zero-inflated negative binomial model has been selected based on AIC and SBC criteria, which is due to over-dispersion and high proportion of zero-counts. The significant factors to affect insurance solicitor's turnover found to be a work period in current company, a total work period as financial planner, an affiliated corporation, and channel management satisfaction. We also have found that as the job satisfaction or the channel management satisfaction gets lower as channel management satisfaction, the number of insurance solicitor's turnovers increases. In addition, the total work period as financial planner has positive relationship with the number of insurance solicitor's turnovers, but the work period in current company has negative relationship with it.

Likelihood Ratio Test for the Epidemic Alternatives on the Zero-Inflated Poisson Model (변화시점이 있는 영과잉-포아송모형에서 돌출대립가설에 대한 우도비검정)

  • Kim, Kyung-Moo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.9 no.2
    • /
    • pp.247-253
    • /
    • 1998
  • In ease of the epidemic Zero-Inflated Poisson model, likelihood ratio test was used for testing epidemic alternatives. Epidemic changepoints were estimated by the method of least squares. It were used for starting points to estimate the maximum likelihood estimators. And several parameters were compared through the Monte Carlo simulations. As a result, maximum likelihood estimators for the epidemic chaagepoints and several parameters are better than the least squares and moment estimators.

  • PDF

Development of the U-turn Accident Model at 4-Legged Signalized Intersections in Urban Areas (도시부 4지 신호교차로 유턴 사고모형 개발)

  • Kang, JongHo;Kim, KyungWhan;Ha, ManBok;Kim, SeongMun
    • International Journal of Highway Engineering
    • /
    • v.16 no.2
    • /
    • pp.119-129
    • /
    • 2014
  • PURPOSES : The purpose of this study is to develop the U-turn accident model at 4-legged signalized intersections in urban areas. METHODS : In order to analyze the characteristics of the accidents which are associated with U-turn operation at 4-legged signalized intersections in urban areas and develop an U-turn accident model by regression analysis, the tests of overdispersion and zero-inflation are conducted about the dependent variables of number of accidents and EPDO (Equivalent Property Damage Only). RESULTS : As their results, the Poisson model fits best for number of accident and the ZIP (Zero Inflated Poisson) fits best for EPOD, the variables of conflict traffic, width of opposing road, traffic passing speed are adopted as independent variable for both models. The variables of number of bus berths and rate of U-turn signal time at which the U-turn is permitted are adopted as independent variable only for EPDO. CONCLUSIONS : These study results suggest that U-turn would be permitted at the intersection where the width of opposing road is wider than 11.9 meters, the passing vehicle speed is not high and U-turn operation is not hindered by the buses stopping at bus stops.

Bayesian Inference for the Zero In ated Negative Binomial Regression Model (제로팽창 음이항 회귀모형에 대한 베이지안 추론)

  • Shim, Jung-Suk;Lee, Dong-Hee;Jun, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.951-961
    • /
    • 2011
  • In this paper, we propose a Bayesian inference using the Markov Chain Monte Carlo(MCMC) method for the zero inflated negative binomial(ZINB) regression model. The proposed model allows the regression model for zero inflation probability as well as the regression model for the mean of the dependent variable. This extends the work of Jang et al. (2010) to the fully defiend ZINB regression model. In addition, we apply the proposed method to a real data example, and compare the efficiency with the zero inflated Poisson model using the DIC. Since the DIC of the ZINB is smaller than that of the ZIP, the ZINB model shows superior performance over the ZIP model in zero inflated count data with overdispersion.