• Title/Summary/Keyword: poisson regression models

Search Result 96, Processing Time 0.027 seconds

Prediction of K-league soccer scores using bivariate Poisson distributions (이변량 포아송분포를 이용한 K-리그 골 점수의 예측)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1221-1229
    • /
    • 2014
  • In this paper we choose the best model among several bivariate Poisson models on Korean soccer data. The models considered allow for correlation between the number of goals of two competing teams. We use an R package called bivpois for bivariate Poisson regression models and the data of K-league for season 1983-2012. Finally we conclude that the best fitted model supported by the AIC and BIC is the bivariate Poisson model with constant covariance. The zero and diagonal inflated models did not improve the model fit. The model can be used to examine home-away effect, goodness of fit, attack and defense parameters.

Developing the Pedestrian Accident Models of Intersections using Tobit Model (토빗모형을 이용한 교차로 보행자 사고모형 개발)

  • Lee, Seung Ju;Lim, Jin Kang;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.5
    • /
    • pp.154-159
    • /
    • 2014
  • This study deals with the pedestrian accidents of intersections in case of Cheongju. The objective is to develop the pedestrian accident models using Tobit regression model. In pursuing the above, the pedestrian accident data from 2007 to 2011 were collected from TAAS data set of Road Traffic Authority. To analyze the accident, Poisson, negative binomial and Tobit regression models were utilized in this study. The dependent variable were the number of accident by intersection. Independent variables are traffic volume, intersection geometric structure and the transportation facility. The main results were as follows. First, Tobit model was judged to be more appropriate model than other models. Also, these models were analyzed to be statistically significant. Second, such the main variables related to accidents as traffic volume, pedestrian volume, number of traffic island, crossing length and the pedestrian countdown signal systems were adopted in the above model.

Bayesian Analysis for the Zero-inflated Regression Models (영과잉 회귀모형에 대한 베이지안 분석)

  • Jang, Hak-Jin;Kang, Yun-Hee;Lee, S.;Kim, Seong-W.
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.603-613
    • /
    • 2008
  • We often encounter the situation that discrete count data have a large portion of zeros. In this case, it is not appropriate to analyze the data based on standard regression models such as the poisson or negative binomial regression models. In this article, we consider Bayesian analysis for two commonly used models. They are zero-inflated poisson and negative binomial regression models. We use the Bayes factor as a model selection tool and computation is proceeded via Markov chain Monte Carlo methods. Crash count data are analyzed to support theoretical results.

Developing Rear-End Collision Models of Roundabouts in Korea (국내 회전교차로의 추돌사고 모형 개발)

  • Park, Byung Ho;Beak, Tae Hun
    • Journal of the Korean Society of Safety
    • /
    • v.29 no.6
    • /
    • pp.151-157
    • /
    • 2014
  • This study deals with the rear-end collision at roundabouts. The purpose of this study is to develop the accident models of rear-end collision in Korea. In pursuing the above, this study gives particular attention to developing the appropriate models using Poisson, negative binomial model, ZAM, multiple linear and nonlinear regression models, and statistical analysis tools. The main results are as follows. First, the Vuong statistics and overdispersion parameters indicate that ZIP is the most appropriate model among count data models. Second, RMSE, MPB, MAD and correlation coefficient tests show that the multiple nonlinear model is the most suitable to the rear-end collision data. Finally, such the independent variables as traffic volume, ratio of heavy vehicle, number of circulatory roadway lane, number of crosswalk and stop line are adopted in the optimal model.

Bayesian Analysis of a Zero-inflated Poisson Regression Model: An Application to Korean Oral Hygienic Data (영과잉 포아송 회귀모형에 대한 베이지안 추론: 구강위생 자료에의 적용)

  • Lim, Ah-Kyoung;Oh, Man-Suk
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.505-519
    • /
    • 2006
  • We consider zero-inflated count data, which is discrete count data but has too many zeroes compared to the Poisson distribution. Zero-inflated data can be found in various areas. Despite its increasing importance in practice, appropriate statistical inference on zero-inflated data is limited. Classical inference based on a large number theory does not fit unless the sample size is very large. And regular Poisson model shows lack of St due to many zeroes. To handle the difficulties, a mixture of distributions are considered for the zero-inflated data. Specifically, a mixture of a point mass at zero and a Poisson distribution is employed for the data. In addition, when there exist meaningful covariates selected to the response variable, loglinear link is used between the mean of the response and the covariates in the Poisson distribution part. We propose a Bayesian inference for the zero-inflated Poisson regression model by using a Markov Chain Monte Carlo method. We applied the proposed method to a Korean oral hygienic data and compared the inference results with other models. We found that the proposed method is superior in that it gives small parameter estimation error and more accurate predictions.

Application of discrete Weibull regression model with multiple imputation

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.325-336
    • /
    • 2019
  • In this article we extend the discrete Weibull regression model in the presence of missing data. Discrete Weibull regression models can be adapted to various type of dispersion data however, it is not widely used. Recently Yoo (Journal of the Korean Data and Information Science Society, 30, 11-22, 2019) adapted the discrete Weibull regression model using single imputation. We extend their studies by using multiple imputation also with several various settings and compare the results. The purpose of this study is to address the merit of using multiple imputation in the presence of missing data in discrete count data. We analyzed the seventh Korean National Health and Nutrition Examination Survey (KNHANES VII), from 2016 to assess the factors influencing the variable, 1 month hospital stay, and we compared the results using discrete Weibull regression model with those of Poisson, negative Binomial and zero-inflated Poisson regression models, which are widely used in count data analyses. The results showed that the discrete Weibull regression model using multiple imputation provided the best fit. We also performed simulation studies to show the accuracy of the discrete Weibull regression using multiple imputation given both under- and over-dispersed distribution, as well as varying missing rates and sample size. Sensitivity analysis showed the influence of mis-specification and the robustness of the discrete Weibull model. Using imputation with discrete Weibull regression to analyze discrete data will increase explanatory power and is widely applicable to various types of dispersion data with a unified model.

Regression models generated by gamma random variables with long-term survivors

  • Ortega, Edwin M.M.;Cordeiro, Gauss M.;Hashimoto, Elizabeth M.;Suzuki, Adriano K.
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.1
    • /
    • pp.43-65
    • /
    • 2017
  • We propose a flexible cure rate survival model by assuming that the number of competing causes of the event of interest has the Poisson distribution and the time for the event follows the gamma-G family of distributions. The extended family of gamma-G failure-time models with long-term survivors is flexible enough to include many commonly used failure-time distributions as special cases. We consider a frequentist analysis for parameter estimation and derive appropriate matrices to assess local influence on the parameters. Further, various simulations are performed for different parameter settings, sample sizes and censoring percentages. We illustrate the performance of the proposed regression model by means of a data set from the medical area (gastric cancer).

A Causation Study for car crashes at Rural 4-legged Signalized Intersections Using Nonlinear Regression and Structural Equation Methods (비선형 회귀분석과 구조방정식을 이용한 지방부 4지 신호교차로의 사고요인분석)

  • Oh, Ju Taek;Kweon, Ihl;Hwang, Jeong Won
    • Journal of Korean Society of Transportation
    • /
    • v.31 no.1
    • /
    • pp.65-76
    • /
    • 2013
  • Traffic accidents at signalized intersections have been increased annually so that it is required to examine the causation to reduce the accidents. However, the current existing accident models were developed mainly by using non-linear regression models such as Poisson methods. These non-linear regression methods lack to reveal the complicated causation for traffic accidents, though they are the right choice to study randomness and non-linearity of accidents. Therefore, it is required to utilize another statistical method to make up for the lack of the non-linear regression methods. This study developed accident prediction models for 4 legged signalized intersections with Poisson methods and compared them with structural equation models. This study used structural equation methods to reveal the complicated causation of traffic accidents, because the structural equation method has merits to explain more causational factors for accidents than others.

Developing the Accident Models of Cheongju Arterial Link Sections Using ZAM Model (ZAM 모형을 이용한 청주시 간선가로 구간의 사고모형 개발)

  • Park, Byung-Ho;Kim, Jun-Yong
    • International Journal of Highway Engineering
    • /
    • v.12 no.2
    • /
    • pp.43-49
    • /
    • 2010
  • This study deals with the traffic accident of the Cheongju arterial link sections. The purpose of the study is to develop the traffic accident model. In pursuing the above, this study gives particular attentions to developing the ZAM(zero-altered model) model using the accident data of arterial roads devided by 322 small link sections. The main results analyzed by ZIP(zero inflated Poisson model) and ZINB(zero inflated negative binomial model) which are the methods of ZAM, are as follows. First, the evaluation of various developed models by the Vuong statistic and t statistic for overdispersion parameter ${\alpha}$ shows that ZINB is analyzed to be optimal among Poisson, NB, ZIP(zero-inflated Poisson) and ZINB regression models. Second, ZINB is evaluated to be statistically significant in view of t, ${\rho}$ and ${\rho}^2$ (0.63) values compared to other models. Finally, the accident factors of ZINB models are developed to be the traffic volume(ADT), number of entry/exit and length of median. The traffic volume(ADT) and the number of entry/exit are evaluated to be the '+' factors and the length of median to be '-' factor of the accident.

Semiparametric Kernel Poisson Regression for Longitudinal Count Data

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.1003-1011
    • /
    • 2008
  • Mixed-effect Poisson regression models are widely used for analysis of correlated count data such as those found in longitudinal studies. In this paper, we consider kernel extensions with semiparametric fixed effects and parametric random effects. The estimation is through the penalized likelihood method based on kernel trick and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of hyperparameters, cross-validation techniques are employed. Examples illustrating usage and features of the proposed method are provided.