• Title/Summary/Keyword: count data model

Search Result 235, Processing Time 0.025 seconds

Model Checking for Time-Series Count Data

  • Lee, Sung-Im
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.359-364
    • /
    • 2005
  • This paper considers a specification test of conditional Poisson regression model for time series count data. Although conditional models for count data have received attention and proposed in several ways, few studies focused on checking its adequacy. Motivated by the test of martingale difference assumption, a specification test via Ljung-Box statistic is proposed in the conditional model of the time series count data. In order to illustrate the performance of Ljung- Box test, simulation results will be provided.

Modeling clustered count data with discrete weibull regression model

  • Yoo, Hanna
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.4
    • /
    • pp.413-420
    • /
    • 2022
  • In this study we adapt discrete weibull regression model for clustered count data. Discrete weibull regression model has an attractive feature that it can handle both under and over dispersion data. We analyzed the eighth Korean National Health and Nutrition Examination Survey (KNHANES VIII) from 2019 to assess the factors influencing the 1 month outpatient stay in 17 different regions. We compared the results using clustered discrete Weibull regression model with those of Poisson, negative binomial, generalized Poisson and Conway-maxwell Poisson regression models, which are widely used in count data analyses. The results show that the clustered discrete Weibull regression model using random intercept model gives the best fit. Simulation study is also held to investigate the performance of the clustered discrete weibull model under various dispersion setting and zero inflated probabilities. In this paper it is shown that using a random effect with discrete Weibull regression can flexibly model count data with various dispersion without the risk of making wrong assumptions about the data dispersion.

Estimating the Economic Value of the Songieong Beach Using A Count Data Model: - Off-season Estimating Value of the Beach - (가산자료모형을 이용한 송정 해수욕장의 경제적 가치추정: - 비수기 해수욕장의 가치추정 -)

  • Heo, Yun-Jeong;Lee, Seung-Lae
    • The Journal of Fisheries Business Administration
    • /
    • v.38 no.2
    • /
    • pp.79-101
    • /
    • 2007
  • The purpose of this study is to estimate the economic value of the Songieong Beach in Off-season, using a Individual Travel Cost Model(ITCM). Songieong Beach is located in Busan but far away from city. These days, however, the increased rate of traffic inflow to the Songieong beach and the five-day working week are reflected in the trend analysis. Moreover, people have changed psychological value. For that reason, visitors are on the increase on the beach in off-season. The ITCM is applied to estimate non-market value or environmental Good like a Contingent Valuation Method and Hedonic Price Model etc. The ITCM was derived from the Count Data Model(i.e. Poisson and Negative Binomial model). So this paper compares Poisson and negative binomial count data models to measure the tourism demands. The data for the study were collected from the Songjeong Beach on visitors over the a week from November 1 through November 23, 2006. Interviewers were instructed to interview only individuals. So the sample was taken in 113. A dependent variable that is defined on the non-negative integers and subject to sampling truncation is the result of a truncated count data process. This paper analyzes the effects of determinants on visitors' demand for exhibition using a class of maximum-likelihood regression estimators for count data from truncated samples, The count data and truncated models are used primarily to explain non-negative integer and truncation properties of tourist trips as suggested by the economic valuation literature. The results suggest that the truncated negative binomial model is improved overdispersion problem and more preferred than the other models in the study. This paper is not the same as the others. One thing is that Estimating Value of the Beach in off-season. The other thing is this study emphasizes in particular 'travel cost' that is not only monetary cost but also including opportunity cost of 'travel time'. According to the truncated negative binomial model, estimates the Consumer Surplus(CS) values per trip of about 199,754 Korean won and the total economic value was estimated to be 1,288,680 Korean won.

  • PDF

Developing the Pedestrian Accident Models Using Tobit Model (토빗모형을 이용한 가로구간 보행자 사고모형 개발)

  • Lee, Seung Ju;Kim, Yun Hwan;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.16 no.3
    • /
    • pp.101-107
    • /
    • 2014
  • PURPOSES : This study deals with the pedestrian accidents in case of Cheongju. The goals are to develop the pedestrian accident model. METHODS : To analyze the accident, count data models, truncated count data models and Tobit regression models are utilized in this study. The dependent variable is the number of accident. Independent variables are traffic volume, intersection geometric structure and the transportation facility. RESULTS : The main results are as follows. First, Tobit model was judged to be more appropriate model than other models. Also, these models were analyzed to be statistically significant. Second, such the main variables related to accidents as traffic volume, pedestrian volume, number of Entry/exit, number of crosswalk and bus stop were adopted in the above model. CONCLUSIONS : The optimal model for pedestrian accidents is evaluated to be Tobit model.

A Zero-Inated Model for Insurance Data (제로팽창 모형을 이용한 보험데이터 분석)

  • Choi, Jong-Hoo;Ko, In-Mi;Cheon, Soo-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.3
    • /
    • pp.485-494
    • /
    • 2011
  • When the observations can take only the non-negative integer values, it is called the count data such as the numbers of car accidents, earthquakes, or insurance coverage. In general, the Poisson regression model has been used to model these count data; however, this model has a weakness in that it is restricted by the equality of the mean and the variance. On the other hand, the count data often tend to be too dispersed to allow the use of the Poisson model in practice because the variance of data is significantly larger than its mean due to heterogeneity within groups. When overdispersion is not taken into account, it is expected that the resulting parameter estimates or standard errors will be inefficient. Since coverage is the main issue for insurance, some accidents may not be covered by insurance, and the number covered by insurance may be zero. This paper considers the zero-inflated model for the count data including many zeros. The performance of this model has been investigated by using of real data with overdispersion and many zeros. The results indicate that the Zero-Inflated Negative Binomial Regression Model performs the best for model evaluation.

A Bayesian joint model for continuous and zero-inflated count data in developmental toxicity studies

  • Hwang, Beom Seuk
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.239-250
    • /
    • 2022
  • In many applications, we frequently encounter correlated multiple outcomes measured on the same subject. Joint modeling of such multiple outcomes can improve efficiency of inference compared to independent modeling. For instance, in developmental toxicity studies, fetal weight and number of malformed pups are measured on the pregnant dams exposed to different levels of a toxic substance, in which the association between such outcomes should be taken into account in the model. The number of malformations may possibly have many zeros, which should be analyzed via zero-inflated count models. Motivated by applications in developmental toxicity studies, we propose a Bayesian joint modeling framework for continuous and count outcomes with excess zeros. In our model, zero-inflated Poisson (ZIP) regression model would be used to describe count data, and a subject-specific random effects would account for the correlation across the two outcomes. We implement a Bayesian approach using MCMC procedure with data augmentation method and adaptive rejection sampling. We apply our proposed model to dose-response analysis in a developmental toxicity study to estimate the benchmark dose in a risk assessment.

Modelling Count Responses with Overdispersion

  • Jeong, Kwang Mo
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.761-770
    • /
    • 2012
  • We frequently encounter outcomes of count that have extra variation. This paper considers several alternative models for overdispersed count responses such as a quasi-Poisson model, zero-inflated Poisson model and a negative binomial model with a special focus on a generalized linear mixed model. We also explain various goodness-of-fit criteria by discussing their appropriateness of applicability and cautions on misuses according to the patterns of response categories. The overdispersion models for counts data have been explained through two examples with different response patterns.

Analysis on the Effects of the Informatization Level on SMEs through Count Data Model (Count Data Model을 이용한 중소기업의 정보화 효과 분석)

  • Hwang, Soon Hwan
    • Journal of Information Technology Services
    • /
    • v.3 no.1
    • /
    • pp.5-20
    • /
    • 2004
  • It has been known generally that investment in the extending ability to use the IT applications have further enhanced the productivity of effects of IT on firms by reducing costs, increasing returns, and increasing the speed of operations, etc. Notwithstanding this fact, it was very complex and difficult to evaluate concretely the effect of informatization of firm. SMEs(Small- & Medium-sized Enterprises) in particular. In this study, I point out the weakness of SMEs and analyze the effects of informatization through the count data model. For this analysis, I separate the effects into two part, such as organizational effect and personal effect. It comes to conclusion that organizational effect is larger than personal effect and the ability to practical use of IT systems is most efficient item related with informatization level. Since it will be important to cencentrate on raising this ability for heightening the competitiveness of SMEs.

Bayesian Conway-Maxwell-Poisson (CMP) regression for longitudinal count data

  • Morshed Alam ;Yeongjin Gwon ;Jane Meza
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.3
    • /
    • pp.291-309
    • /
    • 2023
  • Longitudinal count data has been widely collected in biomedical research, public health, and clinical trials. These repeated measurements over time on the same subjects need to account for an appropriate dependency. The Poisson regression model is the first choice to model the expected count of interest, however, this may not be an appropriate when data exhibit over-dispersion or under-dispersion. Recently, Conway-Maxwell-Poisson (CMP) distribution is popularly used as the distribution offers a flexibility to capture a wide range of dispersion in the data. In this article, we propose a Bayesian CMP regression model to accommodate over and under-dispersion in modeling longitudinal count data. Specifically, we develop a regression model with random intercept and slope to capture subject heterogeneity and estimate covariate effects to be different across subjects. We implement a Bayesian computation via Hamiltonian MCMC (HMCMC) algorithm for posterior sampling. We then compute Bayesian model assessment measures for model comparison. Simulation studies are conducted to assess the accuracy and effectiveness of our methodology. The usefulness of the proposed methodology is demonstrated by a well-known example of epilepsy data.

A joint modeling of longitudinal zero-inflated count data and time to event data (경시적 영과잉 가산자료와 생존자료의 결합모형)

  • Kim, Donguk;Chun, Jihun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1459-1473
    • /
    • 2016
  • Both longitudinal data and survival data are collected simultaneously in longitudinal data which are observed throughout the passage of time. In this case, the effect of the independent variable becomes biased (provided that sole use of longitudinal data analysis does not consider the relation between both data used) if the missing that occurred in the longitudinal data is non-ignorable because it is caused by a correlation with the survival data. A joint model of longitudinal data and survival data was studied as a solution for such problem in order to obtain an unbiased result by considering the survival model for the cause of missing. In this paper, a joint model of the longitudinal zero-inflated count data and survival data is studied by replacing the longitudinal part with zero-inflated count data. A hurdle model and proportional hazards model were used for each longitudinal zero inflated count data and survival data; in addition, both sub-models were linked based on the assumption that the random effect of sub-models follow the multivariate normal distribution. We used the EM algorithm for the maximum likelihood estimator of parameters and estimated standard errors of parameters were calculated using the profile likelihood method. In simulation, we observed a better performance of the joint model in bias and coverage probability compared to the separate model.