• Title/Summary/Keyword: Count 모형

Search Result 106, Processing Time 0.024 seconds

Hurdle Model for Longitudinal Zero-Inflated Count Data Analysis (영과잉 경시적 가산자료 분석을 위한 허들모형)

  • Jin, Iktae;Lee, Keunbaik
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.923-932
    • /
    • 2014
  • The Hurdle model can to analyze zero-inflated count data. This model is a mixed model of the logit model for a binary component and a truncated Poisson model of a truncated count component. We propose a new hurdle model with a general heterogeneous random effects covariance matrix to analyze longitudinal zero-inflated count data using modified Cholesky decomposition. This decomposition factors the random effects covariance matrix into generalized autoregressive parameters and innovation variance. The parameters are modeled using (generalized) linear models and estimated with a Bayesian method. We use these methods to carefully analyze a real dataset.

Integer-Valued GARCH Models for Count Time Series: Case Study (계수 시계열을 위한 정수값 GARCH 모델링: 사례분석)

  • Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.1
    • /
    • pp.115-122
    • /
    • 2015
  • This article is concerned with count time series taking values in non-negative integers. Along with the first order mean of the count time series, conditional variance (volatility) has recently been paid attention to and therefore various integer-valued GARCH(generalized autoregressive conditional heteroscedasticity) models have been suggested in the last decade. We introduce diverse integer-valued GARCH(INGARCH, for short) processes to count time series and a real data application is illustrated as a case study. In addition, zero inflated INGARCH models are discussed to accommodate zero-inflated count time series.

A Zero-Inated Model for Insurance Data (제로팽창 모형을 이용한 보험데이터 분석)

  • Choi, Jong-Hoo;Ko, In-Mi;Cheon, Soo-Young
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.3
    • /
    • pp.485-494
    • /
    • 2011
  • When the observations can take only the non-negative integer values, it is called the count data such as the numbers of car accidents, earthquakes, or insurance coverage. In general, the Poisson regression model has been used to model these count data; however, this model has a weakness in that it is restricted by the equality of the mean and the variance. On the other hand, the count data often tend to be too dispersed to allow the use of the Poisson model in practice because the variance of data is significantly larger than its mean due to heterogeneity within groups. When overdispersion is not taken into account, it is expected that the resulting parameter estimates or standard errors will be inefficient. Since coverage is the main issue for insurance, some accidents may not be covered by insurance, and the number covered by insurance may be zero. This paper considers the zero-inflated model for the count data including many zeros. The performance of this model has been investigated by using of real data with overdispersion and many zeros. The results indicate that the Zero-Inflated Negative Binomial Regression Model performs the best for model evaluation.

Count Data Model for The Estimation of Bus Ridership (Focusing on Commuters and Students in Seoul) (가산자료모형(Count Data Model)을 이용한 버스이용횟수추정에 관한 연구 (서울시 통근.통학자를 대상으로))

  • 문진수;김순관;임강원
    • Journal of Korean Society of Transportation
    • /
    • v.17 no.5
    • /
    • pp.123-135
    • /
    • 1999
  • The rapid increase of Passenger cars which is caused by the discomfort of Public transit and the Preference of automobiles is the major factor of increasing traffic congestions in Seoul With the point that leading the automobilists to the Public transit can be the most important Policy to ease these traffic congestions, this study focuses on the behavioral aspects of company employees and university students and investigates factors influencing bus ridership. To be brief, by estimating bus ridership through count models, this study investigates factors which influence bus ridership and elicits Political suggestions which lead automobilists to Public transit. The Purpose in this study is the application of appropriate count data model. The count data models have been widely applied to the economic area from the middle of the 1980s and to transportation aspect mainly in the foreign countries from the latter half of the 1980s. Even though a few studies in this country employed count data model to count data. all of them were Poisson regression models without suitable tests for the importance of the model specification. In the end, as the result of statistical test, negative binomial regression model which is suitable for overdispersed data was found to be appropriate for the data of weekly bus ridership. To emphasize the importance of model specification, both of poisson regression model and negative binomial regression model were estimated and the results were compared.

  • PDF

Estimating the Economic Value of Recreation Sea Fishing in the Yellow Sea: An Application of Count Data Model (가산자료모형을 이용한 서해 태안군 유어객의 편익추정)

  • Choi, Jong Du
    • Environmental and Resource Economics Review
    • /
    • v.23 no.2
    • /
    • pp.331-347
    • /
    • 2014
  • The purpose of this study is to estimate the economic value of the recreational sea fishing in the Yellow Sea using count data model. For estimating consumer surplus, we used several count data model of travel cost recreation demand such as a poisson model(PM), a negative binomial model(NBM), a truncated poisson model(TPM), and a truncated negative binomial model(TNBM). Model results show that there is no exist the over-dispersion problem and a NBM was statistically more suitable than the other models. All parameters estimated are statistically significant and theoretically valid. The NBM was applied to estimate the travel demand and consumer surplus. The consumer surplus pre trip was estimated to be 254,453won, total consumer surplus per person and per year 1,536,896won.

Automatic order selection procedure for count time series models (계수형 시계열 모형을 위한 자동화 차수 선택 알고리즘)

  • Ji, Yunmi;Seong, Byeongchan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.2
    • /
    • pp.147-160
    • /
    • 2020
  • In this paper, we study an algorithm that automatically determines the orders of past observations and conditional mean values that play an important role in count time series models. Based on the orders of the ARIMA model, the algorithm constitutes the order candidates group for time series generalized linear models and selects the final model based on information criterion among the combinations of the order candidates group. To evaluate the proposed algorithm, we perform small simulations and empirical analysis according to underlying models and time series as well as compare forecasting performances with the ARIMA model. The results of the comparison confirm that the time series generalized linear model offers better performance than the ARIMA model for the count time series analysis. In addition, the empirical analysis shows better performance in mid and long term forecasting than the ARIMA model.

A joint modeling of longitudinal zero-inflated count data and time to event data (경시적 영과잉 가산자료와 생존자료의 결합모형)

  • Kim, Donguk;Chun, Jihun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1459-1473
    • /
    • 2016
  • Both longitudinal data and survival data are collected simultaneously in longitudinal data which are observed throughout the passage of time. In this case, the effect of the independent variable becomes biased (provided that sole use of longitudinal data analysis does not consider the relation between both data used) if the missing that occurred in the longitudinal data is non-ignorable because it is caused by a correlation with the survival data. A joint model of longitudinal data and survival data was studied as a solution for such problem in order to obtain an unbiased result by considering the survival model for the cause of missing. In this paper, a joint model of the longitudinal zero-inflated count data and survival data is studied by replacing the longitudinal part with zero-inflated count data. A hurdle model and proportional hazards model were used for each longitudinal zero inflated count data and survival data; in addition, both sub-models were linked based on the assumption that the random effect of sub-models follow the multivariate normal distribution. We used the EM algorithm for the maximum likelihood estimator of parameters and estimated standard errors of parameters were calculated using the profile likelihood method. In simulation, we observed a better performance of the joint model in bias and coverage probability compared to the separate model.

A Bayesian zero-inflated Poisson regression model with random effects with application to smoking behavior (랜덤효과를 포함한 영과잉 포아송 회귀모형에 대한 베이지안 추론: 흡연 자료에의 적용)

  • Kim, Yeon Kyoung;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.2
    • /
    • pp.287-301
    • /
    • 2018
  • It is common to encounter count data with excess zeros in various research fields such as the social sciences, natural sciences, medical science or engineering. Such count data have been explained mainly by zero-inflated Poisson model and extended models. Zero-inflated count data are also often correlated or clustered, in which random effects should be taken into account in the model. Frequentist approaches have been commonly used to fit such data. However, a Bayesian approach has advantages of prior information, avoidance of asymptotic approximations and practical estimation of the functions of parameters. We consider a Bayesian zero-inflated Poisson regression model with random effects for correlated zero-inflated count data. We conducted simulation studies to check the performance of the proposed model. We also applied the proposed model to smoking behavior data from the Regional Health Survey (2015) of the Korea Centers for disease control and prevention.

The Economic and Social Implication of Count Regression Models for Married Women's Completed Fertility in Korea (우리나라 가구의 자녀수 결정요인에 관한 Count 모형 분석 및 경제적 함의)

  • Kim, Hyun-Sook
    • Korea journal of population studies
    • /
    • v.30 no.3
    • /
    • pp.107-135
    • /
    • 2007
  • This paper uses a static Gamma count model, a traditional hurdle model and an endogenous switching Poisson model, respectively for determining married women's completed fertility rates in Korea. This paper analyzes the impact of household income, women's wage and education, and women's job market participation on the number of children of married women above age 40 and on the expected number of children of women aged below 40. The paper shows that a household income significantly increases the number of children for at least women aged above 40, however, this income effect is disappearing for younger generation. The empirical model suggests that women having a job tend to have fewer children for a group 39 years old and below and find that there is an endogeneity problem between child birth and labor force participation, too. The education level of married women gives a positive effect for giving a birth, itself, while it gives a negative impact on the number of children. Based on the empirical results, it concludes that Becker's Quantity-Quality theory works for Korea, too.

Analysis of Failutr Count Data Based on NHPP Models (NHPP모형에 기초한 고장 수 자료의 분석)

  • Kim, Seong-Hui;Jeong, Hyang-Suk;Kim, Yeong-Sun;Park, Jung-Yang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.2
    • /
    • pp.395-400
    • /
    • 1997
  • An important quality characteristic of a software reliability.Software reliablilty growh models prvied the tools to evluate and moniter the reliabolty growth behavior of the sofwate during the testing phase Therefore failure data collected during the testing phase should be continmuosly analyzed on the basis of some selected software reliability growth models.For the cases where nonhomogeneous Poisson proxess models are the candiate models,we suggest Poisson regression model, which expresses the relationship between the expeted and actual failures counts in disjonint time intervals,for analyzing the failure count data.The weighted lest squares method is then used to-estimate the paramethers in the parameters in the model:The resulting estimators are equivalent to the maximum likelihood estimators. The method is illustrated by analyzing the failutr count data gathered from a large- scale switchong system.

  • PDF