• 제목/요약/키워드: Time-series count data

Search Result 22, Processing Time 0.026 seconds

Model Checking for Time-Series Count Data

  • Lee, Sung-Im
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.359-364
    • /
    • 2005
  • This paper considers a specification test of conditional Poisson regression model for time series count data. Although conditional models for count data have received attention and proposed in several ways, few studies focused on checking its adequacy. Motivated by the test of martingale difference assumption, a specification test via Ljung-Box statistic is proposed in the conditional model of the time series count data. In order to illustrate the performance of Ljung- Box test, simulation results will be provided.

Threshold-asymmetric volatility models for integer-valued time series

  • Kim, Deok Ryun;Yoon, Jae Eun;Hwang, Sun Young
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.3
    • /
    • pp.295-304
    • /
    • 2019
  • This article deals with threshold-asymmetric volatility models for over-dispersed and zero-inflated time series of count data. We introduce various threshold integer-valued autoregressive conditional heteroscedasticity (ARCH) models as incorporating over-dispersion and zero-inflation via conditional Poisson and negative binomial distributions. EM-algorithm is used to estimate parameters. The cholera data from Kolkata in India from 2006 to 2011 is analyzed as a real application. In order to construct the threshold-variable, both local constant mean which is time-varying and grand mean are adopted. It is noted via a data application that threshold model as an asymmetric version is useful in modelling count time series volatility.

Integer-Valued GARCH Models for Count Time Series: Case Study (계수 시계열을 위한 정수값 GARCH 모델링: 사례분석)

  • Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.1
    • /
    • pp.115-122
    • /
    • 2015
  • This article is concerned with count time series taking values in non-negative integers. Along with the first order mean of the count time series, conditional variance (volatility) has recently been paid attention to and therefore various integer-valued GARCH(generalized autoregressive conditional heteroscedasticity) models have been suggested in the last decade. We introduce diverse integer-valued GARCH(INGARCH, for short) processes to count time series and a real data application is illustrated as a case study. In addition, zero inflated INGARCH models are discussed to accommodate zero-inflated count time series.

An Analysis of Panel Count Data from Multiple random processes

  • Park, You-Sung;Kim, Hee-Young
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.265-272
    • /
    • 2002
  • An Integer-valued autoregressive integrated (INARI) model is introduced to eliminate stochastic trend and seasonality from time series of count data. This INARI extends the previous integer-valued ARMA model. We show that it is stationary and ergodic to establish asymptotic normality for conditional least squares estimator. Optimal estimating equations are used to reflect categorical and serial correlations arising from panel count data and variations arising from three random processes for obtaining observation into estimation. Under regularity conditions for martingale sequence, we show asymptotic normality for estimators from the estimating equations. Using cancer mortality data provided by the U.S. National Center for Health Statistics (NCHS), we apply our results to estimate the probability of cells classified by 4 causes of death and 6 age groups and to forecast death count of each cell. We also investigate impact of three random processes on estimation.

  • PDF

Zero-Inflated INGARCH Using Conditional Poisson and Negative Binomial: Data Application (조건부 포아송 및 음이항 분포를 이용한 영-과잉 INGARCH 자료 분석)

  • Yoon, J.E.;Hwang, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.583-592
    • /
    • 2015
  • Zero-inflation has recently attracted much attention in integer-valued time series. This article deals with conditional variance (volatility) modeling for the zero-inflated count time series. We incorporate zero-inflation property into integer-valued GARCH (INGARCH) via conditional Poisson and negative binomial marginals. The Cholera frequency time series is analyzed as a data application. Estimation is carried out using EM-algorithm as suggested by Zhu (2012).

Statistical Analysis of Count Rate Data for On-line Seawater Radioactivity Monitoring

  • Lee, Dong-Myung;Cong, Binh Do;Lee, Jun-Ho;Yeo, In-Young;Kim, Cheol-Su
    • Journal of Radiation Protection and Research
    • /
    • v.44 no.2
    • /
    • pp.64-71
    • /
    • 2019
  • Background: It is very difficult to distinguish between a radioactive contamination source and background radiation from natural radionuclides in the marine environment by means of online monitoring system. The objective of this study was to investigate a statistical process for triggering abnormal level of count rate data measured from our on-line seawater radioactivity monitoring. Materials and Methods: Count rate data sets in time series were collected from 9 monitoring posts. All of the count rate data were measured every 15 minutes from the region of interest (ROI) for $^{137}Cs$ ($E_{\gamma}=661.6keV$) on the gamma-ray energy spectrum. The Shewhart ($3{\sigma}$), CUSUM, and Bayesian S-R control chart methods were evaluated and the comparative analysis of determination methods for count rate data was carried out in terms of the false positive incidence rate. All statistical algorithms were developed using R Programming by the authors. Results and Discussion: The $3{\sigma}$, CUSUM, and S-R analyses resulted in the average false positive incidence rate of $0.164{\pm}0.047%$, $0.064{\pm}0.0367%$, and $0.030{\pm}0.018%$, respectively. The S-R method has a lower value than that of the $3{\sigma}$ and CUSUM method, because the Bayesian S-R method use the information to evaluate a posterior distribution, even though the CUSUM control chart accumulate information from recent data points. As the result of comparison between net count rate and gross count rate measured in time series all the year at a monitoring post using the $3{\sigma}$ control charts, the two methods resulted in the false positive incidence rate of 0.142% and 0.219%, respectively. Conclusion: Bayesian S-R and CUSUM control charts are better suited for on-line seawater radioactivity monitoring with an count rate data in time series than $3{\sigma}$ control chart. However, it requires a continuous increasing trend to differentiate between a false positive and actual radioactive contamination. For the determination of count rate, the net count method is better than the gross count method because of relatively a small variation in the data points.

Effects of Overdispersion on Testing for Serial Dependence in the Time Series of Counts Data

  • Kim, Hee-Young;Park, You-Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.6
    • /
    • pp.829-843
    • /
    • 2010
  • To test for the serial dependence in time series of counts data, Jung and Tremayne (2003) evaluated the size and power of several tests under the class of INARMA models based on binomial thinning operations for Poisson marginal distributions. The overdispersion phenomenon(i.e., a variance greater than the expectation) is common in the real world. Overdispersed count data can be modeled by using alternative thinning operations such as random coefficient thinning, iterated thinning, and quasi-binomial thinning. Such thinning operations can lead to time series models of counts with negative binomial or generalized Poisson marginal distributions. This paper examines whether the test statistics used by Jung and Tremayne (2003) on serial dependence in time series of counts data are affected by overdispersion.

A generalized regime-switching integer-valued GARCH(1, 1) model and its volatility forecasting

  • Lee, Jiyoung;Hwang, Eunju
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.29-42
    • /
    • 2018
  • We combine the integer-valued GARCH(1, 1) model with a generalized regime-switching model to propose a dynamic count time series model. Our model adopts Markov-chains with time-varying dependent transition probabilities to model dynamic count time series called the generalized regime-switching integer-valued GARCH(1, 1) (GRS-INGARCH(1, 1)) models. We derive a recursive formula of the conditional probability of the regime in the Markov-chain given the past information, in terms of transition probabilities of the Markov-chain and the Poisson parameters of the INGARCH(1, 1) process. In addition, we also study the forecasting of the Poisson parameter as well as the cumulative impulse response function of the model, which is a measure for the persistence of volatility. A Monte-Carlo simulation is conducted to see the performances of volatility forecasting and behaviors of cumulative impulse response coefficients as well as conditional maximum likelihood estimation; consequently, a real data application is given.

Modeling and Analysis of Wireless Lan Traffic (무선 랜 트래픽의 분석과 모델링)

  • Yamkhin, Dashdorj;Lee, Seong-Jin;Won, You-Jip
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.8B
    • /
    • pp.667-680
    • /
    • 2008
  • In this work, we present the results of our empirical study on 802.11 wireless LAN network traffic. We collect the packet trace from existing campus wireless LAN infra-structure. We analyzed four different data sets: aggregate traffic, upstream traffic, downstream traffic, tcp only packet trace from aggregate traffic. We analyze the time series aspect of underlying traffic (byte count process and packet count process), marginal distribution of time series, and packet size distribution. We found that in all four data sets there exist long-range dependent property in byte count and packet count process. Inter-arrival distribution is well fitted with Pareto distribution. Upstream traffic, i.e. from the user to Internet, exhibits significant difference in its packet size distribution from the rests. Average packet size of upstream traffic is 151.7 byte while average packet size of the rest of the data sets are all greater than 260 bytes. Packets with full data payloads constitutes 3% and 10% in upstream traffic and the downstream traffic, respectively. Despite the significant difference in packet size distribution, all four data sets have similar Hurst values. The Hurst alone does not properly explain the stochastic characteristics of the underlying traffic. We model the underlying traffic using fractional-ARIMA (FARIMA) and fractional Gaussian Noise (FGN). While the fractional Gaussian Noise based method is computationally more efficient, FARIMA exhibits superior performance in accurately modeling the underlying traffic.

Forecasting evaluation via parametric bootstrap for threshold-INARCH models

  • Kim, Deok Ryun;Hwang, Sun Young
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.177-187
    • /
    • 2020
  • This article is concerned with the issue of forecasting and evaluation of threshold-asymmetric volatility models for time series of count data. In particular, threshold integer-valued models with conditional Poisson and conditional negative binomial distributions are highlighted. Based on the parametric bootstrap method, some evaluation measures are discussed in terms of one-step ahead forecasting. A parametric bootstrap procedure is explained from which directional measure, magnitude measure and expected cost of misclassification are discussed to evaluate competing models. The cholera data in Bangladesh from 1988 to 2016 is analyzed as a real application.