DOI QR코드

DOI QR Code

Forecasting hierarchical time series for foodborne disease outbreaks

식중독 발생 건수에 대한 계층 시계열 예측

  • In-Kwon Yeo (Department of Statistics, Sookmyung Women's University)
  • 여인권 (숙명여자대학교 통계학과)
  • Received : 2024.03.07
  • Accepted : 2024.04.06
  • Published : 2024.08.31

Abstract

In this paper, we investigate hierarchical time series forecasting that adhere to a hierarchical structure when deriving predicted values by analyzing segmented data as well as aggregated datasets. The occurrences of food poisoning by a specific pathogen are analyzed using zero-inflated Poisson regression models and negative binomial regression models. The occurrences of major, miscellaneous, and overall food poisoning are analyzed using Poisson regression models and negative binomial regression models. For hierarchical time series forecasting, the MinT estimation proposed by Wickramasuriya et al. (2019) is employed. Negative predicted values resulting from hierarchical adjustments are adjusted to zero, and weights are multiplied to the remaining lowest-level variables to satisfy the hierarchical structure. Empirical analysis revealed that there is little difference between hierarchical and non-hierarchical adjustments in predictions based on pathogens. However, hierarchical adjustments generally yield superior results for predictions concerning major, miscellaneous, and overall occurrences. Without hierarchical adjustment, instances may occur where the predicted frequencies of the lowest-level variables exceed that of major or miscellaneous occurrences. However, the proposed method enables the acquisition of predictions that adhere to the hierarchical structure.

이 연구에서는 식중독 발생건수를 원인물질별로 나눈 자료와 합한 자료를 별개로 분석하여 예측값을 유도한 후 계층구조를 만족하도록 하는 계층 시계열 예측에 대해 알아본다. 원인물질별 식중독 방생건수는 영과잉 포아송 회귀모형과 음이항 회귀모형으로 분석하고 합한 식중독 발생건수 포아송 회귀모형과 음이항 회귀모형으로 분석한다. 계층 시계열 예측을 위해 최적결합 중 하나인 Wickramasuriya 등 (2019)의 MinT 추정이 사용되었다. 계층조정 과정에서 발생한 음의 예측값은 0으로 수정하고 나머지 최하위 변수에 가중치를 곱해 계층구조를 만족시킨다. 실증분석 결과를 보면 원인물질별 예측에서는 계층조정을 한 결과와 하지 않은 결과에 차이가 거의 없었으나 주요, 기타 및 전체에 대한 예측에서는 계층조정 한 결과가 대체로 우수한 것으로 나타났다. 중요한 것은 계층조정을 하지 않으면 최하위 변수의 예측빈도가 주요나 기타의 예측빈도 보다 큰 경우도 발생하지만 제안된 방법을 적용하면 계층구조를 이루는 예측값을 얻을 수 있다.

Keywords

References

  1. Agarwal DK, Gelfand AE, and Citron-Pousty S (2002). Zero-inflated models with application to spatial count data, Environmental and Ecological Statistics, 9, 341-355.
  2. Choi K, Kim B, Bae W, Jung W, and Cho Y (2008). Developing the index of foodborne disease occurrence, The Korean Journal of Applied Statistics, 21, 649-658.
  3. Dunn DM, Williams WH, and DeChaine TL (1976). Aggregate versus subaggregate models in local area forecasting, Journal of the American Statistical Association, 71, 68-71.
  4. Fleury M, Charron DF, Holt JD, Allen OB, and Maarouf AR (2006). A time series analysis of the relationship of ambient temperature and common bacterial enteric infections in two Canadian provinces, International Journal of Biometeorology, 50, 385-391.
  5. Hyndman RJ, Ahmed RA, Athanasopoulos G, and Shang HL (2011). Optimal combination forecasts for hierarchical time series, Computational Statistics and Data Analysis, 55, 2579-2589.
  6. Jung HS., Kim BJ, Cho S, and Yeo, IK (2012). Analysis of food poisoning via zero inflation models, The Korean Journal of Applied Statistics, 25, 859-864.
  7. Jeong MS and Oh SS (2009). Study on the Impact Analysis and Control System of Foodborne Disease Outbreak due to Climate Change, Korea Food & Drug Administration, Cheongju-si.
  8. Kim S, Seong B, Choi YG, and Yeo IK (2022). A study on time series linkage in the household income and expenditure survey, The Korean Journal of Applied Statistics, 35, 553-568.
  9. Lambert D (1992). Zero-inflated Poisson regression models with an application to defects in manufacturing, Technometrics, 34, 1-14.
  10. Orcutt GH, Watts HW, and Edwards JB (1968). Data aggregation and information loss, The American Economic Review, 58, 773-787.
  11. Shlifer E and Wolff RW (1979). Aggregation and proration in forecasting, Management Science, 25, 594-603.
  12. Wickramasuriya SL, Athanasopoulos G, and Hyndman RJ (2019). Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization, Journal of the American Statistical Association, 114, 804-819
  13. Yeo IK (2012). Models for forecasting food poisoning occurrences, Journal of the Korean Data & Information Science Society, 23, 1117-1125.
  14. Yeo IK (2013). Prediction of the number of food poisoning occurrences by microbes, The Korean Journal of Applied Statistics, 26, 923-932.
  15. Zhang Y, Bi P, and Hiller JE (2008). Climate variations and salmonellosis transmission in Adelaide, South Australia: A comparison between regression models, International Journal of Biometeorology, 52, 179-187.
  16. Zhang Y, Bi P, Hiller JE, Sun Y, and Ryan P (2007). Climate variations and bacillary dysentery in northern and southern cities of China, The Journal of Infection, 55, 194-200.