DOI QR코드

DOI QR Code

Using the corrected Akaike's information criterion for model selection

모형 선택에서의 수정된 AIC 사용에 대하여

  • Song, Eunjung (Department of Statistics, Inha University) ;
  • Won, Sungho (Department of Public Health Science, Seoul National University) ;
  • Lee, Woojoo (Department of Statistics, Inha University)
  • Received : 2016.11.07
  • Accepted : 2017.01.07
  • Published : 2017.02.28

Abstract

Corrected Akaike's information criterion (AICc) is known to have better finite sample properties. However, Akaike's information criterion (AIC) is still widely used to select an optimal prediction model among several candidate models due to of a lack of research on benefits obtained using AICc. In this paper, we compare the performance of AIC and AICc through numerical simulations and confirm the advantage of using AICc. In addition, we also consider the performance of quasi Akaike's information criterion (QAIC) and the corrected quasi Akaike's information criterion (QAICc) for binomial and Poisson data under overdispersion phenomenon.

이미 corrected Akaike's information criterion(AICc)가 AIC에 비해 우수한 이론적 성질을 가진 것으로 알려져 있으나, 현재 실제 자료분석에서 최적의 예측 모형을 선택하기 위해 가장 널리 사용되는 정보기준은 여전히 Akaike's information criterion(AIC)이다. 이것은 AICc를 사용함으로써 실제 우리가 어떠한 종류의 이점을 얻을 수 있는가에 대해 논의하고 있는 연구가 부족해서이다. 우리는 이 논문에서 수치 연구를 통해 AIC와 AICc의 성능을 비교하고 AICc 의 사용이 가져오는 장점에 대해 확인을 할 것이다. 또한, 포아송 또는 이항 분포 자료 분석에서 과대산포(overdispersion) 현상이 나타난 경우 사용하는 quasi Akaike's information criterion(QAIC)와 corrected quasi Akaike's information criterion(QAICc) 성능에 대해서도 시뮬레이션을 통해 비교해보고자 한다.

Keywords

References

  1. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In 2nd International Symposium on Information Theory (pp. 267-281), Akademia Kiado, Budapest.
  2. Bloom, M. and Milkovich, G. T. (1998). Relationships among risk, incentive pay, and organizational performance, Academy of Management Journal, 41, 283-297. https://doi.org/10.2307/256908
  3. Burnham, K. P. and Anderson, D. (2003). Model Selection and Multi-Model Inference: a Practical Informatio-Theoric Approach, Springer, New York.
  4. Cavanaugh, J. E., Davies S. L., and Neath, A. A. (2008). Discrepancy-based model selection criteria using cross-validation. In Statistical Models and Methods for Biomedical and Technical Systems (pp. 473-486), Birkhauser, Boston.
  5. Debrock, C., Preux, P. M., Houinato, D., Druet-Cabanac, M., Kassa, F., Adjien, C., Avode, G., Denis, F., Boutros-Toni, F., and Dumas, M. (2000). Estimation of the prevalence of epilepsy in the Benin region of Zinvie using the capture-recapture method, International Journal of Epidemiology, 29, 330-335. https://doi.org/10.1093/ije/29.2.330
  6. Harada, T., Ariyoshi, N., Shimura, H., Sato, Y., Yokoyama, I., Takahashi, K., Yamagata, S., Imamaki, M., Kobayashi, Y., Ishii, I., Miyazaki, M., and Kitada, M. (2010). Application of Akaike information criterion to evaluate warfarin dosing algorithm, Thrombosis Research, 126, 183-190. https://doi.org/10.1016/j.thromres.2010.05.016
  7. Hinde, J. and Demetrio, C. G. B. (2007). Overdispersion: models and estimation. In A Short Course for 13th Brazilian Symposium of Probability and Statistics (SINAPE 1998), Brazil.
  8. Hurvich, C. M. and Tsai, C. L. (1989). Regression and time series model selection in small samples, Biometrika, 76, 297-307. https://doi.org/10.1093/biomet/76.2.297
  9. Johnson, R. J., Kerr, C. L., Enouri, S. S., Modi, P., Lascelles, B. D. X., and Castillo, J. R. E. (2016). Pharmacoki-netics of liposomal encapsulated buprenorphine suspension following subcutaneous administration to cats, Journal of Veterinary Pharmacology and Therapeutics, Available from: http://dx.doi.org/10.1111/jvp.12357
  10. Kim, H. J., Cavanaugh, J. E., Dallas, T. A., and Fore, S. A. (2014). Model selection criteria for overdispersed data and their application to the characterization of a host-parasite relationship, Environmental and Ecological Statistics, 21, 329-350. https://doi.org/10.1007/s10651-013-0257-0
  11. Lebreton, J. D., Burnham, K. P., Clobert, J., and Anderson, D. R. (1992). Modeling survival and testing biological hypotheses using marked animals: a uni ed approach with case studies, Ecological Monograph, 62, 67-118. https://doi.org/10.2307/2937171
  12. McDonald, G. C. and Schwing, R. C. (1973). Instabilities of regression estimates relating air pollution to mortality, Technometrics, 15, 463-481. https://doi.org/10.1080/00401706.1973.10489073
  13. Shmueli, G. (2010). To explain or to predict?, Statistical Science, 25, 289-310. https://doi.org/10.1214/10-STS330
  14. Takeuchi, K. (1976). Distribution of informational statistics and a criterion of model fitting, Suri-Kagaku (Mathematic Sciences), 153, 12-18.
  15. Zampetakis, L. A., Bouranta, N., and Moustakis, V. S. (2010). On the relationship between individual creativity and time management, Thinking Skills and Creativity, 5, 23-32. https://doi.org/10.1016/j.tsc.2009.12.001