Acknowledgement
All sequence data were retrieved from avian-flu database (http://avian-flu.org). The source code is available on Github at https://github.com/jhshin0714/ML_avian-flu/. This research was supported in part by the Bio & Medical Technology Development Program of the National Research Foundation (NRF), funded by the Korean government (MSIT) (NRF-2018M3A9H4055197).
References
- Bouvier NM, Palese P. 2008. The biology of influenza viruses. Vaccine 26 Suppl 4: D49-53.
- Krammer F, Smith GJD, Fouchier RAM, Peiris M, Kedzierska K, Doherty PC, et al. 2018. Influenza. Nat. Rev. Dis. Primers 4: 3.
- Long JS, Mistry B, Haslam SM, Barclay WS. 2019. Host and viral determinants of influenza A virus species specificity. Nat. Rev. Microbiol. 17: 67-81.
- Blagodatski A, Trutneva K, Glazova O, Mityaeva O, Shevkova L, Kegeles E, et al. 2021. Avian influenza in wild birds and poultry: dissemination pathways, monitoring methods, and virus ecology. Pathogens. 10: 630.
- Taubenberger JK, Morens DM. 2006. 1918 Influenza: the mother of all pandemics. Emerg. Infect. Dis. 12: 15-22.
- Seltzer ML, Zhang L. 2009. The data deluge: challenges and opportunities of unlimited data in statistical signal processing. Proc. IEEE Int. Conf. Acoust. Speech Signal Process 2009: 3701-3704.
- LeCun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature 521: 436-444.
- AlQuraishi M. 2019. ProteinNet: a standardized data set for machine learning of protein structure. BMC Bioinformatics 20: 311
- Wang S, Sundaram JP, Spiro D. 2010. VIGOR, an annotation program for small viral genomes. BMC Bioinformatics 11: 451
- Fiser A, Sali A. 2003. Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol. 374: 461-491.
- Lin T, Wang G, Li A, Zhang Q, Wu C, Zhang R, et al. 2009. The hemagglutinin structure of an avian H1N1 influenza A virus. Virology 392: 73-81.
- Kazutaka Katoh, Kazuharu Misawa, Kei-ichi Kuma, Takashi Miyata. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30: 3059-3066.
- Lawrence S, Giles CL. 2000. Overfitting and neural networks: conjugate gradient and backpropagation. In Proc IEEE-INNS-ENNS Int Joint Conf Neural Netw, IJCNN 2000, Neural Computing: New Challenges and Perspectives for the New Millennium. pp. 114-119.
- Visa S, Ramsay B, Ralescu A, Knaap E. 2011. Confusion matrix-based feature selection. In CEUR Workshop Proc, Vol. 710, pp. 120-127.
- Trappenberg TP. 2019. Machine Learning with Sklearn. In: Fundamentals of Machine Learning. Oxford Univ Press: Oxford, UK, pp. 38-65.
- Hunter JD. 2007. Matplotlib: A 2D Graphics Environment. Comput Sci Eng. 9: 90-95.
- Claas EC, Osterhaus AD, van Beek R, De Jong JC, Rimmelzwaan GF, Senne DA, et al. 1998. Human influenza A H5N1 virus related to a highly pathogenic avian influenza virus. Lancet 351: 472-477.
- Fouchier RA, Schneeberger PM, Rozendaal FW, Jan M Broekman, Stiena A G Kemink, Vincent Munster, et al. 2004. Avian influenza A virus (H7N7) associated with human conjunctivitis and a fatal case of acute respiratory distress syndrome. Proc. Natl. Acad. Sci. USA 101: 1356-1361.
- Ramazi P, Kunegel-Lion M, Greiner R, Lewis MA. 2021. Predicting insect outbreaks using machine learning: a mountain pine beetle case study. Ecol Evol. 11: 13014-13028.
- Chapelle O, Scholkopf B, Zien A. 2006. Risks of Semi-Supervised Learning: How Unlabeled Data Can Degrade Performance of Generative Classifiers. In: Semi-Supervised Learning. MIT Press: USA, pp. 57-72.