DOI QR코드

DOI QR Code

Microbial Forensic Investigations of Microbial Sources through Single Nucleotide Polymorphism Analysis

단일 염기 다형성 분석을 통한 미생물 출처의 법의학적 조사

  • Seunghyun Lim (Chem-Bio Technology Center, Agency for Defense Development) ;
  • Hyeongseok Yun (Chem-Bio Technology Center, Agency for Defense Development) ;
  • Seungho Lee (Chem-Bio Technology Center, Agency for Defense Development) ;
  • Juhwan Jung (Chem-Bio Technology Center, Agency for Defense Development) ;
  • Sehun Gu (Chem-Bio Technology Center, Agency for Defense Development) ;
  • Daesang Lee (Chem-Bio Technology Center, Agency for Defense Development) ;
  • Donghyun Song (Chem-Bio Technology Center, Agency for Defense Development)
  • 임승현 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터) ;
  • 윤형석 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터) ;
  • 이승호 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터) ;
  • 정주환 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터) ;
  • 구세훈 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터) ;
  • 이대상 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터) ;
  • 송동현 (국방과학연구소 국방첨단기술연구원 Chem-Bio센터)
  • Received : 2024.05.17
  • Accepted : 2024.09.10
  • Published : 2024.12.05

Abstract

Bacillus anthracis, a potential biological agent for terrorism, has been actively investigated for its underlying property and phylogenetic origin in the field of Microbial forensics. With the advancement of next generation sequencing(NGS) technology, in silico analysis becomes feasible at the whole genome sequence level, reducing the time and cost. In this paper, we suggested a methodology for identifying unknown samples form the field, which simulate real forensic evidence rather than highly purified samples, utilizing two in silico methods: k-mer analysis and whole genome single-nucleotide polymorphism(wgSNP). We performed prefix-based k-mer analysis using 964 NGS raw data obtained form the NCBI database, along with the NGS data from the unknown samples, by mapping the reads to Ames Ancestor and obtaining the consensus sequence. When analyzed together with 844 assembled sequences obtained form the NCBI database, it was determined that the unknown samples belong to the Injectional anthrax group, which was an infectious group identified among heroin users in Norway in 2000. wgSNP analysis has categorized the sample into discrete low-SNP group-I and high-SNP group-II, with a difference of up to 9 SNPs within each group. We observed 30 SNP positions in group-I, which includes the unknown samples, and confirmed that the SNP of A4564 was identical to that of the unknown samples. These results demonstrate that prefix-based k-mer, and wgSNP analysis can be effectively used for the collection of microbial forensic evidence from field samples.

Keywords

References

  1. Keim P, Pearson T, Okinaka R. Microbial forensics: DNA fingerprinting of Bacillus anthracis(anthrax). Anal Chem. 2008 Jul 1;80(13):4791-9.
  2. Sahl JW, Pearson T, Okinaka R, Schupp JM, Gillece JD, Heaton H, Birdsell D, Hepp C, Fofanov V, Noseda R, Fasanella A, Hoffmaster A, Wagner DM, Keim P. A Bacillus anthracis Genome Sequence from the Sverdlovsk 1979 Autopsy Specimens. mBio. 2016 Sep 27;7(5):e01501-16.
  3. Perez-Losada M, Arenas M, Castro-Nallar E. Microbial sequence typing in the genomic era. Infect Genet Evol. 2018 Sep;63:346-359.
  4. Tucker T, Marra M, Friedman JM. Massively parallel sequencing: the next big thing in genetic medicine. Am J Hum Genet. 2009 Aug;85(2):142-54.
  5. Bussi Y, Kapon R, Reich Z. Large-scale k-mer-based analysis of the informational properties of genomes, comparative genomics and taxonomy. PLoS One. 2021 Oct 14;16(10):e0258693.
  6. Lumpe J, Gumbleton L, Gorzalski A, Libuit K, Varghese V, Lloyd T, Tadros F, Arsimendi T, Wagner E, Stephens C, Sevinsky J, Hess D, Pandori M. GAMBIT(Genomic Approximation Method for Bacterial Identification and Tracking): A methodology to rapidly leverage whole genome sequencing of bacterial isolates for clinical identification. PLoS One. 2023 Feb 16;18(2):e0277575.
  7. Wang, A., Ash, G. Whole Genome Phylogeny of Bacillus by Feature Frequency Profiles(FFP). Sci Rep 2015 Sep 01;5:13644.
  8. Wakui M. Analysis of single nucleotide polymorphisms(SNPs). Rinsho Byori. 2013 Nov;61(11):1008-17.
  9. Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O. Solving the problem of comparing whole bacterial genomes across different sequencing platforms. PLoS One. 2014 Aug 11;9(8):e104984.
  10. Leekitcharoenphon P, Kaas RS, Thomsen MC, Friis C, Rasmussen S, Aarestrup FM. snpTree--a webserver to identify and construct SNP trees from whole genome sequence data. BMC Genomics. 2012;13 Suppl 7(Suppl 7):S6.
  11. Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013 Jan;41(Database issue):D36-42.
  12. Ravel J, Jiang L, Stanley ST, Wilson MR, Decker RS, Read TD, Worsham P, Keim PS, Salzberg SL, Fraser-Liggett CM, Rasko DA. The complete genome sequence of Bacillus anthracis Ames "Ancestor". J Bacteriol. 2009 Jan;191(1):445-6.
  13. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9.
  14. Grunow R, Verbeek L, Jacob D, Holzmann T, Birkenfeld G, Wiens D, von Eichel-Streiber L, Grass G, Reischl U. Injection anthrax--a new outbreak in heroin users. Dtsch Arztebl Int. 2012 Dec;109(49):843-8.
  15. Ringertz SH, Hoiby EA, Jensenius M, Maehlen J, Caugant DA, Myklebust A, Fossum K. Injectional anthrax in a heroin skin-popper. Lancet. 2000 Nov 4;356(9241):1574-5.
  16. Price EP, Seymour ML, Sarovich DS, Latham J, Wolken SR, Mason J, Vincent G, Drees KP, Beckstrom-Sternberg SM, Phillippy AM, Koren S, Okinaka RT, Chung WK, Schupp JM, Wagner DM, Vipond R, Foster JT, Bergman NH, Burans J, Pearson T, Brooks T, Keim P. Molecular epidemiologic investigation of an anthrax outbreak among heroin users, Europe. Emerg Infect Dis. 2012 Aug;18(8):1307-13.
  17. Souvorov A, Agarwala R, Lipman DJ. SKESA: strategic k-mer extension for scrupulous assemblies. Genome Biol. 2018 Oct 4;19(1):153.
  18. Liao, Xingyu, et al., "Current challenges and solutions of de novo assembly," Quantitative Biology 2019 June 1;7(2): 90-109.
  19. Torresen OK, Star B, Mier P, Andrade-Navarro MA, Bateman A, Jarnot P, Gruca A, Grynberg M, Kajava AV, Promponas VJ, Anisimova M, Jakobsen KS, Linke D. Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases. Nucleic Acids Res. 2019 Dec 2;47(21):10994-11006.
  20. Keim P, Grunow R, Vipond R, Grass G, Hoffmaster A, Birdsell DN, Klee SR, Pullan S, Antwerpen M, Bayer BN, Latham J, Wiggins K, Hepp C, Pearson T, Brooks T, Sahl J, Wagner DM. Whole Genome Analysis of Injectional Anthrax Identifies Two Disease Clusters Spanning More Than 13 Years. EBioMedicine. 2015 Oct 6;2(11):1613-8.