DOI QR코드

DOI QR Code

Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants

  • Kim, Kyung (Department of Biomedical Informatics, Ajou University School of Medicine) ;
  • Seong, Moon-Woo (Department of Laboratory Medicine, Seoul National University Hospital College of Medicine) ;
  • Chung, Won-Hyong (Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Park, Sung Sup (Department of Laboratory Medicine, Seoul National University Hospital College of Medicine) ;
  • Leem, Sangseob (Department of Biomedical Informatics, Ajou University School of Medicine) ;
  • Park, Won (Department of Functional Genomics, Korea University of Science and Technology) ;
  • Kim, Jihyun (Department of Biomedical Informatics, Ajou University School of Medicine) ;
  • Lee, KiYoung (Department of Biomedical Informatics, Ajou University School of Medicine) ;
  • Park, Rae Woong (Department of Biomedical Informatics, Ajou University School of Medicine) ;
  • Kim, Namshin (Department of Functional Genomics, Korea University of Science and Technology)
  • Received : 2015.04.08
  • Accepted : 2015.05.28
  • Published : 2015.06.30

Abstract

Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ${\sim}200{\times}$. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of $120{\times}$. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about $120{\times}$. Moreover, the phenomena were consistent across the breast cancer samples.

Keywords

References

  1. Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 2009;461:272-276. https://doi.org/10.1038/nature08250
  2. Choi M, Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A 2009;106:19096-19101. https://doi.org/10.1073/pnas.0910672106
  3. Gullapalli RR, Desai KV, Santana-Santos L, Kant JA, Becich MJ. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics. J Pathol Inform 2012;3:40. https://doi.org/10.4103/2153-3539.103013
  4. de Ligt J, Willemsen MH, van Bon BW, Kleefstra T, Yntema HG, Kroes T, et al. Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med 2012;367:1921-1929. https://doi.org/10.1056/NEJMoa1206524
  5. Huh HJ, Seo JY, Cho SY, Ki CS, Lee SY, Kim JW, et al. The first Korean case of mucopolysaccharidosis IIIC (Sanfilippo syndrome type C) confirmed by biochemical and molecular investigation. Ann Lab Med 2013;33:75-79. https://doi.org/10.3343/alm.2013.33.1.75
  6. Thompson ER, Doyle MA, Ryland GL, Rowley SM, Choong DY, Tothill RW, et al. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles. PLoS Genet 2012;8:e1002894. https://doi.org/10.1371/journal.pgen.1002894
  7. Park DJ, Odefrey FA, Hammet F, Giles GG, Baglietto L, ABCFS, et al. FAN1 variants identified in multiple-case early-onset breast cancer families via exome sequencing: no evidence for association with risk for breast cancer. Breast Cancer Res Treat 2011;130:1043-1049. https://doi.org/10.1007/s10549-011-1704-y
  8. Lonigro RJ, Grasso CS, Robinson DR, Jing X, Wu YM, Cao X, et al. Detection of somatic copy number alterations in cancer using targeted exome capture sequencing. Neoplasia 2011;13:1019-1025. https://doi.org/10.1593/neo.111252
  9. Wang L, Tsutsumi S, Kawaguchi T, Nagasaki K, Tatsuno K, Yamamoto S, et al. Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency. Genome Res 2012;22:208-219. https://doi.org/10.1101/gr.123109.111
  10. Le Gallo M, O'Hara AJ, Rudd ML, Urick ME, Hansen NF, O'Neil NJ, et al. Exome sequencing of serous endometrial tumors identifies recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes. Nat Genet 2012;44:1310-1315. https://doi.org/10.1038/ng.2455
  11. Wang K, Kan J, Yuen ST, Shi ST, Chu KM, Law S, et al. Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nat Genet 2011;43:1219-1223. https://doi.org/10.1038/ng.982
  12. Liu P, Morrison C, Wang L, Xiong D, Vedell P, Cui P, et al. Identification of somatic mutations in non-small cell lung carcinomas using whole-exome sequencing. Carcinogenesis 2012;33:1270-1276. https://doi.org/10.1093/carcin/bgs148
  13. Cao CC, Li C, Huang Z, Ma X, Sun X. Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing. Genet Epidemiol 2013;37:820-830. https://doi.org/10.1002/gepi.21769
  14. Hou R, Yang Z, Li M, Xiao H. Impact of the next-generation sequencing data depth on various biological result inferences. Sci China Life Sci 2013;56:104-109. https://doi.org/10.1007/s11427-013-4441-0
  15. Ajay SS, Parker SC, Abaan HO, Fajardo KV, Margulies EH. Accurate and comprehensive sequencing of personal genomes. Genome Res 2011;21:1498-1505. https://doi.org/10.1101/gr.123638.111
  16. Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010;26:589-595. https://doi.org/10.1093/bioinformatics/btp698
  17. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/Map format and SAMtools. Bioinformatics 2009;25:2078-2079. https://doi.org/10.1093/bioinformatics/btp352
  18. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a Map-Reduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010;20:1297-1303. https://doi.org/10.1101/gr.107524.110
  19. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011;43:491-498. https://doi.org/10.1038/ng.806
  20. Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;6:80-92. https://doi.org/10.4161/fly.19695
  21. Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 2003;31:3812-3814. https://doi.org/10.1093/nar/gkg509
  22. Smigielski EM, Sirotkin K, Ward M, Sherry ST. dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res 2000;28:352-355. https://doi.org/10.1093/nar/28.1.352
  23. Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 2014;42:D980-D985. https://doi.org/10.1093/nar/gkt1113
  24. Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, et al. The UCSC genome browser database: update 2010. Nucleic Acids Res 2010;38:D613-D619. https://doi.org/10.1093/nar/gkp939
  25. Kananura C, Haug K, Sander T, Runge U, Gu W, Hallmann K, et al. A splice-site mutation in GABRG2 associated with childhood absence epilepsy and febrile convulsions. Arch Neurol 2002;59:1137-1141. https://doi.org/10.1001/archneur.59.7.1137
  26. Carvalho GA, Weiss RE, Refetoff S. Complete thyroxine-binding globulin (TBG) deficiency produced by a mutation in acceptor splice site causing frameshift and early termination of translation (TBG-Kankakee). J Clin Endocrinol Metab 1998;83:3604-3608.
  27. Parkinson DB, Thakker RV. A donor splice site mutation in the parathyroid hormone gene is associated with autosomal recessive hypoparathyroidism. Nat Genet 1992;1:149-152. https://doi.org/10.1038/ng0592-149
  28. Nielsen R, Paul JS, Albrechtsen A, Song YS. Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 2011;12:443-451. https://doi.org/10.1038/nrg2986
  29. Wendl MC, Wilson RK. Aspects of coverage in medical DNA sequencing. BMC Bioinformatics 2008;9:239. https://doi.org/10.1186/1471-2105-9-239
  30. Pan H, He Z, Ling L, Ding Q, Chen L, Zha X, et al. Reproductive factors and breast cancer risk among BRCA1 or BRCA2 mutation carriers: results from ten studies. Cancer Epidemiol 2014;38:1-8. https://doi.org/10.1016/j.canep.2013.11.004
  31. Wooster R, Bignell G, Lancaster J, Swift S, Seal S, Mangion J, et al. Identification of the breast cancer susceptibility gene BRCA2. Nature 1995;378:789-792. https://doi.org/10.1038/378789a0
  32. Couch FJ, DeShano ML, Blackwood MA, Calzone K, Stopfer J, Campeau L, et al. BRCA1 mutations in women attending clinics that evaluate the risk of breast cancer. N Engl J Med 1997;336:1409-1415. https://doi.org/10.1056/NEJM199705153362002

Cited by

  1. Amplicon-based semiconductor sequencing of human exomes: performance evaluation and optimization strategies vol.135, pp.5, 2016, https://doi.org/10.1007/s00439-016-1656-8