차세대 유전체 기술과 환경생물학 - 환경유전체학 시대를 맞이하여

Next-generation Sequencing for Environmental Biology - Full-fledged Environmental Genomics around the Corner

  • 송주연 (연세대학교 생명시스템대학 시스템생물학과) ;
  • 김병권 (연세대학교 생명시스템대학 시스템생물학과) ;
  • 권순경 (과학기술연합대학원대학교 이학부 시스템생명공학전공) ;
  • 곽민정 (과학기술연합대학원대학교 이학부 시스템생명공학전공) ;
  • 김지현 (연세대학교 생명시스템대학 시스템생물학과)
  • Song, Ju Yeon (Department of Systems Biology, Yonsei University) ;
  • Kim, Byung Kwon (Department of Systems Biology, Yonsei University) ;
  • Kwon, Soon-Kyeong (Biosystems and Bioengineering Program, University of Science and Technology) ;
  • Kwak, Min-Jung (Biosystems and Bioengineering Program, University of Science and Technology) ;
  • Kim, Jihyun F. (Department of Systems Biology, Yonsei University)
  • 투고 : 2012.06.11
  • 심사 : 2012.06.19
  • 발행 : 2012.06.30

초록

With the advent of the genomics era powered by DNA sequencing technologies, life science is being transformed significantly and biological research and development have been accelerated. Environmental biology concerns the relationships among living organisms and their natural environment, which constitute the global biogeochemical cycle. As sustainability of the ecosystems depends on biodiversity, examining the structure and dynamics of the biotic constituents and fully grasping their genetic and metabolic capabilities are pivotal. The high-speed high-throughput next-generation sequencing can be applied to barcoding organisms either thriving or endangered and to decoding the whole genome information. Furthermore, diversity and the full gene complement of a microbial community can be elucidated and monitored through metagenomic approaches. With regard to human welfare, microbiomes of various human habitats such as gut, skin, mouth, stomach, and vagina, have been and are being scrutinized. To keep pace with the rapid increase of the sequencing capacity, various bioinformatic algorithms and software tools that even utilize supercomputers and cloud computing are being developed for processing and storage of massive data sets. Environmental genomics will be the major force in understanding the structure and function of ecosystems in nature as well as preserving, remediating, and bioprospecting them.

키워드

참고문헌

  1. Altermann E, WM Russell, MA Azcarate-Peril, R Barrangou, BL Buck, O McAuliffe, N Souther, A Dobson, T Duong, M Callanan, S Lick, A Hamrick, R Cano and TR Klaenhammer. 2005. Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM. Proc. Natl. Acad. Sci. U.S.A. 102:3906-3912. https://doi.org/10.1073/pnas.0409188102
  2. BejaO, L Aravind, EV Koonin, MT Suzuki, A Hadd, LP Nguyen, SB Jovanovich, CM Gates, RA Feldman, JL Spudich, EN Spudich and EF DeLong. 2000. Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Science 289:1902-1906. https://doi.org/10.1126/science.289.5486.1902
  3. Barrick JE, DS Yu, SH Yoon, H Jeong, TK Oh, D Schneider, RE Lenski and JF Kim. 2009. Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461:1243-1247. https://doi.org/10.1038/nature08480
  4. Bentley DR, S Balasubramanian, HP Swerdlow, GP Smith, J Milton et al. 2008. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456:53- 59. https://doi.org/10.1038/nature07517
  5. Bentley SD, KF Chater, AM Cerdeno-Tarraga, GL Challis, NR Thomson et al. 2002. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417:141-147. https://doi.org/10.1038/417141a
  6. Besemer J, A Lomsadze and M Borodovsky. 2001. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic. Acids. Res. 29:2607-2618. https://doi.org/10.1093/nar/29.12.2607
  7. Bickhart DM, Y Hou, SG Schroeder, C Alkan, MF Cardone, LK Matukumalli, J Song, RD Schnabel, M Ventura, JF Taylor, JF Garcia, CP Van Tassell, TS Sonstegard, EE Eichler and GE Liu. 2012. Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res. 22:778-790. https://doi.org/10.1101/gr.133967.111
  8. Blainey PC, AC Mosier, A Potanina, CA Francis and SR Quake. 2011. Genome of a low-salinity ammonia-oxidizing archaeon determined by single-cell and metagenomic analysis,PLos One 6.
  9. Blattner FR, G Plunkett, 3rd, CA Bloch, NT Perna, V Burland, M Riley, J Collado-Vides, JD Glasner, CK Rode, GF Mayhew, J Gregor, NW Davis, HA Kirkpatrick, MA Goeden, DJ Rose, B Mau and Y Shao. 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453-1462 https://doi.org/10.1126/science.277.5331.1453
  10. Boetzer M, CV Henkel, HJ Jansen, D Butler and W Pirovano. 2011. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27:578-579. https://doi.org/10.1093/bioinformatics/btq683
  11. Bult CJ, O White, GJ Olsen, L Zhou, RD Fleischmann, GG Sutton, JA Blake, LM FitzGerald, RA Clayton, JD Gocayne, AR Kerlavage, BA Dougherty, JF Tomb, MD Adams, CI Reich, R Overbeek, EF Kirkness, KG Weinstock, JM Merrick, A Glodek, JL Scott, NS Geoghagen and JC Venter. 1996. Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273:1058- 1073. https://doi.org/10.1126/science.273.5278.1058
  12. Cai Y and Y Sun. 2011. ESPRIT-Tree: hierarchical clustering analysis of millions of 16S rRNA pyrosequences in quasilinear computational time. Nucleic. Acids. Res. 39:e95. https://doi.org/10.1093/nar/gkr349
  13. Caporaso JG, J Kuczynski, J Stombaugh, K Bittinger, FD Bushman et al. 2010. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7:335-336. https://doi.org/10.1038/nmeth.f.303
  14. Casanueva A, M Tuffin, C Cary and DA Cowan. 2010. Molecular adaptations to psychrophily: the impact of 'mic' technologies. Trends Microbiol. 18:374-381. https://doi.org/10.1016/j.tim.2010.05.002
  15. Dayarian A, TP Michael and AM Sengupta. 2010. SOPRA: Scaffolding algorithm for paired reads via statistical optimization. BMC Bioinformatics 11.
  16. Delcher AL, D Harmon, S Kasif, O White and SL Salzberg. 1999. Improved microbial gene identification with GLIMMER. Nucleic. Acids. Res. 27:4636-4641. https://doi.org/10.1093/nar/27.23.4636
  17. Dinsdale EA, RA Edwards, D Hall, F Angly, M Breitbart, JM Brulc, M Furlan, C Desnues, M Haynes, L Li, L McDaniel, MA Moran, KE Nelson, C Nilsson, R Olson, J Paul, BR Brito, Y Ruan, BK Swan, R Stevens, DL Valentine, RV Thurber, L Wegley, BA White and F Rohwer. 2008. Functional metagenomic profiling of nine biomes. Nature 452: 629-632 https://doi.org/10.1038/nature06810
  18. Elinav E, T Strowig, AL Kau, J Henao-Mejia, CA Thaiss, CJ Booth, DR Peaper, J Bertin, SC Eisenbarth, JI Gordon and RA Flavell. 2011. NLRP6 inflammasome regulates colonic microbial ecology and risk for colitis. Cell 145:745-757. https://doi.org/10.1016/j.cell.2011.04.022
  19. Fleischmann RD, D Alland, JA Eisen, L Carpenter, O White, J Peterson, R DeBoy, R Dodson, M Gwinn, D Haft, E Hickey, JF Kolonay, WC Nelson, LA Umayam, M Ermolaeva, SL Salzberg, A Delcher, T Utterback, J Weidman, H Khouri, J Gill, A Mikula, W Bishai, WR Jacobs Jr, Jr., JC Venter and CM Fraser. 2002. Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. J. Bacteriol. 184:5479-5490. https://doi.org/10.1128/JB.184.19.5479-5490.2002
  20. Fleischmann RD, MD Adams, O White, RA Clayton, EF Kirkness et al. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496- 512. https://doi.org/10.1126/science.7542800
  21. Fraser CM, JD Gocayne, O White, MD Adams, RA Clayton et al. 1995. The minimal gene complement of Mycoplasma genitalium. Science 270:397-403. https://doi.org/10.1126/science.270.5235.397
  22. Freiberg C, R Fellay, A Bairoch, WJ Broughton, A Rosenthal and X Perret. 1997. Molecular basis of symbiosis between Rhizobium and legumes. Nature 387:394-401. https://doi.org/10.1038/387394a0
  23. Galand PE, EO Casamayor, DL Kirchman and C Lovejoy. 2009. Ecology of the rare microbial biosphere of the Arctic Ocean. Proc. Natl. Acad. Sci. U.S.A. 106:22427-22432. https://doi.org/10.1073/pnas.0908284106
  24. Hallam SJ, KT Konstantinidis, N Putnam, C Schleper, Y Watanabe, J Sugahara, C Preston, J de la Torre, PM Richardson and EF DeLong. 2006. Genomic analysis of the uncultivated marine crenarchaeote Cenarchaeum symbiosum, Proc. Natl. Acad. Sci. U.S.A. 103:18296-18301. https://doi.org/10.1073/pnas.0608549103
  25. Hamady M, C Lozupone and R Knight. 2010. Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 4:17-27. https://doi.org/10.1038/ismej.2009.97
  26. Handelsman J, MR Rondon, SF Brady, J Clardy and RM Goodman. 1998. Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products. Chem. Biol. 5:R245-249. https://doi.org/10.1016/S1074-5521(98)90108-9
  27. Heidelberg JF, IT Paulsen, KE Nelson, EJ Gaidos, WC Nelson et al. 2002. Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis. Nat. Biotechnol. 20:1118-1123. https://doi.org/10.1038/nbt749
  28. Henao-Mejia J, E Elinav, C Jin, L Hao, WZ Mehal, T Strowig, CA Thaiss, AL Kau, SC Eisenbarth, MJ Jurczak, JP Camporez, GI Shulman, JI Gordon, HM Hoffman and RA Flavell. 2012. Inflammasome-mediated dysbiosis regulates progression of NAFLD and obesity. Nature 482:179-185. https://doi.org/10.1038/nature10809
  29. Hess M, A Sczyrba, R Egan, TW Kim, H Chokhawala, G Schroth, S Luo, DS Clark, F Chen, T Zhang, RI Mackie, LA Pennacchio, SG Tringe, A Visel, T Woyke, Z Wang and EM Rubin. 2011. Metagenomic discovery of biomassdegrading genes and genomes from cow rumen. Science 331:463-467. https://doi.org/10.1126/science.1200387
  30. Jeong H, JH Yim, C Lee, SH Choi, YK Park, SH Yoon, CG Hur, HY Kang, D Kim, HH Lee, KH Park, SH Park, HS Park, HK Lee, TK Oh and JF Kim. 2005. Genomic blueprint of Hahella chejuensis, a marine microbe producing an algicidal agent. Nucleic. Acids. Res. 33:7066-7073 https://doi.org/10.1093/nar/gki1016
  31. Jeong H, V Barbe, CH Lee, D Vallenet, DS Yu, SH Choi, A Couloux, SW Lee, SH Yoon, L Cattolico, CG Hur, HS Park, B Segurens, SC Kim, TK Oh, RE Lenski, FW Studier, P Daegelen and JF Kim. 2009. Genome sequences of Escherichia coli B strains REL606 and BL21 (DE3). J. Mol. Biol. 394: 644-652. https://doi.org/10.1016/j.jmb.2009.09.052
  32. Jung JY, SH Lee, JM Kim, MS Park, JW Bae, Y Hahn, EL Madsen and CO Jeon. 2011. Metagenomic analysis of kimchi, a traditional Korean fermented food. Appl. Environ. Microbiol. 77:2264-2274. https://doi.org/10.1128/AEM.02157-10
  33. Kalisky T, P Blainey and SR Quake. 2011. Genomic analysis at the single-cell level. Annual Review Genetics. 45:431- 445. https://doi.org/10.1146/annurev-genet-102209-163607
  34. Kaneko T, Y Nakamura, S Sasamoto, A Watanabe, M Kohara, M Matsumoto, S Shimpo, M Yamada and S Tabata. 2003. Structural analysis of four large plasmids harboring in a unicellular cyanobacterium, Synechocystis sp. PCC 6803. DNA Res. 10:221-228. https://doi.org/10.1093/dnares/10.5.221
  35. Katoh K and H Toh. 2010. Parallelization of the MAFFT multiple sequence alignment program. Bioinformatics 26:1899- 1900. https://doi.org/10.1093/bioinformatics/btq224
  36. Kim BK, MY Jung, DS Yu, SJ Park, TK Oh, SK Rhee and JF Kim. 2011. Genome sequence of an ammonia-oxidizing soil archaeon, "Candidatus Nitrosoarchaeum koreensis" MY1. J. Bacteriol. 193:5539-5540. https://doi.org/10.1128/JB.05717-11
  37. Kim JF, H Jeong, DS Yu, SH Choi, CG Hur, MS Park, SH Yoon, DW Kim, GE Ji, HS Park and TK Oh. 2009. Genome sequence of the probiotic bacterium Bifidobacterium animalis subsp. lactis AD011. J. Bacteriol. 191:678-679. https://doi.org/10.1128/JB.01515-08
  38. Kim JF, H Jeong, SY Park, SB Kim, YK Park, SK Choi, CM Ryu, CG Hur, SY Ghim, TK Oh, JJ Kim, CS Park and SH Park. 2010. Genome sequence of the polymyxin-producing plant-probiotic rhizobacterium Paenibacillus polymyxa E681. J. Bacteriol. 192:6103-6104. https://doi.org/10.1128/JB.00983-10
  39. Kuroda M, T Ohta, I Uchiyama, T Baba, H Yuzawa et al. 2001. Whole genome sequencing of meticillin-resistant Staphylococcus aureus. Lancet 357: 1225-1240. https://doi.org/10.1016/S0140-6736(00)04403-2
  40. Kwak MJ, JY Song, H Jeong, SY Kim, SG Kang, BK Kim, SK Kwon, CH Lee, DS Yu, SH Park and JF Kim. in press. Genome Sequence of the Endophytic Bacterium Burkholderia sp. KJ006, J. Bacteriol.
  41. Lai B, R Ding, Y Li, L Duan and H Zhu. 2012. A de novo metagenomic assembly program for shotgun DNA reads. Bioinformatics 28:1455-1462. https://doi.org/10.1093/bioinformatics/bts162
  42. Lan Y, Q Wang, JR Cole and GL Rosen. 2012. Using the RDP classifier to predict taxonomic novelty and reduce the search space for finding novel organisms. PLoS One 7:e32491. https://doi.org/10.1371/journal.pone.0032491
  43. Lander ES, LM Linton, B Birren, C Nusbaum, MC Zody et al. 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921. https://doi.org/10.1038/35057062
  44. Li H and R Durbin. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754- 1760. https://doi.org/10.1093/bioinformatics/btp324
  45. Li H, J Ruan and R Durbin. 2008. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18:1851-1858. https://doi.org/10.1101/gr.078212.108
  46. Li R, H Zhu, J Ruan, W Qian, X Fang, Z Shi, Y Li, S Li, G Shan, K Kristiansen, S Li, H Yang, J Wang and J Wang. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265-272. https://doi.org/10.1101/gr.097261.109
  47. Li W and A Godzik. 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658-1659. https://doi.org/10.1093/bioinformatics/btl158
  48. Mackelprang R, MP Waldrop, KM DeAngelis, MM David, KL Chavarria, SJ Blazewicz, EM Rubin and JK Jansson. 2011. Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw. Nature 480:368-371. https://doi.org/10.1038/nature10576
  49. McHardy AC, HG Martin, A Tsirigos, P Hugenholtz and I Rigoutsos. 2007. Accurate phylogenetic classification of variable-length DNA fragments. Nat. Methods 4:63-72. https://doi.org/10.1038/nmeth976
  50. McKernan KJ, HE Peckham, GL Costa, SF McLaughlin, Y Fu et al. 2009. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res. 19:1527- 1541. https://doi.org/10.1101/gr.091868.109
  51. Mendes R, M Kruijt, I de Bruijn, E Dekkers, M van der Voort, JH Schneider, YM Piceno, TZ DeSantis, GL Andersen, PA Bakker and JM Raaijmakers. 2011. Deciphering the rhizosphere microbiome for disease-suppressive bacteria. Science 332:1097-1100. https://doi.org/10.1126/science.1203980
  52. Methe BA, KE Nelson, JA Eisen, IT Paulsen, W Nelson et al. 2003. Genome of Geobacter sulfurreducens: metal reduction in subsurface environments. Science 302:1967-1969. https://doi.org/10.1126/science.1088727
  53. Metzker ML. 2010. Sequencing technologies - the next generation. Nat. Rev. Genet. 11:31-46. https://doi.org/10.1038/nrg2626
  54. Mitra S, M Stark and DH Huson. 2011. Analysis of 16S rRNA environmental sequences using MEGAN. BMC Genomics 12 Suppl 3:S17. https://doi.org/10.1186/1471-2164-12-S3-S17
  55. Ning Z, AJ Cox and JC Mullikin. 2001. SSAHA: a fast search method for large DNA databases. Genome Res. 11:1725- 1729. https://doi.org/10.1101/gr.194201
  56. Nishito Y, Y Osana, T Hachiya, K Popendorf, A Toyoda, A Fujiyama, M Itaya and Y Sakakibara. 2010. Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data. BMC Genomics 11:243. https://doi.org/10.1186/1471-2164-11-243
  57. Notredame C, DG Higgins and J Heringa. 2000. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205-217. https://doi.org/10.1006/jmbi.2000.4042
  58. Ohnishi M, T Murata, K Nakayama, S Kuhara, M Hattori, K Kurokawa, T Yasunaga, K Yokoyama, K Makino, H Shinagawa and T Hayashi. 2000. Comparative analysis of the whole set of rRNA operons between an enterohemorrhagic Escherichia coli O157:H7 Sakai strain and an Escherichia coli K-12 strain MG1655. Syst. Appl. Microbiol. 23:315- 324. https://doi.org/10.1016/S0723-2020(00)80059-4
  59. Pareek CS, R Smoczynski and A Tretyn. 2011. Sequencing technologies and genome sequencing. J. Appl. Genet. 52: 413-435. https://doi.org/10.1007/s13353-011-0057-x
  60. Pignatelli M and A Moya. 2011. Evaluating the fidelity of de novo short read metagenomic assembly using simulated data. PLoS One 6:e19984. https://doi.org/10.1371/journal.pone.0019984
  61. Qin J, R Li, J Raes, M Arumugam, KS Burgdorf et al. 2010. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464:59-65. https://doi.org/10.1038/nature08821
  62. Quaiser A, T Ochsenreiter, HP Klenk, A Kletzin, AH Treusch, G Meurer, J Eck, CW Sensen and C Schleper. 2002. First insight into the genome of an uncultivated crenarchaeote from soil. Environ. Microbiol. 4:603-611. https://doi.org/10.1046/j.1462-2920.2002.00345.x
  63. Rabus R, M Kube, A Beck, F Widdel and R Reinhardt. 2002. Genes involved in the anaerobic degradation of ethylbenzene in a denitrifying bacterium, strain EbN1. Arch. Microbiol. 178:506-516. https://doi.org/10.1007/s00203-002-0487-2
  64. Rusch DB, AL Halpern, G Sutton, KB Heidelberg, S Williamson et al. 2007. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. 5:e77. https://doi.org/10.1371/journal.pbio.0050077
  65. Sanger F, S Nicklen and AR Coulson. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. U.S.A. 74:5463-5467. https://doi.org/10.1073/pnas.74.12.5463
  66. Schloss PD and J Handelsman. 2005. Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl. Environ. Microbio. 71:1501-1506. https://doi.org/10.1128/AEM.71.3.1501-1506.2005
  67. Schloss PD, SL Westcott, T Ryabin, JR Hall, M Hartmann, EB Hollister, RA Lesniewski, BB Oakley, DH Parks, CJ Robinson, JW Sahl, B Stres, GG Thallinger, DJ Van Horn and CF Weber. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75:7537-7541. https://doi.org/10.1128/AEM.01541-09
  68. Setubal JC, P dos Santos, BS Goldman, H Ertesvag, G Espin et al. 2009. Genome sequence of Azotobacter vinelandii, an obligate aerobe specialized to support diverse anaerobic metabolic processes. J. Bacteriol. 191:4534-4545. https://doi.org/10.1128/JB.00504-09
  69. Sievers F, A Wilm, D Dineen, TJ Gibson, K Karplus, W Li, R Lopez, H McWilliam, M Remmert, J Soding, JD Thompson and DG Higgins. 2011. Fast, scalable generation of highquality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7:539.
  70. Simpson JT, K Wong, SD Jackman, JE Schein, SJ Jones and I Birol. 2009. ABySS: a parallel assembler for short read sequence data. Genome Res. 19:1117-1123. https://doi.org/10.1101/gr.089532.108
  71. Song JY, H Jeong, DS Yu, MA Fischbach, HS Park, JJ Kim, JS Seo, SE Jensen, TK Oh, KJ Lee and JF Kim. 2010. Draft genome sequence of Streptomyces clavuligerus NRRL 3585, a producer of diverse secondary metabolites. J. Bacteriol. 192:6317-6318. https://doi.org/10.1128/JB.00859-10
  72. Tomb JF, O White, AR Kerlavage, RA Clayton, GG Sutton et al. 1997. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388:539-547. https://doi.org/10.1038/41483
  73. Toulza E, A Tagliabue, S Blain and G Piganeau. 2012. Analysis of the global ocean sampling (GOS) project for trends in iron uptake by surface ocean microbes. PLos One 7:e30931. https://doi.org/10.1371/journal.pone.0030931
  74. Tsai IJ, TD Otto and M Berriman. 2010. Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps. Genome Biol. 11:R41. https://doi.org/10.1186/gb-2010-11-4-r41
  75. Tyson GW, J Chapman, P Hugenholtz, EE Allen, RJ Ram, PM Richardson, VV Solovyev, EM Rubin, DS Rokhsar and JF Banfield. 2004. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428:37-43. https://doi.org/10.1038/nature02340
  76. Valouev A, J Ichikawa, T Tonthat, J Stuart, S Ranade, H Peckham, K Zeng, JA Malek, G Costa, K McKernan, A Sidow, A Fire and SM Johnson. 2008. A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res. 18:1051-1063. https://doi.org/10.1101/gr.076463.108
  77. Venter JC, K Remington, JF Heidelberg, AL Halpern, D Rusch, JA Eisen, D Wu, I Paulsen, KE Nelson, W Nelson, DE Fouts, S Levy, AH Knap, MW Lomas, K Nealson, O White, J Peterson, J Hoffman, R Parsons, H Baden-Tillson, C Pfannkoch, YH Rogers and HO Smith. 2004. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304:66-74. https://doi.org/10.1126/science.1093857
  78. Venter JC, MD Adams, EW Myers, PW Li, RJ Mural et al. 2001. The sequence of the human genome. Science 291: 1304-1351. https://doi.org/10.1126/science.1058040
  79. Walker CB, JR de la Torre, MG Klotz, H Urakawa, N Pinel et al. 2010. Nitrosopumilus maritimus genome reveals unique mechanisms for nitrification and autotrophy in globally distributed marine crenarchaea. Proc. Natl. Acad. Sci. U.S.A. 107:8818-8823. https://doi.org/10.1073/pnas.0913533107
  80. Woyke T, G Xie, A Copeland, JM Gonzalez, C Han, H Kiss, JH Saw, P Senin, C Yang, S Chatterji, JF Cheng, JA Eisen, ME Sieracki and R Stepanauskas. 2009. Assembling the marine metagenome, one cell at a time. PLoS One 4:e5299. https://doi.org/10.1371/journal.pone.0005299
  81. Xing Y, D Medvin, G Narasimhan, D Yoder-Himes and S Lory. 2011. CloG: A pipeline for closing gaps in a draft assembly using short reads. In Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on, pp. 202-207.
  82. Yooseph S, G Sutton, DB Rusch, AL Halpern, SJ Williamson et al. 2007. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol. 5:e16. https://doi.org/10.1371/journal.pbio.0050016
  83. Yutin N, MT Suzuki, H Teeling, M Weber, JC Venter, DB Rusch and O Béjà. 2007. Assessing diversity and biogeography of aerobic anoxygenic phototrophic bacteria in surface waters of the Atlantic and Pacific Oceans using the Global Ocean Sampling expedition metagenomes. Environ. Microbiol. 9:1464-1475. https://doi.org/10.1111/j.1462-2920.2007.01265.x
  84. Zerbino DR and E Birney. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18:821-829. https://doi.org/10.1101/gr.074492.107