A Survey of the Brassica rapa Genome by BAC-End Sequence Analysis and Comparison with Arabidopsis thaliana

  • Hong, Chang Pyo (Department of Horticulture, College of Agriculture and Life Science, Chungnam National University) ;
  • Plaha, Prikshit (Advanced Centre of Hill Bioresources & Biotechnology, HP Agricultural University) ;
  • Koo, Dal-Hoe (Department of Bioscience, School of Bioscience and Biotechnology, Chungnam National University) ;
  • Yang, Tae-Jin (Department of Plant Science, College of Agriculture and Life Sciences, Seoul National University) ;
  • Choi, Su Ryun (Department of Horticulture, College of Agriculture and Life Science, Chungnam National University) ;
  • Lee, Young Ki (Department of Horticulture, College of Agriculture and Life Science, Chungnam National University) ;
  • Uhm, Taesik (Department of Horticulture, College of Agriculture and Life Science, Chungnam National University) ;
  • Bang, Jae-Wook (Department of Bioscience, School of Bioscience and Biotechnology, Chungnam National University) ;
  • Edwards, David (Primary Industries Research Victoria, Department of Primary Industries, Victorian AgriBioscience Centre) ;
  • Bancroft, Ian (Department of Crop Genetics, John Innes Centre, Norwich Research Park) ;
  • Park, Beom-Seok (Brassica Genomics Team, National Institute of Agricultural Biotechnology) ;
  • Lee, Jungho (Green Plant Institute) ;
  • Lim, Yong Pyo (Department of Horticulture, College of Agriculture and Life Science, Chungnam National University)
  • Received : 2006.07.29
  • Accepted : 2006.10.27
  • Published : 2006.12.31

Abstract

Brassica rapa ssp. pekinensis (Chinese cabbage) is an economically important crop and a model plant for studies on polyploidization and phenotypic evolution. To gain an insight into the structure of the B. rapa genome we analyzed 12,017 BAC-end sequences for the presence of transposable elements (TEs), SSRs, centromeric satellite repeats and genes, and similarity to the closely related genome of Arabidopsis thaliana. TEs were estimated to occupy 14% of the genome, with 12.3% of the genome represented by retrotransposons. It was estimated that the B. rapa genome contains 43,000 genes, 1.6 times greater than the genome of A. thaliana. A number of centromeric satellite sequences, representing variations of a 176-bp consensus sequence, were identified. This sequence has undergone rapid evolution within the B. rapa genome and has diverged among the related species of Brassicaceae. A study of SSRs demonstrated a non-random distribution with a greater abundance within predicted intergenic regions. Our results provide an initial characterization of the genome of B. rapa and provide the basis for detailed analysis through whole-genome sequencing.

Keywords

Acknowledgement

Supported by : Korean Science and Engineering Foundation

References

  1. Adams, K. L. and Wendel, J. F. (2005) Polyploidy and genome evolution in plants. Curr. Opin. Plant Biol. 8, 135−141 https://doi.org/10.1016/j.pbi.2005.01.001
  2. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403−410
  3. Ayele, M., Haas, B. J., Kumar, N., Wu, H., Xiao, Y., et al. (2005) Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis. Genome Res. 15, 487−495 https://doi.org/10.1101/gr.3176505
  4. Bennetzen, J. L. (2002) Mechanisms and rates of genome expansion and contraction in flowering plants. Genetica 115, 29−36
  5. Betrán, E. and Long, M. (2002) Expansion of genome coding regions by acquisition of new genes. Genetica 115, 65−80
  6. Bevan, M. and Walsh, S. (2005) The Arabidopsis genome: a foundation for plant research. Genome Res. 15, 1632−1642 https://doi.org/10.1101/gr.3723405
  7. Bowen, N. J. and Jordan, I. K. (2002) Transposable elements and the evolution of eukaryotic complexity. Curr. Issues Mol. Biol. 4, 65−76
  8. Bowers, J. E., Chapman, B. A., Rong, J., and Paterson, A. H. (2003) Unraveling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433−438 https://doi.org/10.1038/nature01521
  9. Cardle, L., Ramsay, L., Milbourne, D., Macaulay, M., Marshall, D., et al. (2000) Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics 156, 847−854
  10. Cavell, A. C., Lydiate, D. J., Parkin, I. A. P., Dean, C., and Trick, M. (1998) Collinearity between a 30-centimorgan segment of Arabidopsis thaliana chromosome 4 and duplicated regions within the Brassica napus genome. Genome 41, 62−69 https://doi.org/10.1139/gen-41-1-62
  11. Cheng, Z., Dong, F., Langdon, T., Ouyang, S., Buell, C. R., et al. (2002) Functional rice centromeres are marked by a satellite repeat and a centromere-specific retrotransposon. Plant Cell 14, 1691−1704 https://doi.org/10.1105/tpc.003079
  12. Copenhaver, G. P., Nickel, K., Kuromori, T., Benito, M. I., Kaul, S., et al. (1999) Genetic definition and sequence analysis of Arabidopsis centromeres. Science 286, 2468−2474
  13. Dong, F., Miller, J. T., Jackson, S. A., Wang, G. L., Ronald, P. C., et al. (1998) Rice (Oryza sativa) centromeric regions consist of complex DNA. Proc. Natl. Acad. Sci. USA 95, 8135−8140
  14. Ewing, B. and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186−194
  15. Fedoroff, N. (2000) Transposons and genome evolution in plants. Proc. Natl. Acad. Sci. USA 97, 7002−7007
  16. Friedman, R. and Hughes, A. L. (2001) Gene duplication and the structure of eukaryotic genomes. Genome Res. 11, 373−381
  17. Fujimori, S., Washio, T., Higo, K., Ohtomo, Y., Murakami, K., et al. (2003) A novel feature of microsatellites in plants: a distribution gradient along the direction of transcription. FEBS Lett. 554, 17−22 https://doi.org/10.1016/S0014-5793(03)01041-X
  18. Fujiyama, A., Watanabe, H., Toyoda, A., Taylor, T. D., Itoh, T., et al. (2002) Construction and analysis of a humanchimpanzee comparative clone map. Science 295, 131−134
  19. Gao, M., Li, G., Yang, B., McCombie, W. R., and Quiros, C. F. (2004) Comparative analysis of a Brassica BAC clone containing several major aliphatic glucosinolate genes with its corresponding Arabidopsis sequence. Genome 47, 666−679 https://doi.org/10.1139/g04-021
  20. Goff et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296, 92−100
  21. Grellet, F., Delcasso, D., Panabieres, F., and Delseny, M. (1986) Organization and evolution of a higher plant alphoid-like satellite DNA sequence. J. Mol. Biol. 187, 495−507 https://doi.org/10.1016/0022-2836(86)90329-3
  22. Harbinder, S. and Lakshmikumaran, M. (1990) A repetitive sequence from Diplotaxis erucoides is highly homologous to that of Brassica campestris and B. oleracea. Plant Mol. Biol. 15, 155−156 https://doi.org/10.1007/BF00017733
  23. Hass, B. J., Wortman, J. R., Ronning, C. M., Hannick, L. I., Smith, R. K. Jr, et al. (2005) Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release. BMC Biol. 3, 7 https://doi.org/10.1186/1741-7007-3-7
  24. Henikoff, S., Ahmad, K., and Malik, H. S. (2001) The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293, 1098−1102
  25. Heslop-Harrison, J. S., Murata, M., Ogura, Y., Schwarzacher, T., and Motoyoshi, F. (1999) Polymorphisms and genomic organization of repetitive DNA from centromeric regions of Arabidopsis chromosomes. Plant Cell 11, 31−42 https://doi.org/10.1105/tpc.11.1.31
  26. Hong, C. P., Lee, S. J., Park, J. Y., Plaha, P., Park, Y. S., et al. (2004) Construction of a BAC library of Korean ginseng and initial analysis of BAC-end sequences. Mol. Genet. Genomics 271, 709−716
  27. Jiang, J., Birchler, J. A., Parrott, W. A., and Dawe, R. K. (2003) A molecular view of plant centromeres. Trends Plant Sci. 8, 570−575 https://doi.org/10.1016/j.tplants.2003.10.011
  28. Johnston, J. S., Pepper, A. E., Hall, A. E., Chen, Z. J., Hodnett, G., et al. (2005) Evolution of genome size in Brassicaceae. Ann. Bot. 95, 229−235 https://doi.org/10.1093/aob/mci016
  29. Katti, M. V., Ranjekar, P. K., and Gupta, V. S. (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol. Biol. Evol. 18, 1161−1167
  30. Ku, H. M., Vision, T., Liu, J., and Tanksley, S. D. (2000) Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl. Acad. Sci. USA 97, 9121−9126
  31. Kumar, S., Tamura, K., Jakobsen, I. B., and Nei, M. (2001) MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17, 1244−1245
  32. Lagercrantz, U. (1998) Comparative mapping between Arabidopsis thaliana and Brassica nigra indicates that Brassica genomes have evolved through extensive genome replication accompanied by chromosome fusions and frequent rearrangements. Genetics 150, 1217−1228
  33. La Rota, M., Kantety, R. V., Yu, J. K., and Sorrells, M. E. (2005) Nonrandom distribution and frequencies of genomic and EST-derived microsatellite markers in rice, wheat, and barley. BMC Genomics 6, 23 https://doi.org/10.1186/1471-2164-6-23
  34. Lawton-Rauh, A. (2002) Evolutionary dynamics of duplicated genes in plants. Mol. Phylogenet. Evol. 29, 396−409 https://doi.org/10.1016/j.ympev.2003.07.004
  35. Li, Y. C., Korol, A. B., Fahima, T., Beiles, A., and Nevo, E. (2002) Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol. Ecol. 11, 2453−2465
  36. Lim, K. B., de Jong, H., Yang, T. J., Park, J. Y., Kwon, S. J., et al. (2005) Characterization of rDNAs and tandem repeats in the heterochromatin of Brassica rapa. Mol. Cells 19, 436−444
  37. Lim, Y. P., Plaha, P., Choi, S. R., Uhm, T., Hong, C. P., et al. (2006) Towards unraveling the structure of Brassica rapa genome. Physiologia Plantarum 126, 585−591
  38. Lynch, M. and Conery, J. S. (2003) The origins of genome complexity. Science 302, 1401−1404 https://doi.org/10.1126/science.1089370
  39. Lysak, M. A., Koch, M. A., Pecinka, A., and Schubert, I. (2005) Chromosome triplication found across the tribe Brassiceae. Genome Res. 15, 516−525 https://doi.org/10.1101/gr.3531105
  40. Mahairas, G. G., Wallace, J. C., Smith, K., Swartzell, S., Holzman, T., et al. (1999) Sequence-tagged connectors: a sequence approach to mapping and scanning the human genome. Proc. Natl. Acad. Sci. USA 17, 9739−9744
  41. Messing, J., Bharti, A. K., Karlowski, W. M., Gundlach, H., Kim, H. R., et al. (2004) Sequence composition and genome organization of maize. Proc. Natl. Acad. Sci. USA 101, 14349− 14354
  42. Miller, J. T., Jackson, S. A., Nasuda, S., Gill, B. S., Wing, R. A., et al. (1998) Cloning and characterization of a centromerespecific repetitive DNA element from Sorghum bicolor. Theor. Appl. Genet. 96, 832−839
  43. Morgante, M., Hanafey, M., and Powell, W. (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat. Genet. 30, 194−200
  44. Nagaki, K., Song, J., Stupar, R. M., Parokonny, A. S., Yuan, Q., et al. (2003) Molecular and cytological analyses of large tracks of centromeric DNA reveal the structure and evolutionary dynamics of maize centromeres. Genetics 163, 759−770
  45. O'Neill, C. M. and Bancroft, I. (2000) Comparative physical mapping of segments of the genome of Brassica oleracea var. alboglabra that are homoeologous to sequenced regions of chromosomes 4 and 5 of Arabidopsis thaliana. Plant J. 23, 233−243
  46. Park, J. Y., Koo, D. H., Hong, C. P., Lee, S. J., Jeon, J. W., et al. (2005) Physical mapping and microsynteny of Brassica rapa ssp. pekinensis genome corresponding to a 222 kb gene-rich region of Arabidopsis chromosome 4 and partially duplicated on chromosome 5. Mol. Genet. Genomics 274, 579−588 https://doi.org/10.1007/s00438-005-0041-4
  47. Paterson, A. H., Lan, T. H., Amasino, R., Osborn, T. C., and Quiros, C. (2001) Brassica genomics: a complement to, and early beneficiary of, the Arabidopsis sequence. Genome Biol. 2, Reviews 1011.1−1011.4
  48. Quiros, C. F., Grellet, F., Sadowski, J., Suzuki, T., Li, G., et al. (2001) Arabidopsis and Brassica comparative genomics: sequence, structure and gene content in the ABI-Rps2-Ck1 chromosomal segment and related regions. Genetics 157, 1321−1330
  49. Rana, D., van den Boogaart, T., O'Neill, C. M., Hynes, L., Bent, E., et al. (2004) Conservation of the microstructure of genome segments in Brassica napus and its diploid relatives. Plant J. 40, 725−733 https://doi.org/10.1111/j.1365-313X.2004.02244.x
  50. Schmidt, R., Acarkan, A., and Boivin K. (2001) Comparative structural genomics in the Brassicaceae family. Plant Physiol. Biochem. 39, 253−262
  51. The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796−815
  52. Thompson, H. L., Schmidt, R., and Dean, C. (1996) Identification and distribution of seven classes of middle-repetitive DNA in the Arabidopsis thaliana genome. Nucleic Acids Res. 24, 3017−3022 https://doi.org/10.1093/nar/24.15.3017
  53. Toth, G., Gaspari, Z., and Jurka, J. (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 10, 967−981
  54. Town, C. D., Cheung, F., Maiti, R., Crabtree, J., Haas, B. J., et al. (2006) Comparative genomics of Brassica oleracea and Arabidopsis thaliana reveal gene loss, fragmentation, and dispersal after polyploidy. Plant Cell 18, 1348−1359 https://doi.org/10.1105/tpc.106.041665
  55. U, N. (1935) Genome analysis in Brassica with special reference to the experimental formation of B. napus and peculiar mode of fertilization. Jpn. J. Bot. 7, 389−452
  56. Venter, J. C., Smith, H. O., and Hood, L. (1996) A new strategy for genome sequencing. Nature 381, 364−366
  57. Wendel, J. F. and Wessler, S. R. (2000) Retrotransposonmediated genome evolution on a local ecological scale. Proc. Natl. Acad. Sci. USA 97, 6250−6252
  58. Yang, Y. W., Lai, K. N., Tai, P. Y., and Li, W. H. (1999) Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J. Mol. Evol. 48, 597−604 https://doi.org/10.1007/PL00006502
  59. Yang, T. J., Kim, J. S., Lim, K. B., Kwon, S. J., Kim, J. A., et al. (2005) The Korea Brassica genome project: A glimpse of the Brassica genome based on comparative genome analysis with Arabidopsis. Comp. Funct. Genomics 6, 138−146 https://doi.org/10.1002/cfg.465
  60. Yang, T. J., Kim, J. S., Kwon, S. J., Lim, K. B., Choi, B. S., et al. (2006) Sequence-level analysis of the diploidization process in the triplicated FLOWERING LOCUS C region of Brassica rapa. Plant Cell 18, 1339−1347 https://doi.org/10.1105/tpc.105.040535
  61. Yu, J., et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79−92
  62. Zhang, X. and Wessler, S. R. (2004) Genome-wide comparative analysis of the transposable elements in the related species Arabidopsis thaliana and Brassica oleracea. Proc. Natl. Acad. Sci. USA 101, 5589−5594
  63. Zhao, S. (2000) Human BAC ends. Nucleic Acids Res. 28, 129−132