DOI QR코드

DOI QR Code

De novo gene set assembly of the transcriptome of diploid, oilseed-crop species Perilla citriodora

  • Received : 2016.08.24
  • Accepted : 2016.09.12
  • Published : 2016.09.30

Abstract

High-quality gene sets are necessary for functional research of genes. Although Perilla is a commonly cultivated oil crop and vegetable crop in Southeast Asia, the quality of its available gene set is insufficient. To construct a high-quality Perilla gene set, we sequenced mRNAs extracted from different tissues of Perilla citriodora, the wild species (2n = 20) of Perilla. To make a high-quality gene set for P. citriodora, we compared the quality of assemblies produced by Velvet and Trinity, the two well-known de novo assemblers, and improved the de novo assembly pipeline by optimizing k-mers and removing redundant sequences. We then selected representative transcripts for loci according to several criteria. The improved assembly yielded a total of 86,396 transcripts and 38,413 representative transcripts. We evaluated the assembled transcripts by comparing them to 638 homologous Arabidopsis genes involved in fatty acid and TAG biosynthesis pathways. High proportions of full-length genes and transcripts in the assembled transcripts matched known genes in other species, indicating that the P. citriodora gene set can be applied in future functional studies. Our study provides a reference P. citriodora gene set for further studies. It will serve as valuable genetic resource to elucidate the molecular basis of various metabolisms.

Keywords

References

  1. Bates PD, Johnson SR, Cao X, Li J, Nam JW, Jaworski JG, et al. (2014) Fatty acid synthesis is inhibited by inefficient utilization of unusual fatty acids for glycerolipid assembly. Proc Natl Acad Sci USA 111(3), 1204-1209 https://doi.org/10.1073/pnas.1318511111
  2. Bumblauskiene L, Jakstas V, Janulis V, Mazdzieriene R, Ragazinskiene O. (2009) Preliminary analysis on essential oil composition of Perilla L. cultivated in Lithuania. Acta Pol Pharm 66(4), 409-413
  3. Chen G, Yin K, Wang C, Shi T. (2011a) De novo transcriptome assembly of RNA-Seq reads with different strategies. Sci China Life Sci 54(12), 1129-1133 https://doi.org/10.1007/s11427-011-4256-9
  4. Chen G, Li R, Shi L, Qi J, Hu P, Luo J, et al. (2011b) Revealing the missing expressed genes beyond the human reference genome by RNA-Seq. BMC Genomics 12, 590 https://doi.org/10.1186/1471-2164-12-590
  5. Cox MP, Peterson DA, Biggs PJ. (2010) SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11, 485 https://doi.org/10.1186/1471-2105-11-485
  6. Fukushima A, Nakamura M, Suzuki H, Saito K, Yamazaki M. (2015) High-Throughput Sequencing and De Novo Assembly of Red and Green Forms of the Perilla frutescens var. crispa Transcriptome. PLoS One 10(6), e0129154 https://doi.org/10.1371/journal.pone.0139154
  7. Garber M, Grabherr MG, Guttman M, Trapnell C. (2011) Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods 8(6), 469-477 https://doi.org/10.1038/nmeth.1613
  8. Gongora-Castillo E, Buell CR. (2013) Bioinformatics challenges in de novo transcriptome assembly using short read sequences in the absence of a reference genome sequence. Nat Prod Rep 30(4), 490-500 https://doi.org/10.1039/c3np20099j
  9. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7), 644-652 https://doi.org/10.1038/nbt.1883
  10. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, et al. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 32(Database issue), D258-261 https://doi.org/10.1093/nar/gkh036
  11. Iorizzo M, Senalik DA, Grzebelus D, Bowman M, Cavagnaro PF, Matvienko M, et al. (2011) De novo assembly and characterization of the carrot transcriptome reveals novel genes, new markers, and genetic diversity. BMC Genomics 12, 389 https://doi.org/10.1186/1471-2164-12-389
  12. Ito M, Kiuchi F, Yang LL, Honda G. (2000) Perilla citriodora from Taiwan and its phytochemical characteristics. Biol Pharm Bull 23(3), 359-362 https://doi.org/10.1248/bpb.23.359
  13. Illumina (http://www.illumina.com/products/truseq_rna_library_prep_kit_v2.html)
  14. Jung CS, Lee MH,Oh KW, HK Kim, Park CB, Sung JD, Suh DY. (2005) Discovery of New Diploid Perilla Species in Korea. Korean J. Breed 37(3), 152-154
  15. Kanehisa M, Goto S. (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1), 27-30 https://doi.org/10.1093/nar/28.1.27
  16. Kim HA, Lim CJ, Kim S, Choe JK, Jo SH, Baek N, et al. (2014) High-throughput sequencing and de novo assembly of Brassica oleracea var. Capitata L. for transcriptome analysis. PLoS One 9(3), e92087 https://doi.org/10.1371/journal.pone.0092087
  17. Kim HU, Chen GQ. (2015) Identification of hydroxy fatty acid and triacylglycerol metabolism-related genes in lesquerella through seed transcriptome analysis. BMC Genomics 16, 230 https://doi.org/10.1186/s12864-015-1413-8
  18. KAPA biosystems (http://www.kapabiosystems.com/productapplications/products/next-generation-sequencing-2/libraryquantification/)
  19. Li B, Fillmore N, Bai Y, Collins M, Thomson JA, Stewart R, et al. (2014) Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biol 15(12), 553 https://doi.org/10.1186/s13059-014-0553-5
  20. Langmead B, Trapnell C, Pop M, Salzberg SL. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3), R25 https://doi.org/10.1186/gb-2009-10-3-r25
  21. Lee SC, Lee JK, Kim NH, Park JY, Kim HU, Lee HO, et al. (2014) Analysis of expressed sequence tags from a normalized cDNA library of perilla (Perilla frutescens). Journal of Plant Biology 57(5), 312-320 https://doi.org/10.1007/s12374-014-0263-2
  22. Marguerat S, Bahler J. (2010) RNA-seq: from technology to biology. Cell Mol Life Sci 67(4), 569-579 https://doi.org/10.1007/s00018-009-0180-6
  23. Martin JA, Wang Z. (2011) Next-generation transcriptome assembly. Nat Rev Genet 12(10), 671-682 https://doi.org/10.1038/nrg3068
  24. Ness RW, Siol M, Barrett SC. (2011) De novo sequence assembly and characterization of the floral transcriptome in cross- and self-fertilizing plants. BMC Genomics 12, 298 https://doi.org/10.1186/1471-2164-12-298
  25. Schulz MH, Zerbino DR, Vingron M, Birney E. (2012) Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28(8), 1086-1092 https://doi.org/10.1093/bioinformatics/bts094
  26. Wang Z, Gerstein M, Snyder M. (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1), 57-63 https://doi.org/10.1038/nrg2484
  27. Zerbino DR, Birney E. (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18(5), 821-829 https://doi.org/10.1101/gr.074492.107
  28. Zhang J, Liang S, Duan J, Wang J, Chen S, Cheng Z, et al. (2012) De novo assembly and characterisation of the transcriptome during seed development, and generation of genic-SSR markers in peanut (Arachis hypogaea L.). BMC Genomics 13, 90 https://doi.org/10.1186/1471-2164-13-90

Cited by

  1. Mapping Population vol.50, pp.1, 2018, https://doi.org/10.9787/KJBS.2018.50.1.13