DOI QR코드

DOI QR Code

Human Transcriptome and Chromatin Modifications: An ENCODE Perspective

  • Shen, Li (Department of Neuroscience, Mount Sinai School of Medicine) ;
  • Choi, Inchan (Department of Genetics, Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania) ;
  • Nestler, Eric J. (Department of Neuroscience, Mount Sinai School of Medicine) ;
  • Won, Kyoung-Jae (Department of Genetics, Institute for Diabetes, Obesity and Metabolism, University of Pennsylvania)
  • Received : 2013.02.14
  • Accepted : 2013.03.13
  • Published : 2013.06.30

Abstract

A decade-long project, led by several international research groups, called the Encyclopedia of DNA Elements (ENCODE), recently released an unprecedented amount of data. The ambitious project covers transcriptome, cistrome, epigenome, and interactome data from more than 1,600 sets of experiments in human. To make use of this valuable resource, it is important to understand the information it represents and the techniques that were used to generate these data. In this review, we introduce the data that ENCODE generated, summarize the observations from the data analysis, and revisit a computational approach that ENCODE used to predict gene expression, with a focus on the human transcriptome and its association with chromatin modifications.

Keywords

References

  1. ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 2004;306:636-640. https://doi.org/10.1126/science.1105136
  2. ENCODE Project Consortium, Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007;447:799-816. https://doi.org/10.1038/nature05874
  3. ENCODE Project Consortium, Myers RM, Stamatoyannopoulos J, Snyder M, Dunham I, Hardison RC, et al. A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 2011;9:e1001046. https://doi.org/10.1371/journal.pbio.1001046
  4. ENCODE Project Consortium, Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, et al. An integrated encyclopedia of DNA elements in the human genome. Nature 2012;489:57-74. https://doi.org/10.1038/nature11247
  5. Strahl BD, Allis CD. The language of covalent histone modifications. Nature 2000;403:41-45. https://doi.org/10.1038/47412
  6. Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 2009; 459:108-112. https://doi.org/10.1038/nature07829
  7. Andersson R, Enroth S, Rada-Iglesias A, Wadelius C, Komorowski J. Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res 2009; 19:1732-1741. https://doi.org/10.1101/gr.092353.109
  8. Luco RF, Pan Q, Tominaga K, Blencowe BJ, Pereira-Smith OM, Misteli T. Regulation of alternative splicing by histone modifications. Science 2010;327:996-1000. https://doi.org/10.1126/science.1184208
  9. Kouzarides T. Chromatin modifications and their function. Cell 2007;128:693-705. https://doi.org/10.1016/j.cell.2007.02.005
  10. Zhang ZD, Paccanaro A, Fu Y, Weissman S, Weng Z, Chang J, et al. Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions. Genome Res 2007;17:787-797. https://doi.org/10.1101/gr.5573107
  11. Johnson DS, Li W, Gordon DB, Bhattacharjee A, Curry B, Ghosh J, et al. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res 2008;18:393-403. https://doi.org/10.1101/gr.7080508
  12. Sboner A, Mu XJ, Greenbaum D, Auerbach RK, Gerstein MB. The real cost of sequencing: higher than you think! Genome Biol 2011;12:125. https://doi.org/10.1186/gb-2011-12-8-125
  13. Ho JW, Bishop E, Karchenko PV, Negre N, White KP, Park PJ. ChIP-chip versus ChIP-seq: lessons for experimental design and data analysis. BMC Genomics 2011;12:134. https://doi.org/10.1186/1471-2164-12-134
  14. Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet 2009;10:669-680. https://doi.org/10.1038/nrg2641
  15. Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 2012;22: 1813-1831. https://doi.org/10.1101/gr.136184.111
  16. Chen Y, Negre N, Li Q, Mieczkowska JO, Slattery M, Liu T, et al. Systematic evaluation of factors influencing ChIP-seq fidelity. Nat Methods 2012;9:609-614. https://doi.org/10.1038/nmeth.1985
  17. Egelhofer TA, Minoda A, Klugman S, Lee K, Kolasinska- Zwierz P, Alekseyenko AA, et al. An assessment of histone- modification antibody quality. Nat Struct Mol Biol 2011; 18:91-93. https://doi.org/10.1038/nsmb.1972
  18. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature 2012;489:101-108. https://doi.org/10.1038/nature11233
  19. Flicek P, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, et al. Ensembl 2012. Nucleic Acids Res 2012;40:D84-D90. https://doi.org/10.1093/nar/gkr991
  20. Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 2009;458: 223-227. https://doi.org/10.1038/nature07672
  21. Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 2010;28:503-510. https://doi.org/10.1038/nbt.1633
  22. Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet 2009;10:155-159. https://doi.org/10.1038/nrg2521
  23. Katayama S, Tomaru Y, Kasukawa T, Waki K, Nakanishi M, Nakamura M, et al. Antisense transcription in the mammalian transcriptome. Science 2005;309:1564-1566. https://doi.org/10.1126/science.1112009
  24. Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 2012;22:1775-1789. https://doi.org/10.1101/gr.132159.111
  25. Reed R. Coupling transcription, splicing and mRNA export. Curr Opin Cell Biol 2003;15:326-331. https://doi.org/10.1016/S0955-0674(03)00048-6
  26. Kornblihtt AR, de la Mata M, Fededa JP, Munoz MJ, Nogues G. Multiple links between transcription and splicing. RNA 2004; 10:1489-1498. https://doi.org/10.1261/rna.7100104
  27. Listerman I, Sapra AK, Neugebauer KM. Cotranscriptional coupling of splicing factor recruitment and precursor messenger RNA splicing in mammalian cells. Nat Struct Mol Biol 2006;13:815-822. https://doi.org/10.1038/nsmb1135
  28. Vargas DY, Shah K, Batish M, Levandoski M, Sinha S, Marras SA, et al. Single-molecule imaging of transcriptionally coupled and uncoupled splicing. Cell 2011;147:1054-1065. https://doi.org/10.1016/j.cell.2011.10.024
  29. Tilgner H, Knowles DG, Johnson R, Davis CA, Chakrabortty S, Djebali S, et al. Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res 2012;22:1616-1625. https://doi.org/10.1101/gr.134445.111
  30. Hon G, Wang W, Ren B. Discovery and annotation of functional chromatin signatures in the human genome. PLoS Comput Biol 2009;5:e1000566. https://doi.org/10.1371/journal.pcbi.1000566
  31. Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet 2009;41:376-381. https://doi.org/10.1038/ng.322
  32. Nahkuri S, Taft RJ, Mattick JS. Nucleosomes are preferentially positioned at exons in somatic and sperm cells. Cell Cycle 2009;8:3420-3424. https://doi.org/10.4161/cc.8.20.9916
  33. Spies N, Nielsen CB, Padgett RA, Burge CB. Biased chromatin signatures around polyadenylation sites and exons. Mol Cell 2009;36:245-254. https://doi.org/10.1016/j.molcel.2009.10.008
  34. Tilgner H, Nikolaou C, Althammer S, Sammeth M, Beato M, Valcárcel J, et al. Nucleosome positioning as a determinant of exon recognition. Nat Struct Mol Biol 2009;16:996-1001. https://doi.org/10.1038/nsmb.1658
  35. Li M, Wang IX, Li Y, Bruzel A, Richards AL, Toung JM, et al. Widespread RNA and DNA sequence differences in the human transcriptome. Science 2011;333:53-58. https://doi.org/10.1126/science.1207018
  36. Pickrell JK, Gilad Y, Pritchard JK. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012;335:1302.
  37. Kleinman CL, Majewski J. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012;335:1302.
  38. Lin W, Piskol R, Tan MH, Li JB. Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science 2012;335:1302.
  39. Park E, Williams B, Wold BJ, Mortazavi A. RNA editing in the human ENCODE RNA-seq data. Genome Res 2012;22: 1626-1633. https://doi.org/10.1101/gr.134957.111
  40. Peng Z, Cheng Y, Tan BC, Kang L, Tian Z, Zhu Y, et al. Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Nat Biotechnol 2012; 30:253-260. https://doi.org/10.1038/nbt.2122
  41. Mighell AJ, Smith NR, Robinson PA, Markham AF. Vertebrate pseudogenes. FEBS Lett 2000;468:109-114. https://doi.org/10.1016/S0014-5793(00)01199-6
  42. Harrison PM, Echols N, Gerstein MB. Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. Nucleic Acids Res 2001;29:818-830. https://doi.org/10.1093/nar/29.3.818
  43. Echols N, Harrison P, Balasubramanian S, Luscombe NM, Bertone P, Zhang Z, et al. Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes. Nucleic Acids Res 2002;30: 2515-2523. https://doi.org/10.1093/nar/30.11.2515
  44. Balakirev ES, Ayala FJ. Pseudogenes: are they "junk" or functional DNA? Annu Rev Genet 2003;37:123-151. https://doi.org/10.1146/annurev.genet.37.040103.103949
  45. Poliseno L, Salmena L, Zhang J, Carver B, Haveman WJ, Pandolfi PP. A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature 2010;465: 1033-1038. https://doi.org/10.1038/nature09144
  46. Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, Cheloufi S, et al. Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. Nature 2008;453:534-538. https://doi.org/10.1038/nature06904
  47. Watanabe T, Totoki Y, Toyoda A, Kaneda M, Kuramochi- Miyagawa S, Obata Y, et al. Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes. Nature 2008;453:539-543. https://doi.org/10.1038/nature06908
  48. Sasidharan R, Gerstein M. Genomics: protein fossils live on as RNA. Nature 2008;453:729-731. https://doi.org/10.1038/453729a
  49. Guo X, Zhang Z, Gerstein MB, Zheng D. Small RNAs originated from pseudogenes: cis- or trans-acting? PLoS Comput Biol 2009;5:e1000449. https://doi.org/10.1371/journal.pcbi.1000449
  50. Pei B, Sisu C, Frankish A, Howald C, Habegger L, Mu XJ, et al. The GENCODE pseudogene resource. Genome Biol 2012;13: R51. https://doi.org/10.1186/gb-2012-13-9-r51
  51. Kim TK, Hemberg M, Gray JM, Costa AM, Bear DM, Wu J, et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 2010;465:182-187. https://doi.org/10.1038/nature09033
  52. Core LJ, Waterfall JJ, Lis JT. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 2008;322:1845-1848. https://doi.org/10.1126/science.1162228
  53. Wang D, Garcia-Bassets I, Benner C, Li W, Su X, Zhou Y, et al. Reprogramming transcription by distinct classes of enhancers functionally defined by eRNA. Nature 2011;474:390-394. https://doi.org/10.1038/nature10006
  54. Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, et al. CAGE: cap analysis of gene expression. Nat Methods 2006;3:211-222. https://doi.org/10.1038/nmeth0306-211
  55. Ng P, Wei CL, Sung WK, Chiu KP, Lipovich L, Ang CC, et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nat Methods 2005;2: 105-111. https://doi.org/10.1038/nmeth733
  56. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009;10:57-63. https://doi.org/10.1038/nrg2484
  57. Dong X, Greven MC, Kundaje A, Djebali S, Brown JB, Cheng C, et al. Modeling gene expression using chromatin features in various cellular contexts. Genome Biol 2012;13:R53. https://doi.org/10.1186/gb-2012-13-9-r53
  58. Karlic R, Chung HR, Lasserre J, Vlahovicek K, Vingron M. Histone modification levels are predictive for gene expression. Proc Natl Acad Sci U S A 2010;107:2926-2931. https://doi.org/10.1073/pnas.0909344107
  59. Costa IG, Roider HG, do Rego TG, de Carvalho Fde A. Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models. BMC Bioinformatics 2011;12 Suppl 1:S29. https://doi.org/10.1186/1471-2105-12-S1-S29
  60. Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett 2006;27:861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  61. Natarajan A, Yardimci GG, Sheffield NC, Crawford GE, Ohler U. Predicting cell-type-specific gene expression from regions of open chromatin. Genome Res 2012;22:1711-1722. https://doi.org/10.1101/gr.135129.111
  62. Hoffman MM, Ernst J, Wilder SP, Kundaje A, Harris RS, Libbrecht M, et al. Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res 2013;41:827-841. https://doi.org/10.1093/nar/gks1284
  63. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011;473:43-49. https://doi.org/10.1038/nature09906
  64. Won, KJ, Zhang X, Wang T, Ding B, Raha D, Snyder M, et al. Comparative annotation of functional regions in the human genome using epigenome data. Nucleic Acids Rese 2013;41: 4423-4432. https://doi.org/10.1093/nar/gkt143
  65. Shin H, Liu T, Manrai AK, Liu XS. CEAS: cis-regulatory element annotation system. Bioinformatics 2009;25:2605-2606. https://doi.org/10.1093/bioinformatics/btp479
  66. Ye T, Krebs AR, Choukrallah MA, Keime C, Plewniak F, Davidson I, et al. seqMINER: an integrated ChIP-seq data interpretation platform. Nucleic Acids Res 2011;39:e35. https://doi.org/10.1093/nar/gkq1287
  67. Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, et al. Architecture of the human regulatory network derived from ENCODE data. Nature 2012;489:91-100. https://doi.org/10.1038/nature11245
  68. Neph S, Vierstra J, Stergachis AB, Reynolds AP, Haugen E, Vernot B, et al. An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 2012;489: 83-90. https://doi.org/10.1038/nature11212
  69. Sanyal A, Lajoie BR, Jain G, Dekker J. The long-range interaction landscape of gene promoters. Nature 2012;489: 109-113. https://doi.org/10.1038/nature11279

Cited by

  1. Histone arginine methylation in cocaine action in the nucleus accumbens vol.113, pp.34, 2016, https://doi.org/10.1073/pnas.1605045113