DOI QR코드

DOI QR Code

Sequence-based 5-mers highly correlated to epigenetic modifications in genes interactions

  • Salimi, Dariush (Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran) ;
  • Moeini, Ali (Department of Algorithms and Computation, School of Engineering Science, College of Engineering, University of Tehran) ;
  • Masoudi?Nejad, Ali (Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran)
  • Received : 2018.01.13
  • Accepted : 2018.08.22
  • Published : 2018.12.31

Abstract

One of the main concerns in biology is extracting sophisticated features from DNA sequence for gene interaction determination, receiving a great deal of researchers' attention. The epigenetic modifications along with their patterns have been intensely recognized as dominant features affecting on gene expression. However, studying sequenced-based features highly correlated to this key element has remained limited. The main objective in this research was to propose a new feature highly correlated to epigenetic modifications capable of classification of genes. In this paper, classification of 34 genes in PPAR signaling pathway associated with muscle fat tissue in human was performed. Using different statistical outlier detection methods, we proposed that 5-mers highly correlated to epigenetic modifications can correctly categorize the genes involved in the same biological pathway or process. Thirty-four genes in PPAR signaling pathway were classified via applying a proposed feature, 5-mers strongly associated to 17 different epigenetic modifications. For this, diverse statistical outlier detection methods were applied to specify the group of thoroughly correlated genes. The results indicated that these 5-mers can appropriately identify correlated genes. In addition, our results corresponded to GeneMania interaction information, leading to support the suggested method. The appealing findings imply that not only epigenetic modifications but also their highly correlated 5-mers can be applied for reconstructing gene regulatory networks as supplementary data as well as other applications like physical interaction, genes prioritization, indicating some sort of data fusion in this analysis.

Keywords

References

  1. Barski A et al (2007) High-resolution profiling of histone methylations in the human genome. Cell 129(4):823-837 https://doi.org/10.1016/j.cell.2007.05.009
  2. Belosludtsev YY et al (2004) Organism identification using a genome sequence-independent universal microarray probe set. Biotechniques 37(4):654-660 https://doi.org/10.2144/04374RR02
  3. Bird A (2002) DNA methylation patterns and epigenetic memory. Genes Dev 16:6-21 https://doi.org/10.1101/gad.947102
  4. Cancer Genome Atlas Research Network (2008) Comprehensive genomic characterization defines human glioblastoma genes and core pathway. Nature 455(7216):1061-1068 https://doi.org/10.1038/nature07385
  5. Chae H et al (2013) Comparative analysis using K-mer and K-flank patterns provides evidence for CpG island sequence evolution in mammalian genomes. Nucleic Acids Res 41(9):4783-4791 https://doi.org/10.1093/nar/gkt144
  6. Chen C-C et al (2013) Understanding variation in transcription factor binding by modeling transcription factor genome-epigenome interactions. PLoS Comput Biol 9(12):e1003367 https://doi.org/10.1371/journal.pcbi.1003367
  7. Cui X-J, Li H, Liu G-Q (2011) Combinatorial patterns of histone modifications in Saccharomyces cerevisiae. Yeast 28(9):683-691 https://doi.org/10.1002/yea.1896
  8. Fogel GB et al (2004) Discovery of sequence motifs related to coexpression of genes using evolutionary computation. Nucleic Acids Res 32(13):3826-3835 https://doi.org/10.1093/nar/gkh713
  9. Hill T, Lewicki P, Lewicki P (2006) Statistics: methods and applications: a comprehensive reference for science, industry, and data mining. StatSoft, Tulsa, p 832
  10. Iftikhar Hussain A (2012) Robust outlier detection techniques for skewed distributions and applications to real data. International Islamic University, Islamabad, p 133
  11. Kuo M-H, Allis CD (1998) Roles of histone acetyltransferases and deacetylases in gene regulation. BioEssays 20(8):615-626 https://doi.org/10.1002/(SICI)1521-1878(199808)20:8<615::AID-BIES4>3.0.CO;2-H
  12. Larson JL, Yuan GC (2010) Epigenetic domains found in mouse embryonic stem cells via a hidden Markov model. BMC Bioinform 11:557 https://doi.org/10.1186/1471-2105-11-557
  13. Pfluger J, Wagner D (2007) Histone modifications and dynamic regulation of genome accessibility in plants. Curr Opin Plant Biol 10(6):645-652 https://doi.org/10.1016/j.pbi.2007.07.013
  14. Pham TH et al (2005) Qualitatively predicting acetylation and methylation areas in DNA sequences. Genome Inform 16(2):3-11
  15. Pham TH et al (2007) Prediction of histone modifications in DNA sequences. In: IEEE 7th international symposium on bioinformatics and bioengineering, IEEE, Boston, MA, USA
  16. Rosen G et al (2008) Metagenome fragment classification using N-mer frequency profiles. Adv Bioinform. https://doi.org/10.1155/2008/205969
  17. Segal E, Widom J (2009) What controls nucleosome positions? Trends Genet 35(8):335-343
  18. Segal E et al (2002) From promoter sequence to expression: a probabilistic framework. In: 6th annual international conference on computational biology, NY ACM Press, New York
  19. Segal E et al (2006) A genomic code for nucleosome positioning. Nature 442:772-778 https://doi.org/10.1038/nature04979
  20. Shahbazian MD, Grunstein M (2007) Functions of site-specific histone acetylation and deacetylation. Annu Rev Biochem 75:76-100
  21. Singh RK, Sivabalakrishnan DM (2015) Feature selection of gene expression data for cancer classification: a review. Proc Comput Sci 50:52-57 https://doi.org/10.1016/j.procs.2015.04.060
  22. Tran DH et al (2006) Conditional random fields for predicting and analyzing histone occupancy, acetylation and methylation areas in DNA sequences, in workshops on applications of evolutionary computation. LNCS, Berlin, Heidelberg, pp 221-230
  23. van Dam RM, Quake SR (2002) Gene expression analysis with universal n-mer arrays. Genome Res 12(1):145-152 https://doi.org/10.1101/gr.198901
  24. Wang Z et al (2008) Combinatorial patterns of histone acetylation and methylation in the human. Nat Genet 40(7):897-903 https://doi.org/10.1038/ng.154
  25. Yu H et al (2008) Inferring causal relationships among different histone modifications and gene expression. Genome Res 18(9):1314-1324 https://doi.org/10.1101/gr.073080.107
  26. Zilberman D, Henikoff S (2007) Genome-wide analysis of DNA methylation patterns. Development 134:3959-3965 https://doi.org/10.1242/dev.001131