• Title/Summary/Keyword: gene-based analysis

Search Result 1,986, Processing Time 0.038 seconds

NGSEA: Network-Based Gene Set Enrichment Analysis for Interpreting Gene Expression Phenotypes with Functional Gene Sets

  • Han, Heonjong;Lee, Sangyoung;Lee, Insuk
    • Molecules and Cells
    • /
    • v.42 no.8
    • /
    • pp.579-588
    • /
    • 2019
  • Gene set enrichment analysis (GSEA) is a popular tool to identify underlying biological processes in clinical samples using their gene expression phenotypes. GSEA measures the enrichment of annotated gene sets that represent biological processes for differentially expressed genes (DEGs) in clinical samples. GSEA may be suboptimal for functional gene sets; however, because DEGs from the expression dataset may not be functional genes per se but dysregulated genes perturbed by bona fide functional genes. To overcome this shortcoming, we developed network-based GSEA (NGSEA), which measures the enrichment score of functional gene sets using the expression difference of not only individual genes but also their neighbors in the functional network. We found that NGSEA outperformed GSEA in identifying pathway gene sets for matched gene expression phenotypes. We also observed that NGSEA substantially improved the ability to retrieve known anti-cancer drugs from patient-derived gene expression data using drug-target gene sets compared with another method, Connectivity Map. We also repurposed FDA-approved drugs using NGSEA and experimentally validated budesonide as a chemical with anti-cancer effects for colorectal cancer. We, therefore, expect that NGSEA will facilitate both pathway interpretation of gene expression phenotypes and anti-cancer drug repositioning. NGSEA is freely available at www.inetbio.org/ngsea.

Development of Gene Based STS Markers in Wheat

  • Lee, Sang-Kyu;Heo, Hwa-Young;Kwon, Young-Up;Lee, Byung-Moo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.57 no.1
    • /
    • pp.71-77
    • /
    • 2012
  • The objective of this study is to develop the gene based sequence tagged site (STS) markers in wheat. The euchromatin enriched genomic library was constructed and the STS primer sets were designed using gene based DNA sequence. The euchromatin enriched genomic (EEG) DNA library in wheat was constructed using the $Mcr$A and $Mcr$BC system in $DH5{\alpha}$ cell. The 2,166 EEG colonies have been constructed by methylated DNA exclusion. Among the colonies, 606 colonies with the size between 400 and 1200 bp of PCR products were selected for sequencing. In order to develop the gene based STS primers, blast analysis comparing between wheat genetic information and rice genome sequence was employed. The 227 STS primers mainly matched on $Triticum$ $aestivum$ (hexaploid), $Triticum$ $turgidum$ (tetraploid), $Aegilops$ (diploid), and other plants. The polymorphisms were detected in PCR products after digestion with restriction enzymes. The eight STS markers that showed 32 polymorphisms in twelve wheat genotypes were developed using 227 STS primers. The STS primers analysis will be useful for generation of informative molecular markers in wheat. Development of gene based STS marker is to identify the genetic function through cloning of target gene and find the new allele of target trait.

Meta-analysis of Gene Expression Data Identifies Causal Genes for Prostate Cancer

  • Wang, Xiang-Yang;Hao, Jian-Wei;Zhou, Rui-Jin;Zhang, Xiang-Sheng;Yan, Tian-Zhong;Ding, De-Gang;Shan, Lei
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.1
    • /
    • pp.457-461
    • /
    • 2013
  • Prostate cancer is a leading cause of death in male populations across the globe. With the advent of gene expression arrays, many microarray studies have been conducted in prostate cancer, but the results have varied across different studies. To better understand the genetic and biologic mechanisms of prostate cancer, we conducted a meta-analysis of two studies on prostate cancer. Eight key genes were identified to be differentially expressed with progression. After gene co-expression analysis based on data from the GEO database, we obtained a co-expressed gene list which included 725 genes. Gene Ontology analysis revealed that these genes are involved in actin filament-based processes, locomotion and cell morphogenesis. Further analysis of the gene list should provide important clues for developing new prognostic markers and therapeutic targets.

Gene Expression Pattern Analysis via Latent Variable Models Coupled with Topographic Clustering

  • Chang, Jeong-Ho;Chi, Sung Wook;Zhang, Byoung Tak
    • Genomics & Informatics
    • /
    • v.1 no.1
    • /
    • pp.32-39
    • /
    • 2003
  • We present a latent variable model-based approach to the analysis of gene expression patterns, coupled with topographic clustering. Aspect model, a latent variable model for dyadic data, is applied to extract latent patterns underlying complex variations of gene expression levels. Then a topographic clustering is performed to find coherent groups of genes, based on the extracted latent patterns as well as individual gene expression behaviors. Applied to cell cycle­regulated genes of the yeast Saccharomyces cerevisiae, the proposed method could discover biologically meaningful patterns related with characteristic expression behavior in particular cell cycle phases. In addition, the display of the variation in the composition of these latent patterns on the cluster map provided more facilitated interpretation of the resulting cluster structure. From this, we argue that latent variable models, coupled with topographic clustering, are a promising tool for explorative analysis of gene expression data.

Application of Crossover Analysis-logistic Regression in the Assessment of Gene- environmental Interactions for Colorectal Cancer

  • Wu, Ya-Zhou;Yang, Huan;Zhang, Ling;Zhang, Yan-Qi;Liu, Ling;Yi, Dong;Cao, Jia
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.5
    • /
    • pp.2031-2037
    • /
    • 2012
  • Background: Analysis of gene-gene and gene-environment interactions for complex multifactorial human disease faces challenges regarding statistical methodology. One major difficulty is partly due to the limitations of parametric-statistical methods for detection of gene effects that are dependent solely or partially on interactions with other genes or environmental exposures. Based on our previous case-control study in Chongqing of China, we have found increased risk of colorectal cancer exists in individuals carrying a novel homozygous TT at locus rs1329149 and known homozygous AA at locus rs671. Methods: In this study, we proposed statistical method-crossover analysis in combination with logistic regression model, to further analyze our data and focus on assessing gene-environmental interactions for colorectal cancer. Results: The results of the crossover analysis showed that there are possible multiplicative interactions between loci rs671 and rs1329149 with alcohol consumption. Multifactorial logistic regression analysis also validated that loci rs671 and rs1329149 both exhibited a multiplicative interaction with alcohol consumption. Moreover, we also found additive interactions between any pair of two factors (among the four risk factors: gene loci rs671, rs1329149, age and alcohol consumption) through the crossover analysis, which was not evident on logistic regression. Conclusions: In conclusion, the method based on crossover analysis-logistic regression is successful in assessing additive and multiplicative gene-environment interactions, and in revealing synergistic effects of gene loci rs671 and rs1329149 with alcohol consumption in the pathogenesis and development of colorectal cancer.

Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

  • Kim, Jihye;Kwon, Ji-Sun;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.11 no.3
    • /
    • pp.135-141
    • /
    • 2013
  • Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP) genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO) terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait ($p_{corr}$ < 0.05). Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

Pathway and Network Analysis in Glioma with the Partial Least Squares Method

  • Gu, Wen-Tao;Gu, Shi-Xin;Shou, Jia-Jun
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.7
    • /
    • pp.3145-3149
    • /
    • 2014
  • Gene expression profiling facilitates the understanding of biological characteristics of gliomas. Previous studies mainly used regression/variance analysis without considering various background biological and environmental factors. The aim of this study was to investigate gene expression differences between grade III and IV gliomas through partial least squares (PLS) based analysis. The expression data set was from the Gene Expression Omnibus database. PLS based analysis was performed with the R statistical software. A total of 1,378 differentially expressed genes were identified. Survival analysis identified four pathways, including Prion diseases, colorectal cancer, CAMs, and PI3K-Akt signaling, which may be related with the prognosis of the patients. Network analysis identified two hub genes, ELAVL1 and FN1, which have been reported to be related with glioma previously. Our results provide new understanding of glioma pathogenesis and prognosis with the hope to offer theoretical support for future therapeutic studies.

An XML-Based Analysis Tool for Gene Prediction Results (XML기반의 유전자 예측결과 분석도구)

  • Kim Jin-Hong;Byun Sang-Hee;Lee Myung-Joon;Park Yang-Su
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.755-764
    • /
    • 2005
  • Recently, as it is considered more important to identify the function of ail unknown genes in living things, many tools for gene prediction have been developed to identify genes in the DNA sequences. Unfortunately, most of those tools use their own schemes to represent their programs results, requiring researchers to make additional efforts to understand the result generated by them So, it is desirable to provide a standardized method of representing predicted gene information, which makes it possible to automatically produce the predicted results for a given set of gene data In this paper, we describe an effective U representation for various predicted gene information, and present an XML-based analysis tool for gene predication results based on this representation. The developed system helps users of gene prediction tools to conveniently analyze the predicted results and to automatically produce the statistical results of the prediction. To show the usefulness of the tool, we applied our programs to the results generated by GenScan and GeneID, which are widely used gene prediction systems.

CaGe: A Web-Based Cancer Gene Annotation System for Cancer Genomics

  • Park, Young-Kyu;Kang, Tae-Wook;Baek, Su-Jin;Kim, Kwon-Il;Kim, Seon-Young;Lee, Do-Heon;Kim, Yong-Sung
    • Genomics & Informatics
    • /
    • v.10 no.1
    • /
    • pp.33-39
    • /
    • 2012
  • High-throughput genomic technologies (HGTs), including next-generation DNA sequencing (NGS), microarray, and serial analysis of gene expression (SAGE), have become effective experimental tools for cancer genomics to identify cancer-associated somatic genomic alterations and genes. The main hurdle in cancer genomics is to identify the real causative mutations or genes out of many candidates from an HGT-based cancer genomic analysis. One useful approach is to refer to known cancer genes and associated information. The list of known cancer genes can be used to determine candidates of cancer driver mutations, while cancer gene-related information, including gene expression, protein-protein interaction, and pathways, can be useful for scoring novel candidates. Some cancer gene or mutation databases exist for this purpose, but few specialized tools exist for an automated analysis of a long gene list from an HGT-based cancer genomic analysis. This report presents a new web-accessible bioinformatic tool, called CaGe, a cancer genome annotation system for the assessment of candidates of cancer genes from HGT-based cancer genomics. The tool provides users with information on cancer-related genes, mutations, pathways, and associated annotations through annotation and browsing functions. With this tool, researchers can classify their candidate genes from cancer genome studies into either previously reported or novel categories of cancer genes and gain insight into underlying carcinogenic mechanisms through a pathway analysis. We show the usefulness of CaGe by assessing its performance in annotating somatic mutations from a published small cell lung cancer study.

GSnet: An Integrated Tool for Gene Set Analysis and Visualization

  • Choi, Yoon-Jeong;Woo, Hyun-Goo;Yu, Ung-Sik
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.133-136
    • /
    • 2007
  • The Gene Set network viewer (GSnet) visualizes the functional enrichment of a given gene set with a protein interaction network and is implemented as a plug-in for the Cytoscape platform. The functional enrichment of a given gene set is calculated using a hypergeometric test based on the Gene Ontology annotation. The protein interaction network is estimated using public data. Set operations allow a complex protein interaction network to be decomposed into a functionally-enriched module of interest. GSnet provides a new framework for gene set analysis by integrating a priori knowledge of a biological network with functional enrichment analysis.