• Title/Summary/Keyword: gene functional analysis

Search Result 838, Processing Time 0.024 seconds

NGSEA: Network-Based Gene Set Enrichment Analysis for Interpreting Gene Expression Phenotypes with Functional Gene Sets

  • Han, Heonjong;Lee, Sangyoung;Lee, Insuk
    • Molecules and Cells
    • /
    • v.42 no.8
    • /
    • pp.579-588
    • /
    • 2019
  • Gene set enrichment analysis (GSEA) is a popular tool to identify underlying biological processes in clinical samples using their gene expression phenotypes. GSEA measures the enrichment of annotated gene sets that represent biological processes for differentially expressed genes (DEGs) in clinical samples. GSEA may be suboptimal for functional gene sets; however, because DEGs from the expression dataset may not be functional genes per se but dysregulated genes perturbed by bona fide functional genes. To overcome this shortcoming, we developed network-based GSEA (NGSEA), which measures the enrichment score of functional gene sets using the expression difference of not only individual genes but also their neighbors in the functional network. We found that NGSEA outperformed GSEA in identifying pathway gene sets for matched gene expression phenotypes. We also observed that NGSEA substantially improved the ability to retrieve known anti-cancer drugs from patient-derived gene expression data using drug-target gene sets compared with another method, Connectivity Map. We also repurposed FDA-approved drugs using NGSEA and experimentally validated budesonide as a chemical with anti-cancer effects for colorectal cancer. We, therefore, expect that NGSEA will facilitate both pathway interpretation of gene expression phenotypes and anti-cancer drug repositioning. NGSEA is freely available at www.inetbio.org/ngsea.

Current status on plant functional genomics (식물 유전자 연구의 최근 동향)

  • Cho, Yong-Gu;Woo, Hee-Jong;Yoon, Ung-Han;Kim, Hong-Sig;Woo, Sun-Hee
    • Journal of Plant Biotechnology
    • /
    • v.37 no.2
    • /
    • pp.115-124
    • /
    • 2010
  • As the completion of genome sequencing, large collection of expression data and the great efforts in annotating plant genomes, the next challenge is to systematically assign functions to all predicted genes in the genome. Functional genome analysis of plants has entered the high-throughput stage. The generations and collections of mutants at the genome-wide level form technological platform of functional genomics. However, to identify the exact function of unknown genes it is necessary to understand each gene's role in the complex orchestration of all gene activities in the plant cell. Gene function analysis therefore necessitates the analysis of temporal and spatial gene expression patterns. The most conclusive information about changes in gene expression levels can be gained from analysis of the varying qualitative and quantitative changes of messenger RNAs, proteins and metabolites. New technologies have been developed to allow fast and highly parallel measurements of these constituents of the cell that make up gene activity. We have reviewed currently employed technologies to identify unknown functions of predicted genes including map-based cloning, insertional mutagenesis, reverse genetics, chemical mutagenesis, microarray analysis, FOX-hunting system, gene silencing mutagenesis, proteomics and chemical genomics. Recent improvements in technologies for functional genomics enable whole-genome functional analysis, and thus open new avenues for studies of the regulations and functions of unknown genes in plants.

GSnet: An Integrated Tool for Gene Set Analysis and Visualization

  • Choi, Yoon-Jeong;Woo, Hyun-Goo;Yu, Ung-Sik
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.133-136
    • /
    • 2007
  • The Gene Set network viewer (GSnet) visualizes the functional enrichment of a given gene set with a protein interaction network and is implemented as a plug-in for the Cytoscape platform. The functional enrichment of a given gene set is calculated using a hypergeometric test based on the Gene Ontology annotation. The protein interaction network is estimated using public data. Set operations allow a complex protein interaction network to be decomposed into a functionally-enriched module of interest. GSnet provides a new framework for gene set analysis by integrating a priori knowledge of a biological network with functional enrichment analysis.

An Efficient Functional Analysis Method for Micro-array Data Using Gene Ontology

  • Hong, Dong-Wan;Lee, Jong-Keun;Park, Sung-Soo;Hong, Sang-Kyoon;Yoon, Jee-Hee
    • Journal of Information Processing Systems
    • /
    • v.3 no.1
    • /
    • pp.38-42
    • /
    • 2007
  • Microarray data includes tens of thousands of gene expressions simultaneously, so it can be effectively used in identifying the phenotypes of diseases. However, the retrieval of functional information from a large corpus of gene expression data is still a time-consuming task. In this paper, we propose an efficient method for identifying functional categories of differentially expressed genes from a micro-array experiment by using Gene Ontology (GO). Our method is as follows: (1) The expression data set is first filtered to include only genes with mean expression values that differ by at least 3-fold between the two groups. (2) The genes are then ranked based on the t-statistics. The 100 most highly ranked genes are selected as informative genes. (3) The t-value of each informative gene is imposed as a score on the associated GO terms. High-scoring GO terms are then listed with their associated genes and represent the functional category information of the micro-array experiment. A system called HMDA (Hallym Micro-array Data analysis) is implemented on publicly available micro-array data sets and validated. Our results were also compared with the original analysis.

Gene Expression in the Muscles of young and Mature Channel Catfish (Ictalurus punctatus) as Analyzed by Expressed Sequence Tags and Gene Filters

  • Soon-Hag Kim
    • Journal of Aquaculture
    • /
    • v.16 no.1
    • /
    • pp.8-14
    • /
    • 2003
  • To generate expressed sequence tags for genomics research involving genetic linkage analysis, to examine gene expression profiles in muscles of channel catfish in a non-normalized muscle cDNA library, to compare gene expression in young and mature channel catfish muscles using the EST reagents and gene filters to demonstrate the feasibility of functional genomics research in small laboratories. 102 randomly picked cDNA clones were analyzed from the catfish muscle cDNA library. Of the sequences generated, 90.2% of ESTs was identified as known genes by identity comparisons. These 92 clones of known gene products represent transcriptional products of 24 genes. The 10 clones of unknown gene products represent 8 genes. The major transcripts (70.1% of the analyzed ESTs) in the catfish muscle are from many major genes involved in muscle contraction, relaxation, energy metabolism and calcium binding such as alpha actin, creatine kinase, parvalbumin, myosin, troponins, and tropomyosins. Gene expression of the unique ESTs was comparatively studied in the young and adult catfish muscles. Significant differences were observed for aldolase, myostatin, myosin light chain, parvalbumin, and an unknown gene. While myosin light chain and an unknown gene (CM 192) are down-regulated in the mature fish muscle, the aldolase, myostatin, and parvalbumin are significantly up-regulated in the mature fish muscle. Although the physiological significance of the changes in expression levels needs to be further addressed, this research demonstrates the feasibility and power of functional genomics in channel catfish. Channel catfish muscle gene expression profiles provide a valuable molecular muscle physiology blueprint for functional comparative genomics.

Mouse phenogenomics, toolbox for functional annotation of human genome

  • Kim, Il-Yong;Shin, Jae-Hoon;Seong, Je-Kyung
    • BMB Reports
    • /
    • v.43 no.2
    • /
    • pp.79-90
    • /
    • 2010
  • Mouse models are crucial for the functional annotation of human genome. Gene modification techniques including gene targeting and gene trap in mouse have provided powerful tools in the form of genetically engineered mice (GEM) for understanding the molecular pathogenesis of human diseases. Several international consortium and programs are under way to deliver mutations in every gene in mouse genome. The information from studying these GEM can be shared through international collaboration. However, there are many limitations in utility because not all human genes are knocked out in mouse and they are not yet phenotypically characterized by standardized ways which is required for sharing and evaluating data from GEM. The recent improvement in mouse genetics has now moved the bottleneck in mouse functional genomics from the production of GEM to the systematic mouse phenotype analysis of GEM. Enhanced, reproducible and comprehensive mouse phenotype analysis has thus emerged as a prerequisite for effectively engaging the phenotyping bottleneck. In this review, current information on systematic mouse phenotype analysis and an issue-oriented perspective will be provided.

Identification of key genes and functional enrichment analysis of liver fibrosis in nonalcoholic fatty liver disease through weighted gene co-expression network analysis

  • Yue Hu;Jun Zhou
    • Genomics & Informatics
    • /
    • v.21 no.4
    • /
    • pp.45.1-45.11
    • /
    • 2023
  • Nonalcoholic fatty liver disease (NAFLD) is a common type of chronic liver disease, with severity levels ranging from nonalcoholic fatty liver to nonalcoholic steatohepatitis (NASH). The extent of liver fibrosis indicates the severity of NASH and the risk of liver cancer. However, the mechanism underlying NASH development, which is important for early screening and intervention, remains unclear. Weighted gene co-expression network analysis (WGCNA) is a useful method for identifying hub genes and screening specific targets for diseases. In this study, we utilized an mRNA dataset of the liver tissues of patients with NASH and conducted WGCNA for various stages of liver fibrosis. Subsequently, we employed two additional mRNA datasets for validation purposes. Gene set enrichment analysis (GSEA) was conducted to analyze gene function enrichment. Through WGCNA and subsequent analyses, complemented by validation using two additional datasets, we identified five genes (BICC1, C7, EFEMP1, LUM, and STMN2) as hub genes. GSEA analysis indicated that gene sets associated with liver metabolism and cholesterol homeostasis were uniformly downregulated. BICC1, C7, EFEMP1, LUM, and STMN2 were identified as hub genes of NASH, and were all related to liver metabolism, NAFLD, NASH, and related diseases. These hub genes might serve as potential targets for the early screening and treatment of NASH.

Functional Genomics Approach Using Mice

  • Sung, Young-Hoon;Song, Jae-Whan;Lee, Han-Woong
    • BMB Reports
    • /
    • v.37 no.1
    • /
    • pp.122-132
    • /
    • 2004
  • The rapid development and characterization of the mouse genome sequence, coupled with comparative sequence analysis of human, has been paralleled by a reinforced enthusiasm for mouse functional genomics. The way to uncover the in vivo function of genes is to analyze the phenotypes of the mutant animals. From this standpoint, the mouse is a suitable and valuable model organism in the studies of functional genomics. Therefore, there have been enormous efforts to enrich the list of the mutant mice. Such a trend emphasizes the random mutagenesis, including ENU mutagenesis and gene-trap mutagenesis, to obtain a large stock of mutant mice. However, since various mutant alleles are needed to precisely characterize the role of a gene in vivo, mutations should be designed. The simplicity and utility of transgenic technology can satisfy this demand. The combination of RNA interference with transgenic technology will provide more opportunities for researchers. Nevertheless, gene targeting can solely define the in vivo function of a gene without a doubt. Thus, transgenesis and gene targeting will be the major strategies in the field of functional genomics.

Gene Set and Pathway Analysis of Microarray Data (프마이크로어레이 데이터의 유전자 집합 및 대사 경로 분석)

  • Kim Seon-Young
    • KOGO NEWS
    • /
    • v.6 no.1
    • /
    • pp.29-33
    • /
    • 2006
  • Gene set analysis is a new concept and method. to analyze and interpret microarray gene expression data and tries to extract biological meaning from gene expression data at gene set level rather than at gene level. Compared with methods which select a few tens or hundreds of genes before gene ontology and pathway analysis, gene set analysis identifies important gene ontology terms and pathways more consistently and performs well even in gene expression data sets with minimal or moderate gene expression changes. Moreover, gene set analysis is useful for comparing multiple gene expression data sets dealing with similar biological questions. This review briefly summarizes the rationale behind the gene set analysis and introduces several algorithms and tools now available for gene set analysis.

  • PDF

Integrative Analysis of Microarray Data with Gene Ontology to Select Perturbed Molecular Functions using Gene Ontology Functional Code

  • Kim, Chang-Sik;Choi, Ji-Won;Yoon, Suk-Joon
    • Genomics & Informatics
    • /
    • v.7 no.2
    • /
    • pp.122-130
    • /
    • 2009
  • A systems biology approach for the identification of perturbed molecular functions is required to understand the complex progressive disease such as breast cancer. In this study, we analyze the microarray data with Gene Ontology terms of molecular functions to select perturbed molecular functional modules in breast cancer tissues based on the definition of Gene ontology Functional Code. The Gene Ontology is three structured vocabularies describing genes and its products in terms of their associated biological processes, cellular components and molecular functions. The Gene Ontology is hierarchically classified as a directed acyclic graph. However, it is difficult to visualize Gene Ontology as a directed tree since a Gene Ontology term may have more than one parent by providing multiple paths from the root. Therefore, we applied the definition of Gene Ontology codes by defining one or more GO code(s) to each GO term to visualize the hierarchical classification of GO terms as a network. The selected molecular functions could be considered as perturbed molecular functional modules that putatively contributes to the progression of disease. We evaluated the method by analyzing microarray dataset of breast cancer tissues; i.e., normal and invasive breast cancer tissues. Based on the integration approach, we selected several interesting perturbed molecular functions that are implicated in the progression of breast cancers. Moreover, these selected molecular functions include several known breast cancer-related genes. It is concluded from this study that the present strategy is capable of selecting perturbed molecular functions that putatively play roles in the progression of diseases and provides an improved interpretability of GO terms based on the definition of Gene Ontology codes.