• Title/Summary/Keyword: gene set analysis

Search Result 291, Processing Time 0.024 seconds

Gene Set and Pathway Analysis of Microarray Data

  • Kim Seon-Yeong
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2006.02a
    • /
    • pp.20-28
    • /
    • 2006
  • 최근의 microarray 기술의 발달로 인해 점점 더 많은 양의 mRNA 발현 데이터가 쌓여 가고 있다. 이제는 데이터를 만드는 단계보다는 데이터로부터 중요한 생물학적 의미를 끌어내는 것이 더욱 중요한 일이 되었다. micorarray 기술이 처음 도입된 이후로, 많은 앨고리즘과 소프트웨어가 개발되어, 실험자들이 microarray 데이터로부터 생물학적 의미를 끌어내는 작업을 도와주어 왔다. 그런데, 이전의 데이터 마이닝 방법들은 거의 예외 없이 전체 데이터로부터 선택된 몇 십, 몇 백 개의 유전자 리스트로부터 출발한다. 그런데, 이러한 방법 (over-representation analysis, ORA로 줄임)은 몇 가지 한계를 가지고 있어서, 최근에는 전체 데이터로부터 의미 있는 유전자 세트 (gene set)를 찾아내는 방법들이 도입되었다. 본 세미나는 이런 방법들, 줄여서 gene set analysis라 함, 에 사용되는 앨고리즘들과 소프트웨어들을 비교, 검토하고자 한다.

  • PDF

Hierarchical Clustering of Gene Expression Data Based on Self Organizing Map (자기 조직화 지도에 기반한 유전자 발현 데이터의 계층적 군집화)

  • Park, Chang-Beom;Lee, Dong-Hwan;Lee, Seong-Whan
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.170-177
    • /
    • 2003
  • Gene expression data are the quantitative measurements of expression levels and ratios of numberous genes in different situations based on microarray image analysis results. The process to draw meaningful information related to genomic diseases and various biological activities from gene expression data is known as gene expression data analysis. In this paper, we present a hierarchical clustering method of gene expression data based on self organizing map which can analyze the clustering result of gene expression data more efficiently. Using our proposed method, we could eliminate the uncertainty of cluster boundary which is the inherited disadvantage of self organizing map and use the visualization function of hierarchical clustering. And, we could process massive data using fast processing speed of self organizing map and interpret the clustering result of self organizing map more efficiently and user-friendly. To verify the efficiency of our proposed algorithm, we performed tests with following 3 data sets, animal feature data set, yeast gene expression data and leukemia gene expression data set. The result demonstrated the feasibility and utility of the proposed clustering algorithm.

  • PDF

An improvement on fuzzy seismic fragility analysis using gene expression programming

  • Ebrahimi, Elaheh;Abdollahzadeh, Gholamreza;Jahani, Ehsan
    • Structural Engineering and Mechanics
    • /
    • v.83 no.5
    • /
    • pp.577-591
    • /
    • 2022
  • This paper develops a comparatively time-efficient methodology for performing seismic fragility analysis of the reinforced concrete (RC) buildings in the presence of uncertainty sources. It aims to appraise the effectiveness of any variation in the material's mechanical properties as epistemic uncertainty, and the record-to-record variation as aleatory uncertainty in structural response. In this respect, the fuzzy set theory, a well-known 𝛼-cut approach, and the Genetic Algorithm (GA) assess the median of collapse fragility curves as a fuzzy response. GA is requisite for searching the maxima and minima of the objective function (median fragility herein) in each membership degree, 𝛼. As this is a complicated and time-consuming process, the authors propose utilizing the Gene Expression Programming-based (GEP-based) equation for reducing the computational analysis time of the case study building significantly. The results indicate that the proposed structural analysis algorithm on the derived GEP model is able to compute the fuzzy median fragility about 33.3% faster, with errors less than 1%.

A Method of Identifying Disease-related Significant Pathways Using Time-Series Microarray Data (시간열 마이크로어레이 데이터를 이용한 질병 관련 유의한 패스웨이 유전자 집합의 검출)

  • Kim, Jae-Young;Shin, Mi-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.5
    • /
    • pp.17-24
    • /
    • 2010
  • Recently the study of identifying bio-markers for disease diagnosis and prognosis has been actively performed. In particular, lots of attentions have been paid to the finding of pathway gene-sets differentially expressed in disease patients rather than the finding of individual gene markers. In this paper we propose a novel method to identify disease-related pathway gene-sets based on time-series microarray data. For this purpose, we firstly compute individual gene scores by the using maSigPro (microarray Significant Profiles) and then arrange all the genes in the decreasing order of the corresponding gene scores. The rank of each gene in the entire list is used to evaluate the statistical significance of candidate gene-sets with Wilcoxson rank sum test. For the generation of candidate gene-sets, MSigDB (Molecular Signatures Database) pathway information has been employed. The experiment was conducted with prostate cancer time-series microarray data and the results showed the usefulness of the proposed method by correctly identifying 6 out of 7 biological pathways already known as being actually related to prostate cancer.

Analysis of gene expression during odontogenic differentiation of cultured human dental pulp cells

  • Seo, Min-Seock;Hwang, Kyung-Gyun;Kim, Hyong-Bum;Baek, Seung-Ho
    • Restorative Dentistry and Endodontics
    • /
    • v.37 no.3
    • /
    • pp.142-148
    • /
    • 2012
  • Objectives: We analyzed gene-expression profiles after 14 day odontogenic induction of human dental pulp cells (DPCs) using a DNA microarray and sought candidate genes possibly associated with mineralization. Materials and Methods: Induced human dental pulp cells were obtained by culturing DPCs in odontogenic induction medium (OM) for 14 day. Cells exposed to normal culture medium were used as controls. Total RNA was extracted from cells and analyzed by microarray analysis and the key results were confirmed selectively by reverse-transcriptase polymerase chain reaction (RT-PCR). We also performed a gene set enrichment analysis (GSEA) of the microarray data. Results: Six hundred and five genes among the 47,320 probes on the BeadChip differed by a factor of more than two-fold in the induced cells. Of these, 217 genes were upregulated, and 388 were down-regulated. GSEA revealed that in the induced cells, genes implicated in Apoptosis and Signaling by wingless MMTV integration (Wnt) were significantly upregulated. Conclusions: Genes implicated in Apoptosis and Signaling by Wnt are highly connected to the differentiation of dental pulp cells into odontoblast.

Correlation Analysis between Regulatory Sequence Motifs and Expression Profiles by Kernel CCA

  • Rhee, Je-Keun;Joung, Je-Gun;Chang, Jeong-Ho;Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.63-68
    • /
    • 2005
  • Transcription factors regulate gene expression by binding to gene upstream region. Each transcription factor has the specific binding site in promoter region. So the analysis of gene upstream sequence is necessary for understanding regulatory mechanism of genes, under a plausible idea that assumption that DNA sequence motif profiles are closely related to gene expression behaviors of the corresponding genes. Here, we present an effective approach to the analysis of the relation between gene expression profiles and gene upstream sequences on the basis of kernel canonical correlation analysis (kernel CCA). Kernel CCA is a useful method for finding relationships underlying between two different data sets. In the application to a yeast cell cycle data set, it is shown that gene upstream sequence profile is closely related to gene expression patterns in terms of canonical correlation scores. By the further analysis of the contributing values or weights of sequence motifs in the construction of a pair of sequence motif profiles and expression profiles, we show that the proposed method can identify significant DNA sequence motifs involved with some specific gene expression patterns, including some well known motifs and those putative, in the process of the yeast cell cycle.

  • PDF

Genetic Analysis of Wheat for Plant Height by RNA-seq Analysis of Wheat Cultivars 'Keumkang' and 'Komac 5'

  • Moon Seok Kim;Jin Seok Yoon;Yong Weon Seo
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.275-275
    • /
    • 2022
  • One of the most widely grown food crops in the world, wheat, is increasing more lodged since for increased rains and winds caused by abnormal climate. During the Green Revolution, shorter wheat cultivars were bred using many Rht genes to increase lodging resistance. However, since only some Rht genes were used for breeding shorter wheat, it may have had a limited impact on wheat breeding and reduced genetic diversity. Therefore, it is essential to search for genes that have breeding potential and affect dwarfism in order to increase the genetic diversity of dwarf characteristics in wheat. In this study, we performed the RNA-seq between 'Keumkang' and 'Komac 5' ('Keumkang' mutant) to analyze the difference in plant height. Differentially expressed genes (DEGs) analysis and Gene function annotation were performed using 265,365,558 mapped reads. Cluster set analysis was performed to compress and select candidate gene DEGs affecting plant height, stem and internode. Gene expression analysis was performed in order to identify the functions of the selected genes by condensing the results of the DEG analysis into a cluster set analysis. This analysis of these plant height-related genes could help reduce plant height, improve lodging resistance, and increase wheat yield. Its application to wheat breeding will also affect the increased genetic diversity of wheat dwarfism.

  • PDF

Comparison of Invariant NKT Cells with Conventional T Cells by Using Gene Set Enrichment Analysis (GSEA)

  • Oh, Sae-Jin;Ahn, Ji-Ye;Chung, Doo-Hyun
    • IMMUNE NETWORK
    • /
    • v.11 no.6
    • /
    • pp.406-411
    • /
    • 2011
  • Background: Invariant Natural killer T (iNKT) cells, a distinct subset of CD1d-restricted T cells with invariant $V{\alpha}{\beta}$ TCR, functionally bridge innate and adaptive immunity. While iNKT cells share features with conventional T cells in some functional aspects, they simultaneously produce large amount of Th1 and Th2 cytokines upon T-cell receptor (TCR) ligation. However, gene expression pattern in two types of cells has not been well characterized. Methods: we performed comparative microarray analyses of gene expression in murine iNKT cells and conventional $CD4^+CD25^-$ ${\gamma}{\delta}TCR^-$ T cells by using Gene Set Enrichment Analysis (GSEA) method. Results: Here, we describe profound differences in gene expression pattern between iNKT cells and conventional $CD4^+CD25^-$ ${\gamma}{\delta}TCR^-$ T cells. Conclusion: Our results provide new insights into the functional competence of iNKT cells and a better understanding of their various roles during immune responses.

Macroscopic Biclustering of Gene Expression Data (유전자 발현 데이터에 적용한 거시적인 바이클러스터링 기법)

  • Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.327-338
    • /
    • 2009
  • A microarray dataset is 2-dimensional dataset with a set of genes and a set of conditions. A bicluster is a subset of genes that show similar behavior within a subset of conditions. Genes that show similar behavior can be considered to have same cellular functions. Thus, biclustering algorithm is a useful tool to uncover groups of genes involved in the same cellular process and groups of conditions which take place in this process. We are proposing a polynomial time algorithm to identify functionally highly correlated biclusters. Our algorithm identifies 1) the gene set that has hidden patterns even if the level of noise is high, 2) the multiple, possibly overlapped, and diverse gene sets, 3) gene sets whose functional association is strongly high, and 4) deterministic biclustering results. We validated the level of functional association of our method, and compared with current methods using GO.

Pathway and Network Analysis in Glioma with the Partial Least Squares Method

  • Gu, Wen-Tao;Gu, Shi-Xin;Shou, Jia-Jun
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.7
    • /
    • pp.3145-3149
    • /
    • 2014
  • Gene expression profiling facilitates the understanding of biological characteristics of gliomas. Previous studies mainly used regression/variance analysis without considering various background biological and environmental factors. The aim of this study was to investigate gene expression differences between grade III and IV gliomas through partial least squares (PLS) based analysis. The expression data set was from the Gene Expression Omnibus database. PLS based analysis was performed with the R statistical software. A total of 1,378 differentially expressed genes were identified. Survival analysis identified four pathways, including Prion diseases, colorectal cancer, CAMs, and PI3K-Akt signaling, which may be related with the prognosis of the patients. Network analysis identified two hub genes, ELAVL1 and FN1, which have been reported to be related with glioma previously. Our results provide new understanding of glioma pathogenesis and prognosis with the hope to offer theoretical support for future therapeutic studies.