• 제목/요약/키워드: gene expression data

검색결과 1,288건 처리시간 0.033초

전산생물학을 이용한 마이크로어레이의 유전자 발현 데이터 분석 및 유형 분류 기법 (Analysis and Subclass Classification of Microarray Gene Expression Data Using Computational Biology)

  • 유창규;이민영;김영황;이인범
    • 제어로봇시스템학회논문지
    • /
    • 제11권10호
    • /
    • pp.830-836
    • /
    • 2005
  • Application of microarray technologies which monitor simultaneously the expression pattern of thousands of individual genes in different biological systems results in a tremendous increase of the amount of available gene expression data and have provided new insights into gene expression during drug development, within disease processes, and across species. There is a great need of data mining methods allowing straightforward interpretation, visualization and analysis of the relevant information contained in gene expression profiles. Specially, classifying biological samples into known classes or phenotypes is an important practical application for microarray gene expression profiles. Gene expression profiles obtained from tissue samples of patients thus allowcancer classification. In this research, molecular classification of microarray gene expression data is applied for multi-class cancer using computational biology such gene selection, principal component analysis and fuzzy clustering. The proposed method was applied to microarray data from leukemia patients; specifically, it was used to interpret the gene expression pattern and analyze the leukemia subtype whose expression profiles correlated with four cases of acute leukemia gene expression. A basic understanding of the microarray data analysis is also introduced.

자기 조직화 지도에 기반한 유전자 발현 데이터의 계층적 군집화 (Hierarchical Clustering of Gene Expression Data Based on Self Organizing Map)

  • Park, Chang-Beom;Lee, Dong-Hwan;Lee, Seong-Whan
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.170-177
    • /
    • 2003
  • Gene expression data are the quantitative measurements of expression levels and ratios of numberous genes in different situations based on microarray image analysis results. The process to draw meaningful information related to genomic diseases and various biological activities from gene expression data is known as gene expression data analysis. In this paper, we present a hierarchical clustering method of gene expression data based on self organizing map which can analyze the clustering result of gene expression data more efficiently. Using our proposed method, we could eliminate the uncertainty of cluster boundary which is the inherited disadvantage of self organizing map and use the visualization function of hierarchical clustering. And, we could process massive data using fast processing speed of self organizing map and interpret the clustering result of self organizing map more efficiently and user-friendly. To verify the efficiency of our proposed algorithm, we performed tests with following 3 data sets, animal feature data set, yeast gene expression data and leukemia gene expression data set. The result demonstrated the feasibility and utility of the proposed clustering algorithm.

  • PDF

진화연산에 기반한 유전자 발현 데이터로부터의 유전자 상호작용 네트워크 구성 (Construction of Gene Interaction Networks from Gene Expression Data Based on Evolutionary Computation)

  • 정성훈;조광현
    • 제어로봇시스템학회논문지
    • /
    • 제10권12호
    • /
    • pp.1189-1195
    • /
    • 2004
  • This paper investigates construction of gene (interaction) networks from gene expression time-series data based on evolutionary computation. To illustrate the proposed approach in a comprehensive way, we first assume an artificial gene network and then compare it with the reconstructed network from the gene expression time-series data generated by the artificial network. Next, we employ real gene expression time-series data (Spellman's yeast data) to construct a gene network by applying the proposed approach. From these experiments, we find that the proposed approach can be used as a useful tool for discovering the structure of a gene network as well as the corresponding relations among genes. The constructed gene network can further provide biologists with information to generate/test new hypotheses and ultimately to unravel the gene functions.

프마이크로어레이 데이터의 유전자 집합 및 대사 경로 분석 (Gene Set and Pathway Analysis of Microarray Data)

  • 김선영
    • 유전체소식지
    • /
    • 제6권1호
    • /
    • pp.29-33
    • /
    • 2006
  • Gene set analysis is a new concept and method. to analyze and interpret microarray gene expression data and tries to extract biological meaning from gene expression data at gene set level rather than at gene level. Compared with methods which select a few tens or hundreds of genes before gene ontology and pathway analysis, gene set analysis identifies important gene ontology terms and pathways more consistently and performs well even in gene expression data sets with minimal or moderate gene expression changes. Moreover, gene set analysis is useful for comparing multiple gene expression data sets dealing with similar biological questions. This review briefly summarizes the rationale behind the gene set analysis and introduces several algorithms and tools now available for gene set analysis.

  • PDF

Enhancing Gene Expression Classification of Support Vector Machines with Generative Adversarial Networks

  • Huynh, Phuoc-Hai;Nguyen, Van Hoa;Do, Thanh-Nghi
    • Journal of information and communication convergence engineering
    • /
    • 제17권1호
    • /
    • pp.14-20
    • /
    • 2019
  • Currently, microarray gene expression data take advantage of the sufficient classification of cancers, which addresses the problems relating to cancer causes and treatment regimens. However, the sample size of gene expression data is often restricted, because the price of microarray technology on studies in humans is high. We propose enhancing the gene expression classification of support vector machines with generative adversarial networks (GAN-SVMs). A GAN that generates new data from original training datasets was implemented. The GAN was used in conjunction with nonlinear SVMs that efficiently classify gene expression data. Numerical test results on 20 low-sample-size and very high-dimensional microarray gene expression datasets from the Kent Ridge Biomedical and Array Expression repositories indicate that the model is more accurate than state-of-the-art classifying models.

Finding Informative Genes From Microarray Gene Expression Data Using FIGER-test

  • Choi, Kyoung-Oak;Chung, Hwan-Mook
    • 한국지능시스템학회논문지
    • /
    • 제17권5호
    • /
    • pp.707-711
    • /
    • 2007
  • Microarray gene expression data is believed to show the functions of living organism through the gene expression values. We have studied a method to get the informative genes from the microarray gene expression data. There are several ways for this. In recent researches to get more sophisticated and detailed results, it has used the intelligence information theory like fuzzy theory. Some methods are to add fudge factors to the significance test for more refined results. In this paper, we suggest a method to get informative genes from microarray gene expression data. We combined the difference of means between two groups and the fuzzy membership degree which reflects the variance of the gene expression data. We have called our significance test the Fuzzy Information method for Gene Expression data(FIGER). The FIGER calculates FIGER variation ratio and FIGER membership degree to show how strongly each object belongs to the each group and then it results in the significance degree of each gene. The FIGER is focused on the variation and distribution of the data set to adjust the significance level. Out simulation shows that the FIGER-test is an effective and useful significance test.

Reverting Gene Expression Pattern of Cancer into Normal-Like Using Cycle-Consistent Adversarial Network

  • Lee, Chan-hee;Ahn, TaeJin
    • International Journal of Advanced Culture Technology
    • /
    • 제6권4호
    • /
    • pp.275-283
    • /
    • 2018
  • Cancer show distinct pattern of gene expression when it is compared to normal. This difference results malignant characteristic of cancer. Many cancer drugs are targeting this difference so that it can selectively kill cancer cells. One of the recent demand for personalized treating cancer is retrieving normal tissue from a patient so that the gene expression difference between cancer and normal be assessed. However, in most clinical situation it is hard to retrieve normal tissue from a patient. This is because biopsy of normal tissues may cause damage to the organ function or a risk of infection or side effect what a patient to take. Thus, there is a challenge to estimate normal cell's gene expression where cancers are originated from without taking additional biopsy. In this paper, we propose in-silico based prediction of normal cell's gene expression from gene expression data of a tumor sample. We call this challenge as reverting the cancer into normal. We divided this challenge into two parts. The first part is making a generator that is able to fool a pretrained discriminator. Pretrained discriminator is from the training of public data (9,601 cancers, 7,240 normals) which shows 0.997 of accuracy to discriminate if a given gene expression pattern is cancer or normal. Deceiving this pretrained discriminator means our method is capable of generating very normal-like gene expression data. The second part of the challenge is to address whether generated normal is similar to true reverse form of the input cancer data. We used, cycle-consistent adversarial networks to approach our challenges, since this network is capable of translating one domain to the other while maintaining original domain's feature and at the same time adding the new domain's feature. We evaluated that, if we put cancer data into a cycle-consistent adversarial network, it could retain most of the information from the input (cancer) and at the same time change the data into normal. We also evaluated if this generated gene expression of normal tissue would be the biological reverse form of the gene expression of cancer used as an input.

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권5호
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

Consensus Clustering for Time Course Gene Expression Microarray Data

  • Kim, Seo-Young;Bae, Jong-Sung
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.335-348
    • /
    • 2005
  • The rapid development of microarray technologies enabled the monitoring of expression levels of thousands of genes simultaneously. Recently, the time course gene expression data are often measured to study dynamic biological systems and gene regulatory networks. For the data, biologists are attempting to group genes based on the temporal pattern of their expression levels. We apply the consensus clustering algorithm to a time course gene expression data in order to infer statistically meaningful information from the measurements. We evaluate each of consensus clustering and existing clustering methods with various validation measures. In this paper, we consider hierarchical clustering and Diana of existing methods, and consensus clustering with hierarchical clustering, Diana and mixed hierachical and Diana methods and evaluate their performances on a real micro array data set and two simulated data sets.

Considerations on gene chip data analysis

  • Lee, Jae-K.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보학 국제심포지엄
    • /
    • pp.77-102
    • /
    • 2001
  • Different high-throughput chip technologies are available for genome-wide gene expression studies. Quality control and prescreening analysis are important for rigorous analysis on each type of gene expression data. Statistical significance evaluation of differential expression patterns is needed. Major genome institutes develop database and analysis systems for information sharing of precious expression data.

  • PDF