• Title/Summary/Keyword: Microarray Data Analysis

Search Result 323, Processing Time 0.027 seconds

Nonstandard Machine Learning Algorithms for Microarray Data Mining

  • Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.165-196
    • /
    • 2001
  • DNA chip 또는 microarray는 다수의 유전자 또는 유전자 조각을 (보통 수천내지 수만 개)칩상에 고정시켜 놓고 DNA hybridization 반응을 이용하여 유전자들의 발현 양상을 분석할 수 있는 기술이다. 이러한 high-throughput기술은 예전에는 생각하지 못했던 여러가지 분자생물학의 문제에 대한 해답을 제시해 줄 수 있을 뿐 만 아니라, 분자수준에서의 질병 진단, 신약 개발, 환경 오염 문제의 해결 등 그 응용 가능성이 무한하다. 이 기술의 실용적인 적용을 위해서는 DNA chip을 제작하기 위한 하드웨어/웻웨어 기술 외에도 이러한 데이터로부터 최대한 유용하고 새로운 지식을 창출하기 위한 bioinformatics 기술이 핵심이라고 할 수 있다. 유전자 발현 패턴을 데이터마이닝하는 문제는 크게 clustering, classification, dependency analysis로 구분할 수 있으며 이러한 기술은 통계학과인공지능 기계학습에 기반을 두고 있다. 주로 사용된 기법으로는 principal component analysis, hierarchical clustering, k-means, self-organizing maps, decision trees, multilayer perceptron neural networks, association rules 등이다. 본 세미나에서는 이러한 기본적인 기계학습 기술 외에 최근에 연구되고 있는 새로운 학습 기술로서 probabilistic graphical model (PGM)을 소개하고 이를 DNA chip 데이터 분석에 응용하는 연구를 살펴본다. PGM은 인공신경망, 그래프 이론, 확률 이론이 결합되어 형성된 기계학습 모델로서 인간 두뇌의 기억과 학습 기작에 기반을 두고 있으며 다른 기계학습 모델과의 큰 차이점 중의 하나는 generative model이라는 것이다. 즉 일단 모델이 만들어지면 이것으로부터 새로운 데이터를 생성할 수 있는 능력이 있어서, 만들어진 모델을 검증하고 이로부터 새로운 사실을 추론해 낼 수 있어 biological data mining 문제에서와 같이 새로운 지식을 발견하는 exploratory analysis에 적합하다. 또한probabilistic graphical model은 기존의 신경망 모델과는 달리 deterministic한의사결정이 아니라 확률에 기반한 soft inference를 하고 학습된 모델로부터 관련된 요인들간의 인과관계(causal relationship) 또는 상호의존관계(dependency)를 분석하기에 적합한 장점이 있다. 군체적인 PGM 모델의 예로서, Bayesian network, nonnegative matrix factorization (NMF), generative topographic mapping (GTM)의 구조와 학습 및 추론알고리즘을소개하고 이를 DNA칩 데이터 분석 평가 대회인 CAMDA-2000과 CAMDA-2001에서 사용된cancer diagnosis 문제와 gene-drug dependency analysis 문제에 적용한 결과를 살펴본다.

  • PDF

Suppression of metastasis-related ERBB2 and PLAU expressions in human breast cancer MCF 7 cells by fermented soybean extract (발효대두추출물의 인간 유방암 MCF7 세포에서 전이 관련 ERBB2와 PLAU 발현 억제 효과)

  • Park, Jameon;Kim, Han Bok
    • Korean Journal of Microbiology
    • /
    • v.54 no.4
    • /
    • pp.320-324
    • /
    • 2018
  • Chunkookjang, fermented soybean is rich in diverse oligopeptides which derived from cleavage of soybean proteins during fermentation. Microarray data containing differently expressed genes in breast cancer cells treated with fermented soybean extract and well known breast cancer metastasis markers were combined, and a new network was constructed. It is used to check interactions between the marker proteins and the differently expressed genes. Based on the network analysis, PLAU (plasminogen activator, urokinase, uPA) and ERBB2 (epidermal growth factor receptor 2) are chosen as possible metastasis genes. We treated breast cancer MCF7 cells with fermented soybean extract and measured expression levels of PLAU and ERBB2. Fermented soybean extract suppressed PLAU and ERBB2 expressions conspicuously. In the cancer cells treated with fermented soybean extracts, an inflammation marker, NO production was also reduced. It will be interesting to find specific peptides to suppress PLAU and ERBB2 expressions in human breast cancer cells.

Bioinformatics for the Korean Functional Genomics Project

  • Kim, Sang-Soo
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.45-52
    • /
    • 2000
  • Genomic approach produces massive amount of data within a short time period, New high-throughput automatic sequencers can generate over a million nucleotide sequence information overnight. A typical DNA chip experiment produces tens of thousands expression information, not to mention the tens of megabyte image files, These data must be handled automatically by computer and stored in electronic database, Thus there is a need for systematic approach of data collection, processing, and analysis. DNA sequence information is translated into amino acid sequence and is analyzed for key motif related to its biological and/or biochemical function. Functional genomics will play a significant role in identifying novel drug targets and diagnostic markers for serious diseases. As an enabling technology for functional genomics, bioinformatics is in great need worldwide, In Korea, a new functional genomics project has been recently launched and it focuses on identi☞ing genes associated with cancers prevalent in Korea, namely gastric and hepatic cancers, This involves gene discovery by high throughput sequencing of cancer cDNA libraries, gene expression profiling by DNA microarray and proteomics, and SNP profiling in Korea patient population, Our bioinformatics team will support all these activities by collecting, processing and analyzing these data.

  • PDF

Studies on Gene Expression of Imperatorin treated in HL-60 cell line using High-throughput Gene Expression Analysis Techniques (Imperatorin을 처리한 HL-60 백혈병 세포주에서 대규모 유전자 분석 발현 연구)

  • Kang Bong-Joo;Cha Min-Ho;Jeon Byung Hun;Yun Yong Gab;Yoon Yoo Sik
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.18 no.4
    • /
    • pp.1028-1035
    • /
    • 2004
  • Imperatorin, a biologically active furanocoumarin from the roots of Angelica dahurica (Umbelliferae), was mutagenic and induced transformation of mouse fibroblast cell lines, whereas it provided inhibiting effects on mutagenesis and carcinogenesis induced by various carcinogens. Furthermore, it has been suggested that imperatorin may have potential anticarcinogenic effects when administered orally in the diet. In addition to its anticarcinogenic properties, imperatorin has been shown to possess anticancer activities. We investigated the macro scale gene expression analysis on the HL-60 cells treated with imperatorin. Imperatorin (10μM) were used to treat the cells for 6h, 12h, 24h, 48h, and 72h. In a human cDNAchip study of 10,000 genes evaluated 6, 12, 24, 48, 72 hours after treated with imperatorin in HL-60 cells. Hierarchical cluster against the genes which showed expression changes by more than 2 fold. Three hundred eighty six genes were grouped into 6 clusters by a hierarchical clustering algorithm. Pathway analysis using gene microarray pathway prof Her that is a computer application designed to visualize gene expression data on screen representing biological pathways and groupings of genes.

Expression profiling of cultured podocytes exposed to nephrotic plasma reveals intrinsic molecular signatures of nephrotic syndrome

  • Panigrahi, Stuti;Pardeshi, Varsha Chhotusing;Chandrasekaran, Karthikeyan;Neelakandan, Karthik;PS, Hari;Vasudevan, Anil
    • Clinical and Experimental Pediatrics
    • /
    • v.64 no.7
    • /
    • pp.355-363
    • /
    • 2021
  • Background: Nephrotic syndrome (NS) is a common renal disorder in children attributed to podocyte injury. However, children with the same diagnosis have markedly variable treatment responses, clinical courses, and outcomes, suggesting molecular heterogeneity. Purpose: This study aimed to explore the molecular responses of podocytes to nephrotic plasma to identify specific genes and signaling pathways differentiating various clinical NS groups as well as biological processes that drive injury in normal podocytes. Methods: Transcriptome profiles from immortalized human podocyte cell line exposed to the plasma of 8 subjects (steroid-sensitive nephrotic syndrome [SSNS], n=4; steroid-resistant nephrotic syndrome [SRNS], n=2; and healthy adult individuals [control], n=2) were generated using microarray analysis. Results: Unsupervised hierarchical clustering of global gene expression data was broadly correlated with the clinical classification of NS. Differential gene expression (DGE) analysis of diseased groups (SSNS or SRNS) versus healthy controls identified 105 genes (58 up-regulated, 47 down-regulated) in SSNS and 139 genes (78 up-regulated, 61 down-regulated) in SRNS with 55 common to SSNS and SRNS, while the rest were unique (50 in SSNS, 84 genes in SRNS). Pathway analysis of the significant (P≤0.05, -1≤ log2 FC ≥1) differentially expressed genes identified the transforming growth factor-β and Janus kinase-signal transducer and activator of transcription pathways to be involved in both SSNS and SRNS. DGE analysis of SSNS versus SRNS identified 2,350 genes with values of P≤0.05, and a heatmap of corresponding expression values of these genes in each subject showed clear differences in SSNS and SRNS. Conclusion: Our study observations indicate that, although podocyte injury follows similar pathways in different clinical subgroups, the pathways are modulated differently as evidenced by the heatmap. Such transcriptome profiling with a larger cohort can stratify patients into intrinsic subtypes and provide insight into the molecular mechanisms of podocyte injury.

GSnet: An Integrated Tool for Gene Set Analysis and Visualization

  • Choi, Yoon-Jeong;Woo, Hyun-Goo;Yu, Ung-Sik
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.133-136
    • /
    • 2007
  • The Gene Set network viewer (GSnet) visualizes the functional enrichment of a given gene set with a protein interaction network and is implemented as a plug-in for the Cytoscape platform. The functional enrichment of a given gene set is calculated using a hypergeometric test based on the Gene Ontology annotation. The protein interaction network is estimated using public data. Set operations allow a complex protein interaction network to be decomposed into a functionally-enriched module of interest. GSnet provides a new framework for gene set analysis by integrating a priori knowledge of a biological network with functional enrichment analysis.

Gene Expression Profiling in Diethylnitrosamine Treated Mouse Liver: From Pathological Data to Microarray Analysis (Diethylnitrosamine 처리 후 병리학적 결과를 기초로 한 마우스 간에서의 유전자 발현 분석)

  • Kim, Ji-Young;Yoon, Seok-Joo;Park, Han-Jin;Kim, Yong-Bum;Cho, Jae-Woo;Koh, Woo-Suk;Lee, Michael
    • Toxicological Research
    • /
    • v.23 no.1
    • /
    • pp.55-63
    • /
    • 2007
  • Diethylnitrosamine (DEN) is a nitrosamine compound that can induce a variety of liver lesions including hepatic carcinoma, forming DNA-carcinogen adducts. In the present study, microarray analyses were performed with Affymetrix Murine Genome 430A Array in order to identify the gene-expression profiles for DEN and to provide valuable information for the evaluation of potential hepatotoxicity. C57BL/6NCrj mice were orally administered once with DEN at doses of 0, 3, 7 and 20 mg/kg. Liver from each animal was removed 2, 4, 8 and 24 hrs after the administration. The histopathological analysis and serum biochemical analysis showed no significant difference in DEN-treated groups compared to control group. Conversely, the principal component analysis (PCA) profiles demonstrated that a specific normal gene expression profile in control groups differed clearly from the expression profiles of DEN-treated groups. Within groups, a little variance was found between individuals. Student's t-test on the results obtained from triplicate hybridizations was performed to identify those genes with statistically significant changes in the expression. Statistical analysis revealed that 11 genes were significantly downregulated and 28 genes were upregulated in all three animals after 2 h treatment at 20 mg/kg. The upregulated group included genes encoding Gdf15, JunD1, and Mdm2, while the genes including Sox6, Shmt2, and SIc6a6 were largely down regulated. Hierarchical clustering of gene expression also allowed the identification of functionally related clusters that encode proteins related to metabolism, and MAPK signaling pathway. Taken together, this study suggests that match with a toxicant signature can assign a putative mechanism of action to the test compound if is established a database containing response patterns to various toxic compounds.

Permutation-Based Test with Small Samples for Detecting Differentially Expressed Genes (극소수 샘플에서 유의발현 유전자 탐색에 사용되는 순열에 근거한 검정법)

  • Lee, Ju-Hyoung;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.5
    • /
    • pp.1059-1072
    • /
    • 2009
  • In the analysis of microarray data with a small number of arrays, the most important task is the detection of differentially expressed genes by a significance test. For this purpose, one needs to construct a null distribution based on a large number of genes and one of the best way for constructing the null distribution for a small number of arrays is by means of permutation methods. In this paper we propose simple test statistics and permutation methods that are appropriate in constructing the null distribution. In a simulation study, we compare the null distributions generated by the proposed test statistics and permutation methods with the previous ones. With an example microarray data, differentially expressed genes are determined by applying these methods.

Analysis of Putative Downstream Genes of Arabidopsis AtERF71/HRE2 Transcription Factor using a Microarray (마이크로어레이를 이용한 애기장대 AtERF71/HRE2 전사인자의 하위 유전자 분석)

  • Seok, Hye-Yeon;Lee, Sun-Young;Woo, Dong-Hyuk;Park, Hee-Yeon;Moon, Yong-Hwan
    • Journal of Life Science
    • /
    • v.22 no.10
    • /
    • pp.1359-1370
    • /
    • 2012
  • Arabidopsis AtERF71/HRE2, a transcription activator, is located in the nucleus and is involved in the signal transduction of low oxygen and osmotic stresses. In this study, microarray analysis using AtERF71/HRE2-overexpressing transgenic plants was performed to identify genes downstream of AtERF71/HRE2. A total of 161 different genes as well as AtERF71/HRE2 showed more than a twofold higher expression in AtERF71/HRE2-overexpressing transgenic plants compared with wild-type plants. Among the 161 genes, 24 genes were transcriptional regulators, such as transcription factors and DNA-binding proteins, based on gene ontology annotations, suggesting that AtERF71/HRE2 is an upstream transcription factor that regulates the activities of various downstream genes via these transcription regulators. RT-PCR analysis of 15 genes selected out of the 161 genes showed higher expression in AtERF71/HRE2-overexpressing transgenic plants, validating the microarray data. On the basis of Genevestigator database analysis, 51 genes among the 161 genes were highly expressed under low oxygen and/or osmotic stresses. RT-PCR analysis showed that the expression levels of three genes among the selected 15 genes increased under low oxygen stress and another three genes increased under high salt stress, suggesting that these genes might be downstream genes of AtERF71/HRE2 in low oxygen or high salt stress signal transduction. Microarray analysis results indicated that AtERF71/HRE2 might also be involved in the responses to other abiotic stresses and also in the regulation of plant developmental processes.

BINGO: Biological Interpretation Through Statistically and Graph-theoretically Navigating Gene $Ontology^{TM}$

  • Lee, Sung-Geun;Yang, Jae-Seong;Chung, Il-Kyung;Kim, Yang-Seok
    • Molecular & Cellular Toxicology
    • /
    • v.1 no.4
    • /
    • pp.281-283
    • /
    • 2005
  • Extraction of biologically meaningful data and their validation are very important for toxicogenomics study because it deals with huge amount of heterogeneous data. BINGO is an annotation mining tool for biological interpretation of gene groups. Several statistical modeling approaches using Gene Ontology (GO) have been employed in many programs for that purpose. The statistical methodologies are useful in investigating the most significant GO attributes in a gene group, but the coherence of the resultant GO attributes over the entire group is rarely assessed. BINGO complements the statistical methods with graph-theoretic measures using the GO directed acyclic graph (DAG) structure. In addition, BINGO visualizes the consistency of a gene group more intuitively with a group-based GO subgraph. The input group can be any interesting list of genes or gene products regardless of its generation process if the group is built under a functional congruency hypothesis such as gene clusters from DNA microarray analysis.