• Title/Summary/Keyword: Microarray data analysis

Search Result 326, Processing Time 0.03 seconds

Cluster Analysis of Incomplete Microarray Data with Fuzzy Clustering

  • Kim, Dae-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we present a method for clustering incomplete Microarray data using alternating optimization in which a prior imputation method is not required. To reduce the influence of imputation in preprocessing, we take an alternative optimization approach to find better estimates during iterative clustering process. This method improves the estimates of missing values by exploiting the cluster Information such as cluster centroids and all available non-missing values in each iteration. The clustering results of the proposed method are more significantly relevant to the biological gene annotations than those of other methods, indicating its effectiveness and potential for clustering incomplete gene expression data.

Quantitative analysis using decreasing amounts of genomic DNA to assess the performance of the oligo CGH microarray

  • Song Sunny;Lazar Vladimir;Witte Anniek De;Ilsley Diane
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2006.02a
    • /
    • pp.71-76
    • /
    • 2006
  • Comparative genomic hybridization (CGH) is a technique for studying chromosomal changes in cancer. As cancerous cells multiply, they can undergo dramatic chromosomal changes, including chromosome loss, duplication, and the translocation of DNA from one chromosome to another. Chromosome aberrations have previously been detected using optical imaging of whole chromosomes, a technique with limited sensitivity, resolution, quantification, and throughput. Efforts in recent years to use microarrays to overcome these limitations have been hampered by inadequate sensitivity, specificity and flexibility of the microarray systems. The oligonucleotide CGH microarray system overcomes several scientific hurdles that have impeded comparative genomic studies of cancer. This new system can reliably detect single copy deletions in chromosomes. The system includes a whole human genome microarray, reagents for sample preparation, an optimized microarray processing protocol, and software for data analysis and visualization. In this study, we determined the sensitivity, accuracy and reproducibility of the new system. Using this assay, we find that the performance of the complete system was maintained over a range of input genomic DNA from 5 ug down to 0.15 ug.

  • PDF

Cross platform classification of microarrays by rank comparison

  • Lee, Sunho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.475-486
    • /
    • 2015
  • Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.

Identification of Novel Universal Housekeeping Genes by Statistical Analysis of Microarray Data

  • Lee, Se-Ram;Jo, Min-Joung;Lee, Jung-Eun;Koh, Sang-Seok;Kim, So-Youn
    • BMB Reports
    • /
    • v.40 no.2
    • /
    • pp.226-231
    • /
    • 2007
  • Housekeeping genes are widely used as internal controls in a variety of study types, including real time RT-PCR, microarrays, Northern analysis and RNase protection assays. However, even commonly used housekeeping genes may vary in stability depending on the cell type or disease being studied. Thus, it is necessary to identify additional housekeeping-type genes that show sample-independent stability. Here, we used statistical analysis to examine a large human microarray database, seeking genes that were stably expressed in various tissues, disease states and cell lines. We further selected genes that were expressed at different levels, because reference and target genes should be present in similar copy numbers to achieve reliable quantitative results. Real time RT-PCR amplification of three newly identified reference genes, CGI-119, CTBP1 and GOLGAl, alongside three well-known housekeeping genes, B2M, GAPD, and TUBB, confirmed that the newly identified genes were more stably expressed in individual samples with similar ranges. These results collectively suggest that statistical analysis of microarray data can be used to identify new candidate housekeeping genes showing consistent expression across tissues and diseases. Our analysis identified three novel candidate housekeeping genes (CGI-119, GOLGA1, and CTBP1) that could prove useful for normalization across a variety of RNA-based techniques.

Predicting Survival of DLBCL Patients in Pathway-Based Microarray Analysis (DLBCL 환자의 대사경로 정보를 이용한 생존예측)

  • Lee, Kwang-Hyun;Lee, Sun-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.4
    • /
    • pp.705-713
    • /
    • 2010
  • Predicting survival from microarray data is not easy due to the problem of high dimensionality of data and the existence of censored observations. Also the limitation of individual gene analysis causes the shift of focus to the level of gene sets with functionally related genes. For developing a survival prediction model based on pathway information, the methods for selecting a supergene using principal component analysis and testing its significance for each pathway are discussed. Besides, the performance of gene filtering is compared.

Microarray Data Analysis of Perturbed Pathways in Breast Cancer Tissues

  • Kim, Chang-Sik;Choi, Ji-Won;Yoon, Suk-Joon
    • Genomics & Informatics
    • /
    • v.6 no.4
    • /
    • pp.210-222
    • /
    • 2008
  • Due to the polygenic nature of cancer, it is believed that breast cancer is caused by the perturbation of multiple genes and their complex interactions, which contribute to the wide aspects of disease phenotypes. A systems biology approach for the identification of subnetworks of interconnected genes as functional modules is required to understand the complex nature of diseases such as breast cancer. In this study, we apply a 3-step strategy for the interpretation of microarray data, focusing on identifying significantly perturbed metabolic pathways rather than analyzing a large amount of overexpressed and underexpressed individual genes. The selected pathways are considered to be dysregulated functional modules that putatively contribute to the progression of disease. The subnetwork of protein-protein interactions for these dysregulated pathways are constructed for further detailed analysis. We evaluated the method by analyzing microarray datasets of breast cancer tissues; i.e., normal and invasive breast cancer tissues. Using the strategy of microarray analysis, we selected several significantly perturbed pathways that are implicated in the regulation of progression of breast cancers, including the extracellular matrix-receptor interaction pathway and the focal adhesion pathway. Moreover, these selected pathways include several known breast cancer-related genes. It is concluded from this study that the present strategy is capable of selecting interesting perturbed pathways that putatively play a role in the progression of breast cancer and provides an improved interpretability of networks of protein-protein interactions.

Analysis of gene expression during odontogenic differentiation of cultured human dental pulp cells

  • Seo, Min-Seock;Hwang, Kyung-Gyun;Kim, Hyong-Bum;Baek, Seung-Ho
    • Restorative Dentistry and Endodontics
    • /
    • v.37 no.3
    • /
    • pp.142-148
    • /
    • 2012
  • Objectives: We analyzed gene-expression profiles after 14 day odontogenic induction of human dental pulp cells (DPCs) using a DNA microarray and sought candidate genes possibly associated with mineralization. Materials and Methods: Induced human dental pulp cells were obtained by culturing DPCs in odontogenic induction medium (OM) for 14 day. Cells exposed to normal culture medium were used as controls. Total RNA was extracted from cells and analyzed by microarray analysis and the key results were confirmed selectively by reverse-transcriptase polymerase chain reaction (RT-PCR). We also performed a gene set enrichment analysis (GSEA) of the microarray data. Results: Six hundred and five genes among the 47,320 probes on the BeadChip differed by a factor of more than two-fold in the induced cells. Of these, 217 genes were upregulated, and 388 were down-regulated. GSEA revealed that in the induced cells, genes implicated in Apoptosis and Signaling by wingless MMTV integration (Wnt) were significantly upregulated. Conclusions: Genes implicated in Apoptosis and Signaling by Wnt are highly connected to the differentiation of dental pulp cells into odontoblast.

Gene Discovery Analysis from Mouse Embryonic Stem Cells Based on Time Course Microarray Data

  • Suh, Young Ju;Cho, Sun A;Shim, Jung Hee;Yook, Yeon Joo;Yoo, Kyung Hyun;Kim, Jung Hee;Park, Eun Young;Noh, Ji Yeun;Lee, Seong Ho;Yang, Moon Hee;Jeong, Hyo Seok;Park, Jong Hoon
    • Molecules and Cells
    • /
    • v.26 no.4
    • /
    • pp.338-343
    • /
    • 2008
  • An embryonic stem cell is a powerful tool for investigation of early development in vitro. The study of embryonic stem cell mediated neuronal differentiation allows for improved understanding of the mechanisms involved in embryonic neuronal development. We investigated expression profile changes using time course cDNA microarray to identify clues for the signaling network of neuronal differentiation. For the short time course microarray data, pattern analysis based on the quadratic regression method is an effective approach for identification and classification of a variety of expressed genes that have biological relevance. We studied the expression patterns, at each of 5 stages, after neuronal induction at the mRNA level of embryonic stem cells using the quadratic regression method for pattern analysis. As a result, a total of 316 genes (3.1%) including 166 (1.7%) informative genes in 8 possible expression patterns were identified by pattern analysis. Among the selected genes associated with neurological system, all three genes showing linearly increasing pattern over time, and one gene showing decreasing pattern over time, were verified by RT-PCR. Therefore, an increase in gene expression over time, in a linear pattern, may be associated with embryonic development. The genes: Tcfap2c, Ttr, Wnt3a, Btg2 and Foxk1 detected by pattern analysis, and verified by RT-PCR simultaneously, may be candidate markers associated with the development of the nervous system. Our study shows that pattern analysis, using the quadratic regression method, is very useful for investigation of time course cDNA microarray data. The pattern analysis used in this study has biological significance for the study of embryonic stem cells.

Finding Interesting Genes Using Reliability in Various Gene Expression Models

  • Lee, Eun-Kyung;Cook, Dianne;Hoffman, Heike
    • Genomics & Informatics
    • /
    • v.9 no.1
    • /
    • pp.28-36
    • /
    • 2011
  • Most statistical methods for finding interesting genes are focusing on the summary values with large fold-changes or large variations. Very few methods consider the probe level data. We developed a new measure to detect reliability that incorporates the probe level data. This reliability measure is useful for exploring the microarray data without ignoring the probe level data. It is easy to calculate, and it can be used for all the other statistical methods as a good guideline to find real differentially expressed genes. Instead of filtering out genes before the analysis, we use whole genes in the analysis and make decisions with new reliability measures.

Developing a Parametric Method for Testing the Significance of Gene Sets in Microarray Data Analysis (마이크로어레이 자료분석에서 모수적 방법을 이용한 유전자군의 유의성 검정)

  • Lee, Sun-Ho;Lee, Seung-Kyu;Lee, Kwang-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.3
    • /
    • pp.397-408
    • /
    • 2009
  • The development of microarray technology makes possible to analyse many thousands of genes simultaneously. While it is important to test each gene whether it shows changes in expression associated with a phenotype, human diseases are thought to occur through the interactions of multiple genes within a same functional cafe-gory. Recent research interests aims to directly test the behavior of sets of functionally related genes, instead of focusing on single genes. Gene set enrichment analysis(GSEA), significance analysis of microarray to gene-set analysis(SAM-GS) and parametric analysis of gene set enrichment(PAGE) have been applied widely as a tool for gene-set analyses. We describe their problems and propose an alternative method using a parametric analysis by adopting normal score transformation of gene expression values. Performance of the newly derived method is compared with previous methods on three real microarray datasets.