• Title/Summary/Keyword: Omics Data

Search Result 67, Processing Time 0.025 seconds

Set Covering 기반의 대용량 오믹스데이터 특징변수 추출기법 (Set Covering-based Feature Selection of Large-scale Omics Data)

  • 마정우;안기동;김광수;류홍서
    • 한국경영과학회지
    • /
    • 제39권4호
    • /
    • pp.75-84
    • /
    • 2014
  • In this paper, we dealt with feature selection problem of large-scale and high-dimensional biological data such as omics data. For this problem, most of the previous approaches used simple score function to reduce the number of original variables and selected features from the small number of remained variables. In the case of methods that do not rely on filtering techniques, they do not consider the interactions between the variables, or generate approximate solutions to the simplified problem. Unlike them, by combining set covering and clustering techniques, we developed a new method that could deal with total number of variables and consider the combinatorial effects of variables for selecting good features. To demonstrate the efficacy and effectiveness of the method, we downloaded gene expression datasets from TCGA (The Cancer Genome Atlas) and compared our method with other algorithms including WEKA embeded feature selection algorithms. In the experimental results, we showed that our method could select high quality features for constructing more accurate classifiers than other feature selection algorithms.

Classification of Colon Cancer Patients Based on the Methylation Patterns of Promoters

  • Choi, Wonyoung;Lee, Jungwoo;Lee, Jin-Young;Lee, Sun-Min;Kim, Da-Won;Kim, Young-Joon
    • Genomics & Informatics
    • /
    • 제14권2호
    • /
    • pp.46-52
    • /
    • 2016
  • Diverse somatic mutations have been reported to serve as cancer drivers. Recently, it has also been reported that epigenetic regulation is closely related to cancer development. However, the effect of epigenetic changes on cancer is still elusive. In this study, we analyzed DNA methylation data on colon cancer taken from The Caner Genome Atlas. We found that several promoters were significantly hypermethylated in colon cancer patients. Through clustering analysis of differentially methylated DNA regions, we were able to define subgroups of patients and observed clinical features associated with each subgroup. In addition, we analyzed the functional ontology of aberrantly methylated genes and identified the G-protein-coupled receptor signaling pathway as one of the major pathways affected epigenetically. In conclusion, our analysis shows the possibility of characterizing the clinical features of colon cancer subgroups based on DNA methylation patterns and provides lists of important genes and pathways possibly involved in colon cancer development.

Advances in Systems Biology Approaches for Autoimmune Diseases

  • Kim, Ho-Youn;Kim, Hae-Rim;Lee, Sang-Heon
    • IMMUNE NETWORK
    • /
    • 제14권2호
    • /
    • pp.73-80
    • /
    • 2014
  • Because autoimmune diseases (AIDs) result from a complex combination of genetic and epigenetic factors, as well as an altered immune response to endogenous or exogenous antigens, systems biology approaches have been widely applied. The use of multi-omics approaches, including blood transcriptomics, genomics, epigenetics, proteomics, and metabolomics, not only allow for the discovery of a number of biomarkers but also will provide new directions for further translational AIDs applications. Systems biology approaches rely on high-throughput techniques with data analysis platforms that leverage the assessment of genes, proteins, metabolites, and network analysis of complex biologic or pathways implicated in specific AID conditions. To facilitate the discovery of validated and qualified biomarkers, better-coordinated multi-omics approaches and standardized translational research, in combination with the skills of biologists, clinicians, engineers, and bioinformaticians, are required.

BaSDAS: a web-based pooled CRISPR-Cas9 knockout screening data analysis system

  • Park, Young-Kyu;Yoon, Byoung-Ha;Park, Seung-Jin;Kim, Byung Kwon;Kim, Seon-Young
    • Genomics & Informatics
    • /
    • 제18권4호
    • /
    • pp.46.1-46.4
    • /
    • 2020
  • We developed the BaSDAS (Barcode-Seq Data Analysis System), a GUI-based pooled knockout screening data analysis system, to facilitate the analysis of pooled knockout screen data easily and effectively by researchers with limited bioinformatics skills. The BaSDAS supports the analysis of various pooled screening libraries, including yeast, human, and mouse libraries, and provides many useful statistical and visualization functions with a user-friendly web interface for convenience. We expect that BaSDAS will be a useful tool for the analysis of genome-wide screening data and will support the development of novel drugs based on functional genomics information.

남조류의 생리·생태 연구에서 분자생태유전학적 기법의 역할 및 전망 (Prospect and Roles of Molecular Ecogenetic Techniques in the Ecophysiological Study of Cyanobacteria)

  • 안치용
    • 생태와환경
    • /
    • 제51권1호
    • /
    • pp.16-28
    • /
    • 2018
  • 남조류에 대한 오랜 연구로 많은 사실을 알게 되었음에도 여전히 미지의 영역으로 남아있는 부분이 많은데, 분자 생물학에 기반한 오믹스 기술의 발전으로 새로운 도구를 이용한 다른 관점에서의 연구가 최근 활발해지고 있다. 일차적으로는 유전체 염기서열 분석기술을 사용하여 다양한 남조류의 유전체 비교분석과 유전자의 발현 양상을 연구함으로써, 독소 합성의 조절 기작 등 생리적 특성이 나타나는 원리 규명에 많은 노력이 기울여지고 있다. 또한 남조류 유전형의 다양성과 이들이 밀접하게 상호작용하는 박테리아 군집이 계절적 및 환경적 요인에 어떻게 반응하여 변화하고, 이러한 변화가 생태계에는 어떤 영향을 미치는지에 대한 연구가 생물정보학 분석기법과 결합하면서, 생태계의 복잡한 작동방식에 대한 이해도 늘어나고 있다. 특히 다양한 오믹스 기법을 복합 적용함으로써 생태계 안에서 일어나는 모든 층위의 생물학적 반응에 대한 총체적 그림을 그리는 것이 현실화되고 있으며, 이렇게 그려진 설계도로부터 녹조를 효과적으로 제어하고 건강한 수생태계를 유지할 수 있는 새로운 통찰의 가능성에 대한 기대가 고조되고 있다.

ZNF204P is a stemness-associated oncogenic long non-coding RNA in hepatocellular carcinoma

  • Hwang, Ji-Hyun;Lee, Jungwoo;Choi, Won-Young;Kim, Min-Jung;Lee, Jiyeon;Chu, Khanh Hoang Bao;Kim, Lark Kyun;Kim, Young-Joon
    • BMB Reports
    • /
    • 제55권6호
    • /
    • pp.281-286
    • /
    • 2022
  • Hepatocellular carcinoma is a major health burden, and though various treatments through much research are available, difficulties in early diagnosis and drug resistance to chemotherapy-based treatments render several ineffective. Cancer stem cell model has been used to explain formation of heterogeneous cell population within tumor mass, which is one of the underlying causes of high recurrence rate and acquired chemoresistance, highlighting the importance of CSC identification and understanding the molecular mechanisms of CSC drivers. Extracellular CSC-markers such as CD133, CD90 and EpCAM have been used successfully in CSC isolation, but studies have indicated that increasingly complex combinations are required for accurate identification. Pseudogene-derived long non-coding RNAs are useful candidates as intracellular CSC markers - factors that regulate pluripotency and self-renewal - given their cancer-specific expression and versatile regulation across several levels. Here, we present the use of microarray data to identify stemness-associated factors in liver cancer, and selection of sole pseudogene-derived lncRNA ZNF204P for experimental validation. ZNF204P knockdown impairs cell proliferation and migration/invasion. As the cytosolic ZNF204P shares miRNA binding sites with OCT4 and SOX2, well-known drivers of pluripotency and self-renewal, we propose that ZNF204P promotes tumorigenesis through the miRNA-145-5p/OCT4, SOX2 axis.

Perspectives of Integrative Cancer Genomics in Next Generation Sequencing Era

  • Kwon, So-Mee;Cho, Hyun-Woo;Choi, Ji-Hye;Jee, Byul-A;Jo, Yun-A;Woo, Hyun-Goo
    • Genomics & Informatics
    • /
    • 제10권2호
    • /
    • pp.69-73
    • /
    • 2012
  • The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research.

Estimation of high-dimensional sparse cross correlation matrix

  • Yin, Cao;Kwangok, Seo;Soohyun, Ahn;Johan, Lim
    • Communications for Statistical Applications and Methods
    • /
    • 제29권6호
    • /
    • pp.655-664
    • /
    • 2022
  • On the motivation by an integrative study of multi-omics data, we are interested in estimating the structure of the sparse cross correlation matrix of two high-dimensional random vectors. We rewrite the problem as a multiple testing problem and propose a new method to estimate the sparse structure of the cross correlation matrix. To do so, we test the correlation coefficients simultaneously and threshold the correlation coefficients by controlling FRD at a predetermined level α. Further, we apply the proposed method and an alternative adaptive thresholding procedure by Cai and Liu (2016) to the integrative analysis of the protein expression data (X) and the mRNA expression data (Y) in TCGA breast cancer cohort. By varying the FDR level α, we show that the new procedure is consistently more efficient in estimating the sparse structure of cross correlation matrix than the alternative one.

Network Analysis in Systems Epidemiology

  • Park, JooYong;Choi, Jaesung;Choi, Ji-Yeob
    • Journal of Preventive Medicine and Public Health
    • /
    • 제54권4호
    • /
    • pp.259-264
    • /
    • 2021
  • Traditional epidemiological studies have identified a number of risk factors for various diseases using regression-based methods that examine the association between an exposure and an outcome (i.e., one-to-one correspondences). One of the major limitations of this approach is the "black-box" aspect of the analysis, in the sense that this approach cannot fully explain complex relationships such as biological pathways. With high-throughput data in current epidemiology, comprehensive analyses are needed. The network approach can help to integrate multi-omics data, visualize their interactions or relationships, and make inferences in the context of biological mechanisms. This review aims to introduce network analysis for systems epidemiology, its procedures, and how to interpret its findings.

Mutation of the lbp-5 gene alters metabolic output in Caenorhabditis elegans

  • Xu, Mo;Choi, Eun-Young;Paik, Young-Ki
    • BMB Reports
    • /
    • 제47권1호
    • /
    • pp.15-20
    • /
    • 2014
  • Intracellular lipid-binding proteins (LBPs) impact fatty acid homeostasis in various ways, including fatty acid transport into mitochondria. However, the physiological consequences caused by mutations in genes encoding LBPs remain largely uncharacterized. Here, we explore the metabolic consequences of lbp-5 gene deficiency in terms of energy homeostasis in Caenorhabditis elegans. In addition to increased fat storage, which has previously been reported, deletion of lbp-5 attenuated mitochondrial membrane potential and increased reactive oxygen species levels. Biochemical measurement coupled to proteomic analysis of the lbp-5(tm1618) mutant revealed highly increased rates of glycolysis in this mutant. These differential expression profile data support a novel metabolic adaptation of C. elegans, in which glycolysis is activated to compensate for the energy shortage due to the insufficient mitochondrial ${\beta}$-oxidation of fatty acids in lbp-5 mutant worms. This report marks the first demonstration of a unique metabolic adaptation that is a consequence of LBP-5 deficiency in C. elegans.