• 제목/요약/키워드: visualization of genome information

검색결과 36건 처리시간 0.027초

Visualizing Live Chromatin Dynamics through CRISPR-Based Imaging Techniques

  • Chaudhary, Narendra;Im, Jae-Kyeong;Nho, Si-Hyeong;Kim, Hajin
    • Molecules and Cells
    • /
    • 제44권9호
    • /
    • pp.627-636
    • /
    • 2021
  • The three-dimensional organization of chromatin and its time-dependent changes greatly affect virtually every cellular function, especially DNA replication, genome maintenance, transcription regulation, and cell differentiation. Sequencing-based techniques such as ChIP-seq, ATAC-seq, and Hi-C provide abundant information on how genomic elements are coupled with regulatory proteins and functionally organized into hierarchical domains through their interactions. However, visualizing the time-dependent changes of such organization in individual cells remains challenging. Recent developments of CRISPR systems for site-specific fluorescent labeling of genomic loci have provided promising strategies for visualizing chromatin dynamics in live cells. However, there are several limiting factors, including background signals, off-target binding of CRISPR, and rapid photobleaching of the fluorophores, requiring a large number of target-bound CRISPR complexes to reliably distinguish the target-specific foci from the background. Various modifications have been engineered into the CRISPR system to enhance the signal-to-background ratio and signal longevity to detect target foci more reliably and efficiently, and to reduce the required target size. In this review, we comprehensively compare the performances of recently developed CRISPR designs for improved visualization of genomic loci in terms of the reliability of target detection, the ability to detect small repeat loci, and the allowed time of live tracking. Longer observation of genomic loci allows the detailed identification of the dynamic characteristics of chromatin. The diffusion properties of chromatin found in recent studies are reviewed, which provide suggestions for the underlying biological processes.

Loss of Heterozygosity at the Calcium Regulation Gene Locus on Chromosome 10q in Human Pancreatic Cancer

  • Long, Jin;Zhang, Zhong-Bo;Liu, Zhe;Xu, Yuan-Hong;Ge, Chun-Lin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권6호
    • /
    • pp.2489-2493
    • /
    • 2015
  • Background: Loss of heterozygosity (LOH) on chromosomal regions is crucial in tumor progression and this study aimed to identify genome-wide LOH in pancreatic cancer. Materials and Methods: Single-nucleotide polymorphism (SNP) profiling data GSE32682 of human pancreatic samples snap-frozen during surgery were downloaded from Gene Expression Omnibus database. Genotype console software was used to perform data processing. Candidate genes with LOH were screened based on the genotype calls, SNP loci of LOH and dbSNP database. Gene annotation was performed to identify the functions of candidate genes using NCBI (the National Center for Biotechnology Information) database, followed by Gene Ontology, INTERPRO, PFAM and SMART annotation and UCSC Genome Browser track to the unannotated genes using DAVID (the Database for Annotation, Visualization and Integration Discovery). Results: The candidate genes with LOH identified in this study were MCU, MICU1 and OIT3 on chromosome 10. MCU was found to encode a calcium transporter and MICU1 could encode an essential regulator of mitochondrial $Ca^{2+}$ uptake. OIT3 possibly correlated with calcium binding revealed by the annotation analyses and was regulated by a large number of transcription factors including STAT, SOX9, CREB, NF-kB, PPARG and p53. Conclusions: Global genomic analysis of SNPs identified MICU1, MCU and OIT3 with LOH on chromosome 10, implying involvement of these genes in progression of pancreatic cancer.

SOP (Search of Omics Pathway): A Web-based Tool for Visualization of KEGG Pathway Diagrams of Omics Data

  • Kim, Jun-Sub;Yeom, Hye-Jung;Kim, Seung-Jun;Kim, Ji-Hoon;Park, Hye-Won;Oh, Moon-Ju;Hwang, Seung-Yong
    • Molecular & Cellular Toxicology
    • /
    • 제3권3호
    • /
    • pp.208-213
    • /
    • 2007
  • With the help of a development and popularization of microarray technology that enable to us to simultaneously investigate the expression pattern of thousands of genes, the toxicogenomics experimenters can interpret the genome-scale interaction between genes exposed in toxicant or toxicant-related environment. The ultimate and primary goal of toxicogenomics identifies functional context among the group of genes that are differentially or similarly coexpressed under the specific toxic substance. On the other side, public reference databases with transcriptom, proteom, and biological pathway information are needed for the analysis of these complex omics data. However, due to the heterogeneous and independent nature of these databases, it is hard to individually analyze a large omics annotations and their pathway information. Fortunately, several web sites of the public database provide information linked to other. Nevertheless it involves not only approriate information but also unnecessary information to users. Therefore, the systematically integrated database that is suitable to a demand of experimenters is needed. For these reasons, we propose SOP (Search of Omics Pathway) database system which is constructed as the integrated biological database converting heterogeneous feature of public databases into combined feature. In addition, SOP offers user-friendly web interfaces which enable users to submit gene queries for biological interpretation of gene lists derived from omics experiments. Outputs of SOP web interface are supported as the omics annotation table and the visualized pathway maps of KEGG PATHWAY database. We believe that SOP will appear as a helpful tool to perform biological interpretation of genes or proteins traced to omics experiments, lead to new discoveries from their pathway analysis, and design new hypothesis for a next toxicogenomics experiments.

3 계층의 2.5차원 대사경로 레이아웃 알고리즘 (3-layer 2.5D Metabolic pathway layout algorithm)

  • 송은하;용승림
    • 한국컴퓨터정보학회논문지
    • /
    • 제18권6호
    • /
    • pp.71-79
    • /
    • 2013
  • 화합물의 상호 관계를 그래프를 통해 표현하는 대사 경로는 본질적인 복잡성 때문에 대사 경로 내의 흐름을 한눈에 알 수 있도록 가시화하여 보여 주는 도구가 반드시 필요하다. 또한 유전체 수준의 대사 경로를 연구하기 위해서는 대사 경로 그래프 레이아웃 상에 나타나는 에지 교차를 줄이는 것이 시각화의 매우 중요한 부분이다. 본 논문은 생물학에서의 대사 경로에 대한 시각화를 위한 3-계층을 이용한 대사 경로 레이아웃 알고리즘을 제안한다. 대사경로의 구조적 특징을 고려하여 노드수가 증가하여도 에지 교차가 기하급수적으로 증가하는 문제를 해결하기 위하여 연결성 높은 노드와 환형 컴포넌트를 중앙계층에 위치시키고 나머지 부분 그래프를 상위와 하위 계층에 레이아웃 하도록 한다. 실험을 통해 에지 교차수가 줄어듦을 확인할 수 있다.

스트링 B-트리를 이용한 게놈 서열 분석 시스템 (An Analysis System for Whole Genomic Sequence Using String B-Tree)

  • 최정현;조환규
    • 정보처리학회논문지A
    • /
    • 제8A권4호
    • /
    • pp.509-516
    • /
    • 2001
  • 생명 과학의 발전과 많은 게놈(genome) 프로젝트의 결과로 여러 종의 게놈 서열이 밝혀지고 있다. 생물체의 서열을 분석하는 방법은 전역정렬(global alignment), 지역정렬(local alignment) 등 여러 가지 방법이 있는데, 그 중 하나가 k-mer 분석이다. k-mer는 유전자의 염기 서열내의 길이가 k인 연속된 염기 서열로서 k-mer 분석은 염기서열이 가진 k-mer들의 빈도 분포나 대칭성 등을 탐색하는 것이다. 그런데 게놈의 염기 서열은 대용량 텍스트이고 k가 클 때 기존의 온메모리 알고리즘으로는 처리가 불가능하므로 효율적인 자료구조와 알고리즘이 필요하다. 스트링 B-트리는 패턴 일치(pattern matching)에 적합하고 외부 메모리를 지원하는 좋은 자료구조이다. 본 논문에서는 스트링 B-트리(string B-tree)를 k-mer 분석에 효율적인 구조로 개선하여, C. elegans 외의 30개의 게놈 서열에 대해 분석한다. k-mer들의 빈도 분포와 대칭성을 보여주기 위해 CGR(Chaotic Game Representation)을 이용한 가시화 시스템을 제시한다. 게놈 서열과 매우 유사한 서열 상의 어떤 부분을 시그니쳐(signature)라 하고, 높은 유사도를 가지는 최소 길이의 시그니쳐를 찾는 알고리즘을 제시한다.

  • PDF

High-performance computing for SARS-CoV-2 RNAs clustering: a data science-based genomics approach

  • Oujja, Anas;Abid, Mohamed Riduan;Boumhidi, Jaouad;Bourhnane, Safae;Mourhir, Asmaa;Merchant, Fatima;Benhaddou, Driss
    • Genomics & Informatics
    • /
    • 제19권4호
    • /
    • pp.49.1-49.11
    • /
    • 2021
  • Nowadays, Genomic data constitutes one of the fastest growing datasets in the world. As of 2025, it is supposed to become the fourth largest source of Big Data, and thus mandating adequate high-performance computing (HPC) platform for processing. With the latest unprecedented and unpredictable mutations in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the research community is in crucial need for ICT tools to process SARS-CoV-2 RNA data, e.g., by classifying it (i.e., clustering) and thus assisting in tracking virus mutations and predict future ones. In this paper, we are presenting an HPC-based SARS-CoV-2 RNAs clustering tool. We are adopting a data science approach, from data collection, through analysis, to visualization. In the analysis step, we present how our clustering approach leverages on HPC and the longest common subsequence (LCS) algorithm. The approach uses the Hadoop MapReduce programming paradigm and adapts the LCS algorithm in order to efficiently compute the length of the LCS for each pair of SARS-CoV-2 RNA sequences. The latter are extracted from the U.S. National Center for Biotechnology Information (NCBI) Virus repository. The computed LCS lengths are used to measure the dissimilarities between RNA sequences in order to work out existing clusters. In addition to that, we present a comparative study of the LCS algorithm performance based on variable workloads and different numbers of Hadoop worker nodes.