• Title/Summary/Keyword: 유전체 단위반복변이

Search Result 6, Processing Time 0.022 seconds

Genome-Wide Association Study between Copy Number Variation and Trans-Gene Expression by Protein-Protein Interaction-Network (단백질 상호작용 네트워크를 통한 유전체 단위반복변이와 트랜스유전자 발현과의 연관성 분석)

  • Park, Chi-Hyun;Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.18D no.2
    • /
    • pp.89-100
    • /
    • 2011
  • The CNV (Copy Number Variation) which is one of the genetic structural variations in human genome is closely related with the function of gene. In particular, the genome-wide association studies for genetic diseased persons have been researched. However, there have been few studies which infer the genetic function of CNV with normal human. In this paper, we propose the analysis method to reveal the functional relationship between common CNV and genes without considering their genomic loci. To achieve that, we propose the data integration method for heterogeneity biological data and novel measurement which can calculate the correlation between common CNV and genes. To verify the significance of proposed method, we has experimented several verification tests with GO database. The result showed that the novel measurement had enough significance compared with random test and the proposed method could systematically produce the candidates of genetic function which have strong correlation with common CNV.

Highly accurate detection of cancer-specific copy number variations with MapReduce (맵리듀스 기반의 암 특이적 유전자 단위 반복 변이 추출)

  • Shin, Jae-Moon;Hong, Sang-Kyoon;Lee, Un-Joo;Yoon, Jee-Hee
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06c
    • /
    • pp.19-21
    • /
    • 2012
  • 모든 암 세포는 체세포 변이를 동반한다. 따라서 암 유전체 변이 분석에 의하여 암을 발생시키는 유전자 및 진단/치료법을 찾아낼 수 있다. 본 연구에서는 차세대 시퀀싱 데이터를 이용하여 암 특이적 단이 반복 변이(copy number variation, CNV) 유형을 밝히는 새로운 알고리즘을 제안한다. 제안하는 방식은 암 환자의 정상 세포와 암세포로부터 얻어진 정상 유전체와 암 유전체를 동시 분석하여 각각 CNV 후보 영역을 추출하며, 통계적 유의성 분석을 통하여 암 특이적 CNV 후보 영역을 선별하고, 다음 후처리 과정에서 참조 표준 서열(reference sequence)에 존재하는 오류 영역 보정 작업을 수행하여 정확한 암 특이적 CNV 영역을 추출해 낸다. 또한 다수의 대용량 유전체 데이터 동시 분석을 위하여 맵리듀스(MapReduce) 기법을 기반으로 하는 병렬 수행 알고리즘을 제안한다.

An Enhanced SW-ARRAY Method for Detecting Copy Number Variations(CNVs) (유전체 단위 반복 변이(CNV) 발견을 위한 개선된 SW-ARRAY)

  • Moon, Myung-Jin;Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Chi-Hyun;Park, Sang-Hyun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.208-211
    • /
    • 2008
  • 최근 유전체 단위 반복 변이(CNV)의 중요성이 부각되고 있다. CNV란 DNA가 복제될 때 일부가 만들어지지 않거나 혹은 많이 만들어져 그 양이 차이가 나게 되는 것으로, 인간의 질병이나 형질과 밀접한 관련을 가진다고 알려져 있다. 이에 따라 CNV와 관련된 연구가 활발히 진행되었으며, CNV를 찾기 위한 다양한 방법들이 나오게 되었다. 본 논문에서는 CNV를 찾아내는 대표적인 기법 중 하나인 SW-ARRAY에 대해서 알아보고, 여기에 페널티 값과 점수에 따른 가변 임계값을 적용하여 보정함으로써 기존 SW-ARRAY의 문제점을 해결하는 방법을 제안한다. 이를 실제 Array-CGH 데이터에 적용한 결과 긍정 오류 값이 줄어들어 기존의 방식에 비해 정확한 값을 얻게 되었다.

  • PDF

CNVDAT: A Copy Number Variation Detection and Analysis Tool for Next-generation Sequencing Data (CNVDAT : 차세대 시퀀싱 데이터를 위한 유전체 단위 반복 변이 검출 및 분석 도구)

  • Kang, Inho;Kong, Jinhwa;Shin, JaeMoon;Lee, UnJoo;Yoon, Jeehee
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.249-255
    • /
    • 2014
  • Copy number variations(CNVs) are a recently recognized class of human structural variations and are associated with a variety of human diseases, including cancer. To find important cancer genes, researchers identify novel CNVs in patients with a particular cancer and analyze large amounts of genomic and clinical data. We present a tool called CNVDAT which is able to detect CNVs from NGS data and systematically analyze the genomic and clinical data associated with variations. CNVDAT consists of two modules, CNV Detection Engine and Sequence Analyser. CNV Detection Engine extracts CNVs by using the multi-resolution system of scale-space filtering, enabling the detection of the types and the exact locations of CNVs of all sizes even when the coverage level of read data is low. Sequence Analyser is a user-friendly program to view and compare variation regions between tumor and matched normal samples. It also provides a complete analysis function of refGene and OMIM data and makes it possible to discover CNV-gene-phenotype relationships. CNVDAT source code is freely available from http://dblab.hallym.ac.kr/CNVDAT/.

CNVR Detection Reflecting the Properties of the Reference Sequence in HLA Region (레퍼런스 시퀀스의 특성을 고려한 HLA 영역에서의 CNVR 탐지)

  • Lee, Jong-Keun;Hong, Dong-Wan;Yoon, Jee-Hee
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.6
    • /
    • pp.712-716
    • /
    • 2010
  • In this paper, we propose a novel shape-based approach to detect CNV regions (CNVR) by analyzing the coverage graph obtained by aligning the giga-sequencing data onto the human reference sequence. The proposed algorithm proceeds in two steps: a filtering step and a post-processing step. In the filtering step, it takes several shape parameters as input and extracts candidate CNVRs having various depth and width. In the post-processing step, it revises the candidate regions to make up for errors potentially included in the reference sequence and giga-sequencing data, and filters out regions with high ratio of GC-contents, and returns the final result set from those candidate CNVRs. To verify the superiority of our approach, we performed extensive experiments using giga-sequencing data publicly opened by "1000 genome project" and verified the accuracy by comparing our results with those registered in DGV database. The result revealed that our approach successfully finds the CNVR having various shapes (gains or losses) in HLA (Human Leukocyte Antigen) region.

A CNV detection algorithm based on statistical analysis of the aligned reads (정렬된 리드의 통계적 분석을 기반으로 하는 CNV 검색 알고리즘)

  • Hong, Sang-Kyoon;Hong, Dong-Wan;Yoon, Jee-Hee;Kim, Baek-Sop;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.5
    • /
    • pp.661-672
    • /
    • 2009
  • Recently it was found that various genetic structural variations such as CNV(copy number variation) exist in the human genome, and these variations are closely related with disease susceptibility, reaction to treatment, and genetic characteristics. In this paper we propose a new CNV detection algorithm using millions of short DNA sequences generated by giga-sequencing technology. Our method maps the DNA sequences onto the reference sequence, and obtains the occurrence frequency of each read in the reference sequence. And then it detects the statistically significant regions which are longer than 1Kbp as the candidate CNV regions by analyzing the distribution of the occurrence frequency. To select a proper read alignment method, several methods are employed in our algorithm, and the performances are compared. To verify the superiority of our approach, we performed extensive experiments. The result of simulation experiments (using a reference sequence, build 35 of NCBI) revealed that our approach successfully finds all the CNV regions that have various shapes and arbitrary length (small, intermediate, or large size).