• Title/Summary/Keyword: Copy Number Variation

Search Result 62, Processing Time 0.033 seconds

Highly accurate detection of cancer-specific copy number variations with MapReduce (맵리듀스 기반의 암 특이적 유전자 단위 반복 변이 추출)

  • Shin, Jae-Moon;Hong, Sang-Kyoon;Lee, Un-Joo;Yoon, Jee-Hee
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06c
    • /
    • pp.19-21
    • /
    • 2012
  • 모든 암 세포는 체세포 변이를 동반한다. 따라서 암 유전체 변이 분석에 의하여 암을 발생시키는 유전자 및 진단/치료법을 찾아낼 수 있다. 본 연구에서는 차세대 시퀀싱 데이터를 이용하여 암 특이적 단이 반복 변이(copy number variation, CNV) 유형을 밝히는 새로운 알고리즘을 제안한다. 제안하는 방식은 암 환자의 정상 세포와 암세포로부터 얻어진 정상 유전체와 암 유전체를 동시 분석하여 각각 CNV 후보 영역을 추출하며, 통계적 유의성 분석을 통하여 암 특이적 CNV 후보 영역을 선별하고, 다음 후처리 과정에서 참조 표준 서열(reference sequence)에 존재하는 오류 영역 보정 작업을 수행하여 정확한 암 특이적 CNV 영역을 추출해 낸다. 또한 다수의 대용량 유전체 데이터 동시 분석을 위하여 맵리듀스(MapReduce) 기법을 기반으로 하는 병렬 수행 알고리즘을 제안한다.

Detection of Microcystin Synthetic Cyanobacteria and Variation of Intracellular Microcystin Synthesis Using by eDNA and eRNA in Freshwater Ecocystem (담수환경에서 eDNA와 eRNA를 이용한 Microcystin 합성 남조류 탐색 및 세포 내 Microcystin 생합성 활성 변화)

  • Keonhee Kim;Chaehong Park;Hyeonjin Cho;Daeryul Kwon;Soon-Jin Hwang
    • Korean Journal of Ecology and Environment
    • /
    • v.56 no.1
    • /
    • pp.1-13
    • /
    • 2023
  • Targeting Microcystin (MC), which is most abundantly detected in the North-Han River water area, we analyzed the relationship between the MC biosynthesis gene (mcyA gene), cyanobacteria cell density, and MC concentration, derived an RNA-MC conversion formula, and derived the cyanobacteria. The concentration of MC present in cells was predicted. In the North-Han River waters, the mcyA gene was found mainly at downstream sites of the North-Han River after Muk-Hyeon Stream junction, and higher copy numbers were found on average than other sites. In the Uiam Lake waters upstream of the North-Han River, the mcyA gene copy number increased at the Kong-Ji Stream point, and after September, the mcyA gene copy number decreased throughout the North-Han River waters. The expression of the mcyA gene was concentrated in the short period of summer due to the spatio-temporal difference between upstream and downstream water bodies. The mcyA gene expression level was not only highly correlated with MC concentration, but also correlated with the cell density of Microcystis aeruginosa and Dolichospermum circinale, which are known to biosynthesize MC. Six conversion formulas derived based on the RNA-MC relationship showed statistical significance (p<0.05) and exhibited high correlation coefficients (r) of 0.9 or higher. The expression level of MC biosynthesis gene present in eRNA determines the synthesis of cyanotoxin substances in water, quickly quantifies gene activity, and can be fully utilized for early warning of MC development.

Identification of copy number variations using high density whole-genome single nucleotide polymorphism markers in Chinese Dongxiang spotted pigs

  • Wang, Chengbin;Chen, Hao;Wang, Xiaopeng;Wu, Zhongping;Liu, Weiwei;Guo, Yuanmei;Ren, Jun;Ding, Nengshui
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.12
    • /
    • pp.1809-1815
    • /
    • 2019
  • Objective: Copy number variations (CNVs) are a major source of genetic diversity complementary to single nucleotide polymorphism (SNP) in animals. The aim of the study was to perform a comprehensive genomic analysis of CNVs based on high density whole-genome SNP markers in Chinese Dongxiang spotted pigs. Methods: We used customized Affymetrix Axiom Pig1.4M array plates containing 1.4 million SNPs and the PennCNV algorithm to identify porcine CNVs on autosomes in Chinese Dongxiang spotted pigs. Then, the next generation sequence data was used to confirm the detected CNVs. Next, functional analysis was performed for gene contents in copy number variation regions (CNVRs). In addition, we compared the identified CNVRs with those reported ones and quantitative trait loci (QTL) in the pig QTL database. Results: We identified 871 putative CNVs belonging to 2,221 CNVRs on 17 autosomes. We further discarded CNVRs that were detected only in one individual, leaving us 166 CNVRs in total. The 166 CNVRs ranged from 2.89 kb to 617.53 kb with a mean value of 93.65 kb and a genome coverage of 15.55 Mb, corresponding to 0.58% of the pig genome. A total of 119 (71.69%) of the identified CNVRs were confirmed by next generation sequence data. Moreover, functional annotation showed that these CNVRs are involved in a variety of molecular functions. More than half (56.63%) of the CNVRs (n = 94) have been reported in previous studies, while 72 CNVRs are reported for the first time. In addition, 162 (97.59%) CNVRs were found to overlap with 2,765 previously reported QTLs affecting 378 phenotypic traits. Conclusion: The findings improve the catalog of pig CNVs and provide insights and novel molecular markers for further genetic analyses of Chinese indigenous pigs.

A CNV detection algorithm based on statistical analysis of the aligned reads (정렬된 리드의 통계적 분석을 기반으로 하는 CNV 검색 알고리즘)

  • Hong, Sang-Kyoon;Hong, Dong-Wan;Yoon, Jee-Hee;Kim, Baek-Sop;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.16D no.5
    • /
    • pp.661-672
    • /
    • 2009
  • Recently it was found that various genetic structural variations such as CNV(copy number variation) exist in the human genome, and these variations are closely related with disease susceptibility, reaction to treatment, and genetic characteristics. In this paper we propose a new CNV detection algorithm using millions of short DNA sequences generated by giga-sequencing technology. Our method maps the DNA sequences onto the reference sequence, and obtains the occurrence frequency of each read in the reference sequence. And then it detects the statistically significant regions which are longer than 1Kbp as the candidate CNV regions by analyzing the distribution of the occurrence frequency. To select a proper read alignment method, several methods are employed in our algorithm, and the performances are compared. To verify the superiority of our approach, we performed extensive experiments. The result of simulation experiments (using a reference sequence, build 35 of NCBI) revealed that our approach successfully finds all the CNV regions that have various shapes and arbitrary length (small, intermediate, or large size).

A Genome-Wide Study of Moyamoya-Type Cerebrovascular Disease in the Korean Population

  • Joo, Sung-Pil;Kim, Tae-Sun;Lee, Il-Kwon;Kim, Joon-Tae;Park, Man-Seok;Cho, Ki-Hyun
    • Journal of Korean Neurosurgical Society
    • /
    • v.50 no.6
    • /
    • pp.486-491
    • /
    • 2011
  • Objective : Structural genetic variation, including copy-number variation (CNV), constitutes a substantial fraction of total genetic variability, and the importance of structural variants in modulating susceptibility is increasingly being recognized. CNV can change biological function and contribute to pathophysiological conditions of human disease. Its relationship with common, complex human disease in particular is not fully understood. Here, we searched the human genome to identify copy number variants that predispose to moya-moya type cerebrovascular disease. Methods : We retrospectively analyzed patients who had unilateral or bilateral steno-occlusive lesions at the cerebral artery from March, 2007, to September, 2009. For the 20 subjects, including patients with moyamoya type pathologies and three normal healthy controls, we divided the subjects into 4 groups : typical moyamoya (n=6), unilateral moyamoya (n=9), progression unilateral to typical moyamoya (n=2) and non-moyamoya (n=3). Fragmented DNA was hybridized on Human610Quad v1.0 DNA analysis BeadChips (Illumina). Data analysis was performed with GenomeStudio v2009.1, Genotyping 1.1.9, cnvPartition_v2.3.4 software. Overall call rates were more than 99.8%. Results : In total, 1258 CNVs were identified across the whole genome. The average number of CNV was 45.55 per subject (CNV region was 45.4). The gain/loss of CNV was 52/249, having 4.7 fold higher frequencies in loss calls. The total CNV size was 904,657,868, and average size was 993,038. The largest portion of CNVs (613 calls) were 1M-10M in length. Interestingly, significant association between unilateral moyamoya disease (MMD) and progression of unilateral to typical moyamoya was observed. Conclusion : Significant association between unilateral MMD and progression of unilateral to typical moyamoya was observed. The finding was confirmed again with clustering analysis. These data demonstrate that certain CNV associate with moyamoya-type cerebrovascular disease.

Identification of CNVs and their association with the meat traits of Hanwoo

  • Chan Mi Bang;Khaliunaa Tseveen;Gwang Hyeon Lee;Gil Jong Seo;Hong Sik Kong
    • Journal of Animal Reproduction and Biotechnology
    • /
    • v.38 no.3
    • /
    • pp.158-166
    • /
    • 2023
  • Background: Copy number variation (CNV) can be identified using next-generation sequencing and microarray technologies, the research on the analysis of its association with meat traits in livestock breeding has significantly increased in recent years. Hanwoo is an inherent species raised in the Republic of Korea. It is now considered one of the most economically important species and a major food source mainly used for meat (Hanwoo beef). Methods: In this study, CNVs and the relationship between the obtained CNV regions (CNVRs) can be identified in the Hanwoo steer samples (n = 473) using Illumina Hanwoo SNP 50K bead chip and bioinformatic tools, which were used to locate the required data and meat traits were investigated. The PennCNV software was used for the identification of CNVs, followed by the use of the CNV Ruler software for locating the different CNVRs. Furthermore, bioinformatics analysis was performed. Results: We found a total of 2,575 autosomal CNVs (933 losses, 1,642 gains) and 416 CNVRs (289 gains, 111 losses, and 16 mixed), which were established with ranged in size from 2,183 bp to 983,333 bp and 10,004 bp to 381,836 bp, respectively. Upon analyzing the restriction of minor alleles frequency > 0.05 for meat traits association, 6 CNVRs in the carcass weight, 2 CNVRs in the marbling score, 3 CNVRs in the backfat thickness, and 2 CNVRs in the longissimus muscle area were related to the meat traits. In addition, we identified an overlap of 347 CNVRs. Moreover, 3 CNVRs were determined to have a gene that affects meat quality. Conclusions: Our results confirmed the relationship between Hanwoo CNVR and meat traits, and the possibility of overlapping candidate genes, annotations, and quantitative trait loci that results depended on to contribute to the greater understanding of CNVs in Hanwoo and its role in genetic variation among cattle livestock.

CNVDAT: A Copy Number Variation Detection and Analysis Tool for Next-generation Sequencing Data (CNVDAT : 차세대 시퀀싱 데이터를 위한 유전체 단위 반복 변이 검출 및 분석 도구)

  • Kang, Inho;Kong, Jinhwa;Shin, JaeMoon;Lee, UnJoo;Yoon, Jeehee
    • Journal of KIISE:Databases
    • /
    • v.41 no.4
    • /
    • pp.249-255
    • /
    • 2014
  • Copy number variations(CNVs) are a recently recognized class of human structural variations and are associated with a variety of human diseases, including cancer. To find important cancer genes, researchers identify novel CNVs in patients with a particular cancer and analyze large amounts of genomic and clinical data. We present a tool called CNVDAT which is able to detect CNVs from NGS data and systematically analyze the genomic and clinical data associated with variations. CNVDAT consists of two modules, CNV Detection Engine and Sequence Analyser. CNV Detection Engine extracts CNVs by using the multi-resolution system of scale-space filtering, enabling the detection of the types and the exact locations of CNVs of all sizes even when the coverage level of read data is low. Sequence Analyser is a user-friendly program to view and compare variation regions between tumor and matched normal samples. It also provides a complete analysis function of refGene and OMIM data and makes it possible to discover CNV-gene-phenotype relationships. CNVDAT source code is freely available from http://dblab.hallym.ac.kr/CNVDAT/.

Identification of a Copy Number Variation on Chromosome 20q13.12 Associated with Osteoporotic Fractures in the Korean Population

  • Park, Tae-Joon;Hwang, Mi Yeong;Moon, Sanghoon;Hwang, Joo-Yeon;Go, Min Jin;Kim, Bong-Jo
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.216-221
    • /
    • 2016
  • Osteoporotic fractures (OFs) are critical hard outcomes of osteoporosis and are characterized by decreased bone strength induced by low bone density and microarchitectural deterioration in bone tissue. Most OFs cause acute pain, hospitalization, immobilization, and slow recovery in patients and are associated with increased mortality. A variety of genetic studies have suggested associations of genetic variants with the risk of OF. Genome-wide association studies have reported various single-nucleotide polymorphisms and copy number variations (CNVs) in European and Asian populations. To identify CNV regions associated with OF risk, we conducted a genome-wide CNV study in a Korean population. We performed logistic regression analyses in 1,537 Korean subjects (299 OF cases and 1,238 healthy controls) and identified a total of 8 CNV regions significantly associated with OF (p < 0.05). Then, one CNV region located on chromosome 20q13.12 was selected for experimental validation. The selected CNV region was experimentally validated by quantitative polymerase chain reaction. The CNV region of chromosome 20q13.12 is positioned upstream of a family of long non-coding RNAs, LINC01260. Our findings could provide new information on the genetic factors associated with the risk of OF.

Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data

  • Kim, Ji-Hong;Yim, Seon-Hee;Jeong, Yong-Bok;Jung, Seong-Hyun;Xu, Hai-Dong;Shin, Seung-Hun;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • v.6 no.4
    • /
    • pp.231-234
    • /
    • 2008
  • Precise and reliable identification of CNV is still important to fully understand the effect of CNV on genetic diversity and background of complex diseases. SNP marker has been used frequently to detect CNVs, but the analysis of SNP chip data for identifying CNV has not been well established. We compared various normalization methods for CNV analysis and suggest optimal normalization procedure for reliable CNV call. Four normal Koreans and NA10851 HapMap male samples were genotyped using Affymetrix Genome-Wide Human SNP array 5.0. We evaluated the effect of median and quantile normalization to find the optimal normalization for CNV detection based on SNP array data. We also explored the effect of Robust Multichip Average (RMA) background correction for each normalization process. In total, the following 4 combinations of normalization were tried: 1) Median normalization without RMA background correction, 2) Quantile normalization without RMA background correction, 3) Median normalization with RMA background correction, and 4) Quantile normalization with RMA background correction. CNV was called using SW-ARRAY algorithm. We applied 4 different combinations of normalization and compared the effect using intensity ratio profile, box plot, and MA plot. When we applied median and quantile normalizations without RMA background correction, both methods showed similar normalization effect and the final CNV calls were also similar in terms of number and size. In both median and quantile normalizations, RMA backgroundcorrection resulted in widening the range of intensity ratio distribution, which may suggest that RMA background correction may help to detect more CNVs compared to no correction.

Plant genome analysis using flow cytometry

  • Lee Jai-Heon;Kim Kee-Young;Chung Dae-Soo;Chung Won Bok;Kwon Oh-Chang
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 1999.05a
    • /
    • pp.162-163
    • /
    • 1999
  • The goal of this research was (1) to describe the conditions and parameters required for the cell cycle synchronization and the accumulation of large number of metaphase cells in maize and other cereal root tips, (2) to isolate intact metaphase chromosomes from root tips suitable for characterization by flow cytometry, and (3) to construct chromosome-specific libraries from maize. Plant metaphase chromosomes have been successfully synchronized and isolated from many cereal root-tips. DNA synthesis inhibitor (hydroxyurea) was used to synchronize cell cycle, follwed by treatement with trifluralin to accumulate metaphase chromosomes. Maize flow karyotypes show substantial variation among inbred lines. thish variation should be sueful in isolating individual chromosome types. In addition, flow cytometry is a useful method to measure DNA content of individual chromosomes in a genotyps, and to detect chromosomal variations. Individual chromosome peaks have been sorted from the maize hybrid B73/Mol7. Libraries were generated form the DOP-PCR amplification product from each peak. To date, we have analyzed clones from a library constructed from the maize chromosome 1 peak. Hybridization of labeled genomic DNA to clone inserts indicated that $24\%,\;18\%,\;and\;58\%$ of the clones were highly repetitive, medium repetitive, and low copy, respectively. Fifty percent of putative low cpoy clones showed single bands on inbred screening, blots, and the remaining $50\%$ were low copy repeats. Single copy clones showing polymorphism will be mapped using recombinant inbred mapping populations. Repetitive clones are being characterized by Southern blot analysis, and will be screened by in situ hybridization for their potential utility as chromosome specific markers.

  • PDF