• Title/Summary/Keyword: whole genome analysis

Search Result 328, Processing Time 0.026 seconds

Complete Genome Sequence of Enterococcus faecalis CAUM157 Isolated from Raw Cow's Milk

  • Elnar, Arxel G.;Lim, Sang-Dong;Kim, Geun-Bae
    • Journal of Dairy Science and Biotechnology
    • /
    • v.38 no.3
    • /
    • pp.142-145
    • /
    • 2020
  • Enterococcus faecalis CAUM157, isolated from raw cow's milk, is a Gram-positive, facultatively anaerobic, and non-spore-forming bacterium capable of inhabiting a wide range of environmental niches. E. faecalis CAUM157 was observed to produce a two-peptide bacteriocin that had a wide range of activity against several pathogens, including Listeria monocytogenes, Staphylococcus aureus, and periodontitis-causing bacteria. The whole genome of E. faecalis CAUM157 was sequenced using the PacBio RS II platform, revealing a genome size of 2,972,812 bp with a G+C ratio of 37.44%, assembled into two contigs. Annotation analysis revealed 2,830 coding sequences, 12 rRNAs, and 61 tRNAs. Further, in silico analysis of the genome identified a single bacteriocin gene cluster.

High-Resolution Microarrays for Mapping Promoter Binding sites and Copy Number Variation in the Human Genome

  • Albert Thomas
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2006.02a
    • /
    • pp.125-126
    • /
    • 2006
  • NimbleGen has developed strategies to use its high-density oligonucleotide microarray platform (385,000 probes per array) to map both promoter binding sites and copy number variation at very high-resolution in the human genome. Here we describe a genome-wide map of active promoters determined by experimentally locating the sites of transcription imitation complex binding throughout the human genome using microarrays combined with chromatin immunoprecipitation. This map defines 10,567 active promoters corresponding to 6,763 known genes and at least 1,196 un-annotated transcriptional units. Microarray-based comparative genomic hybridisation (CGH) is animportant research tool for investigating chromosomal aberrations frequently associated with complex diseases such as cancer, neuropsychiatric disorders, and congenital developmental disorders. NimbleGen array CGH is an ultra-high resolution (0.5-50 Kb) oligo array platform that can be used to detect amplifications and deletions and map the associated breakpoints on the whole-genome level or with custom fine-tiling arrays. For whole-genome array CGH, probes are tiled through genic and intergenic regions with a median probe spacing of 6 Kb, which provides a comprehensive, unbiased analysis of the genome.

  • PDF

Five Computer Simulation Studies of Whole-Genome Fragment Assembly: The Case of Assembling Zymomonas mobilis ZM4 Sequences

  • Jung, Cholhee;Choi, Jin-Young;Park, Hyun Seck;Seo, Jeong-Sun
    • Genomics & Informatics
    • /
    • v.2 no.4
    • /
    • pp.184-190
    • /
    • 2004
  • An approach for genome analysis based on assembly of fragments of DNA from the whole genome can be applied to obtain the complete nucleotide sequence of the genome of Zymomonas mobilis. However, the problem of fragment assembly raise thorny computational issues. Computer simulation studies of sequence assembly usually show some abnormal assemblage of artificial sequences containing repetitive or duplicated regions, and suggest methods to correct those abnormalities. In this paper, we describe five simulation studies which had been performed previous to the actual genome assembly process of Zymomonas mobilis ZM4.

Utilization of whole genome treasure for the library construction of industrial enzymes

  • Kim, Won-Ho;Cho, Kyoung-Won;Jung, In-Su;Choi, Keum-Hwa;Hur, Byung-Ki;Kim, Geun-Joong
    • 한국생물공학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.815-820
    • /
    • 2003
  • A huge database resulted from whole genome sequencing has provided a possibility of new information that is likely to extent the scope and thus changes the way of approach for the functional assigning of putative open reading frames annotated by whole genome sequence analyses. These are mainly realized by ease, one-step identification of putative genes using genomics or proteomics tools. A major challenge remained in biotechnology may translate these informations into better ways to screen or select a gene as a representative sequence. Further attempts to mine the related whole genes or partial DNA fragment from whole genome treasure, and then the incorporation of these sequences into a representative template, will result in the use of putative genes that can be translated into functional proteins or allowed the generation of new lineages as a valuable pool. Such screens enable rapid biochemical analysis and easy isolation of the target activity, thereby accelerating the screening of novel enzymes from the expanded library with related sequences. Information-based PCR amplification of whole genes and reconstitution of functional DNA fragments will provide a platform for expanding the functional spaces of potential enzymes, especially when used mixed- or metagenome as gene resources.

  • PDF

Whole Genome Analysis of Human Papillomavirus Type 16 Multiple Infection in Cervical Cancer Patients

  • Chansaenroj, Jira;Theamboonlers, Apiradee;Junyangdikul, Pairoj;Swangvaree, Sukumarn;Karalak, Anant;Poovorawan, Yong
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.2
    • /
    • pp.599-606
    • /
    • 2012
  • The characterization of the whole genome of human papillomavirus type 16 (HPV16) from cervical cancer specimens with multiple infections in comparison with single infection samples as the oncogenic potential of the virus may differ. Cervical carcinoma specimens positive for HPV16 by PCR and INNO-LiPA were randomly selected for whole genome characterization. Two HPV16 single infection and six HPV16 multiple infection specimens were subjected to whole genome analysis by using conserved primers and subsequent sequencing. All HPV16 whole genomes from single infection samples clustered in the European (E) lineage while all multiple infection specimens belonged to the non-European lineage. The variations in nucleotide sequences in E6, E7, E2, L1 and Long control region (LCR) were evaluated. In the E6 region, amino acid changes at L83V were related to increased cancer progression. An amino acid variation N29S within the E7 oncoprotein significantly associated with severity of lesion was also discovered. In all three domains of the E2 gene non synonymous mutations were found. The L1 region showed various mutations which may be related to conformation changes of viral epitopes. Some transcription factor binding sites in the LCR region correlated to virulence were shown on GRE/1, TEF-1, YY14 and Oct-1. HPV16 European variant prone to single infection may harbor a major variation at L83V which significantly increases the risk for developing cervical carcinoma. HPV16 non-European variants prone to multiple infections may require many polymorphisms to enhance the risk of cervical cancer development.

Survey of the Applications of NGS to Whole-Genome Sequencing and Expression Profiling

  • Lim, Jong-Sung;Choi, Beom-Soon;Lee, Jeong-Soo;Shin, Chan-Seok;Yang, Tae-Jin;Rhee, Jae-Sung;Lee, Jae-Seong;Choi, Ik-Young
    • Genomics & Informatics
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2012
  • Recently, the technologies of DNA sequence variation and gene expression profiling have been used widely as approaches in the expertise of genome biology and genetics. The application to genome study has been particularly developed with the introduction of the nextgeneration DNA sequencer (NGS) Roche/454 and Illumina/ Solexa systems, along with bioinformation analysis technologies of whole-genome $de$ $novo$ assembly, expression profiling, DNA variation discovery, and genotyping. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing $de$ $novo$ assembly of novel genome sequencing data. It is necessary to have DNA sequence information from a multiplatform NGS with at least $2{\times}$ and $30{\times}$ depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly. Massive shortlength reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a wholegenome transcriptome, depending on the tissue samples. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through $de$ $novo$ assembly in any whole-genome sequenced species. The $20{\times}$ and $50{\times}$ coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively, is effective to create novel expressed reference sequences. However, only an average $30{\times}$ coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence.

Multi-omics techniques for the genetic and epigenetic analysis of rare diseases

  • Yeonsong Choi;David Whee-Young Choi;Semin Lee
    • Journal of Genetic Medicine
    • /
    • v.20 no.1
    • /
    • pp.1-5
    • /
    • 2023
  • Until now, rare disease studies have mainly been carried out by detecting simple variants such as single nucleotide substitutions and short insertions and deletions in protein-coding regions of disease-associated gene panels using diagnostic next-generation sequencing in association with patient phenotypes. However, several recent studies reported that the detection rate hardly exceeds 50% even when whole-exome sequencing is applied. Therefore, the necessity of introducing whole-genome sequencing is emerging to discover more diverse genomic variants and examine their association with rare diseases. When no diagnosis is provided by whole-genome sequencing, additional omics techniques such as RNA-seq also can be considered to further interrogate causal variants. This paper will introduce a description of these multi-omics techniques and their applications in rare disease studies.

Genomic Variations of Rice Regenerants from Tissue Culture Revealed by Whole Genome Re-Sequencing

  • Qin, Yang;Shin, Kong-Sik;Woo, Hee-Jong;Lim, Myung-Ho
    • Plant Breeding and Biotechnology
    • /
    • v.6 no.4
    • /
    • pp.426-433
    • /
    • 2018
  • Plant tissue culture is a technique that has invariably been used for various purposes such as obtaining transgenic plants for crop improvement or functional analysis of genes. However, this process can be associated with a variety of genetic and epigenetic instabilities in regenerated plants, termed as somaclonal variation. In this study, we investigated mutation spectrum, chromosomal distributions of nucleotide substitution types of single-nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) by whole genome re-sequencing between Dongjin and Nipponbare along with regenerated plants of Dongjin from different induction periods. Results indicated that molecular spectrum of mutations in regenerated rice against Dongjin genome ranged from $9.14{\times}10^{-5}$ to $1.37{\times}10^{-4}$ during one- to three-month callus inductions, while natural mutation rate between Dongjin and Nipponbare genomes was $6.97{\times}10^{-4}$. Non-random chromosome distribution of SNP and InDel was observed in both regenerants and Dongjin genomes, with the highest densities on chromosome 11. The transition to transversion ratio was 2.25 in common SNPs of regenerants against Dongjin genome with the highest C/T transition frequency, which was similar to that of Dongjin against Nipponbare genome.

Chromosome-specific polymorphic SSR markers in tropical eucalypt species using low coverage whole genome sequences: systematic characterization and validation

  • Patturaj, Maheswari;Munusamy, Aiswarya;Kannan, Nithishkumar;Kandasamy, Ulaganathan;Ramasamy, Yasodha
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.33.1-33.10
    • /
    • 2021
  • Eucalyptus is one of the major plantation species with wide variety of industrial uses. Polymorphic and informative simple sequence repeats (SSRs) have broad range of applications in genetic analysis. In this study, two individuals of Eucalyptus tereticornis (ET217 and ET86), one individual each from E. camaldulensis (EC17) and E. grandis (EG9) were subjected to whole genome resequencing. Low coverage (10×) genome sequencing was used to find polymorphic SSRs between the individuals. Average number of SSR loci identified was 95,513 and the density of SSRs per Mb was from 157.39 in EG9 to 155.08 in EC17. Among all the SSRs detected, the most abundant repeat motifs were di-nucleotide (59.6%-62.5%), followed by tri- (23.7%-27.2%), tetra- (5.2%-5.6%), penta- (5.0%-5.3%), and hexa-nucleotide (2.7%-2.9%). The predominant SSR motif units were AG/CT and AAG/TTC. Computational genome analysis predicted the SSR length variations between the individuals and identified the gene functions of SSR containing sequences. Selected subset of polymorphic markers was validated in a full-sib family of eucalypts. Additionally, genome-wide characterization of single nucleotide polymorphisms, InDels and transcriptional regulators were carried out. These variations will find their utility in genome-wide association studies as well as understanding of molecular mechanisms involved in key economic traits. The genomic resources generated in this study would provide an impetus to integrate genomics in marker-trait associations and breeding of tropical eucalypts.