• Title/Summary/Keyword: prokaryotic genomes

Search Result 14, Processing Time 0.023 seconds

Evaluation of DNA Microarray Approach for Identifying Strain-Specific Genes

  • Hwang, Keum-Ok;Cho, Jae-Chang
    • Journal of Microbiology and Biotechnology
    • /
    • v.16 no.11
    • /
    • pp.1773-1777
    • /
    • 2006
  • We evaluated the usefulness of DNA microarray as a comparative genomics tool, and tested the validity of the cutoff values for defining absent genes in test genomes. Three genome-sequenced E. coli strains (K-12, EDL933, and CFT073) were subjected to comparative genomic hybridization with DNA microarrays covering almost all ORFs of the reference strain K-12, and the microarray results were compared with the results obtained from in silico analyses of genome sequences. For defining the K-12 ORFs absent in test genomes (reference strain-specific ORFs), we applied and evaluated the cutoff level of -1. The average sequence similarity between ORFs, to which corresponding spots showed a log-ratio of>-1, was $96.9{\pm}4.8$. The numbers of spots showing a log-ratio of <-1 (P<0.05, t-test) were 90 (2.5%) and 417 (10.6%) for the EDL933 genome and the CFT073 genome, respectively. Frequency of false negatives (FN) was ca. 0.2, and the cutoff level of -1.3 was required to achieve the FN of 0.1. The average sequence similarity of the false negative ORFs was $77.8{\pm}14.8$, indicating that the majority of the false negatives were caused by highly divergent genes. We concluded that the microarray is useful for identifying missing or divergent ORFs in closely related prokaryotic genomes.

Genome-Wide Analysis of Type VI System Clusters and Effectors in Burkholderia Species

  • Nguyen, Thao Thi;Lee, Hyun-Hee;Park, Inmyoung;Seo, Young-Su
    • The Plant Pathology Journal
    • /
    • v.34 no.1
    • /
    • pp.11-22
    • /
    • 2018
  • Type VI secretion system (T6SS) has been discovered in a variety of gram-negative bacteria as a versatile weapon to stimulate the killing of eukaryotic cells or prokaryotic competitors. Type VI secretion effectors (T6SEs) are well known as key virulence factors for important pathogenic bacteria. In many Burkholderia species, T6SS has evolved as the most complicated secretion pathway with distinguished types to translocate diverse T6SEs, suggesting their essential roles in this genus. Here we attempted to detect and characterize T6SSs and potential T6SEs in target genomes of plant-associated and environmental Burkholderia species based on computational analyses. In total, 66 potential functional T6SS clusters were found in 30 target Burkholderia bacterial genomes, of which 33% possess three or four clusters. The core proteins in each cluster were specified and phylogenetic trees of three components (i.e., TssC, TssD, TssL) were constructed to elucidate the relationship among the identified T6SS clusters. Next, we identified 322 potential T6SEs in the target genomes based on homology searches and explored the important domains conserved in effector candidates. In addition, using the screening approach based on the profile hidden Markov model (pHMM) of T6SEs that possess markers for type VI effectors (MIX motif) (MIX T6SEs), 57 revealed proteins that were not included in training datasets were recognized as novel MIX T6SE candidates from the Burkholderia species. This approach could be useful to identify potential T6SEs from other bacterial genomes.

Computational Detection of Prokaryotic Core Promoters in Genomic Sequences

  • Kim Ki-Bong;Sim Jeong Seop
    • Journal of Microbiology
    • /
    • v.43 no.5
    • /
    • pp.411-416
    • /
    • 2005
  • The high-throughput sequencing of microbial genomes has resulted in the relatively rapid accumulation of an enormous amount of genomic sequence data. In this context, the problem posed by the detection of promoters in genomic DNA sequences via computational methods has attracted considerable research attention in recent years. This paper addresses the development of a predictive model, known as the dependence decomposition weight matrix model (DDWMM), which was designed to detect the core promoter region, including the -10 region and the transcription start sites (TSSs), in prokaryotic genomic DNA sequences. This is an issue of some importance with regard to genome annotation efforts. Our predictive model captures the most significant dependencies between positions (allowing for non­adjacent as well as adjacent dependencies) via the maximal dependence decomposition (MDD) procedure, which iteratively decomposes data sets into subsets, based on the significant dependence between positions in the promoter region to be modeled. Such dependencies may be intimately related to biological and structural concerns, since promoter elements are present in a variety of combinations, which are separated by various distances. In this respect, the DDWMM may prove to be appropriate with regard to the detection of core promoter regions and TSSs in long microbial genomic contigs. In order to demonstrate the effectiveness of our predictive model, we applied 10-fold cross-validation experiments on the 607 experimentally-verified promoter sequences, which evidenced good performance in terms of sensitivity.

A New Approach to Fragment Assembly in DNA Sequencing

  • Pevzner, Pavel-A.;Tang, Haixu;Waterman, Micheal-S.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.08a
    • /
    • pp.11-35
    • /
    • 2001
  • For the last twenty years fragment assembly in DNA sequencing followed the "overlap - layout - consensus"paradigm that is used in all currently available assembly tools. Although this approach proved to be useful in assembling clones, it faces difficulties in genomic shotgun assembly: the existing algorithms make assembly errors and are often unable to resolve repeats even in prokaryotic genomes. Biologists are well-aware of these errors and are forced to carry additional experiments to verify the assembled contigs. We abandon the classical “overlap - layout - consensus”approach in favor of a new Eulerian Superpath approach that, for the first time, resolves the problem of repeats in fragment assembly. Our main result is the reduction of the fragment assembly to a variation of the classical Eulerian path problem. This reduction opens new possibilities for repeat resolution and allows one to generate error-free solutions of the large-scale fragment assemble problems. The major improvement of EULER over other algorithms is that it resolves all repeats except long perfect repeats that are theoretically impossible to resolve without additional experiments.

  • PDF

Basic Concept of Gene Microarray (Gene Microarray의 기본개념)

  • Hwang, Seung Yong
    • Korean Journal of Biological Psychiatry
    • /
    • v.8 no.2
    • /
    • pp.203-207
    • /
    • 2001
  • The genome sequencing project has generated and will continue to generate enormous amounts of sequence data including 5 eukaryotic and about 60 prokaryotic genomes. Given this ever-increasing amounts of sequence information, new strategies are necessary to efficiently pursue the next phase of the genome project-the elucidation of gene expression patterns and gene product function on a whole genome scale. In order to assign functional information to the genome sequence, DNA chip(or gene microarray) technology was developed to efficiently identify the differential expression pattern of independent biological samples. DNA chip provides a new tool for genome expression analysis that may revolutionize many aspects of biotechnology including new drug discovery and disease diagnostics.

  • PDF

Investigation of COGs (Clusters of Orthologous Groups of proteins) in 1,309 Species of Prokaryotes (원핵생물 1,309종에 분포된 COGs (Clusters of Orthologous Groups of proteins) 연구)

  • Lee, Dong-Geun;Lee, Sang-Hyeon
    • Journal of Life Science
    • /
    • v.31 no.9
    • /
    • pp.834-839
    • /
    • 2021
  • Authors previously reported the results of analyses of COGs (Clusters of Orthologous Groups of proteins) in 711 prokaryotes. The data of COGs were significantly updated for 2020 using 1,309 prokaryotic genomes. Here, we report the results of analyses of 3,455,853 proteins comprising 4,877 updated COGs in terms of COGs and prokaryotes. The numbers of COGs in each prokaryote ranged from 97 to 2,281, with an average of 1,430.0 and a standard deviation of 414.2. Mean numbers of COGs at the phylum level were minimal 497.86 for Mollicutes and maximal 1,642.90 for Cyanobacteria. The top 10 species with the highest COG retention numbers were all Proteobacteria, and 9 out of the bottom 10 were those that could not be cultured in vitro. The numbers of proteins belonging to each COG ranged from 2 to 22,048, with over 12,000 proteins up to the top 11. Five of the top 11 were COGs that bind to DNA and were involved in the gene expression, indicating the importance of regulating gene expression in prokaryotes in a changing environment. COG data are expected to be widely utilized as they can be used for the identification of genes included in the genome and the selection of genes for the strain improvement.

Comparison of Mitochondria-related Conserved Genes in Eukaryotes and Prokaryotes (진핵생물과 원핵생물의 미토콘드리아 관련 보존적 유전자 비교)

  • Lee, Dong-Geun
    • Journal of Life Science
    • /
    • v.24 no.7
    • /
    • pp.791-797
    • /
    • 2014
  • Sixty-two conserved orthologous groups (OGs) of proteins, in 63 prokaryotes and seven eukaryotes were analyzed to identify essential proteins in the mitochondria of eukaryotes, and their counterparts in prokaryotes. Twenty OGs were common in eukaryotic mitochondria, and all were translation related. Encephalitozoon cuniculi, an obligate parasitic eukaryote, shares no common mitochondrial OGs with the other 69 organisms. Seventeen conserved OGs were mitochondria related in the 69 organisms. Mitochondria related- and nonrelated-OGs were divided into prokaryotic genomes (p<0.001, paired t-test) unlike eukaryotic genomes in the distance value analysis. The most commonly conserved mitochondria-related OG was COG0048-KOG1750 (ribosomal small subunit S12), whereas it was COG0100-KOG0407 (ribosomal small subunit S11) in nonrelated OGs. These results could be applied in scientific research to determine phylogenetic relationships and in areas such as drug development.