• 제목/요약/키워드: prokaryotic genomes

검색결과 14건 처리시간 0.026초

Evaluation of DNA Microarray Approach for Identifying Strain-Specific Genes

  • Hwang, Keum-Ok;Cho, Jae-Chang
    • Journal of Microbiology and Biotechnology
    • /
    • 제16권11호
    • /
    • pp.1773-1777
    • /
    • 2006
  • We evaluated the usefulness of DNA microarray as a comparative genomics tool, and tested the validity of the cutoff values for defining absent genes in test genomes. Three genome-sequenced E. coli strains (K-12, EDL933, and CFT073) were subjected to comparative genomic hybridization with DNA microarrays covering almost all ORFs of the reference strain K-12, and the microarray results were compared with the results obtained from in silico analyses of genome sequences. For defining the K-12 ORFs absent in test genomes (reference strain-specific ORFs), we applied and evaluated the cutoff level of -1. The average sequence similarity between ORFs, to which corresponding spots showed a log-ratio of>-1, was $96.9{\pm}4.8$. The numbers of spots showing a log-ratio of <-1 (P<0.05, t-test) were 90 (2.5%) and 417 (10.6%) for the EDL933 genome and the CFT073 genome, respectively. Frequency of false negatives (FN) was ca. 0.2, and the cutoff level of -1.3 was required to achieve the FN of 0.1. The average sequence similarity of the false negative ORFs was $77.8{\pm}14.8$, indicating that the majority of the false negatives were caused by highly divergent genes. We concluded that the microarray is useful for identifying missing or divergent ORFs in closely related prokaryotic genomes.

Genome-Wide Analysis of Type VI System Clusters and Effectors in Burkholderia Species

  • Nguyen, Thao Thi;Lee, Hyun-Hee;Park, Inmyoung;Seo, Young-Su
    • The Plant Pathology Journal
    • /
    • 제34권1호
    • /
    • pp.11-22
    • /
    • 2018
  • Type VI secretion system (T6SS) has been discovered in a variety of gram-negative bacteria as a versatile weapon to stimulate the killing of eukaryotic cells or prokaryotic competitors. Type VI secretion effectors (T6SEs) are well known as key virulence factors for important pathogenic bacteria. In many Burkholderia species, T6SS has evolved as the most complicated secretion pathway with distinguished types to translocate diverse T6SEs, suggesting their essential roles in this genus. Here we attempted to detect and characterize T6SSs and potential T6SEs in target genomes of plant-associated and environmental Burkholderia species based on computational analyses. In total, 66 potential functional T6SS clusters were found in 30 target Burkholderia bacterial genomes, of which 33% possess three or four clusters. The core proteins in each cluster were specified and phylogenetic trees of three components (i.e., TssC, TssD, TssL) were constructed to elucidate the relationship among the identified T6SS clusters. Next, we identified 322 potential T6SEs in the target genomes based on homology searches and explored the important domains conserved in effector candidates. In addition, using the screening approach based on the profile hidden Markov model (pHMM) of T6SEs that possess markers for type VI effectors (MIX motif) (MIX T6SEs), 57 revealed proteins that were not included in training datasets were recognized as novel MIX T6SE candidates from the Burkholderia species. This approach could be useful to identify potential T6SEs from other bacterial genomes.

Computational Detection of Prokaryotic Core Promoters in Genomic Sequences

  • Kim Ki-Bong;Sim Jeong Seop
    • Journal of Microbiology
    • /
    • 제43권5호
    • /
    • pp.411-416
    • /
    • 2005
  • The high-throughput sequencing of microbial genomes has resulted in the relatively rapid accumulation of an enormous amount of genomic sequence data. In this context, the problem posed by the detection of promoters in genomic DNA sequences via computational methods has attracted considerable research attention in recent years. This paper addresses the development of a predictive model, known as the dependence decomposition weight matrix model (DDWMM), which was designed to detect the core promoter region, including the -10 region and the transcription start sites (TSSs), in prokaryotic genomic DNA sequences. This is an issue of some importance with regard to genome annotation efforts. Our predictive model captures the most significant dependencies between positions (allowing for non­adjacent as well as adjacent dependencies) via the maximal dependence decomposition (MDD) procedure, which iteratively decomposes data sets into subsets, based on the significant dependence between positions in the promoter region to be modeled. Such dependencies may be intimately related to biological and structural concerns, since promoter elements are present in a variety of combinations, which are separated by various distances. In this respect, the DDWMM may prove to be appropriate with regard to the detection of core promoter regions and TSSs in long microbial genomic contigs. In order to demonstrate the effectiveness of our predictive model, we applied 10-fold cross-validation experiments on the 607 experimentally-verified promoter sequences, which evidenced good performance in terms of sensitivity.

A New Approach to Fragment Assembly in DNA Sequencing

  • Pevzner, Pavel-A.;Tang, Haixu;Waterman, Micheal-S.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보학 국제심포지엄
    • /
    • pp.11-35
    • /
    • 2001
  • For the last twenty years fragment assembly in DNA sequencing followed the "overlap - layout - consensus"paradigm that is used in all currently available assembly tools. Although this approach proved to be useful in assembling clones, it faces difficulties in genomic shotgun assembly: the existing algorithms make assembly errors and are often unable to resolve repeats even in prokaryotic genomes. Biologists are well-aware of these errors and are forced to carry additional experiments to verify the assembled contigs. We abandon the classical “overlap - layout - consensus”approach in favor of a new Eulerian Superpath approach that, for the first time, resolves the problem of repeats in fragment assembly. Our main result is the reduction of the fragment assembly to a variation of the classical Eulerian path problem. This reduction opens new possibilities for repeat resolution and allows one to generate error-free solutions of the large-scale fragment assemble problems. The major improvement of EULER over other algorithms is that it resolves all repeats except long perfect repeats that are theoretically impossible to resolve without additional experiments.

  • PDF

Gene Microarray의 기본개념 (Basic Concept of Gene Microarray)

  • 황승용
    • 생물정신의학
    • /
    • 제8권2호
    • /
    • pp.203-207
    • /
    • 2001
  • The genome sequencing project has generated and will continue to generate enormous amounts of sequence data including 5 eukaryotic and about 60 prokaryotic genomes. Given this ever-increasing amounts of sequence information, new strategies are necessary to efficiently pursue the next phase of the genome project-the elucidation of gene expression patterns and gene product function on a whole genome scale. In order to assign functional information to the genome sequence, DNA chip(or gene microarray) technology was developed to efficiently identify the differential expression pattern of independent biological samples. DNA chip provides a new tool for genome expression analysis that may revolutionize many aspects of biotechnology including new drug discovery and disease diagnostics.

  • PDF

원핵생물 1,309종에 분포된 COGs (Clusters of Orthologous Groups of proteins) 연구 (Investigation of COGs (Clusters of Orthologous Groups of proteins) in 1,309 Species of Prokaryotes)

  • 이동근;이상현
    • 생명과학회지
    • /
    • 제31권9호
    • /
    • pp.834-839
    • /
    • 2021
  • 저자들은 이전에 711개의 원핵생물에서 COG (Clusters of Orthologous Groups of proteins)를 분석한 결과를 보고하였다. COG 데이터베이스는 2020년에 1,309개의 원핵생물 게놈들을 사용하여 대폭 업데이트되었다. 이에 COG와 원핵생물 측면에서 업데이트된 4,877개의 COG를 구성하는 3,455,853개의 단백질들에 대한 분석 결과를 보고한다. 각 원핵생물이 보유한 COG 종류의 수는 97에서 2,281개의 사이였으며, 평균은 1,430.0개이고 표준편차는 414.2개였다. 문(phylum) 수준에서 보유 COG의 평균 수는 Mollicutes가 497.86개로 최소였고, Cyanobacteria가 1,642.90개로 최대였다. 가장 높은 보유 COG 개수를 가진 상위 10개 종은 모두 Proteobacteria였으며, 하위 10개 중 9개는 시험관 내에서 배양할 수 없는 Candidatus 구성원이었다. 각 COG에 속하는 단백질의 수는 2개에서 22,048개 사이였으며, 상위 11위 COG들은 12,000개 이상의 단백질을 포함하였다. 상위 11개 중 5개는 DNA에 결합하고 유전자 발현에 관여하는 COG로, 원핵생물에서 유전자 발현 조절의 중요성을 알 수 있었다. COG 데이터 베이스는 게놈에 포함된 유전자를 식별하고 균주 개선을 위한 유전자를 선택하는 데 사용할 수 있어 많은 활용이 기대된다.

진핵생물과 원핵생물의 미토콘드리아 관련 보존적 유전자 비교 (Comparison of Mitochondria-related Conserved Genes in Eukaryotes and Prokaryotes)

  • 이동근
    • 생명과학회지
    • /
    • 제24권7호
    • /
    • pp.791-797
    • /
    • 2014
  • 원핵과 진핵생물에 공통 보존적인 OG (Orthologous Group of proteins)를 미토콘드리아 관련 OG와 비관련 OG로 나누어 분석하였다. 62개의 원핵-진핵생물 공통적 COG (Clusters of OG)중 20개가 미토콘드리아 관련 OG였고 이들은 모두 번역관련 OG로 생명현상에서의 단백질의 중요성을 확인할 수 있었다. 세포내 절대기생체인 뇌회백염원충은 비교대상 다른 생물들 모두에 공통적인 미토콘드리아 관련 OG가 전혀 없었다. 뇌회백염원충을 제외한 6개 진핵생물과 원핵생물 63종에 모두 보존적인 미토콘드리아 관련 OG는 17개였다. Phylogenetic tree의 distance 분석을 수행하니 보존적 OG가 원핵생물에서 미토콘드리아 관련 OG와 비관련 OG 등 각각 2개의 그룹으로 나누어 졌고(p<0.001, paired t-test) 진핵생물은 그렇지 않았다(p>0.05, paired t-test). 보존성이 가장 높은 ortholog는 미토콘드리아 관련 OG에서는 COG0048-KOG1750 (ribosomal small subunit S12)이었고, 미토콘드리아 비관련 OG에서는 COG0100-KOG0407 (ribosomal small subunit S11)이었다. 본 연구결과는 진화관계 등의 기초학문적 연구와 치료제 개발 등의 자료가 될 수 있을 것이다.