• Title/Summary/Keyword: Genome database

Search Result 359, Processing Time 0.023 seconds

Considerations on gene chip data analysis

  • Lee, Jae-K.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.08a
    • /
    • pp.77-102
    • /
    • 2001
  • Different high-throughput chip technologies are available for genome-wide gene expression studies. Quality control and prescreening analysis are important for rigorous analysis on each type of gene expression data. Statistical significance evaluation of differential expression patterns is needed. Major genome institutes develop database and analysis systems for information sharing of precious expression data.

  • PDF

Computing Post-translation Modification using FTMS

  • Shen, Wei;Sung, Wing-Kin;SZE, Siu Kwan
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.331-336
    • /
    • 2005
  • Post translational modifications (PTMs) discovery is an important problem in proteomic. In the past, people discover PTMs by Tandem Mass Spectrometer based on ‘bottom-up’ strategy. However, such strategy suffers from the problem of failing to discover all PTMs. Recently, due to the improvement in proteomic technology, Taylor et al. proposed a database software to discover PTMs with ‘topdown’ strategy by FTMS, which avoids the disadvantages of ‘bottom-up’ approach. However, their proposed algorithm runs in exponential time, requires a database of proteins, and needs prior knowledge about PTM sites. In this paper, a new algorithm is proposed which can work without a protein database and can identify modifications in polynomial time. Besides, no prior knowledge about PTM sites is needed.

  • PDF

In Silico Identification of 6-Phosphogluconolactonase Genes that are Frequently Missing from Completely Sequenced Bacterial Genomes

  • Jeong, Hae-Young;F. Kim, Ji-Hyun;Park, Hong-Seog
    • Genomics & Informatics
    • /
    • v.4 no.4
    • /
    • pp.182-187
    • /
    • 2006
  • 6-Phosphogluconolactonase (6PGL) is one of the key enzymes in the ubiquitous pathways of central carbon metabolism, but bacterial 6PGL had been long known as a missing enzyme even after complete bacterial genome sequence information became available. Although recent experimental characterization suggests that there are two types of 6PGLs (DevB and YbhE), their phylogenetic distribution is severely biased. Here we present that proteins in COG group previously described as 3-oarboxymuconate cyclase (COG2706) are actually the YbhE-type 6PGLs, which are widely distributed in Proteobacteria and Fimicutes. This case exemplifies how erroneous functional description of a member in the reference database commonly used in transitive genome annotation cause systematic problem in the prediction of genes even with universal cellular functions.

Genome size estimation of 43 Korean Carex

  • LEE, Bora;CHO, Yanghoon;KIM, Sangtae
    • Korean Journal of Plant Taxonomy
    • /
    • v.49 no.4
    • /
    • pp.334-344
    • /
    • 2019
  • The genome size is defined as the amount of DNA in an unreplicated gametic chromosome complement and is expressed as the 1C value. It is a fundamental parameter of organisms that is useful for studies of the genome, as well as biodiversity and conservation. The genome sizes of Korean plants, including Carex (Cyperaceae), have been poorly reported. In this study, we report the genome sizes of 43 species and infraspecific taxa of Korean Carex using flow cytometry, and these results represent about 24.4% of the Carex species and infraspecific taxa distributed on the Korean peninsula. The Plant DNA C-Value Database (release 7.1) updated with and now including our data (a total of 372 Carex accessions) shows that the average genome size of members of the Carex species is 0.47 pg (1C), and the largest genome (C. cuspidate Bertol.; 1C = 1.64 pg) is 8.2 times larger than the smallest (C. brownii Tuck., C. kobomugi Ohwi, C. nubigena D. Don ex Tilloch & Taylor, and C. paxii Kuk.; 1C = 0.20 pg). The large genomes are frequently found in the subgen. Carex, especially in sect. Aulocystis, sect. Digitatae, sect. Glaucae, sect. Paniceae, and sect. Siderostictae. Our data updates the current understanding of genome sizes in Carex. This will serve as the basis for understanding the phylogeny and evolution of Carex and will be especially useful for future genome studies.

Pan-Genome Analysis Reveals Origin Specific Genome Expansion in Enterococcus mundtii Strains

  • Neeti Pandey;Raman Rajagopal;Shubham Dhara
    • Microbiology and Biotechnology Letters
    • /
    • v.52 no.2
    • /
    • pp.163-178
    • /
    • 2024
  • Pan-genome analysis is used to interpret genome heterogeneity and diversification of bacterial species. Here, we present pan-genome analysis of 22 strains of Enterococcus mundtii. The GenBank file of E. mundtii strains that have been isolated from different sources i.e., human fecal matter, soil, leaf, dairy products, and insects was downloaded from National Center for Biotechnology Information (NCBI) database and analyzed using BPGA-1.3.0 (Bacterial Pan Genome Analysis) pipeline. Out of a total, 4503 gene families, 1843 belongs to the core genes whereas 1,762 gene families represent the accessory genes and 898 gene families depict the unique genes among all the selected genomes. Majority of the core genes belongs to the categories of Metabolism (37.83%) and Information storage & processing (29.84%) whereas unique genes belongs to the category of Information storage & processing (48.08%). Further, accessory genes are almost equally present in both functional categories i.e. Information storage & processing and Metabolism (34.34% and 32.27% respectively). Further, subset analysis on the basis of the origin of isolates exhibits presence and absence of exclusive gene families. The observation suggests that even closely related strains of a species show extensive disparity in genome owing to their ability to adapt to a specific environment.

Conserved Genes and Metabolic Pathways in Prokaryotes of the Same Genus (동일한 속 원핵생물들의 보존 유전자와 대사경로)

  • Lee, Dong-Geun;Lee, Sang-Hyeon
    • Journal of Life Science
    • /
    • v.29 no.1
    • /
    • pp.123-128
    • /
    • 2019
  • The use of 16S rDNA is commonplace in the determination of prokaryotic species. However, it has limitations, and there are few studies at the genus level. We investigated conserved genes and metabolic pathways at the genus level in 28 strains of 13 genera of prokaryotes using the COG database (conserved genes) and MetaCyc database (metabolic pathways). Conserved genes compared to total genes (core genome) at the genus level ranged from 27.62%(Nostoc genus) to 71.76%(Spiribacter genus), with an average of 46.72%. The lower ratio of core genome meant the higher ratio of peculiar genes of a prokaryote, namely specific biological activities or the habitat may be varied. The ratio of common metabolic pathways at the genus level was higher than the ratio of core genomes, from 58.79% (Clostridium genus) to 96.31%(Mycoplasma genus), with an average of 75.86%. When compared among other genera, members of the same genus were positioned in the closest nodes to each other. Interestingly, Bacillus and Clostridium genera were positioned in closer nodes than those of the other genera. Archaebacterial genera were grouped together in the ortholog and metabolic pathway nodes in a phylogenetic tree. The genera Granulicella, Nostoc, and Bradyrhizobium of the Acidobacteria, Cyanobacteria, and Proteobacteria phyla, respectively, were grouped in an ortholog content tree. The results of this study can be used for (i) the identification of common genes and metabolic pathways at each phylogenetic level and (ii) the improvement of strains through horizontal gene transfer or site-directed mutagenesis.

D2GSNP: a web server for the selection of Single Nucleotide Polymorphisms within human disease genes

  • Kang Hyo-Jin;Hong Tae-Hui;Chung Won-Hyong;Kim Young-Uk;Jung Jin-Hee;Hwang So-Hyun;Han A-Reum;Kim Young-Joo
    • Genomics & Informatics
    • /
    • v.4 no.1
    • /
    • pp.45-47
    • /
    • 2006
  • D2GSNP is a web-based server for the selection of single nucleotide polymorph isms (SNPs) within genes related to human diseases. The D2GSNP is based on a relational database created by downloading and parsing OMIM, GAD, and dbSNP, and merging it with positional information of UCSC Golden Path. Totally our server provides 5,142 and 1,932 non-redundant disease genes from OMIM and GAD, respectively. With the D2GSNP web interface, users can select SNPs within genes responding to certain diseases and get their flanking sequences for further genotyping experiments such as association studies.

Gene annotation by the "interactome"analysis in KEGG

  • Kanehisa, Minoru
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.56-58
    • /
    • 2000
  • Post-genomics may be defined in different ways depending on how one views the challenges after the genome. A popular view is to follow the concept of the central dogma in molecular biology, namely from genome to transcriptome to proteome. Projects are going on to analyze gene expression profiles both at the mRNA and protein levels and to catalog protein 3D structure families, which will no doubt help the understanding of information in the genome. However complete, such catalogs of genes, RNAs, and proteins only tell us about the building blocks of life. They do not tell us much about the wiring (interaction) of building blocks, which is essential for uncovering systemic functional behaviors of the cell or the organism. Thus, an alternative view of post-genomics is to go up from the molecular level to the cellular level, and to understand, what I call, the "interactome"or a complete picture of molecular interactions in the cell. KEGG (http://www.genome.ad.jp/kegg/) is our attempt to computerize current knowledge on various cellular processes as a collection of "generalized"protein-protein interaction networks, to develop new graph-based algorithms for predicting such networks from the genome information, and to actually reconstruct the interactomes for all the completely sequenced genomes and some partial genomes. During the reconstruction process, it becomes readily apparent that certain pathways and molecular complexes are present or absent in each organism, indicating modular structures of the interactome. In addition, the reconstruction uncovers missing components in an otherwise complete pathway or complex, which may result from misannotation of the genome or misrepresentation of the KEGG pathway. When combined with additional experimental data on protein-protein interactions, such as by yeast two-hybrid systems, the reconstruction possibly uncovers unknown partners for a particular pathway or complex. Thus, the reconstruction is tightly coupled with the annotation of individual genes, which is maintained in the GENES database in KEGG. We are also trying to expand our literature surrey to include in the GENES database most up-to-date information about gene functions.

  • PDF

글로벌리포트3/ 게임속에서 구현된 성

  • Korea Database Promotion Center
    • Digital Contents
    • /
    • no.5 s.120
    • /
    • pp.158-164
    • /
    • 2003
  • 컴퓨터 게임속의 새롭고 독특한 주인공을 만들어 내는 게임 제작을 위해 캐릭터의 '유전자 설계'를 해야 한다. 이렇게 탄생하는 캐릭터들은 性을 비롯해, 모양, 크기, 색깔 등 모든 특성을 포함하는 게놈(Genome)을 갖고 있는 것이다. 게임 캐릭터의 유전자 설계를 통해 불가능한 것은 없다. 돌연변이, 수명의 결정, 후천적인 캐릭터 계승, 그리고 학습된 행동까지 지정할 수 있는 것이다. 컴퓨터 게임에서는 좋아하는 어떤 것이다 디자인 할 수 있다.

  • PDF