• Title/Summary/Keyword: National Center for Biotechnology Information

Search Result 269, Processing Time 0.031 seconds

Patome: Database of Patented Bio-sequences

  • Kim, SeonKyu;Lee, ByungWook
    • Genomics & Informatics
    • /
    • v.3 no.3
    • /
    • pp.94-97
    • /
    • 2005
  • We have built a database server called Patome which contains the annotation information for patented bio-sequences from the Korean Intellectual Property Office (KIPO). The aims of the Patome are to annotate Korean patent bio-sequences and to provide information on patent relationship of public database entries. The patent sequences were annotated with Reference Sequence (RefSeq) or NCBI's nr database. The raw patent data and the annotated data were stored in the database. Annotation information can be used to determine whether a particular RefSeq ID or NCBI's nr ID is related to Korean patent. Patome infrastructure consists of three components­the database itself, a sequence data loader, and an online database query interface. The database can be queried using submission number, organism, title, applicant name, or accession number. Patome can be accessed at http://www.patome.net. The information will be updated every two months.

Genome Sequence of Bacillus cereus FORC_021, a Food-Borne Pathogen Isolated from a Knife at a Sashimi Restaurant

  • Chung, Han Young;Lee, Kyu-Ho;Ryu, Sangryeol;Yoon, Hyunjin;Lee, Ju-Hoon;Kim, Hyeun Bum;Kim, Heebal;Jeong, Hee Gon;Choi, Sang Ho;Kim, Bong-Soo
    • Journal of Microbiology and Biotechnology
    • /
    • v.26 no.12
    • /
    • pp.2030-2035
    • /
    • 2016
  • Bacillus cereus causes food-borne illness through contaminated foods; therefore, its pathogenicity and genome sequences have been analyzed in several studies. We sequenced and analyzed B. cereus strain FORC_021 isolated from a sashimi restaurant. The genome sequence consists of 5,373,294 bp with 35.36% GC contents, 5,350 predicted CDSs, 42 rRNA genes, and 107 tRNA genes. Based on in silico DNA-DNA hybridization values, B. cereus ATCC $14579^T$ was closest to FORC_021 among the complete genome-sequenced strains. Three major enterotoxins were detected in FORC_021. Comparative genomic analysis of FORC_021 with ATCC $14579^T$ revealed that FORC_021 harbored an additional genomic region encoding virulence factors, such as putative ADP-ribosylating toxin, spore germination protein, internalin, and sortase. Furthermore, in vitro cytotoxicity testing showed that FORC_021 exhibited a high level of cytotoxicity toward INT-407 human epithelial cells. This genomic information of FORC_021 will help us to understand its pathogenesis and assist in managing food contamination.

Proteomic Analysis to Identify Tightly-Bound Cell Wall Protein in Rice Calli

  • Cho, Won Kyong;Hyun, Tae Kyung;Kumar, Dhinesh;Rim, Yeonggil;Chen, Xiong Yan;Jo, Yeonhwa;Kim, Suwha;Lee, Keun Woo;Park, Zee-Yong;Lucas, William J.;Kim, Jae-Yean
    • Molecules and Cells
    • /
    • v.38 no.8
    • /
    • pp.685-696
    • /
    • 2015
  • Rice is a model plant widely used for basic and applied research programs. Plant cell wall proteins play key roles in a broad range of biological processes. However, presently, knowledge on the rice cell wall proteome is rudimentary in nature. In the present study, the tightly-bound cell wall proteome of rice callus cultured cells using sequential extraction protocols was developed using mass spectrometry and bioinformatics methods, leading to the identification of 1568 candidate proteins. Based on bioinformatics analyses, 389 classical rice cell wall proteins, possessing a signal peptide, and 334 putative non-classical cell wall proteins, lacking a signal peptide, were identified. By combining previously established rice cell wall protein databases with current data for the classical rice cell wall proteins, a comprehensive rice cell wall proteome, comprised of 496 proteins, was constructed. A comparative analysis of the rice and Arabidopsis cell wall proteomes revealed a high level of homology, suggesting a predominant conservation between monocot and eudicot cell wall proteins. This study importantly increased information on cell wall proteins, which serves for future functional analyses of these identified rice cell wall proteins.

Meta- and Gene Set Analysis of Stomach Cancer Gene Expression Data

  • Kim, Seon-Young;Kim, Jeong-Hwan;Lee, Heun-Sik;Noh, Seung-Moo;Song, Kyu-Sang;Cho, June-Sik;Jeong, Hyun-Yong;Kim, Woo Ho;Yeom, Young-Il;Kim, Nam-Soon;Kim, Sangsoo;Yoo, Hyang-Sook;Kim, Yong Sung
    • Molecules and Cells
    • /
    • v.24 no.2
    • /
    • pp.200-209
    • /
    • 2007
  • We generated gene expression data from the tissues of 50 gastric cancer patients, and applied meta-analysis and gene set analysis to this data and three other stomach cancer gene expression data sets to define the gene expression changes in gastric tumors. By meta-analysis we identified genes consistently changed in gastric carcinomas, while gene set analysis revealed consistently changed biological themes. Genes and gene sets involved in digestion, fatty acid metabolism, and ion transport were consistently down-regulated in gastric carcinomas, while those involved in cellular proliferation, cell cycle, and DNA replication were consistently up-regulated. We also found significant differences between the genes and gene sets expressed in diffuse and intestinal type gastric carcinoma. By gene set analysis of cytogenetic bands, we identified many chromosomal regions with possible gross chromosomal changes (amplifications or deletions). Similar analysis of transcription factor binding sites (TFBSs), revealed transcription factors that may have caused the observed gene expression changes in gastric carcinomas, and we confirmed the overexpression of one of these, E2F1, in many gastric carcinomas by tissue array and immunohistochemistry. We have incorporated the results of our meta- and gene set analyses into a web accessible database (http://human-genome.kribb.re.kr/stomach/).

Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB

  • Lee, Joon-Ho;Lee, Taeheon;Lee, Hak-Kyo;Cho, Byung-Wook;Shin, Dong-Hyun;Do, Kyoung-Tag;Sung, Samsun;Kwak, Woori;Kim, Hyeon Jeong;Kim, Heebal;Cho, Seoae;Park, Kyung-Do
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.27 no.9
    • /
    • pp.1236-1243
    • /
    • 2014
  • Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB) (http://snugenome2.snu.ac.kr/HSDB) provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs) were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.

Single Nucleotide Polymorphism (SNP) Discovery and Kompetitive Allele-Specific PCR (KASP) Marker Development with Korean Japonica Rice Varieties

  • Cheon, Kyeong-Seong;Baek, Jeongho;Cho, Young-il;Jeong, Young-Min;Lee, Youn-Young;Oh, Jun;Won, Yong Jae;Kang, Do-Yu;Oh, Hyoja;Kim, Song Lim;Choi, Inchan;Yoon, In Sun;Kim, Kyung-Hwan;Han, Jung-Heon;Ji, Hyeonso
    • Plant Breeding and Biotechnology
    • /
    • v.6 no.4
    • /
    • pp.391-403
    • /
    • 2018
  • Genome resequencing by next-generation sequencing technology can reveal numerous single nucleotide polymorphisms (SNPs) within a closely-related cultivar group, which would enable the development of sufficient SNP markers for mapping and the identification of useful genes present in the cultivar group. We analyzed genome sequence data from 13 Korean japonica rice varieties and discovered 740,566 SNPs. The SNPs were distributed at 100-kbp intervals throughout the rice genome, although the SNP density was uneven among the chromosomes. Of the 740,566 SNPs, 1,014 SNP sites were selected on the basis of polymorphism information content (PIC) value higher than 0.4 per 200-kbp interval, and 506 of these SNPs were converted to Kompetitive Allele-Specific PCR (KASP) markers. The 506 KASP markers were tested for genotyping with the 13 sequenced Korean japonica rice varieties, and polymorphisms were detected in 400 KASP markers (79.1%) which would be suitable for genetic analysis and molecular breeding. Additionally, a genetic map comprising 205 KASP markers was successfully constructed with 188 $F_2$ progenies derived from a cross between the varieties, Junam and Nampyeong. In a phylogenetic analysis with 81 KASP markers, 13 Korean japonica varieties showed close genetic relationships and were divided into three groups. More KASP markers are being developed and these markers will be utilized in gene mapping, quantitative trait locus (QTL) analysis, marker-assisted selection and other strategies relevant to crop improvement.

Applicability Evaluation of Male-Specific Coliphage-Based Detection Methods for Microbial Contamination Tracking

  • Kim, Gyungcheon;Park, Gwoncheol;Kang, Seohyun;Lee, Sanghee;Park, Jiyoung;Ha, Jina;Park, Kunbawui;Kang, Minseok;Cho, Min;Shin, Hakdong
    • Journal of Microbiology and Biotechnology
    • /
    • v.31 no.12
    • /
    • pp.1709-1715
    • /
    • 2021
  • Outbreaks of food poisoning due to the consumption of norovirus-contaminated shellfish continue to occur. Male-specific (F+) coliphage has been suggested as an indicator of viral species due to the association with animal and human wastes. Here, we compared two methods, the double agar overlay and the quantitative real-time PCR (RT-PCR)-based method, for evaluating the applicability of F+ coliphage-based detection technique in microbial contamination tracking of shellfish samples. The RT-PCR-based method showed 1.6-39 times higher coliphage PFU values from spiked shellfish samples, in relation to the double agar overlay method. These differences indicated that the RT-PCR-based technique can detect both intact viruses and non-particle-protected viral DNA/RNA, suggesting that the RT-PCR based method could be a more efficient tool for tracking microbial contamination in shellfish. However, the virome information on F+ coliphage-contaminated oyster samples revealed that the high specificity of the RT-PCR- based method has a limitation in microbial contamination tracking due to the genomic diversity of F+ coliphages. Further research on the development of appropriate primer sets for microbial contamination tracking is therefore necessary. This study provides preliminary insight that should be examined in the search for suitable microbial contamination tracking methods to control the sanitation of shellfish and related seawater.

Bioinformatics services for analyzing massive genomic datasets

  • Ko, Gunhwan;Kim, Pan-Gyu;Cho, Youngbum;Jeong, Seongmun;Kim, Jae-Yoon;Kim, Kyoung Hyoun;Lee, Ho-Yeon;Han, Jiyeon;Yu, Namhee;Ham, Seokjin;Jang, Insoon;Kang, Byunghee;Shin, Sunguk;Kim, Lian;Lee, Seung-Won;Nam, Dougu;Kim, Jihyun F.;Kim, Namshin;Kim, Seon-Young;Lee, Sanghyuk;Roh, Tae-Young;Lee, Byungwook
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.8.1-8.10
    • /
    • 2020
  • The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www. bioexpress.re.kr/.

New Approach to Predict microRNA Gene by using data Compression technique

  • Kim, Dae-Won;Yang, Joshua SungWoo;Kim, Pan-Jun;Chu, In-Sun;Jeong, Ha-Woong;Park, Hong-Seog
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.361-365
    • /
    • 2005
  • Over the past few years, the complex and subtle roles of microRNA (miRNA) in gene regulation have been increasingly appreciated. Computational approaches have played one of important roles in identifying miRNAs from plant and animals, as well as in predicting their putative gene target. We present a new approach of comprehensive analysis of the evolutionarily conserved element scores and applied data compression technique to detect putative miRNA genes. We used the evolutionarily conserved elements [19] (see more detail on method and material) to calculate for base-by-base along the candidate pre-miRNA gene region by detecting common conserved pattern from target sequence. We applied the data compression technique [20] to detect unknown miRNA genes. This zipping method devises, without loss of generality with respect to the nature of the character strings, a method to measure the similarity between the strings under consideration [20]. Our experience to using our new computational method for detecting miRNA gene identification (or miRNA gene prediction) has been stratified and we were able to find 28 putative miRNA genes.

  • PDF