KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok (National Genome Information Center (NGIC)) ;
  • Hahn, Yoon-Soo (Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health) ;
  • Kim, Nam-Soon (The Center for Functional Analysis of Human Genome, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
  • Yu, Ung-Sik (National Genome Information Center (NGIC)) ;
  • Woo, Hyun-Goo (National Genome Information Center (NGIC)) ;
  • Chu, In-Sun (National Genome Information Center (NGIC)) ;
  • Kim, Yong-Sung (The Center for Functional Analysis of Human Genome, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
  • Yoo, Hyang-Sook (The Center for Functional Analysis of Human Genome, Korea Research Institute of Bioscience and Biotechnology (KRIBB)) ;
  • Kim, Sang-Soo (Department of Bioinformatics and Life Science, Soongsil University)
  • 발행 : 2005.09.22

초록

KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

키워드