Functional Annotation and Analysis of Korean Patented Biological Sequences Using Bioinformatics

  • Lee, Byung Wook (Department of Biosystems, Korea Advanced Institute of Science and Technology) ;
  • Kim, Tae Hyung (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Kim, Seon Kyu (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Kim, Sang Soo (Department of Bioinformatics, Soongsil University) ;
  • Ryu, Gee Chan (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Bhak, Jong (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology)
  • Received : 2005.12.27
  • Accepted : 2006.03.03
  • Published : 2006.04.30


A recent report of the Korean Intellectual Property Office(KIPO) showed that the number of biological sequence-based patents is rapidly increasing in Korea. We present biological features of Korean patented sequences though bioinformatic analysis. The analysis is divided into two steps. The first is an annotation step in which the patented sequences were annotated with the Reference Sequence (RefSeq) database. The second is an association step in which the patented sequences were linked to genes, diseases, pathway, and biological functions. We used Entrez Gene, Online Mendelian Inheritance in Man (OMIM), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases. Through the association analysis, we found that nearly 2.6% of human genes were associated with Korean patenting, compared to 20% of human genes in the U.S. patent. The association between the biological functions and the patented sequences indicated that genes whose products act as hormones on defense responses in the extra-cellular environments were the most highly targeted for patenting. The analysis data are available at


BLAST Analysis;Gene-, Disease-, Pathway-,and Biofunction-patent Association Maps;Korea Patented Biological Sequences;RefSeq


Supported by : Korean Ministry of Science and Technology


  1. Kim, S. K. and Lee, B. W. (2005) Patome: Database of Patented Bio-sequences. Genomics & Informatics 3, 94-97
  2. Xu, G., Webster, A., and Doran, E. (2002) Patented sequence databases. World Patent Info. 24, 95-101
  3. Kim, T. H., Jeon, Y. J., Yi, J. M., Kim, D. S., Huh, J. W., et al. (2004) The distribution and expression of HERV families in the human genome. Mol. Cells 18, 87-93
  4. Yoo, H., Ramanathan, C., and Barcelon-Yang, C. (2005) Intellectual property management of sequence information from a patent searching perspective. World Patent Info. 27, 203-211
  5. Maglott, D., Ostell, J., Pruitt, K. D., and Tatusova, T. (2005) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 33, D54-D58
  6. Hamosh, A., Scott, A. F., Amberger, J. S., Bocchini, C. A., and McKusick, V. A. (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514-D517
  7. Jensen, K and Murray, F. (2005) Property landscape of the human genome. Science 14, 239-240
  8. Xie, H., Wasserman, A., Levine, Z., Novik, A., Grebinskiy, V., et al. (2002) Large-scale protein annotation through gene ontology. Genome Res. 12, 785-794
  9. Pruitt, K. D., Tatusova, T., and Maglott, D. R. (2005) NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501-D504
  10. Jones, R. (2003) Errors in patent application sequence listings. Nat. Biotech. 21, 1239-1240
  11. Dufresne, G. and Duval, M. (2004) Genetic sequences: how are they patented? Nat. Biotech. 22, 231-232
  12. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402
  13. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Wheeler, D. L. (2005) GenBank. Nucleic Acids Res. 33, D34-D38
  14. Rouse, R. J. D., Castagnetto, J., and Niedner, R. H. (2005) Pat-Gen--a consolidated resource for searching genetic patented sequences. Bioinformatics 21, 1707-1708
  15. Gene Ontology Consortium (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258-D261
  16. Kanehisa, M., Goto. S., Hattori, M., Aoki-Kinoshita, K. F., Itoh, M., et al. (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354-D357
  17. Collins, F. S., Green, E. D., Guttmacher, A. E., and Guyer, M. S. (2003) A vision for the future of genomics research. Nature 422, 835-847