Functional Annotation and Analysis of Korean Patented Biological Sequences Using Bioinformatics

  • Lee, Byung Wook (Department of Biosystems, Korea Advanced Institute of Science and Technology) ;
  • Kim, Tae Hyung (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Kim, Seon Kyu (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Kim, Sang Soo (Department of Bioinformatics, Soongsil University) ;
  • Ryu, Gee Chan (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Bhak, Jong (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology)
  • Received : 2005.12.27
  • Accepted : 2006.03.03
  • Published : 2006.04.30

Abstract

A recent report of the Korean Intellectual Property Office(KIPO) showed that the number of biological sequence-based patents is rapidly increasing in Korea. We present biological features of Korean patented sequences though bioinformatic analysis. The analysis is divided into two steps. The first is an annotation step in which the patented sequences were annotated with the Reference Sequence (RefSeq) database. The second is an association step in which the patented sequences were linked to genes, diseases, pathway, and biological functions. We used Entrez Gene, Online Mendelian Inheritance in Man (OMIM), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases. Through the association analysis, we found that nearly 2.6% of human genes were associated with Korean patenting, compared to 20% of human genes in the U.S. patent. The association between the biological functions and the patented sequences indicated that genes whose products act as hormones on defense responses in the extra-cellular environments were the most highly targeted for patenting. The analysis data are available at http://www.patome.net

Keywords

BLAST Analysis;Gene-, Disease-, Pathway-,and Biofunction-patent Association Maps;Korea Patented Biological Sequences;RefSeq

Acknowledgement

Supported by : Korean Ministry of Science and Technology

References

  1. Kim, S. K. and Lee, B. W. (2005) Patome: Database of Patented Bio-sequences. Genomics & Informatics 3, 94-97
  2. Xu, G., Webster, A., and Doran, E. (2002) Patented sequence databases. World Patent Info. 24, 95-101 https://doi.org/10.1016/S0172-2190(02)00004-2
  3. Kim, T. H., Jeon, Y. J., Yi, J. M., Kim, D. S., Huh, J. W., et al. (2004) The distribution and expression of HERV families in the human genome. Mol. Cells 18, 87-93
  4. Yoo, H., Ramanathan, C., and Barcelon-Yang, C. (2005) Intellectual property management of sequence information from a patent searching perspective. World Patent Info. 27, 203-211 https://doi.org/10.1016/j.wpi.2005.02.001
  5. Maglott, D., Ostell, J., Pruitt, K. D., and Tatusova, T. (2005) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 33, D54-D58 https://doi.org/10.1093/nar/gni052
  6. Hamosh, A., Scott, A. F., Amberger, J. S., Bocchini, C. A., and McKusick, V. A. (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514-D517 https://doi.org/10.1093/nar/gki033
  7. Jensen, K and Murray, F. (2005) Property landscape of the human genome. Science 14, 239-240
  8. Xie, H., Wasserman, A., Levine, Z., Novik, A., Grebinskiy, V., et al. (2002) Large-scale protein annotation through gene ontology. Genome Res. 12, 785-794 https://doi.org/10.1101/gr.86902
  9. Pruitt, K. D., Tatusova, T., and Maglott, D. R. (2005) NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501-D504 https://doi.org/10.1093/nar/gki476
  10. Jones, R. (2003) Errors in patent application sequence listings. Nat. Biotech. 21, 1239-1240 https://doi.org/10.1038/nbt1003-1239
  11. Dufresne, G. and Duval, M. (2004) Genetic sequences: how are they patented? Nat. Biotech. 22, 231-232 https://doi.org/10.1038/nbt0204-231
  12. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402 https://doi.org/10.1093/nar/25.17.3389
  13. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Wheeler, D. L. (2005) GenBank. Nucleic Acids Res. 33, D34-D38 https://doi.org/10.1093/nar/gni032
  14. Rouse, R. J. D., Castagnetto, J., and Niedner, R. H. (2005) Pat-Gen--a consolidated resource for searching genetic patented sequences. Bioinformatics 21, 1707-1708 https://doi.org/10.1093/bioinformatics/bti202
  15. Gene Ontology Consortium (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258-D261 https://doi.org/10.1093/nar/gkh036
  16. Kanehisa, M., Goto. S., Hattori, M., Aoki-Kinoshita, K. F., Itoh, M., et al. (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354-D357 https://doi.org/10.1093/nar/gkj102
  17. Collins, F. S., Green, E. D., Guttmacher, A. E., and Guyer, M. S. (2003) A vision for the future of genomics research. Nature 422, 835-847 https://doi.org/10.1038/nature01626