Identification of 1,531 cSNPs from Full-length Enriched cDNA Libraries of the Korean Native Pig Using in Silico Analysis

  • Published : 2009.06.30


Sequences from the clones of full-length enriched cDNA libraries serve as valuable resources for functional genomics related studies, genome annotation and SNP discovery. We analyzed 7,392 high-quality chromatograms (Phred value ${\geq}$30) obtained from sequencing the 5' ends of clones derived from full-length enriched cDNA libraries of Korean native pigs including brainstem, liver, cerebellum, neocortex and spleen libraries. In addition, 50,000 EST sequence trace files obtained from GenBank were combined with our sequences to identify cSNPs in silico. The process generated 11,324 contigs, of which 2,895 contigs contained at least one SNP and among them 610 contigs had a minimum of one sequence from Korean native pigs. Of 610 contigs, we randomly selected 262 contigs and performed in silico analysis for the identification of cSNPs. From the results, we identified 1,531 putative coding single nucleotide polymorphisms (cSNPs) and the SNP detection frequency was one SNP per 465 bp. A large-scale sequencing result of clones from full-length enriched cDNA libraries and identified cSNPs will serve as a useful resource to functional genomics related projects such as a pig HapMap project in the near future.



  1. Barker, G., Batley, J., Sullivan, H.O., Edwards, K.J., and Edwards, D. (2002). Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics 19, 421-422
  2. Brookes, A.J. (1999). The essence of SNPs. Gene 234, 177-186
  3. Buetow, K.H., Edmonson, M.N., and Cassidy, A.B. (1999). Reliable identification of large number of candidate SNPs from public EST data. Nat. Genet. 21, 323-325
  4. Chen, C.H., Lin, E.C., Cheng, W.T.K., Sun, H.S., Mersmann, H.J., and Ding, S.T. (2006). Abundantly expressed genes in pig adipose tissue: an expressed sequence tag approach. J. Anim. Sci. 84, 2673-2683
  5. Dimmic, M.W., Sunyaev, S., and Bustamante, C. (2005). Inferring SNP function using evolutionary, structural and computational methods. Pac. Symp. Biocomput. 10, 382-384
  6. Dirisala, V.R., Kim, J., Park, K., Kim, N., Lee, K.T., Oh, S.J., Oh, J.H., Kim, N.S., Um, S.J., Lee, H.T., Kim, K.I., and Park, C. (2005). cSNP mining from full-length enriched cDNA libraries of the Korean native pig. Kor. J. Genet. 27, 329-335
  7. Dirisala, V.R., Kim, J., Park, K., Lee, H.T., and Park, C. (2007). Discovery of cSNPs in Pig Using Full-length Enriched cDNA Libraries of the Lorean Native Pig as a source of Genetic Diversity. BBE. 12, 424-432
  8. Ewing, B., and Green, P. (1998a). Base calling of automated sequencing tracers using phred. II. Error probabilities. Genome. Res. 8, 186-194
  9. Ewing, B., Hillier, L., Wendl, M., and Green, P. (1998b). Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome. Res. 8, 175-185
  10. Fahrenkrug, S.C., Freking, B.A., Smith, T.P.L., Rohrer, G.A., and Keele, J.W. (2002). Single nucleotide polymorphism (SNP) discovery in porcine expressed genes. Anim. Genet. 33, 186-195
  11. Fitzsimmons, C.J., Savolainen, P., Amini, B., Hjalm, G., Lunderberg, J., and Andersson, L. (2004). Detection of sequence polymorphisms in red junglefowl and white leghorn ESTs. Anim. Genet. 35, 391-396
  12. Fujisaki, S., Sugiyama, A., Eguchi, T., Watanabe, Y., Hiraiwa, H., Honma, D., Saito, T., and Yasue, H. (2004). Analysis of a full-length cDNA library constructed from swine olfactory bulb for elucidation of expressed genes and their transcription initiation sites. J. Vet. Med. Sci. 66, 15-23
  13. Garg, K., Green, P., and Nickerson, D.A. (1999). Identification of candidate coding region single nucleotide polymorphisms in 165 human genes using assembled expressed sequence tags. Genome Res. 9, 1087-1092
  14. Glazier A.M., Nadeau J.H., and Aitman, T.J. (2002). Finding genes that underlie complex traits. Science 298, 2345-2349
  15. Gordon, D., Abajian, C., and Green, P. (1998). Consed: a graphical tool for sequence finishing. Genome Res. 8, 195-202
  16. Grapes, L., Rudd, S., Fernando, R.L., Megy, K., Rocha, D., and Rothschild, M.F. (2006). Prospecting for pig single nucleotide polymorphisms in the human genome: have we struck gold? J. Anim. Breed. Genet. 123, 145-151
  17. Gu, Z., Hillier, L., and Kwok, P.Y. (1998). Single-nucleotide polymorphism hunting in cyberspace. Hum. Mutat. 12, 221-225<221::AID-HUMU1>3.0.CO;2-I
  18. Guryev, V., Berezikov, E., Malik, R., Plasterk, R.H., and Cuppen, E. (2004). Single nucleotide polymorphisms associated with rat expressed sequences. Genome Res. 14, 1438-1443
  19. Hawken, R.J., Barris, W.C., McWilliam, S.M., and Dalrymple, B.P. (2004). An interactive bovine in silico SNP database (IBISS). Mamm. Genome 15, 819-827
  20. Kim, H., Shmidt, C.J., Decker, K.S., and Emara, M.G. (2003). A double-screening method to identify reliable candidate non-synonymous SNPs from chicken EST data. Anim. Genet. 34, 249-254
  21. Kim, J.H., Yim, S.H., Jeong, Y.B., Jung, S.H., Xu, H.D., Shin, S.H., and Chung, Y.J. (2008). Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data. G&I 6, 231-234
  22. Kim, T.H., Kim, K.S., Choi, B.H., Yoon, D.H., Jang, G.W., Lee, K.T., Chung, H.Y., Lee, H.Y., Park, H.S., and Lee, J.W. (2005). Genetic structure of pig breeds from Korea and China using microsatellite loci analysis. J. Anim. Sci. 83, 2255-2263
  23. Kim, T.H., Kim, N.S., Lim, D., Lee, K.T., Oh, J.H., Park, H.S., Jang, G.W., Kim, H.Y., Jeon, M., Choi, B.H., Lee, H.Y., Chung, H.Y., and Kim, H. (2006). Generation and analysis of large-scale expressed sequence tags (ESTs) from a full-length enriched cDNA library of porcine backfat tissue. BMC Genomics 7, 36
  24. Kim, Y.H., and Kim, H. (2007). Application of Random Forests to Association Studies Using Mitochondrial Single Nucleotide Polymorphisms. G&I 5, 168-173
  25. Kollers, S., M$\acute{e}$gy, K., and Rocha, D. (2005), Analysis of public single nucleotide polymorphisms in commercial pig populations, Anim. Genet. 36, 426-431
  26. Komar, A.A. (2007). SNPs, silent but not invisible. Science 315, 466-467
  27. Lee, M.A., Keane, O.M., Glass, B.C., Manley, T.R., Cullen, N.G., Dodds, K.G., McCulloh, A.F., Morris, C.A., Schreiber, M., Warren, J., Zadissa, A., Wilson T., and McEwan, J.C. (2006a). Establishment of a pipeline to analyse non-synonymous SNPs in Bos taurus. BMC Genomics 7, 298
  28. Lee, S.H., Park, E.W., Cho, Y.M., Lee, J.W., Kim, H.Y., Lee, J.H., Oh, S.J., Cheon, I.C., and Yoon, D.H. (2006b), Confirming single nucleotide polymorphisms from expressed sequence tag datasets derived from three cattle cDNA libraries. J. Biochem. Mol. Biol. 39, 183-188
  29. Panitz, F., Stengaard, H., Hornshoj, H., Gorodkin, J., Hedegaard, J., Cirera, S., Thomsen, B., Madsen, L.B., Hoj, A., Vingborg, R.K., Zahn, B., Wang, X., Wang, X., Wernersson, R., Jorgensen, C.B., Scheibye-Knudsen, K., Arvin, T., Lumholdt, S., Sawera, M., Green, T., Nielsen, B.J., Havgaard, J.H., Brunak, S., Fredholm, M., Bendixen, C. (2007). SNP mining porcine ESTs with MAVIANT, a novel tool for SNP evaluation and annotation. Bioinformatics 23, i387-i391
  30. Park, K., Dirisala, V.R., Oh, Y., Choi, H., Lee, K.T., Kim, J.H., Lee, H.T., Seo, K.H., and Park, C. (2009). Reporting 678 putative cSNPs from full-length enriched cDNA sequences of the Korean native pig. J. Anim. Breed Genet. 126(2), 127-133
  31. Picoult-Newberg, L., Idekar, T.E., Pohl, M.G., Taylor, S.L., Donaldson, M.A., Nickerson, D.A., and Boyce-Jacino, M. (1999). Mining SNPs from EST databases. Genome Res. 9, 167-174
  32. Porter, V. (1993) Pigs, A Handbook to the Breeds of the World. Helm information Ltd., UK
  33. Rothschild, M.F. (2003). Advances in pig genomics and functional gene discovery. Comp. Funct. Genom. 4, 266-270
  34. Rothschild, M.F., Hu, Z.L., and Jiang, Z. (2007). Advances in QTL Mapping in pigs. Int. J. Biol. 3, 192-197
  35. Sambrook, J., Fritsch, E., and Maniatis, T. (1989). Molecular cloning: A laboratory manual. 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, USA
  36. The International SNP Map Working Group. (2001). A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928-933
  37. Tuggle, C.K., Wang, Y., and Couture, O. (2007). Advances in Swine Transcriptomics. Int. J. Biol. Sci. 3, 132-152
  38. Uenishi, H., Eguchi, T., Suzuki, K., Sawazaki, T., Toki, D., Shinkai, H., Okumura, N., Hamasina, N., and Awata, T. (2004). PEDE (Pig EST Data Explorer): Construction of a database for ESTs derived from porcine full-length cDNA libraries. Nucl. Acids Res. 32, 484-488
  39. Uenishi, H., Eguchi-Ogawa, T., Shinkai, H., Okumura, N., Suzuki, K., Toki, D., Hamasima, N., and Awata, T. (2007). PEDE (Pig EST Data Explorer) has been expanded into pig expression data explorer, including 10147 porcine full-length cDNA sequences. Nucleic Acids Res. 35, D650-D653
  40. Useche, F.J., Gao, G., Harafey, M., and Rafalski, A. (2001). High-throughput identification, database storage and analysis of SNPs in EST sequences. Genome Inform. Ser. Workshop Genome Inform. 12, 194-203
  41. Wang, D.G., Fan, J.B., Siao, C.J., Berno, A., Young, P., Sapolsky, R., Ghandour, G., Perkins, N., Winchester, E., Spencer, J., Krugylyak, L., Stein, L., Hsie, L., Topaloglou, T. Hubbell, E., Robinson, E., Mittmann, M., Morris, M.S., Shen, N., Kilburn, D., Rioux, J., Nusbaum, C., Rozen, S., Hudson, T.J., and Lander, E.S. (1998). Large-scale identification, mapping, genotyping of single nucleotide polymorphisms in the human genome. Science 280, 1077-1082
  42. Zimdahl, H., Nyakatura, G., Brandt, P., Schulz, H., Hummel, O., Fatmann, B., Brett, D., Droege, M., Monti, J., Lee, Y.A., Sun, Y., Zhao, S., Winter, E.E., Pontig, C.P., Chen, Y., Kasprzyk, A., Birney, E., Ganten, D., and Hubner, N. (2004). A SNP map of rat genome generated from cDNA sequences. Science 303, 807