DOI QR코드

DOI QR Code

Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB

  • Lee, Joon-Ho (Genomic Informatics Center, Hankyong National University) ;
  • Lee, Taeheon (Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University) ;
  • Lee, Hak-Kyo (Genomic Informatics Center, Hankyong National University) ;
  • Cho, Byung-Wook (Department of Animal Science, College of Life Sciences, Pusan National University) ;
  • Shin, Dong-Hyun (Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University) ;
  • Do, Kyoung-Tag (Department of Equine Sciences, Sorabol College) ;
  • Sung, Samsun (C&K Genomics, Seoul National University Research) ;
  • Kwak, Woori (C&K Genomics, Seoul National University Research) ;
  • Kim, Hyeon Jeong (C&K Genomics, Seoul National University Research) ;
  • Kim, Heebal (Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University) ;
  • Cho, Seoae (C&K Genomics, Seoul National University Research) ;
  • Park, Kyung-Do (Genomic Informatics Center, Hankyong National University)
  • Received : 2013.11.04
  • Accepted : 2014.06.21
  • Published : 2014.09.01

Abstract

Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB) (http://snugenome2.snu.ac.kr/HSDB) provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs) were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.

Keywords

References

  1. Ameur, A., A. Zaghlool, J. Halvardson, A. Wetterbom, U. Gyllensten, L. Cavelier, and L. Feuk. 2011. Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain. Nat. Struct. Mol. Biol. 18:1435-1440. https://doi.org/10.1038/nsmb.2143
  2. Barrett, J., B. Fry, J. Maller, and M. Daly. 2005. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263-265. https://doi.org/10.1093/bioinformatics/bth457
  3. Chowdhary, B. P. and T. Raudsepp. 2008. The Horse Genome Derby: racing from map to whole genome sequence. Chromosome Res. 16:109-127. https://doi.org/10.1007/s10577-008-1204-z
  4. Cingolani, P., A. Platts, L. Wang, M. Coon, T. Nguyen, S. J. Land, X. Lu, and D. M. Ruden. 2012. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6:80-92. https://doi.org/10.4161/fly.19695
  5. Danecek, P., A. Auton, G. Abecasis, C. A. Albers, E. Banks, M. A. DePristo, R. E. Handsaker, G. Lunter, G. T. Marth, S. T. Sherry, G. McVean, R. Durbin, and Genomes Project Analysis. 2011. The variant call format and VCFtools. Bioinformatics 27:2156-2158. https://doi.org/10.1093/bioinformatics/btr330
  6. Gordon, J. 2001. The Horse Industry - Contributing to the Australian Economy. Rural Industries Research and Development Corporation, Canberra, Australia. 1-58.
  7. Hill, E. W., B. A. McGivney, J. Gu, R. Whiston, and D. E. MacHugh. 2010. A genome-wide SNP-association study confirms a sequence variant (g. 66493737C> T) in the equine myostatin (MSTN) gene as the most powerful predictor of optimum racing distance for Thoroughbred racehorses. BMC Genomics 11:552. https://doi.org/10.1186/1471-2164-11-552
  8. Hubbard, T., D. Barker, E. Birney, G. Cameron, Y. Chen, L. Clark, T. Cox, J. Cuff, V. Curwen, T. Down, R. Durbin, E. Eyras, J. Gilbert, M. Hammond, L. Huminiecki, A. Kasprzyk, H. Lehvaslaiho, P. Lijnzaad, C. Melsopp, E. Mongin, R. Pettett, M. Pocock, S. Potter, A. Rust, E. Schmidt, S. Searle, G. Slater, J. Smith, W. Spooner, A. Stabenau, J. Stalker, E. Stupka, A. Ureta-Vidal, I. Vastrik, and M. Clamp. 2002. The Ensembl genome database project. Nucl. Acids Res. 30:38-41. https://doi.org/10.1093/nar/30.1.38
  9. Kapranov, P., G. St Laurent, T. Raz, F. Ozsolak, C. P. Reynolds, P. H. B. Sorensen, G. Reaman, P. Milos, R. J. Arceci, J. F. Thompson, and T. J. Triche. 2010. The majority of total nuclear-encoded non-ribosomal RNA in a human cell is 'dark matter' un-annotated RNA. BMC Biol. 8:149. https://doi.org/10.1186/1741-7007-8-149
  10. Kim, H., T. Lee, W. Park, J. W. Lee, J. Kim, B. Y. Lee, H. Ahn, S. Moon, S. Cho, K. T. Do, H. S. Kim, H. K. Lee, C. K. Lee, H. S. Kong, Y. M. Yang, J. Park, H. M. Kim, B. C. Kim, S. Hwang, J. Bhak, D. Burt, K. D. Park, B. W. Cho, and H. Kim. 2013. Peeling back the evolutionary layers of molecular mechanisms responsive to exercise-stress in the skeletal muscle of the racing horse. DNA Res. 20:287-298. https://doi.org/10.1093/dnares/dst010
  11. Langmead, B. and S. L. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9:357-359. https://doi.org/10.1038/nmeth.1923
  12. Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078-2079. https://doi.org/10.1093/bioinformatics/btp352
  13. McKenna, A., M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, and M. A. DePristo. 2010. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-1303. https://doi.org/10.1101/gr.107524.110
  14. Park, K. D., J. Park, J. Ko, B. C. Kim, H. S. Kim, K. Ahn, K. T. Do, H. Choi, H. M. Kim, S. Song, S. Lee, S. Jho, H. S. Kong, Y. M. Yang, B. H. Jhun, C. Kim, T. H. Kim, S. Hwang, J. Bhak, H. K. Lee, and B. W. Cho. 2012. Whole transcriptome analyses of six Thoroughbred horses before and after exercise using RNA-Seq. BMC Genomics 13:473. https://doi.org/10.1186/1471-2164-13-473
  15. Petersen, J. L., J. R. Mickelson, A. K. Rendahl, S. J. Valberg, L. S. Andersson, J. Axelsson, E. Bailey, D. Bannasch, M. M. Binns, A. S. Borges, P. Brama, A. da Camara Machado, S. Capomaccio, K. Cappelli, E. G. Cothran, O. Distl, L. Fox-Clipsham, K. T. Graves, G. Guerin, B. Haase, T. Hasegawa, K. Hemmann, E. W. Hill, T. Leeb, G. Lindgren, H. Lohi, M. S. Lopes, B. A. McGivney, S. Mikko, N. Orr, M. C. Penedo, R. J. Piercy, M. Raekallio, S. Rieder, K. H. Roed, J. Swinburne, T. Tozaki, M. Vaudin, C. M. Wade, and M. E. McCue. 2013. Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet. 9:e1003211. https://doi.org/10.1371/journal.pgen.1003211
  16. Robinson, M. D., D. J. McCarthy, and G. K. Smyth. 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139-140. https://doi.org/10.1093/bioinformatics/btp616
  17. Sherry, S. T., M. H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski, and K. Sirotkin. 2001. dbSNP: the NCBI database of genetic variation. Nucl. Acids Res. 29:308-311. https://doi.org/10.1093/nar/29.1.308
  18. St Laurent, G., D. Shtokalo, M. R. Tackett, Z. Yang, T. Eremina, C. Wahlestedt, S. U. Inchima, B. Seilheimer, T. A. McCaffrey, and P. Kapranov. 2012. Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells. BMC Genomics 13:504. https://doi.org/10.1186/1471-2164-13-504
  19. Trapnell, C., L. Pachter, and S. L. Salzberg. 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105-1111. https://doi.org/10.1093/bioinformatics/btp120
  20. Van Bakel, H., C. Nislow, B. J. Blencowe, and T. R. Hughes. 2010. Most "dark matter" transcripts are associated with known genes. PLoS Biol. 8:e1000371. https://doi.org/10.1371/journal.pbio.1000371
  21. Wade, C. M., E. Giulotto, S. Sigurdsson, M. Zoli, S. Gnerre, F. Imsland, T. L. Lear, D. L. Adelson, E. Bailey, R. R. Bellone, H. Blocker, O. Distl, R. C. Edgar, M. Garber, T. Leeb, E. Mauceli, J. N. MacLeod, M. C. Penedo, J. M. Raison, T. Sharpe, J. Vogel, L. Andersson, D. F. Antczak, T. Biagi, M. M. Binns, B. P. Chowdhary, S. J. Coleman, G. Della Valle, S. Fryc, G. Guerin, T. Hasegawa, E. W. Hill, J. Jurka, A. Kiialainen, G. Lindgren, J. Liu, E. Magnani, J. R. Mickelson, J. Murray, S. G. Nergadze, R. Onofrio, S. Pedroni, M. F. Piras, T. Raudsepp, M. Rocchi, K. H. Roed, O. A. Ryder, S. Searle, L. Skow, J. E. Swinburne, A. C. Syvanen, T. Tozaki, S. J. Valberg, M. Vaudin, J. R. White, M. C. Zody, Broad Institute Genome Sequencing, Platform, Broad Institute Whole Genome Assembly Team, E. S. Lander, and K. Lindblad-Toh. 2009. Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326:865-867. https://doi.org/10.1126/science.1178158
  22. Wetterbom, A., A. Ameur, L. Feuk, U. Gyllensten, and L. Cavelier. 2010. Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing. Genome Biol. 11:R78. https://doi.org/10.1186/gb-2010-11-7-r78

Cited by

  1. Transcriptome Analysis Reveals Silver Nanoparticle-Decorated Quercetin Antibacterial Molecular Mechanism vol.9, pp.11, 2017, https://doi.org/10.1021/acsami.7b02380