INTRODUCTION
Recent developments in high throughput SNP chip technologies have enabled researchers to conduct large-scale genome-wide association studies (GWAS) (1-6). These have revealed an unprecedented amount of genetic variants that are associated with complex traits (7). As of August, 2012, there were 1,330 publications and 6,848 phenotype-associated SNPs in the NHGRI GWAS catalogue (http://www.genome.gov/gwastudies/). The availability of plentiful phenotype-related genomic information is expected to lead to clinically applicable personal genomics in the near future (8,9), however, a number of issues require attention before this can be widely applied. Firstly, the identification of causal variants, and a functional investigation of known loci are required (10,11). GWAS have localized associated signals to specific genomic regions, however, most identified variants are located within intergenic, intronic, and gene desert regions, and are regarded as proxies for causal variants. Further analysis, such as fine mapping and resequencing, is required to unveil causal variants of phenotypes. Only a small number of genes in close proximity to associated variants have been examined to identify possible functional relationships with phenotypes. Secondly, the majority of GWAS have been conducted on populations with European ancestry. This data of European relevence should be validated for its application to other ethnicities, such as those of Asian or African ancestry. Although some recent GWAS have been conducted on ethnic groups other than Europeans, sample sizes and numbers of target phenotypes have been relatively small compared with studies of Europeans (2,6,12). It is important to consider population specific associations for personal genomics applications, as phenotype associations regularly vary across populations (3-5).
Population specific or independent associations of variants can be identified by GWAS in a specific population, or by independent replication studies. However, these approaches require a large number of samples, compounding the high costs associated with genotyping. As an alternative, the genetic diversity of phenotype-associated regions may be examined. Large differences in genetic architecture among populations are an established cause of discrepancies in associations (3-5). The fixation index (Fst) is one of the most widely used metrics for measuring genetic differentiation between populations (13). Variation in linkage disequilibrium (VarLD) is another approach that measures population differences in LD patterns (14). Various web interfaces have been developed to provide user-friendly graphical interfaces (GUI) and browsers to access genetic variation data, including Haplotter, FstSNPHapMap3, SNP@Evolution, and Singapore Genome Variation Project (SGVP) (15-18). Three of these use only reference information, such as data from HapMap phase 2 and phase 3 (19,20). SGVP also provides information derived from three Southeast Asian populations, as compared to HapMap populations (18).
Genetic diversity among East Asian populations has not previously been provided via a web service. In particular, the Korean population is one of the most intensely studied in East Asia, but there is no web resource providing genetic diversity data which includes Koreans. Although two populations (Han Chinese in Beijing )undefined(CHB) and Japanese in Tokyo (JPT)(CHB and JPT) should not be regarded as references for Koreans (21). We therefore developed EvoSNP-DB, a web resource for genetic diversity among East Asian populations.
RESULTS AND DISCUSSION
We constructed EvoSNP-DB by integrating GBrowse and genotype data from 108 Koreans (founders) and 210 HapMap phase II release #22 samples. After quality control, 1,147,845 SNPs overlapped across Korean and other HapMap populations. The EvoSNP-DB database and web server is implemented on a 24×2.66 GHz Xeon core server running on Red Hat Enterprise Linux (version 5.2), Apache (version 2.0), Tomcat (version 5.5), and MySQL (version 5.5). It is viewable in all major web browsers and operating systems, and is available online at [http://biomi.cdc.go.kr/EvoSNP/].
Database design and organization
Data flow through the application is described in Fig. 1. Briefly, genotype data were analyzed to calculate Fst, VarLD, allele frequency, and Hardy-Weinberg equilibrium (HWE). Processed data are stored in the database with annotation information retrieved from UCSC and OMIM. The database is wrapped by Gbrowse and JSP for data query and visualization interfaces. Genotype datasets are derived from the International HapMap Phase II release #22 data repository (11,12), including data from 60 Utah residents with ancestry from Northern and Western Europe (CEU), 45 Han Chinese in Beijing (CHB), 45 Japanese in Tokyo (JPT), and 60 Yoruba in Ibadan, Nigeria (YRI). Considering the relatively small number of samples of CHB and JPT, we pooled the data of both as a single geographical group, and denoted it as ASN (Asian, 90 samples). The SNP information from 108 Korean founders of 54 trios was compared to those of HapMap populations. The database has been integrated with Fst and VarLD metrics to facilitate the graphical representation of the data. Fst measures polymorphism within each population and differentiation among geographical groups (13). To quantify variation in population linkage disequilibrium patterns, we used the varLD program (14). HapMap, UCSC, OMIM, and the NHGRI GWAS catalogue were major sources of annotation information.
Fig. 1.Flow diagram of EvoSNP-DB construction.
User interface and visualization
Within EvoSNP-DB there are user interfaces for data queries and visualization. Three types of query can be applied: (i) SNP identifier, (ii) mRNA ID or gene symbol, and (iii) specific chromosome region. For example, rs28218 could be used for a SNP based search, NM_002124 or ORF4F16 for a gene search, and chr1:157661000..157806000 to search for this chromosomal region.
Regardless of the query type, EvoSNP-DB returns three tables providing Region, Gene, and SNP statistics (Fig. 2). Each table contains summary variation scores. Fig. 2 illustrates the output when rs28218 was used as a search term; scores of the gene TRIO, which contains this SNP, are summarized in the gene statistics table. JSP and GMOD (http://gmod.org) were used to build the table and figure interfaces. Links to public online databases, including Entrez Nucleotide, dbSNP, OMIM (22), and HapMap (20), are provided in EvoSNP-DB, together with the results (Fig. 2). EvoSNP-DB also offers a generic genome browser, which displays overviews of chromosomes, contigs, genes, mRNAs, and SNPs (23). Figs. 3 and 4 demonstrate the output if small or large numbers of SNPs exist in the query region, respectively.
Fig. 2.A screenshot of the result table from EvoSNP-DB.
EvoSNP-DB provides an open-architecture website using a wiki interface for data access (a wiki is a website that allows its users to add, modify, or delete its content via a web browser), and wiki-based SNP annotation will be available in the near future. This will be particularly useful for constructing accurate and informative annotation for variants identified by the collaborative work of many researchers. MySQL, Python, JSP, and GBrowse were used in database construction, and to enhance interface utility (24).
Fig. 3.A detailed screenshot showing EvoSNP-DB search results. Top track: chromosomal overview. SNP locations, diamond shapes. OMIM disease associations, rectangles. Second track: VarLD scores visualized along a 2 Mb chromosomal region. Third track: allele frequencies of SNPs, visualized as a pie chart for the Korean population or as towers for HapMap populations. Bottom track: Genes in the region.
Fig. 4.A wide screenshot showing the search results with OMIM and GWAS Catalogue. Allele frequency is not displayed, but each SNP is indicated.
MATERIALS AND METHODS
Korean genotype data
Previously, we conducted GWAS for two independent Korean population-based cohorts (Ansung and Ansan) as part of the Korean Genome Epidemiology Study (KoGES), and the Korean Association REsource (KARE) project, which was initiated in 2007 (2, 4). In the Ansung area, we recruited additional family members of the original participants to facilitate family based association studies. Among these, 54 trios (162 samples) were genotyped using an Affymetrix Genome-Wide Human SNP array 6.0 and an Illumina human Omni1-Quad Chip. Genotypes were called with Birdseed and BeadStudio GenCall for Affymetrix and Illumina arrays, respectively (25, 26). Initially, ∼1.9 million SNPs from the two platforms (909,622 for Affymetrix and 1,010,624 for the Illumina array) were merged. For quality control, we excluded SNPs using the following criteria: non-autosomal, mendelian errors, high missing genotype rate (> 5%), and deviation from HWE (P < 1E-6). Filtered SNPs were compared with data from HapMap SNPs, including allele, strand, and genomic position. After excluding 14 SNPs with annotation errors, 1,147,845 SNPs were overlapped with HapMap SNPs (27).
HapMap genotype data
HapMap phase II release #22 data (210 samples) were downloaded. Genotype data were converted to the PLINK binary genotype format, and genotype frequencies, allele frequencies, and P-values of HWE calculated using PLINK (28).
Analysis of genetic diversity among populations
Fst and VarLD were used as population genetic diversity metrics (13,14). Fst was calculated for each SNP by a pairwise comparison of four populations. Genome-wide VarLD analysis was performed; VarLD scores were calculated for windows of 50 SNPs, starting from the first SNP of each chromosome and ending with the last. All values from 22 chromosomes were merged and were converted to provide a standard normal distribution (mean=0, standard deviation=1). VarLD analysis procedures were performed for all pairs of populations. To access the degree of genetic difference between populations, we calculated the quartiles of Fst and VarLD score distributions.
References
- Wellcome Trust Case Control Consortium (2007) Genomewide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661-678. https://doi.org/10.1038/nature05911
- Cho, Y. S., Go, M. J., Kim, Y. J., Heo, J. Y., Oh, J. H., Ban, H. J., Yoon, D., Lee, M. H., Kim, D. J., Park, M., Cha, S. H., Kim, J. W., Han, B. G., Min, H., Ahn, Y., Park, M. S., Han, H. R., Jang, H. Y., Cho, E. Y., Lee, J. E., Cho, N. H., Shin, C., Park, T., Park, J. W., Lee, J. K., Cardon, L., Clarke, G., McCarthy, M. I., Lee, J. Y., Oh, B. and Kim, H. L. (2009) A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nat. Genet. 41, 527-534. https://doi.org/10.1038/ng.357
- Kato, N., Takeuchi, F., Tabara, Y., Kelly, T. N., Go, M. J., Sim, X., Tay, W. T., Chen, C. H., Zhang, Y., Yamamoto, K., Katsuya, T., Yokota, M., Kim, Y. J., Ong, R. T., Nabika, T., Gu, D., Chang, L. C., Kokubo, Y., Huang, W., Ohnaka, K., Yamori, Y., Nakashima, E., Jaquish, C. E., Lee, J. Y., Seielstad, M., Isono, M., Hixson, J. E., Chen, Y. T., Miki, T., Zhou, X., Sugiyama, T., Jeon, J. P., Liu, J. J., Takayanagi, R., Kim, S. S., Aung, T., Sung, Y. J., Zhang, X., Wong, T. Y., Han, B. G., Kobayashi, S., Ogihara, T., Zhu, D., Iwai, N., Wu, J. Y., Teo, Y. Y., Tai, E. S., Cho, Y. S. and He, J. (2011) Meta-analysis of genome-wide association studies identifies common variants associated with blood pressure variation in east Asians. Nat Genet 43, 531-538. https://doi.org/10.1038/ng.834
- Kim, Y. J., Go, M. J., Hu, C., Hong, C. B., Kim, Y. K., Lee, J. Y., Hwang, J. Y., Oh, J. H., Kim, D. J., Kim, N. H., Kim, S., Hong, E. J., Kim, J. H., Min, H., Kim, Y., Zhang, R., Jia, W., Okada, Y., Takahashi, A., Kubo, M., Tanaka, T., Kamatani, N., Matsuda, K., Park, T., Oh, B., Kimm, K., Kang, D., Shin, C., Cho, N. H., Kim, H. L., Han, B. G. and Cho, Y. S. (2011) Large-scale genome-wide association studies in East Asians identify new genetic loci influencing metabolic traits. Nat. Genet. 43, 990-995. https://doi.org/10.1038/ng.939
- Soranzo, N., Spector, T. D., Mangino, M., Kuhnel, B., Rendon, A., Teumer, A., Willenborg, C., Wright, B., Chen, L., Li, M., Salo, P., Voight, B. F., Burns, P., Laskowski, R. A., Xue, Y., Menzel, S., Altshuler, D., Bradley, J. R., Bumpstead, S., Burnett, M. S., Devaney, J., Doring, A., Elosua, R., Epstein, S. E., Erber, W., Falchi, M., Garner, S. F., Ghori, M. J., Goodall, A. H., Gwilliam, R., Hakonarson, H. H., Hall, A. S., Hammond, N., Hengstenberg, C., Illig, T., Konig, I. R., Knouff, C. W., McPherson, R., Melander, O., Mooser, V., Nauck, M., Nieminen, M. S., O'Donnell, C. J., Peltonen, L., Potter, S. C., Prokisch, H., Rader, D. J., Rice, C. M., Roberts, R., Salomaa, V., Sambrook, J., Schreiber, S., Schunkert, H., Schwartz, S. M., Serbanovic-Canic, J., Sinisalo, J., Siscovick, D. S., Stark, K., Surakka, I., Stephens, J., Thompson, J. R., Volker, U., Volzke, H., Watkins, N. A., Wells, G. A., Wichmann, H. E., Van Heel, D. A., Tyler-Smith, C., Thein, S. L., Kathiresan, S., Perola, M., Reilly, M. P., Stewart, A. F., Erdmann, J., Samani, N. J., Meisinger, C., Greinacher, A., Deloukas, P., Ouwehand, W. H. and Gieger, C. (2009) A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nat. Genet. 41, 1182-1190. https://doi.org/10.1038/ng.467
- Teslovich, T. M., Musunuru, K., Smith, A. V., Edmondson, A. C., Stylianou, I. M., Koseki, M., Pirruccello, J. P., Ripatti, S., Chasman, D. I., Willer, C. J., Johansen, C. T., Fouchier, S. W., Isaacs, A., Peloso, G. M., Barbalic, M., Ricketts, S. L, Bis, J. C., Aulchenko, Y. S., Thorleifsson, G., Feitosa, M. F., Chambers, J., Orho-Melander, M., Melander, O., Johnson, T., Li, X., Guo, X., Li, M., Shin Cho, Y., Jin Go, M., Jin Kim, Y., Lee, J. Y., Park, T., Kim, K., Sim, X., Twee-Hee Ong, R., Croteau-Chonka, D. C., Lange, L. A., Smith, J. D., Song, K., Hua Zhao, J., Yuan, X., Luan, J., Lamina, C., Ziegler, A., Zhang, W., Zee, R. Y., Wright, A. F., Witteman, J. C., Wilson, J. F., Willemsen, G., Wichmann, H. E., Whitfield, J. B., Waterworth, D. M., Wareham, N. J., Waeber, G., Vollenweider, P., Voight, B. F., Vitart, V., Uitterlinden, A. G., Uda, M., Tuomilehto, J., Thompson, J. R., Tanaka, T., Surakka, I., Stringham, H. M., Spector, T. D., Soranzo, N., Smit, J. H., Sinisalo, J., Silander, K., Sijbrands, E. J., Scuteri, A., Scott, J., Schlessinger, D., Sanna, S., Salomaa, V., Saharinen, J., Sabatti, C., Ruokonen, A., Rudan, I., Rose, L. M., Roberts, R., Rieder, M., Psaty, B. M., Pramstaller, P. P., Pichler, I., Perola, M., Penninx, B. W., Pedersen, N. L., Pattaro, C., Parker, A. N., Pare, G., Oostra, B. A., O'Donnell, C. J., Nieminen, M. S., Nickerson, D. A., Montgomery, G. W., Meitinger, T., McPherson, R., McCarthy, M. I., McArdle, W., Masson, D., Martin, N. G., Marroni, F., Mangino, M., Magnusson, P. K., Lucas, G., Luben, R., Loos, R. J., Lokki, M. L., Lettre, G., Langenberg, C., Launer, L. J., Lakatta, E. G., Laaksonen, R., Kyvik, K. O., Kronenberg, F., Konig, I. R., Khaw, K. T., Kaprio, J., Kaplan, L. M., Johansson, A., Jarvelin, M. R., Janssens, A. C., Ingelsson, E., Igl, W., Kees Hovingh, G., Hottenga, J. J., Hofman, A., Hicks, A. A., Hengstenberg, C., Heid, I. M., Hayward, C., Havulinna, A. S., Hastie, N. D., Harris, T. B., Haritunians, T., Hall, A. S., Gyllensten, U., Guiducci, C., Groop, L. C., Gonzalez, E., Gieger, C., Freimer, N. B., Ferrucci, L., Erdmann, J., Elliott, P., Ejebe, K. G., Doring, A., Dominiczak, A. F., Demissie, S., Deloukas, P., de Geus, E. J., de Faire, U., Crawford, G., Collins, F. S., Chen, Y. D., Caulfield, M. J., Campbell, H., Burtt, N. P., Bonnycastle, L. L., Boomsma, D. I., Boekholdt, S. M., Bergman, R. N., Barroso, I., Bandinelli, S., Ballantyne, C. M., Assimes, T. L., Quertermous, T., Altshuler, D., Seielstad, M., Wong, T. Y., Tai, E. S., Feranil, A. B., Kuzawa, C. W., Adair, L. S., Taylor, H. A. Jr, Borecki, I. B., Gabriel, S. B., Wilson, J. G., Holm, H., Thorsteinsdottir, U., Gudnason, V., Krauss, R. M., Mohlke, K. L., Ordovas, J. M., Munroe, P. B., Kooner, J. S., Tall, A. R., Hegele, R. A., Kastelein, J. J., Schadt, E. E., Rotter, J. I., Boerwinkle, E., Strachan, D. P., Mooser, V., Stefansson, K., Reilly, M. P., Samani, N. J., Schunkert, H., Cupples, L. A., Sandhu, M. S., Ridker, P. M., Rader, D. J., van Duijn, C. M., Peltonen, L., Abecasis, G. R., Boehnke, M. and Kathiresan, S. (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707-713. https://doi.org/10.1038/nature09270
- Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S. and Manolio, T. A. (2009) Potential etiologic and functional implications of genomewide association loci for human diseases and traits. Proc. Natl. Acad. Sci. U.S.A. 106, 9362-9367. https://doi.org/10.1073/pnas.0903103106
- Ashley, E. A., Butte, A. J., Wheeler, M. T., Chen, R., Klein, T. E., Dewey, F. E., Dudley, J. T., Ormond, K. E., Pavlovic, A., Morgan, A. A., Pushkarev, D., Neff, N. F., Hudgins, L., Gong, L., Hodges, L. M., Berlin, D. S., Thorn, C. F., Sangkuhl, K., Hebert, J. M., Woon, M., Sagreiya, H., Whaley, R., Knowles, J. W., Chou, M. F., Thakuria, J. V., Rosenbaum, A. M., Zaranek, A. W., Church, G. M., Greely, H. T., Quake, S. R. and Altman, R. B. (2010) Clinical assessment incorporating a personal genome. Lancet. 375, 1525-1535. https://doi.org/10.1016/S0140-6736(10)60452-7
- Chen, R., Mias, G. I., Li-Pook-Than, J., Jiang, L., Lam, H. Y., Miriami, E., Karczewski, K. J., Hariharan, M., Dewey, F. E., Cheng, Y., Clark, M. J., Im, H., Habegger, L., Balasubramanian, S., O'Huallachain, M., Dudley, J. T., Hillenmeyer, S., Haraksingh, R., Sharon, D., Euskirchen, G., Lacroute, P., Bettinger, K., Boyle, A. P., Kasowski, M., Grubert, F., Seki, S., Garcia, M., Whirl-Carrillo, M., Gallardo, M., Blasco, M. A., Greenberg, P. L., Snyder, P., Klein, T. E., Altman, R. B., Butte, A. J., Ashley, E. A., Gerstein, M., Nadeau, K. C., Tang, H. and Snyder, M. (2012) Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 148, 1293-1307. https://doi.org/10.1016/j.cell.2012.02.009
- Park, J. H., Wacholder, S., Gail, M. H., Peters, U., Jacobs, K. B., Chanock, S. J. and Chatterjee, N. (2010) Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat. Genet. 42, 570-575. https://doi.org/10.1038/ng.610
- (2010) On beyond GWAS. Nat. Genet. 42, 551. https://doi.org/10.1038/ng0710-551
- Lettre, G., Palmer, C. D., Young, T., Ejebe, K. G., Allayee, H., Benjamin, E. J., Bennett, F., Bowden, D. W., Chakravarti, A., Dreisbach, A., Farlow, D. N., Folsom, A. R., Fornage, M., Forrester, T., Fox, E., Haiman, C. A., Hartiala, J., Harris, T. B., Hazen, S. L., Heckbert, S. R., Henderson, B. E., Hirschhorn, J. N., Keating, B. J., Kritchevsky, S. B., Larkin, E., Li, M., Rudock, M. E., McKenzie, C. A., Meigs, J. B., Meng, Y. A., Mosley, T. H., Newman, A. B., Newton-Cheh, C. H., Paltoo, D. N., Papanicolaou, G. J., Patterson, N., Post, W. S., Psaty, B. M., Qasim, A. N., Qu, L., Rader, D. J., Redline, S., Reilly, M. P., Reiner, A. P., Rich, S. S., Rotter, J. I., Liu, Y., Shrader, P., Siscovick, D. S., Tang, W. H., Taylor, H. A., Tracy, R. P., Vasan, R. S., Waters, K. M., Wilks, R., Wilson, J. G., Fabsitz, R. R., Gabriel, S. B., Kathiresan, S. and Boerwinkle, E. (2011) Genome-wide association study of coronary heart disease and its risk factors in 8,090 African Americans: the NHLBI CARe Project. PLoS Genet. 7, e1001300. https://doi.org/10.1371/journal.pgen.1001300
- Wright, S. (1937) The Distribution of Gene Frequencies in Populations. Proc. Natl. Acad. Sci. U.S.A. 23, 307-320. https://doi.org/10.1073/pnas.23.6.307
- Teo, Y. Y., Fry, A. E., Bhattacharya, K., Small, K. S., Kwiatkowski, D. P. and Clark, T. G. (2009) Genome-wide comparisons of variation in linkage disequilibrium. Genome. Res. 19, 1849-1860. https://doi.org/10.1101/gr.092189.109
- Cheng, F., Chen, W., Richards, E., Deng, L. and Zeng, C. (2009) SNP@Evolution: a hierarchical database of positive selection on the human genome. BMC. Evol. Biol. 9, 221. https://doi.org/10.1186/1471-2148-9-221
- Voight, B. F., Kudaravalli, S., Wen, X. and Pritchard, J. K. (2006) A map of recent positive selection in the human genome. PLoS. Biol. 4, e72. https://doi.org/10.1371/journal.pbio.0040072
- Duan, S., Zhang, W., Cox, N. J. and Dolan, M. E. (2008) FstSNP-HapMap3: a database of SNPs with high population differentiation for HapMap3. Bioinformation 3, 139-141. https://doi.org/10.6026/97320630003139
- Teo, Y. Y., Sim, X., Ong, R. T., Tan, A. K., Chen, J., Tantoso, E., Small, K. S., Ku, C. S., Lee, E. J., Seielstad, M. and Chia, K. S. (2009) Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations. Genome Res. 19, 2154-2162. https://doi.org/10.1101/gr.095000.109
- International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437, 1299-1320. https://doi.org/10.1038/nature04226
- International HapMap Consortium, Frazer, K. A., Ballinger, D. G., Cox, D. R., Hinds, D. A., Stuve, L. L., Gibbs, R. A., Belmont, J. W., Boudreau, A., Hardenbol, P., Leal, S. M., Pasternak, S., Wheeler, D. A., Willis, T. D., Yu, F., Yang, H., Zeng, C., Gao, Y., Hu, H., Hu, W., Li, C., Lin, W., Liu, S., Pan, H., Tang, X., Wang, J., Wang, W., Yu, J., Zhang, B., Zhang, Q., Zhao, H., Zhao, H., Zhou, J., Gabriel, S. B., Barry, R., Blumenstiel, B., Camargo, A., Defelice, M., Faggart, M., Goyette, M., Gupta, S., Moore, J., Nguyen, H., Onofrio, R. C., Parkin, M., Roy, J., Stahl, E., Winchester, E., Ziaugra, L., Altshuler, D., Shen, Y., Yao, Z., Huang, W., Chu, X., He, Y., Jin, L., Liu, Y., Shen, Y., Sun, W., Wang, H., Wang, Y., Wang, Y., Xiong, X., Xu, L., Waye, M. M., Tsui, S. K., Xue, H., Wong, J. T., Galver, L. M., Fan, J. B., Gunderson, K., Murray, S. S., Oliphant, A. R., Chee, M. S., Montpetit, A., Chagnon, F., Ferretti, V., Leboeuf, M., Olivier, J. F., Phillips, M. S., Roumy, S., Sallee, C., Verner, A., Hudson, T. J., Kwok, P. Y., Cai, D., Koboldt, D. C., Miller, R. D., Pawlikowska, L., Taillon-Miller, P., Xiao, M., Tsui, L. C., Mak, W., Song, Y. Q., Tam, P. K., Nakamura, Y., Kawaguchi, T., Kitamoto, T., Morizono, T., Nagashima, A., Ohnishi, Y., Sekine, A., Tanaka, T., Tsunoda, T., Deloukas, P., Bird, C. P., Delgado, M., Dermitzakis, E. T., Gwilliam, R., Hunt, S., Morrison, J., Powell, D., Stranger, B. E., Whittaker, P., Bentley, D. R., Daly, M. J., de Bakker, P. I., Barrett, J., Chretien, Y. R., Maller, J., McCarroll, S., Patterson, N., Pe'er, I., Price, A., Purcell, S., Richter, D. J., Sabeti, P., Saxena, R., Schaffner, S. F., Sham, P. C., Varilly, P., Altshuler, D., Stein, L. D., Krishnan, L., Smith, A. V., Tello-Ruiz, M. K., Thorisson, G. A., Chakravarti, A., Chen, P. E., Cutler, D. J., Kashuk, C. S., Lin, S., Abecasis, G. R., Guan, W., Li, Y., Munro, H. M., Qin, Z. S., Thomas, D. J., McVean, G., Auton, A., Bottolo, L., Cardin, N., Eyheramendy, S., Freeman, C., Marchini, J., Myers, S., Spencer, C., Stephens, M., Donnelly, P., Cardon, L. R., Clarke, G., Evans, D. M., Morris, A. P., Weir, B. S., Tsunoda, T., Mullikin, J. C., Sherry, S. T., Feolo, M., Skol, A., Zhang, H., Zeng, C., Zhao, H., Matsuda, I., Fukushima, Y., Macer, D. R., Suda, E., Rotimi, C. N., Adebamowo, C. A., Ajayi, I., Aniagwu, T., Marshall, P. A., Nkwodimmah, C., Royal, C. D., Leppert, M. F., Dixon, M., Peiffer, A., Qiu, R., Kent, A., Kato, K., Niikawa, N., Adewole, I. F., Knoppers, B. M., Foster, M. W., Clayton, E. W., Watkin, J., Gibbs, R. A., Belmont, J. W., Muzny, D., Nazareth, L., Sodergren, E., Weinstock, G. M., Wheeler, D. A., Yakub, I., Gabriel, S. B., Onofrio, R. C., Richter, D. J., Ziaugra, L., Birren, B. W., Daly, M. J., Altshuler, D., Wilson, R. K., Fulton, L. L., Rogers, J., Burton, J., Carter, N. P., Clee, C. M., Griffiths, M., Jones, M. C., McLay, K., Plumb, R. W., Ross, M. T., Sims, S. K., Willey, D. L., Chen, Z., Han, H., Kang, L., Godbout, M., Wallenburg, J. C., L'Archeveque, P., Bellemare, G., Saeki, K., Wang, H., An, D., Fu, H., Li, Q., Wang, Z., Wang, R., Holden, A. L., Brooks, L. D., McEwen, J. E., Guyer, M. S., Wang, V. O., Peterson, J. L., Shi, M., Spiegel, J., Sung, L. M., Zacharia, L. F., Collins, F. S., Kennedy, K., Jamieson, R. and Stewart, J. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861. https://doi.org/10.1038/nature06258
- He, M., Gitschier, J., Zerjal, T., de Knijff, P., Tyler-Smith, C. and Xue, Y. (2009) Geographical affinities of the Hap Map samples. PLoS. One. 4, e4684. https://doi.org/10.1371/journal.pone.0004684
- Amberger, J., Bocchini, C. A., Scott, A. F. and Hamosh, A. (2009) McKusick's Online Mendelian Inheritance in Man (OMIM). Nucleic. Acids. Res. 37, D793-796. https://doi.org/10.1093/nar/gkn665
- Stein, L. D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J. E., Harris, T. W., Arva, A. and Lewis, S. (2002) The generic genome browser: a building block for a model organism system database. Genome. Res. 12, 1599-1610. https://doi.org/10.1101/gr.403602
- Donlin, M. J. (2009) Using the generic genome browser (GBrowse). Curr. Protoc. Bioinformatics. Chapter 9, Unit 9 9.
- Korn, J. M., Kuruvilla, F. G., McCarroll, S. A., Wysoker, A., Nemesh, J., Cawley, S., Hubbell, E., Veitch, J., Collins, P. J., Darvishi, K., Lee, C., Nizzari, M. M., Gabriel, S. B., Purcell, S., Daly, M. J. and Altshuler, D. (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253-1260. https://doi.org/10.1038/ng.237
- Oliphant, A., Barker, D. L., Stuelpnagel, J. R. and Chee, M. S. (2002) BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques. Suppl 56-58, 60-51.
- Hong, C. B., Kim, Y. J., Moon, S., Shin, Y. A., Cho, Y. S. and Lee, J. Y. (2012) KAREBrowser: SNP database of Korea association resource project. BMB Rep. 45, 47-50. https://doi.org/10.5483/BMBRep.2012.45.1.47
- Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., de Bakker, P. I., Daly, M. J. and Sham, P. C. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559-575. https://doi.org/10.1086/519795