• Title/Summary/Keyword: genome-wide association study (GWAS)

Search Result 154, Processing Time 0.022 seconds

MPI-GWAS: a supercomputing-aided permutation approach for genome-wide association studies

  • Paik, Hyojung;Cho, Yongseong;Cho, Seong Beom;Kwon, Oh-Kyoung
    • Genomics & Informatics
    • /
    • v.20 no.1
    • /
    • pp.14.1-14.4
    • /
    • 2022
  • Permutation testing is a robust and popular approach for significance testing in genomic research that has the advantage of reducing inflated type 1 error rates; however, its computational cost is notorious in genome-wide association studies (GWAS). Here, we developed a supercomputing-aided approach to accelerate the permutation testing for GWAS, based on the message-passing interface (MPI) on parallel computing architecture. Our application, called MPI-GWAS, conducts MPI-based permutation testing using a parallel computing approach with our supercomputing system, Nurion (8,305 compute nodes, and 563,740 central processing units [CPUs]). For 107 permutations of one locus in MPI-GWAS, it was calculated in 600 s using 2,720 CPU cores. For 107 permutations of ~30,000-50,000 loci in over 7,000 subjects, the total elapsed time was ~4 days in the Nurion supercomputer. Thus, MPI-GWAS enables us to feasibly compute the permutation-based GWAS within a reason-able time by harnessing the power of parallel computing resources.

Genome-Wide Association Study of Metabolic Syndrome in Koreans

  • Jeong, Seok Won;Chung, Myungguen;Park, Soo-Jung;Cho, Seong Beom;Hong, Kyung-Won
    • Genomics & Informatics
    • /
    • v.12 no.4
    • /
    • pp.187-194
    • /
    • 2014
  • Metabolic syndrome (METS) is a disorder of energy utilization and storage and increases the risk of developing cardiovascular disease and diabetes. To identify the genetic risk factors of METS, we carried out a genome-wide association study (GWAS) for 2,657 cases and 5,917 controls in Korean populations. As a result, we could identify 2 single nucleotide polymorphisms (SNPs) with genome-wide significance level p-values (< $5{\times}10^{-8}$), 8 SNPs with genome-wide suggestive p-values ($5{\times}10^{-8}{\leq}$ p < $1{\times}10^{-5}$), and 2 SNPs of more functional variants with borderline p-values ($5{\times}10^{-5}{\leq}$ p < $1{\times}10^{-4}$). On the other hand, the multiple correction criteria of conventional GWASs exclude false-positive loci, but simultaneously, they discard many true-positive loci. To reconsider the discarded true-positive loci, we attempted to include the functional variants (nonsynonymous SNPs [nsSNPs] and expression quantitative trait loci [eQTL]) among the top 5,000 SNPs based on the proportion of phenotypic variance explained by genotypic variance. In total, 159 eQTLs and 18 nsSNPs were presented in the top 5,000 SNPs. Although they should be replicated in other independent populations, 6 eQTLs and 2 nsSNP loci were located in the molecular pathways of LPL, APOA5, and CHRM2, which were the significant or suggestive loci in the METS GWAS. Conclusively, our approach using the conventional GWAS, reconsidering functional variants and pathway-based interpretation, suggests a useful method to understand the GWAS results of complex traits and can be expanded in other genomewide association studies.

Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS

  • Kwon, Ji-Sun;Kim, Ji-Hye;Nam, Doug-U;Kim, Sang-Soo
    • Genomics & Informatics
    • /
    • v.10 no.2
    • /
    • pp.123-127
    • /
    • 2012
  • Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.

Genome and chromosome wide association studies for growth traits in Simmental and Simbrah cattle

  • Rene, Calderon-Chagoya;Vicente Eliezer, Vega-Murillo;Adriana, Garcia-Ruiz;Angel, Rios-Utrera;Guillermo, Martinez-Velazquez;Moises, Montano-Bermudez
    • Animal Bioscience
    • /
    • v.36 no.1
    • /
    • pp.19-28
    • /
    • 2023
  • Objective: The objective of this study was to perform genome (genome wide association studies [GWAS]) and chromosome (CWAS) wide association analyses to identify single nucleotide polymorphisms (SNPs) associated with growth traits in registered Simmental and Simbrah cattle. Methods: The phenotypes were deregressed BLUP EBVs for birth weight, weaning weight direct, weaning weight maternal, and yearling weight. The genotyping was performed with the GGP Bovine 150k chip. After the quality control analysis, 105,129 autosomal SNP from 967 animals (473 Simmental and 494 Simbrah) were used to carry out genotype association tests. The two association analyses were performed per breed and using combined information of the two breeds. The SNP associated with growth traits were mapped to their corresponding genes at 100 kb on either side. Results: A difference in magnitude of posterior probabilities was found across breeds between genome and chromosome wide association analyses. A total of 110, 143, and 302 SNP were associated with GWAS and CWAS for growth traits in the Simmental-, Simbrah- and joint -data analyses, respectively. It stands out from the enrichment analysis of the pathways for RNA polymerase (POLR2G, POLR3E) and GABAergic synapse (GABRR1, GABRR3) for Simmental cattle and p53 signaling pathway (BID, SERPINB5) for Simbrah cattle. Conclusion: Only 6,265% of the markers associated with growth traits were found using CWAS and GWAS. The associated markers using the CWAS analysis, which were not associated using the GWAS, represents information that due to the model and priors was not associated with the traits.

BioSMACK: a linux live CD for genome-wide association analyses

  • Hong, Chang-Bum;Kim, Young-Jin;Moon, Sang-Hoon;Shin, Young-Ah;Go, Min-Jin;Kim, Dong-Joon;Lee, Jong-Young;Cho, Yoon-Shin
    • BMB Reports
    • /
    • v.45 no.1
    • /
    • pp.44-46
    • /
    • 2012
  • Recent advances in high-throughput genotyping technologies have enabled us to conduct a genome-wide association study (GWAS) on a large cohort. However, analyzing millions of single nucleotide polymorphisms (SNPs) is still a difficult task for researchers conducting a GWAS. Several difficulties such as compatibilities and dependencies are often encountered by researchers using analytical tools, during the installation of software. This is a huge obstacle to any research institute without computing facilities and specialists. Therefore, a proper research environment is an urgent need for researchers working on GWAS. We developed BioSMACK to provide a research environment for GWAS that requires no configuration and is easy to use. BioSMACK is based on the Ubuntu Live CD that offers a complete Linux-based operating system environment without installation. Moreover, we provide users with a GWAS manual consisting of a series of guidelines for GWAS and useful examples. BioSMACK is freely available at http://ksnp.cdc.go.kr/biosmack.

Genetic Variants Associated with Calorie and Macronutrient Intake in a Genome-Wide Association Study (열량 및 열량영양소 섭취량과 관련된 유전자 변이에 대한 전장유전체 연관성 분석연구)

  • Baik, In-Kyung;Ahn, Youn-Jhin;Lee, Seung-Ku;Kim, So-Ri-Wul;Han, Bok-Ghee;Shin, Chol
    • Journal of Nutrition and Health
    • /
    • v.43 no.4
    • /
    • pp.357-366
    • /
    • 2010
  • There has been no genome-wide association study (GWAS) for macronutrient intake as a quantitative trait. To explore genetic loci associated with total calorie and macronutrient intake, genome-wide association data of autosomal single nucleotide polymorphisms (SNPs) from Korean adults were analyzed. We conducted a GWAS in 3,690 men and women aged 40 to 60 years from an urban population-based cohort. At the baseline examination (June 18, 2001 through January 29, 2003), DNA samples of the study subjects were collected and analyzed for genotyping. The information of average daily consumption of total calorie, carbohydrate, protein, and fat was obtained from a semi-quantitative food frequency questionnaire and transformed by natural logarithm for analyses after adjustment of calorie intake. Using multivariate linear regression analysis adjusted for age, sex, and height, we tested for 352,021 SNPs and found weak associations, which do not reach genome-wide association significance, with calorie and macronutrient intake. However, a number of SNPs were found to have potential associations with macronutrient intake; in particular, signals in SORBS1 and those in PRKCB1 were likely associated with carbohydrate and fat intake, respectively. We observed an inverse association between the minor allele of the SNPs in these genes and the amount of consumption of carbohydrate or fat. Our GWAS identified loci and minor alleles weakly associated with macronutrient intake. Because SORBS1 and PRKCB1 are reportedly associated with the metabolism of glucose and lipid as well as with obesity-related diseases, further investigations on biological and functional roles of polymorphism of these genes in the relation to macronutrient intake are warranted.

Genome-wide Association Study (GWAS) and Its Application for Improving the Genomic Estimated Breeding Values (GEBV) of the Berkshire Pork Quality Traits

  • Lee, Young-Sup;Jeong, Hyeonsoo;Taye, Mengistie;Kim, Hyeon Jeong;Ka, Sojeong;Ryu, Youn-Chul;Cho, Seoae
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.28 no.11
    • /
    • pp.1551-1557
    • /
    • 2015
  • The missing heritability has been a major problem in the analysis of best linear unbiased prediction (BLUP). We introduced the traditional genome-wide association study (GWAS) into the BLUP to improve the heritability estimation. We analyzed eight pork quality traits of the Berkshire breeds using GWAS and BLUP. GWAS detects the putative quantitative trait loci regions given traits. The single nucleotide polymorphisms (SNPs) were obtained using GWAS results with p value <0.01. BLUP analyzed with significant SNPs was much more accurate than that with total genotyped SNPs in terms of narrow-sense heritability. It implies that genomic estimated breeding values (GEBVs) of pork quality traits can be calculated by BLUP via GWAS. The GWAS model was the linear regression using PLINK and BLUP model was the G-BLUP and SNP-GBLUP. The SNP-GBLUP uses SNP-SNP relationship matrix. The BLUP analysis using preprocessing of GWAS can be one of the possible alternatives of solving the missing heritability problem and it can provide alternative BLUP method which can find more accurate GEBVs.

Genome-Wide Association Studies of the Korea Association REsource (KARE) Consortium

  • Hong, Kyung-Won;Kim, Hyung-Lae;Oh, Berm-Seok
    • Genomics & Informatics
    • /
    • v.8 no.3
    • /
    • pp.101-102
    • /
    • 2010
  • During the last decade, large community cohorts have been established by the Korea National Institutes of Health (KNIH), and enormous epidemiological and clinical data have been accumulated. Using these information and samples in the cohorts, KNIH set out to do a large-scale genome-wide association study (GWAS) in 2007, and the Korea Association REsource (KARE) consortium was launched to analyze the data to identify the underlying genetic risk factors of diseases and diverse health indexes, such as blood pressure, obesity, bone density, and blood biochemical traits. The consortium consisted of 6 research divisions, formed by 25 principal investigators in 19 organizations, including 18 universities, 2 institutes, and 1 company. Each division focused on one of the following subjects: the identification of genetic factors, the statistical analysis of gene-gene interactions, the genetic epidemiology of gene-environment interactions, copy number variation, the bioinformatics related to a GWAS, and a GWAS of nutrigenomics. In this special issue, the study results of the KARE consortium are provided as 9 articles. We hope that this special issue might encourage the genomics community to share data and scientists, including clinicians, to analyze the valuable Korean data of KARE.

Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index

  • Bae, Sunghwan;Choi, Sungkyoung;Kim, Sung Min;Park, Taesung
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.149-159
    • /
    • 2016
  • With the success of the genome-wide association studies (GWASs), many candidate loci for complex human diseases have been reported in the GWAS catalog. Recently, many disease prediction models based on penalized regression or statistical learning methods were proposed using candidate causal variants from significant single-nucleotide polymorphisms of GWASs. However, there have been only a few systematic studies comparing existing methods. In this study, we first constructed risk prediction models, such as stepwise linear regression (SLR), least absolute shrinkage and selection operator (LASSO), and Elastic-Net (EN), using a GWAS chip and GWAS catalog. We then compared the prediction accuracy by calculating the mean square error (MSE) value on data from the Korea Association Resource (KARE) with body mass index. Our results show that SLR provides a smaller MSE value than the other methods, while the numbers of selected variables in each model were similar.

Beta-Meta: a meta-analysis application considering heterogeneity among genome-wide association studies

  • Gyungbu Kim;Yoonsuk Lee;Jeong Ho Park;Dongmin Kim;Wonseok Lee
    • Genomics & Informatics
    • /
    • v.20 no.4
    • /
    • pp.49.1-49.7
    • /
    • 2022
  • Many packages for a meta-analysis of genome-wide association studies (GWAS) have been developed to discover genetic variants. Although variations across studies must be considered, there are not many currently-accessible packages that estimate between-study heterogeneity. Thus, we propose a python based application called Beta-Meta which can easily process a meta-analysis by automatically selecting between a fixed effects and a random effects model based on heterogeneity. Beta-Meta implements flexible input data manipulation to allow multiple meta-analyses of different genotype-phenotype associations in a single process. It provides a step-by-step meta-analysis of GWAS for each association in the following order: heterogeneity test, two different calculations of an effect size and a p-value based on heterogeneity, and the Benjamini-Hochberg p-value adjustment. These methods enable users to validate the results of individual studies with greater statistical power and better estimation precision. We elaborate on these and illustrate them with examples from several studies of infertility-related disorders.