• Title/Summary/Keyword: Genome Wide Association

Search Result 336, Processing Time 0.026 seconds

Copy Number Variations in the Human Genome: Potential Source for Individual Diversity and Disease Association Studies

  • Kim, Tae-Min;Yim, Seon-Hee;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • v.6 no.1
    • /
    • pp.1-7
    • /
    • 2008
  • The widespread presence of large-scale genomic variations, termed copy number variation (CNVs), has been recently recognized in phenotypically normal individuals. Judging by the growing number of reports on CNVs, it is now evident that these variants contribute significantly to genetic diversity in the human genome. Like single nucleotide polymorphisms (SNPs), CNVs are expected to serve as potential biomarkers for disease susceptibility or drug responses. However, the technical and practical concerns still remain to be tackled. In this review, we examine the current status of CNV DBs and research, including the ongoing efforts of CNV screening in the human genome. We also discuss the characteristics of platforms that are available at the moment and suggest the potential of CNVs in clinical research and application.

Comparison of the Affymetrix SNP Array 5.0 and Oligoarray Platforms for Defining CNV

  • Kim, Ji-Hong;Jung, Seung-Hyun;Hu, Hae-Jin;Yim, Seon-Hee;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • v.8 no.3
    • /
    • pp.138-141
    • /
    • 2010
  • Together with single nucleotide polymorphism (SNP), copy number variations (CNV) are recognized to be the major component of human genetic diversity and used as a genetic marker in many disease association studies. Affymetrix Genome-wide SNP 5.0 is one of the commonly used SNP array platforms for SNP-GWAS as well as CNV analysis. However, there has been no report that validated the accuracy and reproducibility of CNVs identified by Affymetrix SNP array 5.0. In this study, we compared the characteristics of CNVs from the same set of genomic DNAs detected by three different array platforms; Affymetrix SNP array 5.0, Agilent 2X244K CNV array and NimbleGen 2.1M CNV array. In our analysis, Affymetrix SNP array 5.0 seems to detect CNVs in a reliable manner, which can be applied for association studies. However, for the purpose of defining CNVs in detail, Affymetrix Genome-wide SNP 5.0 might be relatively less ideal than NimbleGen 2.1M CNV array and Agilent 2X244K CNV array, which outperform Affymetrix array for defining the small-sized single copy variants. This result will help researchers to select a suitable array platform for CNV analysis.

Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

  • Park, Chanwoo;Jiang, Nan;Park, Taesung
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.47.1-47.12
    • /
    • 2019
  • The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.

Network Graph Analysis of Gene-Gene Interactions in Genome-Wide Association Study Data

  • Lee, Sungyoung;Kwon, Min-Seok;Park, Taesung
    • Genomics & Informatics
    • /
    • v.10 no.4
    • /
    • pp.256-262
    • /
    • 2012
  • Most common complex traits, such as obesity, hypertension, diabetes, and cancers, are known to be associated with multiple genes, environmental factors, and their epistasis. Recently, the development of advanced genotyping technologies has allowed us to perform genome-wide association studies (GWASs). For detecting the effects of multiple genes on complex traits, many approaches have been proposed for GWASs. Multifactor dimensionality reduction (MDR) is one of the powerful and efficient methods for detecting high-order gene-gene ($G{\times}G$) interactions. However, the biological interpretation of $G{\times}G$ interactions identified by MDR analysis is not easy. In order to aid the interpretation of MDR results, we propose a network graph analysis to elucidate the meaning of identified $G{\times}G$ interactions. The proposed network graph analysis consists of three steps. The first step is for performing $G{\times}G$ interaction analysis using MDR analysis. The second step is to draw the network graph using the MDR result. The third step is to provide biological evidence of the identified $G{\times}G$ interaction using external biological databases. The proposed method was applied to Korean Association Resource (KARE) data, containing 8838 individuals with 327,632 single-nucleotide polymorphisms, in order to perform $G{\times}G$ interaction analysis of body mass index (BMI). Our network graph analysis successfully showed that many identified $G{\times}G$ interactions have known biological evidence related to BMI. We expect that our network graph analysis will be helpful to interpret the biological meaning of $G{\times}G$ interactions.

Short Reads Phasing to Construct Haplotypes in Genomic Regions That Are Associated with Body Mass Index in Korean Individuals

  • Lee, Kichan;Han, Seonggyun;Tark, Yeonjeong;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.12 no.4
    • /
    • pp.165-170
    • /
    • 2014
  • Genome-wide association (GWA) studies have found many important genetic variants that affect various traits. Since these studies are useful to investigate untyped but causal variants using linkage disequilibrium (LD), it would be useful to explore the haplotypes of single-nucleotide polymorphisms (SNPs) within the same LD block of significant associations based on high-density variants from population references. Here, we tried to make a haplotype catalog affecting body mass index (BMI) through an integrative analysis of previously published whole-genome next-generation sequencing (NGS) data of 7 representative Korean individuals and previously known Korean GWA signals. We selected 435 SNPs that were significantly associated with BMI from the GWA analysis and searched 53 LD ranges nearby those SNPs. With the NGS data, the haplotypes were phased within the LDs. A total of 44 possible haplotype blocks for Korean BMI were cataloged. Although the current result constitutes little data, this study provides new insights that may help to identify important haplotypes for traits and low variants nearby significant SNPs. Furthermore, we can build a more comprehensive catalog as a larger dataset becomes available.

Identification of SNPs Related to 19 Phenotypic Traits Using Genome-wide Association Study (GWAS) Approach in Korean Wheat Mini-core Collection

  • Yuna Kang;Yeonjun Sung;Seonghyeon Kim;Changsoo Kim
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2020.06a
    • /
    • pp.120-120
    • /
    • 2020
  • Based on the simple sequence repeat (SSR) marker, a Korean wheat core collection were established with 616 wheat accessions. Among them, the SNP genotyping for the entire genome was performed using DNA chip array to clarify the whole genome SNP profiles. Consequently, a total of 35,143 SNPs were found and we re-established a mini-core collection with 247 accessions. Population diversity and phylogenetic analysis revealed genetic diversity and relationships from the mini core set. In addition, genome-wide association study (GWAS) was performed on 19 phenotypic traits; ear type, awn length, culm length, ear length, awn color, seed coat color, culm color, ear color, loading, leaf length, leaf width, seeding stand, cold damage, weight, auricle, plant type, heading stage, maturation period, upright habit, and degree of flag leaf. The GWAS was performed using the fixed and random model circulating probability unification (FarmCPU), which identified 14 to 258 SNP loci related to 19 phenotypic traits. Our study indicates that this Korean wheat mini-core collection is a set of germplasm useful for basic and applied research with the aim of understanding and exploiting the genetic diversity of Korean wheat varieties.

  • PDF

Adjusting sampling bias in case-control genetic association studies

  • Seo, Geum Chu;Park, Taesung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1127-1135
    • /
    • 2014
  • Genome-wide association studies (GWAS) are designed to discover genetic variants such as single nucleotide polymorphisms (SNPs) that are associated with human complex traits. Although there is an increasing interest in the application of GWAS methodologies to population-based cohorts, many published GWAS have adopted a case-control design, which raise an issue related to a sampling bias of both case and control samples. Because of unequal selection probabilities between cases and controls, the samples are not representative of the population that they are purported to represent. Therefore, non-random sampling in case-control study can potentially lead to inconsistent and biased estimates of SNP-trait associations. In this paper, we proposed inverse-probability of sampling weights based on disease prevalence to eliminate a case-control sampling bias in estimation and testing for association between SNPs and quantitative traits. We apply the proposed method to a data from the Korea Association Resource project and show that the standard estimators applied to the weighted data yield unbiased estimates.

Replication of genome-wide association studies on asthma and allergic diseases in Korean adult population

  • Yoon, Dan-Kyu;Ban, Hyo-Jeong;Kim, Young-Jin;Kim, Eun-Jin;Kim, Hyung-Cheol;Han, Bok-Ghee;Park, Jung-Won;Hong, Soo-Jong;Cho, Sang-Heon;Park, Kie-Jung;Lee, Joo-Shil
    • BMB Reports
    • /
    • v.45 no.5
    • /
    • pp.305-310
    • /
    • 2012
  • Allergic diseases such as asthma, allergic rhinitis, and atopic dermatitis are heterogeneous diseases characterized by multiple symptoms and phenotypes. Recent advancements in genetic study enabled us to identify disease associated genetic factors. Numerous genome-wide association studies (GWAS) have revealed multiple associated loci for allergic diseases. However, the majority of previous studies have been conducted in populations of European ancestry. Moreover, the associations of single nucleotide polymorphisms (SNPs) with allergic diseases have not been studied amongst the large-scale general Korean population. Herein, we performed the replication study to validate the previous variants, known to be associated with allergic diseases, in the Korean population. In this study, we categorized three allergic related phenotypes, one allergy and two asthma related phenotypes, based on self-reports of physician diagnosis and their symptoms from 8,842 samples. As a result, we found nominally significant associations of 6 SNPs with at least one allergic related phenotype in the Korean population.

Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS

  • Kwon, Ji-Sun;Kim, Ji-Hye;Nam, Doug-U;Kim, Sang-Soo
    • Genomics & Informatics
    • /
    • v.10 no.2
    • /
    • pp.123-127
    • /
    • 2012
  • Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.

Genome-Wide Association Study between Copy Number Variation and Trans-Gene Expression by Protein-Protein Interaction-Network (단백질 상호작용 네트워크를 통한 유전체 단위반복변이와 트랜스유전자 발현과의 연관성 분석)

  • Park, Chi-Hyun;Ahn, Jae-Gyoon;Yoon, Young-Mi;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.18D no.2
    • /
    • pp.89-100
    • /
    • 2011
  • The CNV (Copy Number Variation) which is one of the genetic structural variations in human genome is closely related with the function of gene. In particular, the genome-wide association studies for genetic diseased persons have been researched. However, there have been few studies which infer the genetic function of CNV with normal human. In this paper, we propose the analysis method to reveal the functional relationship between common CNV and genes without considering their genomic loci. To achieve that, we propose the data integration method for heterogeneity biological data and novel measurement which can calculate the correlation between common CNV and genes. To verify the significance of proposed method, we has experimented several verification tests with GO database. The result showed that the novel measurement had enough significance compared with random test and the proposed method could systematically produce the candidates of genetic function which have strong correlation with common CNV.