• Title/Summary/Keyword: PLINK

Search Result 14, Processing Time 0.024 seconds

A Scheme for Filtering SNPs Imputed in 8,842 Korean Individuals Based on the International HapMap Project Data

  • Lee, Ki-Chan;Kim, Sang-Soo
    • Genomics & Informatics
    • /
    • v.7 no.2
    • /
    • pp.136-140
    • /
    • 2009
  • Genome-wide association (GWA) studies may benefit from the inclusion of imputed SNPs into their dataset. Due to its predictive nature, the imputation process is typically not perfect. Thus, it would be desirable to develop a scheme for filtering out the imputed SNPs by maximizing the concordance with the observed genotypes. We report such a scheme, which is based on the combination of several parameters that are calculated by PLINK, a popular GWA analysis software program. We imputed the genotypes of 8,842 Korean individuals, based on approximately 2 million SNP genotypes of the CHB+JPT panel in the International HapMap Project Phase II data, complementing the 352k SNPs in the original Affymetrix 5.0 dataset. A total of 333,418 SNPs were found in both datasets, with a median concordance rate of 98.7%. The concordance rates were calculated at different ranges of parameters, such as the number of proxy SNPs (NPRX), the fraction of successfully imputed individuals (IMPUTED), and the information content (INFO). The poor concordance that was observed at the lower values of the parameters allowed us to develop an optimal combination of the cutoffs (IMPUTED${\geq}$0.9 and INFO${\geq}$0.9). A total of 1,026,596 SNPs passed the cutoff, of which 94,364 were found in both datasets and had 99.4% median concordance. This study illustrates a conservative scheme for filtering imputed SNPs that would be useful in GWA studies.

The Effect of Increasing Control-to-case Ratio on Statistical Power in a Simulated Case-control SNP Association Study

  • Kang, Moon-Su;Choi, Sun-Hee;Koh, In-Song
    • Genomics & Informatics
    • /
    • v.7 no.3
    • /
    • pp.148-151
    • /
    • 2009
  • Generally, larger sample size leads to a greater statistical power to detect a significant difference. We may increase the sample size for both case and control in order to obtain greater power. However, it is often the case that increasing sample size for case is not feasible for a variety of reasons. In order to look at change in power as the ratio of control to case varies (1:1 to 4:1), we conduct association tests with simulated data generated by PLINK. The simulated data consist of 50 disease SNPs and 300 non-disease SNPs and we compute powers for disease SNPs. Genetic Power Calculator was used for computing powers with varying the ratio of control to case (1:1, 2:1, 3:1, 4:1). In this study, we show that gains in statistical power resulting from increasing the ratio of control to case are substantial for the simulated data. Similar results might be expected for real data.

Genome-wide association study on immune-response for improving healthiness in Holstein dairy cattle (Holstein 젖소의 호흡기 질병 백신에 대한 면역반응성과 전장 유전체 연관 분석 연구)

  • Ha, Seungmin;Lee, Donghui;Lee, Sangmyeong;Chae, Jungil;Seo, Kangseok
    • Korean Journal of Veterinary Service
    • /
    • v.42 no.4
    • /
    • pp.217-225
    • /
    • 2019
  • To detect Single nucleotide polymorphisms (SNP) markers associated with Bovine viral diarrhea virus (BVDV) and Bovine respiratory syncytial virus (BRSV) S/P ratio in Korean Holstein dairy cattle, Genome-wide association study (GWAS) was performed using Illumina BovineSNP50 Beadchip. The number of phenotype data and genotype data were 107, and 294. respectively. Phenotype data were collected for four periods (0 week, 1 week, 4 week, 24 week) after having vaccinated (0 week no vaccinated period). A total of 36,257 SNPs was remained after quality control had been done by PLINK. The result of GWAS showed 6 SNP markers (BTB-01704243, BTB-01594395, ARS-BFGL-NGS-118070, ARS-BFGL-NGS-111365, BTA-65410-no-rs, Hapmap38331-BTA-61256) under BVDV and 4 SNP markers (ARS-BFGL-NGS-109861, Hapmap53701-rs29017064, ARS-BFGL-NGS-71055, BTA-11232-no-rs) under BRSV. And also, 10 candidate genes found through 10 SNP markers (TBX18, CEP162, PAFAH1B1, METTL16, BRCA1, RND2, POLK, ENSBTAG00000051724, ADAM18, NRG3).

Effects of Single Nucleotide Polymorphism Marker Density on Haplotype Block Partition

  • Kim, Sun Ah;Yoo, Yun Joo
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.196-204
    • /
    • 2016
  • Many researchers have found that one of the most important characteristics of the structure of linkage disequilibrium is that the human genome can be divided into non-overlapping block partitions in which only a small number of haplotypes are observed. The location and distribution of haplotype blocks can be seen as a population property influenced by population genetic events such as selection, mutation, recombination and population structure. In this study, we investigate the effects of the density of markers relative to the full set of all polymorphisms in the region on the results of haplotype partitioning for five popular haplotype block partition methods: three methods in Haploview (confidence interval, four gamete test, and solid spine), MIG++ implemented in PLINK 1.9 and S-MIG++. We used several experimental datasets obtained by sampling subsets of single nucleotide polymorphism (SNP) markers of chromosome 22 region in the 1000 Genomes Project data and also the HapMap phase 3 data to compare the results of haplotype block partitions by five methods. With decreasing sampling ratio down to 20% of the original SNP markers, the total number of haplotype blocks decreases and the length of haplotype blocks increases for all algorithms. When we examined the marker-independence of the haplotype block locations constructed from the datasets of different density, the results using below 50% of the entire SNP markers were very different from the results using the entire SNP markers. We conclude that the haplotype block construction results should be used and interpreted carefully depending on the selection of markers and the purpose of the study.

Genome-wide Association Study (GWAS) and Its Application for Improving the Genomic Estimated Breeding Values (GEBV) of the Berkshire Pork Quality Traits

  • Lee, Young-Sup;Jeong, Hyeonsoo;Taye, Mengistie;Kim, Hyeon Jeong;Ka, Sojeong;Ryu, Youn-Chul;Cho, Seoae
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.28 no.11
    • /
    • pp.1551-1557
    • /
    • 2015
  • The missing heritability has been a major problem in the analysis of best linear unbiased prediction (BLUP). We introduced the traditional genome-wide association study (GWAS) into the BLUP to improve the heritability estimation. We analyzed eight pork quality traits of the Berkshire breeds using GWAS and BLUP. GWAS detects the putative quantitative trait loci regions given traits. The single nucleotide polymorphisms (SNPs) were obtained using GWAS results with p value <0.01. BLUP analyzed with significant SNPs was much more accurate than that with total genotyped SNPs in terms of narrow-sense heritability. It implies that genomic estimated breeding values (GEBVs) of pork quality traits can be calculated by BLUP via GWAS. The GWAS model was the linear regression using PLINK and BLUP model was the G-BLUP and SNP-GBLUP. The SNP-GBLUP uses SNP-SNP relationship matrix. The BLUP analysis using preprocessing of GWAS can be one of the possible alternatives of solving the missing heritability problem and it can provide alternative BLUP method which can find more accurate GEBVs.

A Pilot Genome-wide Association Study of Breast Cancer Susceptibility Loci in Indonesia

  • Haryono, Samuel J;Datasena, I Gusti Bagus;Santosa, Wahyu Budi;Mulyarahardja, Raymond;Sari, Kartika
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.6
    • /
    • pp.2231-2235
    • /
    • 2015
  • Genome-wide association studies (GWASs) of the entire genome provide a systematic approach for revealing novel genetic susceptibility loci for breast cancer. However, genetic association studies have hitherto been primarily conducted in women of European ancestry. Therefofre we here performed a pilot GWAS with a single nucleotide polymorphism (SNP) array 5.0 platform from $Affymetrix^{(R)}$ that contains 443,813 SNPs to search for new genetic risk factors in 89 breast cancer cases and 46 healthy women of Indonesian ancestry. The case-control association of the GWAS finding set was evaluated using PLINK. The strengths of allelic and genotypic associations were assessed using logistic regression analysis and reported as odds ratios (ORs) and P values; P values less than $1.00{\times}10^{-8}$ and $5.00{\times}10^{-5}$ were required for significant association and suggestive association, respectively. After analyzing 292,887 SNPs, we recognized 11 chromosome loci that possessed suggestive associations with breast cancer risk. Of these, however, there were only four chromosome loci with identified genes: chromosome 2p.12 with the CTNNA2 gene [Odds ratio (OR)=1.20, 95% confidence interval (CI)=1.13-1.33, $P=1.08{\times}10^{-7}$]; chromosome 18p11.2 with the SOGA2 gene (OR=1.32, 95%CI=1.17-1.44, $P=6.88{\times}10^{-6}$); chromosome 5q14.1 with the SSBP2 gene (OR=1.22, 95%CI=1.11-1.34, $P=4.00{\times}10^{-5}$); and chromosome 9q31.1 with the TEX10 gene (OR=1.24, 95%CI=1.12-1.35, $P=4.68{\times}10^{-5}$). This study identified 11 chromosome loci which exhibited suggestive associations with the risk of breast cancer among Indonesian women.

No Association between Single Nucleotide Polymorphisms in Urocanase Domain Containing 1 (UROC1) and Autism Spectrum Disorders (ASDs) in the Korean Population (한국인 자폐스펙트럼장애와 UROC1 유전자의 연관성 분석)

  • Park, Jung-Won;Ro, Myung-Ja;Nam, Min;Bang, Hee-Jung;Yang, Jae-Won;Choi, Kyung-Sik;Kim, Su-Kang;Chung, Joo-Ho;Kwack, Kyu-Bum
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.23 no.1
    • /
    • pp.8-13
    • /
    • 2012
  • Objectives : Urocanase domain containing 1 (UROC1) has never been studied in prior studies on autism spectrum disorders (ASDs). UROC1 causes urocanic aciduria, one of the symptoms of which is mental retardation. The aim of this study was to investigate the association between the UROC1 gene and ASDs in a Korean population. Methods : A total of 258 control and 214 patients with ASD were used as subjects of this study. SNPs selected from UROC1 were genotyped using Illumina Golden-Gate Genotyping assay with VeraCode$^{(R)}$ technology. Statistical analysis was performed using SAS and Plink software. Results : We found no association of the 12 SNPs in the UROC1 gene with ASDs in a Korean population. Conclusion : Our study suggests that the 12 SNPs (11 SNPs and 1 SNP in the intron and 3'UTR region, respectively) in the UROC1 were not associated with ASDs in a Korean population. Further study on the exon region of UROC1 is needed.

Genome-Wide Association Studies Associated with Backfat Thickness in Landrace and Yorkshire Pigs

  • Lee, Young-Sup;Shin, Donghyun
    • Genomics & Informatics
    • /
    • v.16 no.3
    • /
    • pp.59-64
    • /
    • 2018
  • Although pork quality traits are important commercially, genome-wide association studies (GWASs) have not well considered Landrace and Yorkshire pigs worldwide. Landrace and Yorkshire pigs are important pork-providing breeds. Although quantitative trait loci of pigs are well-developed, significant genes in GWASs of pigs in Korea must be studied. Through a GWAS using the PLINK program, study of the significant genes in Korean pigs was performed. We conducted a GWAS and surveyed the gene ontology (GO) terms associated with the backfat thickness (BF) trait of these pigs. We included the breed information (Yorkshire and Landrace pigs) as a covariate. The significant genes after false discovery rate (<0.01) correction were AFG1L, SCAI, RIMS1, and SPDEF. The major GO terms for the top 5% of genes were related to neuronal genes, cell morphogenesis and actin cytoskeleton organization. The neuronal genes were previously reported as being associated with backfat thickness. However, the genes in our results were novel, and they included ZNF280D, BAIAP2, LRTM2, GABRA5, PCDH15, HERC1, DTNBP1, SLIT2, TRAPPC9, NGFR, APBB2, RBPJ, and ABL2. These novel genes might have roles in important cellular and physiological functions related to BF accumulation. The genes related to cell morphogenesis were NOX4, MKLN1, ZNF280D, BAIAP2, DNAAF1, LRTM2, PCDH15, NGFR, RBPJ, MYH9, APBB2, DTNBP1, TRIM62, and SLIT2. The genes that belonged to actin cytoskeleton organization were MKLN1, BAIAP2, PCDH15, BCAS3, MYH9, DTNBP1, ABL2, ADD2, and SLIT2.

Interaction between Smoking and the STAB2 Gene in the Severity of Rheumatoid Arthritis

  • Min, Jin-Young;Min, Kyoung-Bok;Sung, Joo-Hon;Cho, Sung-Il
    • Genomics & Informatics
    • /
    • v.7 no.1
    • /
    • pp.20-25
    • /
    • 2009
  • Rheumatoid arthritis (RA) is a chronic autoimmune disorder that is characterized by inflammation of the synovial tissue and deterioration of the joint and bone. A recent study reported a potential gene-environment interaction between HLA-DR and smoking. The present study investigated whether a specific gene was related to the association between smoking and the severity of RA (rheumatoid factor levels > 20 IU/ml). We used the resources of the NARAC family collection of GAW 15 databases, and 1139 subjects with RF>20 IU/ml were included in the current analysis. The linkage panel contained 5858 SNP markers, and 5744 SNPs passed quality control criteria. Linear regression analyses, using PLINK software and generalized estimating equation regression models, were used to test for associations between the SNPs and the severity of RA according to smoking groups. Two major findings were established. First, the severity of RA in smokers was associated with rs703618 (p=$6{\times}10^{-5}$), which lies in the intronic region of the stabilin 2 (STAB2) gene on chromosome 12. Second, there were significant differences in the levels of RF between 'ever smokers' and 'never smokers' according to the rs703618 genotype (G/G, A/G, A/A). We investigated whether a specific gene acts as a mediator between smoking and the severity of RA and found that the STAB2 gene could affect this relationship. Our finding indicates that smoking may mediate RA severity by affecting the expression level of a specific gene.

Genomic diversity and admixture patterns among six Chinese indigenous cattle breeds in Yunnan

  • Li, Rong;Li, Chunqing;Chen, Hongyu;Liu, Xuehong;Xiao, Heng;Chen, Shanyuan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.32 no.8
    • /
    • pp.1069-1076
    • /
    • 2019
  • Objective: Yunnan is not only a frontier zone that connects China with South and Southeast Asia, but also represents an admixture zone between taurine (Bos taurus) and zebu (Bos indicus) cattle. The purpose of this study is to understand the level of genomic diversity and the extent of admixture in each Yunnan native cattle breed. Methods: All 120 individuals were genotyped using Illumina BovineHD BeadChip (777,962 single nucleotide polymorphisms [SNPs]). Quality control and genomic diversity indexes were calculated using PLINK software. The principal component analysis (PCA) was assessed using SMARTPCA program implemented in EIGENSOFT software. The ADMIXTURE software was used to reveal admixture patterns among breeds. Results: A total of 604,630 SNPs was obtained after quality control procedures. Among six breeds, the highest level of mean heterozygosity was found in Zhaotong cattle from Northeastern Yunnan, whereas the lowest level of heterozygosity was detected in Dehong humped cattle from Western Yunnan. The PCA based on a pruned dataset of 233,788 SNPs clearly separated Dehong humped cattle (supposed to be a pure zebu breed) from other five breeds. The admixture analysis further revealed two clusters (K = 2 with the lowest cross validation error), corresponding to taurine and zebu cattle lineages. All six breeds except for Dehong humped cattle showed different degrees of admixture between taurine and zebu cattle. As expected, Dehong humped cattle showed no signature of taurine cattle influence. Conclusion: Overall, considerable genomic diversity was found in six Yunnan native cattle breeds except for Dehong humped cattle from Western Yunnan. Dehong humped cattle is a pure zebu breed, while other five breeds had admixed origins with different extents of admixture between taurine and zebu cattle. Such admixture by crossbreeding between zebu and taurine cattle facilitated the spread of zebu cattle from tropical and subtropical regions to other highland regions in Yunnan.