선형혼합모형의 역할 및 활용사례: 유전역학 분석을 중심으로

Linear Mixed Models in Genetic Epidemiological Studies and Applications

  • 투고 : 2015.03.23
  • 심사 : 2015.03.30
  • 발행 : 2015.04.30


지난 수십 년 동안 유전형 기술(genotyping technology)의 발달로 개인별 유전자 정보를 얻기 위해 필요한 비용이 감소함에 따라, 다양한 인간 질병의 원인 유전자를 규명하기 위한 많은 유전역학 연구들이 진행되어 왔다. 예를 들어 전장유전체관련분석(genome-wide association studies)은 수백 개에 이르는 표현형(phenotypes)에 대하여 수천 개에 이르는 원인유전자를 규명하였다. 유전체 자료의 홍수로 인하여 대규모 유전체 자료를 분석할 수 있는 다양한 분석 알고리즘에 개발되었으며, 특별히 선형혼합모형은 유전율의 추정부터 관련분석(association studies)에 이르기까지 유전역학 연구에서 광범위하게 활용되고 방법론이었다. 본 논문에서는 유전역학 연구에 있어 빈번하게 활용되는 선형혼합모형의 활용 사례를 나열하고, 각 분석 모형 별 추정치들의 생물학적 의미를 논하고자 한다.

We have experienced a substantial improvement in and cost-drop for genotyping that enables genetic epidemiological studies with large-scale genetic data. Genome-wide association studies have identified more than ten thousand causal variants. Many statistical methods based on linear mixed models have been developed for various goals such as estimating heritability and identifying disease susceptibility locus. Empirical results also repeatedly stress the importance of linear mixed models. Therefore, we review the statistical methods related with to linear mixed models and illustrate the meaning of their estimates.



  1. Abecasis, G. R., Cherny, S. S., Cookson, W. O. and Cardon, L. R. (2002). Merlin-rapid analysis of dense genetic maps using sparse gene flow trees, Nature Genetics, 30, 97-101.
  2. Almasy, L. and Blangero, J. (1998). Multipoint quantitative-trait linkage analysis in general pedigrees, American Journal of Human Genetics, 62, 1198-1211.
  3. Aulchenko, Y. S., de Koning, D. J. and Haley, C. (2007a). Genomewide rapid association using mixed model and regression: A fast and simple method for genomewide pedigree-based quantitative trait loci association analysis, Genetics, 177, 577-585.
  4. Aulchenko, Y. S., Ripke, S., Isaacs, A. and Van Duijn, C. M. (2007b). GenABEL: An R library for genome-wide association analysis, Bioinformatics, 23, 1294-1296.
  5. Chen, W. M. and Abecasis, G. R. (2006). Estimating the power of variance component linkage analysis in large pedigrees, Genet Epidemiol, 30, 471-484.
  6. Corbeil, R. R. and Searle, S. R. (1976). Restricted Maximum Likelihood (REML) Estimation of Variance Components in Mixed Model, Technometrics, 18, 31-38.
  7. Elston, R. C. and Gray-McGuire, C. (2004). A review of the 'Statistical Analysis for Genetic Epidemiology' (S.A.G.E.) software package, Hum Genomics, 1, 456-459.
  8. Falconer, D. S. (1989). Introduction to Quantitative Genetics, (3rd ed.), Burnt Mill, Harlow, Essex, England.
  9. George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling, Journal of the American Statistical Association, 88, 881-889.
  10. Gilmour, A. R., Thompson, R. and Cullis, B. R. (1995). Average information REML: An efficient algorithm for variance parameter estimation in linear mixed models, Biometrics, 51, 1440-1450.
  11. Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S. and Manolio, T. A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits, Proceedings of the National Academy of Sciences of the United States of America, 106, 9362-9367.
  12. Kang, H. M., Sul, J. H., Service, S. K., Zaitlen, N. A., Kong, S. Y., Freimer, N. B., Sabatti, C. and Eskin, E. (2010). Variance component model to account for sample structure in genome-wide association studies, Nature Genetics, 42, 348-U110.
  13. Kang, H. M., Ye, C. and Eskin, E. (2008a). Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots, Genetics, 180, 1909-1925.
  14. Kang, H. M., Zaitlen, N. A., Wade, C. M., Kirby, A., Heckerman, D., Daly, M. J. and Eskin, E. (2008b). Efficient control of population structure in model organism association mapping, Genomics, 178, 1709-1723.
  15. Kenward, M. G. and Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood, Biometrics, 53, 983-997.
  16. Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J. Y., Sackler, R. S., Haynes, C., Henning, A. K., SanGiovanni, J. P., Mane, S. M., Mayne, S. T., Bracken, M. B., Ferris, F. L., Ott, J., Barnstable, C. and Hoh, J. (2005). Complement factor H polymorphism in age-related macular degeneration, Science, 308, 385-389.
  17. Korte, A., Vilhjalmsson, B. J., Segura, V., Platt, A., Long, Q. and Nordborg, M. (2012). A mixed-model approach for genome-wide association studies of correlated traits in structured populations, Nature Genetics, 44, 1066-+.
  18. Lee, S. H., Wray, N. R., Goddard, M. E. and Visscher, P. M. (2011). Estimating missing heritability for disease from genome-wide association studies, American Journal of Human Genetics, 88, 294-305.
  19. Lim, J., Sung, J. and Won, S. (2014). Efficient strategy for the genetic analysis of related samples with a linear mixed model, Journal of the Korean Data and Information Science Society, 25, 1025-1038.
  20. Lippert, C., Listgarten, J., Liu, Y., Kadie, C. M., Davidson, R. I. and Heckerman, D. (2011). FaST linear mixed models for genome-wide association studies, Nature Methods, 8, 833-U894.
  21. Listgarten, J., Kadie, C., Schadt, E. E. and Heckerman, D. (2010). Correction for hidden confounders in the genetic analysis of gene expression, Proceedings of the National Academy of Sciences of the United States of America, 107, 16465-16470.
  22. Lynch, M. and Walsh, B. (1998). Genetics and Analysis of Quantitative Traits, Sunderland, Mass.: Sinauer.
  23. Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M. I., Ramos, E. M., Cardon, L. R., Chakravarti, A., Cho, J. H., Guttmacher, A. E., Kong, A., Kruglyak, L., Mardis, E., Rotimi, C. N., Slatkin, M., Valle, D., Whittemore, A. S., Boehnke, M., Clark, A. G., Eichler, E. E., Gibson, G., Haines, J. L., Mackay, T. F., McCarroll, S. A. and Visscher, P. M. (2009). Finding the missing heritability of complex diseases, Nature, 461, 747-753.
  24. Martin, E. R., Bass, M. P., Hauser, E. R. and Kaplan, N. L. (2003). Accounting for linkage in family-based tests of association with missing parental genotypes, American Journal of Human Genetics, 73, 1016-1026.
  25. Ott, J. (1999). Analysis of Human Genetic Linkage, (3rd ed.), Baltimore: Johns Hopkins University Press.
  26. Ott, J., Kamatani, Y. and Lathrop, M. (2011). Family-based designs for genome-wide association studies, Nature Reviews Genetics, 12, 465-474.
  27. Ott, J., Schrott, H. G., Goldstei, J. l., Hazzard, W. R., Allen, F. H., Falk, C. T. and Motulsky, A. G. (1974). Linkage studies in a large kindred with familial hypercholesterolemia, American Journal of Human Genetics, 26, 598-603.
  28. Posthuma, D. and Boomsma, D. I. (2005). Mx scripts library: Structural equation modeling scripts for twin and family data, Behavior Genetics, 35, 499-505.
  29. Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A. and Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies, Nature Genetics, 38, 904-909.
  30. Price, A. L., Zaitlen, N. A., Reich, D. and Patterson, N. (2010). New approaches to population stratification in genome-wide association studies, Nature Reviews Genetics, 11, 459-463.
  31. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A. R., Bender, D., Sklar, P., de Bakker, P. I., Daly, M. J. and Sham, P. C. (2007). PLINK: A tool set for whole-genome association and populationbased linkage analyses, American Journal of Human Genetics, 81, 559-575.
  32. Rabinowitz, D. and Laird, N. (2000). A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information, Human Heredity, 50, 211-223.
  33. Risch, N. and Merikangas, K. (1996). The future of genetic studies of complex human diseases, Science, 273, 1516-1517.
  34. Smyth, G. K. and Verbyla, A. P. (1996). A conditional likelihood approach to residual maximum likelihood estimation in generalized linear models, Journal of the Royal Statistical Society Series B-Methodological, 58, 565-572.
  35. Tang, H., Quertermous, T., Rodriguez, B., Kardia, S. L. R., Zhu, X. F., Brown, A., Pankow, J. S., Province, M. A., Hunt, S. C., Boerwinkle, E., Schork, N. J. and Risch, N. J. (2005). Genetic structure, selfidentified race/ethnicity, and confounding in case-control association studies, American Journal of Human Genetics, 76, 268-275.
  36. Vattikuti, S., Guo, J. and Chow, C. C. (2012). Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits, Plos Genetics, 8.
  37. Welter, D., MacArthur, J., Morales, J., Burdett, T., Hall, P., Junkins, H., Klemm, A., Flicek, P., Manolio, T., Hindorff, L. and Parkinson, H. (2014). The NHGRI GWAS Catalog, a curated resource of SNP-trait associations, Nucleic Acids Research, 42(D1), D1001-D1006.
  38. Yang, J. A., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyholt, D. R., Madden, P. A., Heath, A. C., Martin, N. G., Montgomery, G. W., Goddard, M. E. and Visscher, P. M. (2010). Common SNPs explain a large proportion of the heritability for human height, Nature Genetics, 42, 565-U131.
  39. Yang, J. A., Lee, S. H., Goddard, M. E. and Visscher, P. M. (2011). GCTA: A tool for genome-wide complex trait analysis, American Journal of Human Genetics, 88, 76-82.
  40. Yu, J., Pressoir, G., Briggs, W. H., Vroh Bi, I., Yamasaki, M., Doebley, J. F., McMullen, M. D., Gaut, B. S., Nielsen, D. M., Holland, J. B., Kresovich, S. and Buckler, E. S. (2006). A unified mixed-model method for association mapping that accounts for multiple levels of relatedness, Nature Genetics, 38, 203-208.
  41. Zhang, Z. W., Ersoz, E., Lai, C. Q., Todhunter, R. J., Tiwari, H. K., Gore, M. A., Bradbury, P. J., Yu, J., Arnett, D. K., Ordovas, J. M. and Buckler, E. S. (2010). Mixed linear model approach adapted for genome-wide association studies, Nature Genetics, 42, 355-U118.
  42. Zhou, X. and Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association studies, Nature Genetics, 44, 821-U136.
  43. Zuk, O., Hechter, E., Sunyaev, S. R. and Lander, E. S. (2012). The mystery of missing heritability: Genetic interactions create phantom heritability, Proceedings of the National Academy of Sciences of the United States of America, 109, 1193-1198.