DOI QR코드

DOI QR Code

Recent advances in Bayesian inference of isolation-with-migration models

  • Chung, Yujin (Department of Applied Statistics, Kyonggi University)
  • Received : 2019.09.24
  • Accepted : 2019.10.23
  • Published : 2019.12.31

Abstract

Isolation-with-migration (IM) models have become popular for explaining population divergence in the presence of migrations. Bayesian methods are commonly used to estimate IM models, but they are limited to small data analysis or simple model inference. Recently three methods, IMa3, MIST, and AIM, resolved these limitations. Here, we describe the major problems addressed by these three software and compare differences among their inference methods, despite their use of the same standard likelihood function.

Keywords

References

  1. Chung Y, Hey J. Bayesian analysis of evolutionary divergence with genomic data under diverse demographic models. Mol Biol Evol 2017;34:1517-1528. https://doi.org/10.1093/molbev/msx070
  2. Hey J, Nielsen R. Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D. persimilis. Genetics 2004;167:747-760. https://doi.org/10.1534/genetics.103.024182
  3. Hey J, Nielsen R. Integration within the Felsenstein equation for improved Markov chain Monte Carlo methods in population genetics. Proc Natl Acad Sci U S A 2007;104:2785-2790. https://doi.org/10.1073/pnas.0611164104
  4. Nielsen R, Wakeley J. Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics 2001;158:885-896. https://doi.org/10.1093/genetics/158.2.885
  5. Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 2012;29:1969-1973. https://doi.org/10.1093/molbev/mss075
  6. Kubatko LS, Carstens BC, Knowles LL. STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 2009;25:971-973. https://doi.org/10.1093/bioinformatics/btp079
  7. Liu L. BEST: Bayesian estimation of species trees under the coalescent model. Bioinformatics 2008;24:2542-2543. https://doi.org/10.1093/bioinformatics/btn484
  8. Liu L, Pearl DK. Species trees from gene trees: reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Syst Biol 2007;56:504-514. https://doi.org/10.1080/10635150701429982
  9. Liu L, Yu L, Kubatko L, Pearl DK, Edwards SV. Coalescent methods for estimating phylogenetic trees. Mol Phylogenet Evol 2009;53:320-328. https://doi.org/10.1016/j.ympev.2009.05.033
  10. Rannala B, Yang Z. Efficient Bayesian species tree inference under the multispecies coalescent. Syst Biol 2017;66:823-842. https://doi.org/10.1093/sysbio/syw119
  11. Degnan JH, Rosenberg NA. Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol Evol 2009;24:332-340. https://doi.org/10.1016/j.tree.2009.01.009
  12. Chung Y, Ane C. Comparing two Bayesian methods for gene tree/species tree reconstruction: simulations with incomplete lineage sorting and horizontal gene transfer. Syst Biol 2011;60:261-275. https://doi.org/10.1093/sysbio/syr003
  13. Leache AD, Harris RB, Rannala B, Yang Z. The influence of gene flow on species tree estimation: a simulation study. Syst Biol 2014;63:17-30. https://doi.org/10.1093/sysbio/syt049
  14. Muller NF, Ogilvie HA, Zhang C, Drummond A, Stadler T. Inference of species histories in the presence of gene flow. Cold Spring Harbor: bioRxiv, Cold Spring Harbor Laboratory, 2018. Accessed 2019 Aug 3. Available from: https://doi.org/10.1101/348391.
  15. Bouckaert R, Vaughan TG, Barido-Sottani J, Duchene S, Fourment M, Gavryushkina A, et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol 2019;15:e1006650. https://doi.org/10.1371/journal.pcbi.1006650
  16. Hey J, Chung Y, Sethuraman A, Lachance J, Tishkoff S, Sousa VC, et al. Phylogeny estimation by integration over isolation with migration models. Mol Biol Evol 2018;35:2805-2818.
  17. Hey J. Isolation with migration models for more than two populations. Mol Biol Evol 2010;27:905-920. https://doi.org/10.1093/molbev/msp296
  18. Wakeley J, Hey J. Testing speciation models with DNA sequence data. In: Molecular Approaches to Ecology and Evolution (DeSalle R, Schierwater B, eds.). Basel: Birkhauser, 1998. pp. 157-175.
  19. Becquet C, Przeworski M. A new approach to estimate parameters of speciation models with application to apes. Genome Res 2007;17:1505-1519. https://doi.org/10.1101/gr.6409707
  20. Dalquen DA, Zhu T, Yang Z. Maximum likelihood implementation of an isolation-with-migration model for three species. Syst Biol 2017;66:379-398.
  21. Mailund T, Halager AE, Westergaard M, Dutheil JY, Munch K, Andersen LN, et al. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. PLoS Genet 2012;8:e1003125. https://doi.org/10.1371/journal.pgen.1003125
  22. Muller NF, Rasmussen DA, Stadler T. The structured coalescent and its approximations. Mol Biol Evol 2017;34:2970-2981. https://doi.org/10.1093/molbev/msx186
  23. Craiu RV, Rosenthal JS. Bayesian computation via Markov chain Monte Carlo. Annu Rev Stat Its Appl 2014;1:179-201. https://doi.org/10.1146/annurev-statistics-022513-115540
  24. Sousa V, Hey J. Understanding the origin of species with genome-scale data: modelling gene flow. Nat Rev Genet 2013;14:404-414. https://doi.org/10.1038/nrg3446
  25. Nascimento FF, Reis MD, Yang Z. A biologist's guide to Bayesian phylogenetic analysis. Nat Ecol Evol 2017;1:1446-1454. https://doi.org/10.1038/s41559-017-0280-x
  26. Chatzou M, Magis C, Chang JM, Kemena C, Bussotti G, Erb I, et al. Multiple sequence alignment modeling: methods and applications. Brief Bioinform 2016;17:1009-1023. https://doi.org/10.1093/bib/bbv099
  27. Hudson RR, Kaplan NL. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 1985;111:147-64. https://doi.org/10.1093/genetics/111.1.147
  28. Kimura M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 1969;61:893-903. https://doi.org/10.1093/genetics/61.4.893
  29. Jukes TH, Cantor CR. Evolution of protein molecules. In: Mammalian Protein Metabolism (Munro HN, ed.). New York: Academic Press, 1969. pp. 21-132.
  30. Hasegawa M, Kishino H, Yano T. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 1985;22:160-174. https://doi.org/10.1007/BF02101694
  31. Tavare S. Some probabilistic and statistical problems in the analysis of DNA sequences. Am Math Soc Lect Math Life Sci 1986;17:57-86.
  32. Kingman JF. On the genealogy of large populations. J Appl Probab 1982;19:27-43. https://doi.org/10.1017/S0021900200034446
  33. Hudson RR. Properties of a neutral allele model with intragenic recombination. Theor Popul Biol 1983;23:183-201. https://doi.org/10.1016/0040-5809(83)90013-8
  34. Hudson RR. Testing the constant-rate neutral allele model with protein sequence data. Evolution 1983;37:203-217. https://doi.org/10.2307/2408186
  35. Felsenstein J. Phylogenies from molecular sequences: inference and reliability. Annu Rev Genet 1988;22:521-565. https://doi.org/10.1146/annurev.ge.22.120188.002513
  36. Yang Z. Molecular Evolution: A Statistical Approach. Oxford: Oxford University Press, 2014.
  37. Robert CP. Bayesian computational tools. Annu Rev Stat Its Appl 2014;1:153-177. https://doi.org/10.1146/annurev-statistics-022513-115543
  38. Robert CP, Casella G. Monte Carlo Statistical Methods. New York: Springer, 2004.
  39. Sethuraman A, Hey J. IMa2p: parallel MCMC and inference of ancient demography under the Isolation with migration (IM) model. Mol Ecol Resour 2016;16:206-215. https://doi.org/10.1111/1755-0998.12437
  40. Knoblauch J, Sethuraman A, Hey J. IMGui-A Desktop GUI application for isolation with migration analyses. Mol Biol Evol 2017;34:500-504. https://doi.org/10.1093/molbev/msw252
  41. Cruickshank TE, Hahn MW. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Mol Ecol 2014;23:3133-3157. https://doi.org/10.1111/mec.12796
  42. Hey J, Chung Y, Sethuraman A. On the occurrence of false positives in tests of migration under an isolation-with-migration model. Mol Ecol 2015;24:5078-5083. https://doi.org/10.1111/mec.13381
  43. Bouckaert R, Heled J, Kuhnert D, Vaughan T, Wu CH, Xie D, et al. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol 2014;10:e1003537. https://doi.org/10.1371/journal.pcbi.1003537
  44. Ogilvie HA, Bouckaert RR, Drummond AJ. StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates. Mol Biol Evol 2017;34:2101-2114. https://doi.org/10.1093/molbev/msx126
  45. Barido-Sottani J, Boskova V, Plessis LD, Kuhnert D, Magnus C, Mitov V, et al. Taming the BEAST-A community teaching material resource for BEAST 2. Syst Biol 2018;67:170-174. https://doi.org/10.1093/sysbio/syx060
  46. Wilkinson-Herbots HM. The distribution of the coalescence time and the number of pairwise nucleotide differences in a model of population divergence or speciation with an initial period of gene flow. Theor Popul Biol 2012;82:92-108. https://doi.org/10.1016/j.tpb.2012.05.003
  47. Lan S, Palacios JA, Karcher M, Minin VN, Shahbaba B. An efficient Bayesian inference framework for coalescent-based nonparametric phylodynamics. Bioinformatics 2015;31:3282-3289. https://doi.org/10.1093/bioinformatics/btv378