• Kum, Sangho (Department of Mathematics Education Chungbuk National University) ;
  • Yun, Sangwoon (Department of Mathematics Education Sungkyunkwan University)
  • Received : 2018.07.31
  • Accepted : 2018.08.29
  • Published : 2019.07.01


We are concerned with optimization methods for the $L^2$-Wasserstein least squares problem of Gaussian measures (alternatively the n-coupling problem). Based on its equivalent form on the convex cone of positive definite matrices of fixed size and the strict convexity of the variance function, we are able to present an implementable (accelerated) gradient method for finding the unique minimizer. Its global convergence rate analysis is provided according to the derived upper bound of Lipschitz constants of the gradient function.


$L^2$-Wasserstein least squares problem;n-coupling problem;Gaussian measure;positive definite matrix;Nesterov-Todd scaling;gradient projection method

DBSHBB_2019_v56n4_1001_f0001.png 이미지

FIGURE 1. (a) Objective value versus iteration with eiglb = 1, eigub = 99. (b) Objective value versus iteration with eiglb = 0:1, eigub = 99:9. The estimated Lipschitz constant in (b) is more than 10 times bigger than that in (a). It is observed that the estimated Lipschitz constant is smaller, the gap between the objective value of AGPM and that of GPM-A is bigger.

TABLE 1. Test results of the final objective values and CPU time in seconds for three methods GPM-A, GPM-C, and AGPM on 5 random data sets.

DBSHBB_2019_v56n4_1001_t0001.png 이미지


Supported by : NRF


  1. M. Agueh and G. Carlier, Barycenters in the Wasserstein space, SIAM J. Math. Anal. 43 (2011), no. 2, 904-924.
  2. P. C. Alvarez Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matran, A fixed-point approach to barycenters in Wasserstein space, J. Math. Anal. Appl. 441 (2016), no. 2, 744-762.
  3. M. Bacak, Computing medians and means in Hadamard spaces, SIAM J. Optim. 24 (2014), no. 3, 1542-1566.
  4. D. P. Bertsekas, Nonlinear Programming, second edition, Athena Scientific Optimization and Computation Series, Athena Scientific, Belmont, MA, 1999.
  5. D. P. Bertsekas, Incremental gradient, subgradient, and proximal methods for convex optimization: A survey, Preprint arXiv:1507.01030, 2015.
  6. R. Bhatia, Matrix Analysis, Graduate Texts in Mathematics, 169, Springer-Verlag, New York, 1997.
  7. R. Bhatia, T. Jain, and Y. Lim, On the bures-wasserstein distance between positive definite matrices, preprint.
  8. J. Bigot and T. Klein, Consistent estimation of a population barycenter in the wasserstein space, Preprint arXiv:1212.2562, 2012.
  9. Y. Brenier, Polar factorization and monotone rearrangement of vector-valued functions, Comm. Pure Appl. Math. 44 (1991), no. 4, 375-417.
  10. G. Carlier and I. Ekeland, Matching for teams, Econom. Theory 42 (2010), no. 2, 397-418.
  11. G. Carlier, A. Oberman, and E. Oudet, Numerical methods for matching for teams and Wasserstein barycenters, ESAIM Math. Model. Numer. Anal. 49 (2015), no. 6, 1621-1642.
  12. D. C. Dowson and B. V. Landau, The Frechet distance between multivariate normal distributions, J. Multivariate Anal. 12 (1982), no. 3, 450-455.
  13. W. Gangbo and R. J. McCann, The geometry of optimal transportation, Acta Math. 177 (1996), no. 2, 113-161.
  14. W. Gangbo and A. Swiech, Optimal maps for the multidimensional Monge-Kantorovich problem, Comm. Pure Appl. Math. 51 (1998), no. 1, 23-45.<23::AID-CPA2>3.0.CO;2-H
  15. C. R. Givens and R. M. Shortt, A class of Wasserstein metrics for probability distributions, Michigan Math. J. 31 (1984), no. 2, 231-240.
  16. M. Ito and M. Fukuda, A family of subgradient-based methods for convex optimization problems in a unifying framework, Optim. Methods Softw. 31 (2016), no. 5, 952-982.
  17. L. Kantorovich and G. Rubinstein, On a space of completely additive functions, Vestn. Lenin. Univ. 13 (1958), 52-59.
  18. M. Knott and C. S. Smith, On a generalization of cyclic monotonicity and distances among random vectors, Linear Algebra Appl. 199 (1994), 363-371.
  19. J. Lawson and Y. Lim, Monotonic properties of the least squares mean, Math. Ann. 351 (2011), no. 2, 267-279.
  20. J. Lawson and Y. Lim, Weighted means and Karcher equations of positive operators, Proc. Natl. Acad. Sci. USA 110 (2013), no. 39, 15626-15632.
  21. A. S. Lewis and J. Malick, Alternating projections on manifolds, Math. Oper. Res. 33 (2008), no. 1, 216-234.
  22. R. J. McCann, A convexity principle for interacting gases, Adv. Math. 128 (1997), no. 1, 153-179.
  23. A. Nedic and D. P. Bertsekas, Incremental subgradient methods for nondifferentiable optimization, SIAM J. Optim. 12 (2001), no. 1, 109-138.
  24. Yu. Nesterov, Smooth minimization of non-smooth functions, Math. Program. 103 (2005), no. 1, Ser. A, 127-152.
  25. I. Olkin and F. Pukelsheim, The distance between two random vectors with given dispersion matrices, Linear Algebra Appl. 48 (1982), 257-263.
  26. J. Rabin, G. Peyre, J. Delon, and M. Bernot, Wasserstein Barycenter and Its Application to Texture Mixing, 435-446, Springer, Berlin, Heidelberg, 2012.
  27. L. Ruschendorf and L. Uckelmann, On the n-coupling problem, J. Multivariate Anal. 81 (2002), no. 2, 242-258.
  28. P. Tseng, On accelerated proximal gradient methods for convex-concave optimization Preprint, 2008.
  29. C. Villani, Topics in Optimal Transportation, Graduate Studies in Mathematics, 58, American Mathematical Society, Providence, RI, 2003.
  30. C. Villani, Optimal Transport, Grundlehren der Mathematischen Wissenschaften, 338, Springer-Verlag, Berlin, 2009.