Coordinate Descent Without Coordinates: Tangent Subspace Descent on Riemannian Manifolds

Published Online:https://doi.org/10.1287/moor.2022.1253

References

  • [1] Absil PA, Mahony R, Sepulchre R (2007) Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, NJ).Google Scholar
  • [2] Beck A, Tetruashvili L (2013) On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4):2037–2060.CrossrefGoogle Scholar
  • [3] Bjørstad PE, Mandel J (1991) On the spectra of sums of orthogonal projections with applications to parallel computing. BIT 31(1):76–88.CrossrefGoogle Scholar
  • [4] Chen S, Ma S, Man-Cho So A, Zhang T (2020) Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1):210–239.Google Scholar
  • [5] Deutsch F (1995) The angle between subspaces of a Hilbert space. Approximation Theory, Wavelets and Applications (Springer, Berlin), 107–130.CrossrefGoogle Scholar
  • [6] Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2):303–353.CrossrefGoogle Scholar
  • [7] Ferreira OP, Oliveira PR (1998) Subgradient algorithm on Riemannian manifolds. J. Optim. Theory Appl. 97(1):93–104.CrossrefGoogle Scholar
  • [8] Ferreira OP, Louzeiro MS, Prudente LF (2019) Gradient method for optimization on Riemannian manifolds with lower bounded curvature. SIAM J. Optim. 29(4):2517–2541.CrossrefGoogle Scholar
  • [9] Frongillo R, Reid MD (2015) Convergence analysis of prediction markets via randomized subspace descent. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates), 3034–3042.Google Scholar
  • [10] Gao B, Liu X, Chen X, Yuan YX (2018) A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. 28(1):302–332.CrossrefGoogle Scholar
  • [11] Golub GH, Van Loan CF (2013) Matrix Computations, 4th ed. (Johns Hopkins University Press, Baltimore, MD).CrossrefGoogle Scholar
  • [12] Huang W, Wei K (2022) Riemannian proximal gradient methods. Math. Programming Forthcoming.CrossrefGoogle Scholar
  • [13] James AT (1954) Normal multivariate analysis and the orthogonal group. Ann. Math. Statist. 25(1):40–75.CrossrefGoogle Scholar
  • [14] Journée M, Absil PA, Sepulchre R (2007) Optimization on the orthogonal group for independent component analysis. Davies ME, James CJ, Abdallah SA, Plumbley MD, eds. Independent Component Analysis and Signal Separation (Springer, Berlin), 57–64.CrossrefGoogle Scholar
  • [15] Journée M, Nesterov Y, Richtárik P, Sepulchre R (2010) Generalized power method for sparse principal component analysis. J. Machine Learn. Res. 11(15):517–553.Google Scholar
  • [16] Learned-Miller EG, Fisher JW III (2003) ICA using spacings estimates of entropy. J. Machine Learn. Res. 4:1271–1295.Google Scholar
  • [17] Lee JM (2012) Introduction to Smooth Manifolds, vol. 218. Graduate Texts in Mathematics, 2nd ed. (Springer-Verlag New York).CrossrefGoogle Scholar
  • [18] Lee JM (2018) Introduction to Riemannian Manifolds, vol. 176. Graduate Texts in Mathematics, 2nd ed. (Springer International Publishing).CrossrefGoogle Scholar
  • [19] Leventhal D, Lewis AS (2010) Randomized methods for linear constraints: Convergence rates and conditioning. Math. Oper. Res. 35(3):641–654.LinkGoogle Scholar
  • [20] Lewis AS, Malick J (2008) Alternating projections on manifolds. Math. Oper. Res. 33(1):216–234.LinkGoogle Scholar
  • [21] Liu C, Boumal N (2020) Simple algorithms for optimization on Riemannian manifolds with constraints. Appl. Math. Optim. 82:949–981.CrossrefGoogle Scholar
  • [22] Liu Y, Shang F, Cheng J, Cheng H, Jiao L (2017) Accelerated first-order methods for geodesically convex optimization on Riemannian manifolds. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates), 4868–4877.CrossrefGoogle Scholar
  • [23] Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2):341–362.CrossrefGoogle Scholar
  • [24] Richtárik P, Takáč M (2014) Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Programming 144:1–38.CrossrefGoogle Scholar
  • [25] Sato H, Kasai H, Mishra B (2019) Riemannian stochastic variance reduced gradient algorithm with retraction and vector transport. SIAM J. Optim. 29(2):1444–1472.CrossrefGoogle Scholar
  • [26] Shalit U, Chechik G (2014) Coordinate-descent for learning orthogonal matrices through Givens rotations. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. on Machine Learn., vol. 32 (PMLR), 548–556.Google Scholar
  • [27] Vishnoi NK (2018) Geodesic convex optimization: Differentiation on manifolds, geodesics, and convexity. Preprint, submitted June 17, https://arxiv.org/abs/1806.06373.Google Scholar
  • [28] Xu Y (2015) Alternating proximal gradient method for sparse nonnegative Tucker decomposition. Math. Programming Comput. 7(1):39–70.CrossrefGoogle Scholar
  • [29] Yang C, Meza JC, Wang LW (2007) A trust region direct constrained minimization algorithm for the Kohn-Sham equation. SIAM J. Sci. Comput. 29(5):1854–1875.CrossrefGoogle Scholar
  • [30] Zhang H, Sra S (2018) An estimate sequence for geodesically convex optimization. Bubeck S, Perchet V, Rigollet P, eds. Proc. 31st Conf. on Learn. Theory, vol. 75 (PMLR), 1703–1723.Google Scholar
  • [31] Zhang J, Ma S, Zhang S (2019) Primal-dual optimization algorithms over Riemannian manifolds: An iteration complexity analysis. Math. Programming 184:445–490.CrossrefGoogle Scholar
  • [32] Zhang HJ, Reddi S, Sra S (2016) Riemannian SVRG: Fast stochastic optimization on Riemannian manifolds. Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates), 4592–4600.Google Scholar
  • [33] Zohrizadeh F, Kheirandishfard M, Kamangar F, Madani R (2019) Non-smooth optimization over Stiefel manifolds with applications to dimensionality reduction and graph clustering. Proc. 28th Internat. Joint Conf. on Artificial Intelligence (AAAI Press), 1319–1326.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.