Dissolving Constraints for Riemannian Optimization

Published Online:https://doi.org/10.1287/moor.2023.1360

References

  • [1] Abrudan T, Eriksson J, Koivunen V (2009) Conjugate gradient algorithm for optimization under unitary matrix constraint. Signal Processing 89(9):1704–1714.CrossrefGoogle Scholar
  • [2] Abrudan TE, Eriksson J, Koivunen V (2008) Steepest descent algorithms for optimization under unitary matrix constraint. IEEE Trans. Signal Processing 56(3):1134–1147.CrossrefGoogle Scholar
  • [3] Absil PA, Baker CG, Gallivan KA (2007) Trust-region methods on Riemannian manifolds. Foundations Comput. Math. 7(3):303–330.CrossrefGoogle Scholar
  • [4] Absil PA, Mahony R, Sepulchre R (2009) Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, NJ).Google Scholar
  • [5] Bai Z, Li RC (2014) Minimization principles and computation for the generalized linear response eigenvalue problem. BIT Numerical Math. 54(1):31–54.CrossrefGoogle Scholar
  • [6] Bécigneul G, Ganea OE (2018) Riemannian adaptive optimization methods. Preprint, submitted October 1, https://arxiv.org/abs/1810.00760.Google Scholar
  • [7] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1):459–494.CrossrefGoogle Scholar
  • [8] Boumal N (2020) An introduction to optimization on smooth manifolds. Accessed February 24, 2023, https://www.nicolasboumal.net/book/.Google Scholar
  • [9] Boumal N, Absil PA (2015) Low-rank matrix completion via preconditioned optimization on the grassmann manifold. Linear Algebra Appl. 475:200–239.CrossrefGoogle Scholar
  • [10] Boumal N, Mishra B, Absil PA, Sepulchre R (2014) Manopt, a matlab toolbox for optimization on manifolds. J. Machine Learn. Res. 15(1):1455–1459.Google Scholar
  • [11] Bradbury J, Frostig R, Hawkins P, Johnson MJ, Leary C, Maclaurin D, Necula G, et al. (2018) JAX: Composable transformations of Python+NumPy programs. Accessed, http://github.com/google/jax.Google Scholar
  • [12] Byrd RH, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5):1190–1208.CrossrefGoogle Scholar
  • [13] Cartis C, Gould NI, Toint PL (2012) An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity. IMA J. Numerical Anal. 32(4):1662–1695.CrossrefGoogle Scholar
  • [14] Cartis C, Gould NI, Toint PL (2012) Complexity bounds for second-order optimality in unconstrained optimization. J. Complexity 28(1):93–108.CrossrefGoogle Scholar
  • [15] Cartis C, Gould NI, Toint PL (2018) Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization. Proc. Internat. Congress Mathematicians (World Scientific), 3711–3750.Google Scholar
  • [16] Conn AR, Gould NI, Toint PL (2000) Trust Region Methods (SIAM, Philadelphia).CrossrefGoogle Scholar
  • [17] Criscitiello C, Boumal N (2019) Efficiently escaping saddle points on manifolds. Preprint, submitted October 23, https://arxiv.org/abs/1906.04321.Google Scholar
  • [18] Criscitiello C, Boumal N (2022) An accelerated first-order method for non-convex optimization on manifolds. Foundations Comput. Math., https://doi.org/10.1007/s10208-022-09573-9.CrossrefGoogle Scholar
  • [19] Di Pillo G, Grippo L (1986) An exact penalty function method with global convergence properties for nonlinear programming problems. Math. Programming 36(1):1–18.CrossrefGoogle Scholar
  • [20] Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2):303–353.CrossrefGoogle Scholar
  • [21] Estrin R, Friedlander MP, Orban D, Saunders MA (2020) Implementing a smooth exact penalty function for equality-constrained nonlinear optimization. SIAM J. Sci. Comput. 42(3):A1809–A1835.CrossrefGoogle Scholar
  • [22] Fletcher R (1970) A class of methods for nonlinear programming with termination and convergence properties. Abadie J, ed. Integer and Nonlinear Programming (North-Holland, Amsterdam), 157–173.Google Scholar
  • [23] Fletcher R, Leyffer S (2002) Nonlinear programming without a penalty function. Math. Programming 91(2):239–269.CrossrefGoogle Scholar
  • [24] Gao B, Liu X, Yuan Y-x (2019) Parallelizable algorithms for optimization problems with orthogonality constraints. SIAM J. Sci. Comput. 41(3):A1949–A1983.CrossrefGoogle Scholar
  • [25] Gao B, Son NT, Absil PA, Stykel T (2021) Riemannian optimization on the symplectic Stiefel manifold. SIAM J. Optim. 31(2):1546–1575.CrossrefGoogle Scholar
  • [26] Ge R, Huang F, Jin C, Yuan Y (2015) Escaping from saddle points–online stochastic gradient for tensor decomposition. Conf. Learn. Theory (PMLR), 797–842.Google Scholar
  • [27] Golub GH, Van Loan CF (2013) Matrix Computations (JHU Press, Baltimore).CrossrefGoogle Scholar
  • [28] Gould NI, Lucidi S, Roma M, Toint PL (1999) Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2):504–525.CrossrefGoogle Scholar
  • [29] Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J. Numerical Anal. 23(4):707–716.CrossrefGoogle Scholar
  • [30] Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, et al. (2020) Array programming with NumPy. Nature 585(7825):357–362.CrossrefGoogle Scholar
  • [31] Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bureau Standards 49(6):409–436.CrossrefGoogle Scholar
  • [32] Hosseini S (2015) Convergence of nonsmooth descent methods via Kurdyka–Lojasiewicz inequality on Riemannian manifolds. 2015 INS Technical report No. 1523, Hausdorff Center for Mathematics and Institute for Numerical Simulation, University of Bonn, Bonn, Germany.Google Scholar
  • [33] Hu J, Liu X, Wen ZW, Yuan YX (2020) A brief introduction to manifold optimization. J. Oper. Res. Soc. China 8(2):199–248.CrossrefGoogle Scholar
  • [34] Hu X, Liu X (2020) An efficient orthonormalization-free approach for sparse dictionary learning and dual principal component pursuit. Sensors (Basel) 20(11):3041.CrossrefGoogle Scholar
  • [35] Jin C, Netrapalli P, Jordan MI (2018) Accelerated gradient descent escapes saddle points faster than gradient descent. Conf. Learn. Theory (PMLR), 1042–1085.Google Scholar
  • [36] Jin C, Ge R, Netrapalli P, Kakade SM, Jordan MI (2017) How to escape saddle points efficiently. Internat. Conf. Machine Learn. (PMLR), 1724–1732.Google Scholar
  • [37] Kochurov M, Karimov R, Kozlukov S (2020) Geoopt: Riemannian optimization in pytorch. Preprint, submitted July 17, https://arxiv.org/abs/2005.02819.Google Scholar
  • [38] Lee JD, Panageas I, Piliouras G, Simchowitz M, Jordan MI, Recht B (2019) First-order methods almost always avoid strict saddle points. Math. Programming 176(1):311–337.CrossrefGoogle Scholar
  • [39] Lei Y, Hu T, Li G, Tang K (2019) Stochastic gradient descent for nonconvex learning without bounded gradient assumptions. IEEE Trans. Neural Networks Learn. Systems 31(10):4394–4400.CrossrefGoogle Scholar
  • [40] Lenders F, Kirches C, Potschka A (2018) trlib: A vector-free implementation of the GLTR method for iterative solution of the trust region problem. Optim. Methods Software 33(3):420–449.CrossrefGoogle Scholar
  • [41] Li H, Lin Z (2015) Accelerated proximal gradient methods for nonconvex programming. Adv. Neural Inform. Processing Systems 28 (NIPS 2015), 379–387.Google Scholar
  • [42] Li L, Toh KC (2010) An inexact interior point method for l 1-regularized sparse covariance selection. Math. Programming Comput. 2(3):291–315.CrossrefGoogle Scholar
  • [43] Łojasiewicz S (1961) Sur le probleme de la division (Instytut Matematyczny Polskiej Akademi Nauk, Warsaw).Google Scholar
  • [44] Lojasiewicz S (1963) Une propriété topologique des sous-ensembles analytiques réels, les équations aux dérivées partielles. Les Éditions aux Dérivées Partielles, vol. 117, 87–89.Google Scholar
  • [45] Maclaurin D, Duvenaud D, Adams RP (2015) Autograd: Effortless gradients in numpy. ICML 2015 AutoML Workshop, vol. 238.Google Scholar
  • [46] Meghwanshi M, Jawanpuria P, Kunchukuttan A, Kasai H, Mishra B (2018) McTorch, a manifold optimization library for deep learning. Preprint, submitted October 4, https://arxiv.org/abs/1810.01811.Google Scholar
  • [47] Morales JL, Nocedal J (2011) Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization.” ACM Trans. Math. Software 38(1):7.CrossrefGoogle Scholar
  • [48] Nash SG (1984) Newton-type minimization via the Lanczos method. SIAM J. Numerical Anal. 21(4):770–788.CrossrefGoogle Scholar
  • [49] Nesterov Y, Polyak BT (2006) Cubic regularization of Newton method and its global performance. Math. Programming 108(1):177–205.CrossrefGoogle Scholar
  • [50] Nickel M, Kiela D (2018) Learning continuous hierarchies in the Lorentz model of hyperbolic geometry. Internat. Conf. Machine Learn. (PMLR), 3779–3788.Google Scholar
  • [51] Nocedal J, Wright S (2006) Numerical Optimization (Springer Science & Business Media, New York).Google Scholar
  • [52] Ochs P, Chen Y, Brox T, Pock T (2014) ipiano: Inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2):1388–1419.CrossrefGoogle Scholar
  • [53] Powell MJ (1969) A method for nonlinear constraints in minimization problems. Fletcher R, ed. Optimization (Academic Press, New York), 283–298.Google Scholar
  • [54] Qi C, Gallivan KA, Absil PA (2010) Riemannian BFGS algorithm with applications. Diehl M, Glineur F, Jarlebring E, Michiels W, eds. Recent Advances in Optimization and Its Applications in Engineering (Springer, Berlin), 183–192.CrossrefGoogle Scholar
  • [55] Sato H (2016) A Dai–Yuan-type Riemannian conjugate gradient method with the weak Wolfe conditions. Comput. Optim. Appl. 64(1):101–118.CrossrefGoogle Scholar
  • [56] Scott DS (1981) Solving sparse symmetric generalized eigenvalue problems without factorization. SIAM J. Numerical Anal. 18(1):102–110.CrossrefGoogle Scholar
  • [57] Siegel JW (2019) Accelerated optimization with orthogonality constraints. Preprint, submitted October 20, https://arxiv.org/abs/1903.05204v3.Google Scholar
  • [58] Son NT, Absil PA, Gao B, Stykel T (2021) Computing symplectic eigenpairs of symmetric positive-definite matrices via trace minimization and Riemannian optimization. SIAM J. Matrix Anal. Appl. 42(4):1732–1757.CrossrefGoogle Scholar
  • [59] Steihaug T (1983) The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numerical Anal. 20(3):626–637.CrossrefGoogle Scholar
  • [60] Toint P (1981) Toward an efficient sparsity exploiting Newton method for minimization. Duff IS, ed. Sparse Matrices and Their Uses (Academic Press, London), 57–88.Google Scholar
  • [61] Townsend J, Koep N, Weichwald S (2016) Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation. Preprint, submitted September 8, https://arxiv.org/abs/1603.03236.Google Scholar
  • [62] Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, et al. (2020) Scipy 1.0: Fundamental algorithms for scientific computing in python. Nature Methods 17(3):261–272.CrossrefGoogle Scholar
  • [63] Wang L, Gao B, Liu X (2021) Multipliers correction methods for optimization problems over the Stiefel manifold. CSIAM Trans. Appl. Math. 2(2021):508–531.CrossrefGoogle Scholar
  • [64] Wang Z, Zhou Y, Liang Y, Lan G (2020) Cubic regularization with momentum for nonconvex optimization. Proc. 35th Uncertainty Artificial Intelligence Conf. (PMLR), 313–322.Google Scholar
  • [65] Wen Z, Yin W (2013) A feasible method for optimization with orthogonality constraints. Math. Programming 142(1–2):397–434.CrossrefGoogle Scholar
  • [66] Wu RB, Chakrabarti R, Rabitz H (2010) Critical landscape topology for optimization on the symplectic group. J. Optim. Theory Appl. 145(2):387–406.CrossrefGoogle Scholar
  • [67] Xiao N, Liu X (2021) Solving optimization problems over the Stiefel manifold by smooth exact penalty function. Preprint, submitted November 23, https://arxiv.org/abs/2110.08986v2.Google Scholar
  • [68] Xiao N, Liu X, Yuan Y-x (2020) A class of smooth exact penalty function methods for optimization problems with orthogonality constraints. Optim. Methods Software 37(4):1205–1241.CrossrefGoogle Scholar
  • [69] Xiao N, Liu X, Yuan Y-x (2021) A penalty-free infeasible approach for a class of nonsmooth opimtization problems over the Stiefel manifold. Preprint, submitted March 28, https://arxiv.org/abs/2103.03514.Google Scholar
  • [70] Xiao N, Liu X, Yuan Y-x (2021) Exact penalty function for ℓ2,1 norm minimization over the Stiefel manifold. SIAM J. Optim. 31(4):3097–3126.CrossrefGoogle Scholar
  • [71] Yuan Yx (2015) Recent advances in trust region algorithms. Math. Programming 151(1):249–281.CrossrefGoogle Scholar
  • [72] Zavala VM, Anitescu M (2014) Scalable nonlinear programming via exact differentiable penalty functions and trust-region Newton methods. SIAM J. Optim. 24(1):528–558.CrossrefGoogle Scholar
  • [73] Zhang E, Noakes L (2020) Riemannian cubics in quadratic matrix Lie groups. Appl. Math. Comput. 375:125082.Google Scholar
  • [74] Zhang H, Sra S (2018) Toward Riemannian accelerated gradient methods. Preprint, submitted June 7, https://arxiv.org/abs/1806.02812.Google Scholar
  • [75] Zhang J, Zhang H, Sra S (2018) R-SPIDER: A fast Riemannian stochastic optimization algorithm with curvature independent rate. Preprint, submitted December 14, https://arxiv.org/abs/1811.04194.Google Scholar
  • [76] Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Software 23(4):550–560.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.