Dissolving Constraints for Riemannian Optimization
Published Online:8 Mar 2023https://doi.org/10.1287/moor.2023.1360
References
- [1] (2009) Conjugate gradient algorithm for optimization under unitary matrix constraint. Signal Processing 89(9):1704–1714.Crossref, Google Scholar
- [2] (2008) Steepest descent algorithms for optimization under unitary matrix constraint. IEEE Trans. Signal Processing 56(3):1134–1147.Crossref, Google Scholar
- [3] (2007) Trust-region methods on Riemannian manifolds. Foundations Comput. Math. 7(3):303–330.Crossref, Google Scholar
- [4] (2009) Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, NJ).Google Scholar
- [5] (2014) Minimization principles and computation for the generalized linear response eigenvalue problem. BIT Numerical Math. 54(1):31–54.Crossref, Google Scholar
- [6] (2018) Riemannian adaptive optimization methods. Preprint, submitted October 1, https://arxiv.org/abs/1810.00760.Google Scholar
- [7] (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1):459–494.Crossref, Google Scholar
- [8] (2020) An introduction to optimization on smooth manifolds. Accessed February 24, 2023, https://www.nicolasboumal.net/book/.Google Scholar
- [9] (2015) Low-rank matrix completion via preconditioned optimization on the grassmann manifold. Linear Algebra Appl. 475:200–239.Crossref, Google Scholar
- [10] (2014) Manopt, a matlab toolbox for optimization on manifolds. J. Machine Learn. Res. 15(1):1455–1459.Google Scholar
- [11] (2018) JAX: Composable transformations of Python+NumPy programs. Accessed, http://github.com/google/jax.Google Scholar
- [12] (1995) A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5):1190–1208.Crossref, Google Scholar
- [13] (2012) An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity. IMA J. Numerical Anal. 32(4):1662–1695.Crossref, Google Scholar
- [14] (2012) Complexity bounds for second-order optimality in unconstrained optimization. J. Complexity 28(1):93–108.Crossref, Google Scholar
- [15] (2018) Worst-case evaluation complexity and optimality of second-order methods for nonconvex smooth optimization. Proc. Internat. Congress Mathematicians (World Scientific), 3711–3750.Google Scholar
- [16] (2000) Trust Region Methods (SIAM, Philadelphia).Crossref, Google Scholar
- [17] (2019) Efficiently escaping saddle points on manifolds. Preprint, submitted October 23, https://arxiv.org/abs/1906.04321.Google Scholar
- [18] (2022) An accelerated first-order method for non-convex optimization on manifolds. Foundations Comput. Math., https://doi.org/10.1007/s10208-022-09573-9.Crossref, Google Scholar
- [19] (1986) An exact penalty function method with global convergence properties for nonlinear programming problems. Math. Programming 36(1):1–18.Crossref, Google Scholar
- [20] (1998) The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2):303–353.Crossref, Google Scholar
- [21] (2020) Implementing a smooth exact penalty function for equality-constrained nonlinear optimization. SIAM J. Sci. Comput. 42(3):A1809–A1835.Crossref, Google Scholar
- [22] (1970) A class of methods for nonlinear programming with termination and convergence properties. Abadie J, ed. Integer and Nonlinear Programming (North-Holland, Amsterdam), 157–173.Google Scholar
- [23] (2002) Nonlinear programming without a penalty function. Math. Programming 91(2):239–269.Crossref, Google Scholar
- [24] (2019) Parallelizable algorithms for optimization problems with orthogonality constraints. SIAM J. Sci. Comput. 41(3):A1949–A1983.Crossref, Google Scholar
- [25] (2021) Riemannian optimization on the symplectic Stiefel manifold. SIAM J. Optim. 31(2):1546–1575.Crossref, Google Scholar
- [26] (2015) Escaping from saddle points–online stochastic gradient for tensor decomposition. Conf. Learn. Theory (PMLR), 797–842.Google Scholar
- [27] (2013) Matrix Computations (JHU Press, Baltimore).Crossref, Google Scholar
- [28] (1999) Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2):504–525.Crossref, Google Scholar
- [29] (1986) A nonmonotone line search technique for Newton’s method. SIAM J. Numerical Anal. 23(4):707–716.Crossref, Google Scholar
- [30] (2020) Array programming with NumPy. Nature 585(7825):357–362.Crossref, Google Scholar
- [31] (1952) Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bureau Standards 49(6):409–436.Crossref, Google Scholar
- [32] (2015) Convergence of nonsmooth descent methods via Kurdyka–Lojasiewicz inequality on Riemannian manifolds. 2015 INS Technical report No. 1523, Hausdorff Center for Mathematics and Institute for Numerical Simulation, University of Bonn, Bonn, Germany.Google Scholar
- [33] (2020) A brief introduction to manifold optimization. J. Oper. Res. Soc. China 8(2):199–248.Crossref, Google Scholar
- [34] (2020) An efficient orthonormalization-free approach for sparse dictionary learning and dual principal component pursuit. Sensors (Basel) 20(11):3041.Crossref, Google Scholar
- [35] (2018) Accelerated gradient descent escapes saddle points faster than gradient descent. Conf. Learn. Theory (PMLR), 1042–1085.Google Scholar
- [36] (2017) How to escape saddle points efficiently. Internat. Conf. Machine Learn. (PMLR), 1724–1732.Google Scholar
- [37] (2020) Geoopt: Riemannian optimization in pytorch. Preprint, submitted July 17, https://arxiv.org/abs/2005.02819.Google Scholar
- [38] (2019) First-order methods almost always avoid strict saddle points. Math. Programming 176(1):311–337.Crossref, Google Scholar
- [39] (2019) Stochastic gradient descent for nonconvex learning without bounded gradient assumptions. IEEE Trans. Neural Networks Learn. Systems 31(10):4394–4400.Crossref, Google Scholar
- [40] (2018) trlib: A vector-free implementation of the GLTR method for iterative solution of the trust region problem. Optim. Methods Software 33(3):420–449.Crossref, Google Scholar
- [41] (2015) Accelerated proximal gradient methods for nonconvex programming. Adv. Neural Inform. Processing Systems 28 (NIPS 2015), 379–387.Google Scholar
- [42] (2010) An inexact interior point method for l 1-regularized sparse covariance selection. Math. Programming Comput. 2(3):291–315.Crossref, Google Scholar
- [43] (1961) Sur le probleme de la division (Instytut Matematyczny Polskiej Akademi Nauk, Warsaw).Google Scholar
- [44] (1963) Une propriété topologique des sous-ensembles analytiques réels, les équations aux dérivées partielles. Les Éditions aux Dérivées Partielles, vol. 117, 87–89.Google Scholar
- [45] (2015) Autograd: Effortless gradients in numpy. ICML 2015 AutoML Workshop, vol. 238.Google Scholar
- [46] (2018) McTorch, a manifold optimization library for deep learning. Preprint, submitted October 4, https://arxiv.org/abs/1810.01811.Google Scholar
- [47] (2011) Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization.” ACM Trans. Math. Software 38(1):7.Crossref, Google Scholar
- [48] (1984) Newton-type minimization via the Lanczos method. SIAM J. Numerical Anal. 21(4):770–788.Crossref, Google Scholar
- [49] (2006) Cubic regularization of Newton method and its global performance. Math. Programming 108(1):177–205.Crossref, Google Scholar
- [50] (2018) Learning continuous hierarchies in the Lorentz model of hyperbolic geometry. Internat. Conf. Machine Learn. (PMLR), 3779–3788.Google Scholar
- [51] (2006) Numerical Optimization (Springer Science & Business Media, New York).Google Scholar
- [52] (2014) ipiano: Inertial proximal algorithm for nonconvex optimization. SIAM J. Imaging Sci. 7(2):1388–1419.Crossref, Google Scholar
- [53] (1969) A method for nonlinear constraints in minimization problems. Fletcher R, ed. Optimization (Academic Press, New York), 283–298.Google Scholar
- [54] (2010) Riemannian BFGS algorithm with applications. Diehl M, Glineur F, Jarlebring E, Michiels W, eds. Recent Advances in Optimization and Its Applications in Engineering (Springer, Berlin), 183–192.Crossref, Google Scholar
- [55] (2016) A Dai–Yuan-type Riemannian conjugate gradient method with the weak Wolfe conditions. Comput. Optim. Appl. 64(1):101–118.Crossref, Google Scholar
- [56] (1981) Solving sparse symmetric generalized eigenvalue problems without factorization. SIAM J. Numerical Anal. 18(1):102–110.Crossref, Google Scholar
- [57] (2019) Accelerated optimization with orthogonality constraints. Preprint, submitted October 20, https://arxiv.org/abs/1903.05204v3.Google Scholar
- [58] (2021) Computing symplectic eigenpairs of symmetric positive-definite matrices via trace minimization and Riemannian optimization. SIAM J. Matrix Anal. Appl. 42(4):1732–1757.Crossref, Google Scholar
- [59] (1983) The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numerical Anal. 20(3):626–637.Crossref, Google Scholar
- [60] (1981) Toward an efficient sparsity exploiting Newton method for minimization. Duff IS, ed. Sparse Matrices and Their Uses (Academic Press, London), 57–88.Google Scholar
- [61] (2016) Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation. Preprint, submitted September 8, https://arxiv.org/abs/1603.03236.Google Scholar
- [62] (2020) Scipy 1.0: Fundamental algorithms for scientific computing in python. Nature Methods 17(3):261–272.Crossref, Google Scholar
- [63] (2021) Multipliers correction methods for optimization problems over the Stiefel manifold. CSIAM Trans. Appl. Math. 2(2021):508–531.Crossref, Google Scholar
- [64] (2020) Cubic regularization with momentum for nonconvex optimization. Proc. 35th Uncertainty Artificial Intelligence Conf. (PMLR), 313–322.Google Scholar
- [65] (2013) A feasible method for optimization with orthogonality constraints. Math. Programming 142(1–2):397–434.Crossref, Google Scholar
- [66] (2010) Critical landscape topology for optimization on the symplectic group. J. Optim. Theory Appl. 145(2):387–406.Crossref, Google Scholar
- [67] (2021) Solving optimization problems over the Stiefel manifold by smooth exact penalty function. Preprint, submitted November 23, https://arxiv.org/abs/2110.08986v2.Google Scholar
- [68] (2020) A class of smooth exact penalty function methods for optimization problems with orthogonality constraints. Optim. Methods Software 37(4):1205–1241.Crossref, Google Scholar
- [69] (2021) A penalty-free infeasible approach for a class of nonsmooth opimtization problems over the Stiefel manifold. Preprint, submitted March 28, https://arxiv.org/abs/2103.03514.Google Scholar
- [70] (2021) Exact penalty function for ℓ2,1 norm minimization over the Stiefel manifold. SIAM J. Optim. 31(4):3097–3126.Crossref, Google Scholar
- [71] (2015) Recent advances in trust region algorithms. Math. Programming 151(1):249–281.Crossref, Google Scholar
- [72] (2014) Scalable nonlinear programming via exact differentiable penalty functions and trust-region Newton methods. SIAM J. Optim. 24(1):528–558.Crossref, Google Scholar
- [73] (2020) Riemannian cubics in quadratic matrix Lie groups. Appl. Math. Comput. 375:125082.Google Scholar
- [74] (2018) Toward Riemannian accelerated gradient methods. Preprint, submitted June 7, https://arxiv.org/abs/1806.02812.Google Scholar
- [75] (2018) R-SPIDER: A fast Riemannian stochastic optimization algorithm with curvature independent rate. Preprint, submitted December 14, https://arxiv.org/abs/1811.04194.Google Scholar
- [76] (1997) Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Software 23(4):550–560.Crossref, Google Scholar

