A Stochastic Sequential Quadratic Optimization Algorithm for Nonlinear-Equality-Constrained Optimization with Rank-Deficient Jacobians

Published Online:https://doi.org/10.1287/moor.2021.0154

References

  • [1] Berahas AS, Curtis FE, Robinson DP, Zhou B (2021) Sequential quadratic optimization for nonlinear equality constrained stochastic optimization. SIAM J. Optim. 31(2):1352–1379.CrossrefGoogle Scholar
  • [2] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.CrossrefGoogle Scholar
  • [3] Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3):1–27.CrossrefGoogle Scholar
  • [4] Chen C, Tung F, Vedula N, Mori G (2018) Constraint-aware deep neural network compression. Proceedings of the European Conference on Computer Vision (ECVC), 400–415.Google Scholar
  • [5] Cuomo S, Di Cola VS, Giampaolo F, Rozza G, Raissi M, Piccialli F (2022) Scientific machine learning through physics–informed neural networks: Where we are and what’s next. J. Sci. Comput. 92(88).CrossrefGoogle Scholar
  • [6] Curtis FE, Nocedal J, Wächter A (2009) A matrix-free algorithm for equality constrained optimization problems with rank deficient Jacobians. SIAM J. Optim. 20(3):1224–1249.CrossrefGoogle Scholar
  • [7] Curtis FE, O’Neill MJ, Robinson DP (2023) Worst-case complexity of an SQP method for nonlinear equality constrained stochastic optimization. Math. Program., ePub ahead of print June 7, https://doi.org/10.1007/s10107-023-01981-1.CrossrefGoogle Scholar
  • [8] Curtis FE, Robinson DP, Zhou B (2023) Sequential quadratic optimization for stochastic optimization with deterministic nonlinear inequality and equality constraints. Preprint, submitted February 28, https://arxiv.org/abs/2302.14790.Google Scholar
  • [9] Curtis FE, Kungurtsev V, Robinson DP, Wang Q (2023) A stochastic-gradient-based interior-point algorithm for solving smooth bound-constrained optimization problems. Preprint, submitted April 28, https://arxiv.org/abs/2304.14907.Google Scholar
  • [10] Davis D, Drusvyatskiy D, Kakade S (2020) Stochastic subgradient method converges on tame functions. Found. Comput. Math. 20:119–154.CrossrefGoogle Scholar
  • [11] Fang Y, Na S, Mahoney MW, Kolar M (2022) Fully stochastic trust-region sequential quadratic programming for equality-constrained optimization problems. Preprint, submitted November 29, https://arxiv.org/abs/2211.15943.Google Scholar
  • [13] Gould NIM, Orban D, Toint PL (2015) CUTEst: A constrained and unconstrained testing environment with safe threads for mathematical optimization. Comput. Optim. Appl. 60:545–557.CrossrefGoogle Scholar
  • [12] Gould NIM, Lucidi S, Roma M, Toint PL (1999) Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2):504–525.CrossrefGoogle Scholar
  • [14] Han SP (1977) A globally convergent method for nonlinear programming. J. Optim. Theory Appl. 22(3):297–309.CrossrefGoogle Scholar
  • [15] Han SP, Mangasarian OL (1979) Exact penalty functions in nonlinear programming. Math. Program. 17:251–269.CrossrefGoogle Scholar
  • [16] Hazan E, Luo H (2016) Variance-reduced and projection-free stochastic optimization. Proceedings of International Conference on Machine Learning (ICML), 1263–1271.Google Scholar
  • [17] Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L (2021) Physics-informed machine learning. Nat. Rev. Phys. 3:422–440.CrossrefGoogle Scholar
  • [18] Kumar Roy S, Mhammedi Z, Harandi M (2018) Geometry aware constrained optimization techniques for deep learning. Proceedings of Computer Vision and Pattern Recognition (CVPR), 4460–4469.Google Scholar
  • [19] Locatello F, Yurtsever A, Fercoq O, Cevher V (2019) Stochastic Frank-Wolfe for composite convex minimization. Proceedings of Neural Information Processing Systems (NeurIPS), 14269–14279.Google Scholar
  • [20] Lu H, Freund RM (2020) Generalized stochastic Frank-Wolfe algorithm with stochastic “substitute” gradient for structured convex optimization. Math. Program. 187:317–349.Google Scholar
  • [21] Lu L, Pestourie R, Yao W, Wang Z, Verdugo F, Johnson SG (2021) Physics-informed neural networks with hard constraints for inverse design SIAM J. Sci. Comput. 43(6):B1105–B1132.Google Scholar
  • [22] Na S, Anitescu M, Kolar M (2023) An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians. Math. Program. 199:721–791.CrossrefGoogle Scholar
  • [23] Nandwani Y, Pathak A, Singla P (2019) A primal-dual formulation for deep learning with constraints. Proceedings of Neural Information Processing Systems (NeurIPS), 12157–12168.Google Scholar
  • [24] Négiar G, Mahoney MW, Krishnapriyan AS (2023) Learning differentiable solvers for systems with hard constraints. Preprint, submitted July 18, https://arxiv.org/abs/2207.08675.Google Scholar
  • [25] Nocedal J, Wright S (2006) Numerical optimization. Springer Series in Operations Research and Financial Engineering (Springer-Verlag, New York).Google Scholar
  • [26] Omojokun EO (1989) Trust region algorithms for optimization with nonlinear equality and inequality constraints. PhD thesis, University of Colorado, Boulder, CO.Google Scholar
  • [27] Powell MJD (1978) A fast algorithm for nonlinearly constrained optimization calculations. Numerical Analysis, Lecture Notes in Mathematics, vol. 630 (Springer, Berlin), 144–157.Google Scholar
  • [28] Ravi SN, Dinh T, Lokhande VS, Singh V (2019) Explicitly imposing constraints in deep networks via conditional gradients gives improved generalization and faster convergence. Proc. Conf. AAAI Artif. Intell. 33:4772–4779.Google Scholar
  • [29] Reddi SJ, Sra S, Póczos B, Smola A (2016) Stochastic Frank-Wolfe methods for nonconvex optimization. 2016 54th Annual Allerton Conference, 1244–1251 (IEEE, Piscataway, NJ).Google Scholar
  • [30] Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.CrossrefGoogle Scholar
  • [31] Robbins H, Siegmund D (1971) A convergence theorem for nonnegative almost supermartingales and some applications. Rustagi JS, ed., Optimizing Methods in Statistics (Academic Press, New York).Google Scholar
  • [32] Steihaug T (1983) The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20(3):626–637.CrossrefGoogle Scholar
  • [33] Stirzaker D (2003) Elementary Probability (Cambridge University Press, Cambridge).CrossrefGoogle Scholar
  • [34] Wächter A, Biegler LT (2000) Failure of global convergence for a class of interior point methods for nonlinear programming. Math. Program. 88(3):565–574.CrossrefGoogle Scholar
  • [35] Zhang M, Shen Z, Mokhtari A, Hassani H, Karbasi A (2020) One sample stochastic Frank-Wolfe. AISTATS 2020:4012–4023.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.