A Superlinearly Convergent Subgradient Method for Sharp Semismooth Problems
Published Online:16 Aug 2023https://doi.org/10.1287/moor.2023.1390
References
- [1] (2016) Tensorflow: A system for large-scale machine learning. 12th USENIX Sympos. Operating Systems Design Implementation (OSDI 16) (USENIX Association, Savannah, GA), 265–283.Google Scholar
- [2] (2006) On signal reconstruction without phase. Appl. Comput. Harmonic Anal. 20(3):345–356.Crossref, Google Scholar
- [3] (1922) Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales. Fundamenta Mathematicae 3(1):133–181.Crossref, Google Scholar
- [4] (2015) Linear and strong convergence of algorithms involving averaged nonexpansive operators. J. Math. Anal. Appl. 421(1):1–20.Crossref, Google Scholar
- [5] (2021) Conservative set valued fields, automatic differentiation, stochastic gradient methods and deep learning. Math. Programming 188(1):19–51.Crossref, Google Scholar
- [6] (2009) Tame functions are semismooth. Math. Programming 117(1):5–19.Google Scholar
- [7] (2023) An Introduction to Optimization on Smooth Manifolds (Cambridge University Press, Cambridge).Google Scholar
- [8] (2005) Decoding by linear programming. IEEE Trans. Inform. Theory 51(12):4203–4215.Crossref, Google Scholar
- [9] (2006) Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inform. Theory 52(12):5406–5425.Google Scholar
- [10] (2021) Low-rank matrix recovery with composite optimization: Good conditioning and rapid convergence. Foundations Comput. Math. 21:1505–1593.Google Scholar
- [11] (2001) Atomic decomposition by basis pursuit. SIAM Rev. 43(1):129–159.Crossref, Google Scholar
- [12] (2015) An algebraic characterization of injectivity in phase retrieval. Appl. Comput. Harmonic Anal. 38(2):346–356.Crossref, Google Scholar
- [13] (2021) Conservative and semismooth derivatives are equivalent for semialgebraic maps. Set-Valued Variational Anal. 30:453–463.Google Scholar
- [14] (2021) Subgradient methods near active manifolds: Saddle point avoidance, local convergence, and asymptotic normality. Preprint, submitted August 26, https://doi.org/10.48550/arXiv.2108.11832.Google Scholar
- [15] (2020) Stochastic subgradient method converges on tame functions. Foundations Comput. Math. 20(1):119–154.Crossref, Google Scholar
- [16] (2018) Subgradient methods for sharp weakly convex functions. J. Optim. Theory Appl. 179(3):962–982.Google Scholar
- [17] (2023) Optimal convergence rates for the proximal bundle method. 33(2). https://epubs.siam.org/doi/10.1137/21M1428601.Google Scholar
- [18] (2006) Compressed sensing. IEEE Trans. Inform. Theory 52(4):1289–1306.Crossref, Google Scholar
- [19] (2013) Slope and geometry in variational mathematics. PhD thesis, Cornell University, Ithaca, NY.Google Scholar
- [20] (2015) Curves of descent. SIAM J. Control Optim. 53(1):114–138.Crossref, Google Scholar
- [21] (2015) Transversality and alternating projections for nonconvex sets. Foundations Comput. Math. 15(6):1637–1651.Crossref, Google Scholar
- [22] (2017) Rate of convergence of the bundle method. J. Optim. Theory Appl. 173(3):908–922.Crossref, Google Scholar
- [23] (2018) Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval. Inform. Inference 8(3):471–529.Google Scholar
- [24] (2010) Incremental-like bundle methods with application to energy planning. Comput. Optim. Appl. 46(2):305–332.Crossref, Google Scholar
- [25] (1965) The relaxation method of solving systems of inequalities with convex functions on the left-hand side. Proc. USSR Acad. Sci. 160:994–996.Google Scholar
- [26] (1998) Stochastic generalized gradient method for nonconvex nonsmooth stochastic optimization. Cybernetics Systems Anal. 34(2):196–215.Crossref, Google Scholar
- [27] (2007) Finite-Dimensional Variational Inequalities and Complementarity Problems (Springer Science & Business Media, New York).Google Scholar
- [28] (2014) An LP-Newton method: Nonsmooth equations, KKT systems, and nonisolated solutions. Math. Programming 146(1):1–36.Crossref, Google Scholar
- [29] (1977) On convergence rates of subgradient optimization methods. Math. Programming 13(3):329–347.Google Scholar
- [30] Golub GH, Van Loan CF (2013) Matrix Computations. 4th ed. (Johns Hopkins University Press, Baltimore).Google Scholar
- [31] (2012) Optimal rates of convergence for convex set estimation from support functions. Ann. Statist. 40(1):385–411.Google Scholar
- [32] (2010) A redistributed proximal bundle method for nonconvex optimization. SIAM J. Optim. 20(5):2442–2473.Crossref, Google Scholar
- [33] (1952) On approximate solutions of systems of linear inequalities. J. Res. National Bureau Standards 49(4):263–265.Crossref, Google Scholar
- [34] (2017) Variational Analysis of Regular Mappings: Theory and Applications, Springer Monographs in Mathematics (Springer, Cham, Switzerland).Crossref, Google Scholar
- [35] (2014) Newton-Type Methods for Optimization and Variational Problems (Springer, Berlin).Crossref, Google Scholar
- [36] (2020) Faster subgradient methods for functions with Hölderian growth. Math. Programming 180(1):417–450.Google Scholar
- [37] (1985) A linearization algorithm for nonsmooth minimization. Math. Oper. Res. 10(2):185–194.Link, Google Scholar
- [38] (2000) Efficiency of proximal bundle methods. J. Optim. Theory Appl. 104(3):589–603.Crossref, Google Scholar
- [39] (2006) Nonsmooth Equations in Optimization: Regularity, Calculus, Methods and Applications, vol. 60 (Springer Science & Business Media, New York).Google Scholar
- [40] (1955) Two comments on the method of successive approximations. Uspekhi Matematicheskikh Nauk 10:123–127.Google Scholar
- [41] (1988) Newton’s method for non-differentiable functions. Math. Res. 45:114–125.Google Scholar
- [42] (2013) Smooth manifolds. Introduction to Smooth Manifolds (Springer, Berlin), 1–31.Crossref, Google Scholar
- [43] (1975) An extension of davidon methods to nondifferentiable problems. Balinski ML, Wolfe P, eds. Nondifferentiable Optimization, Mathematical Programming Studies, vol. 3 (Springer, Berlin), 95–109.Crossref, Google Scholar
- [44] (2021) The structure of conservative gradient fields. 31(3). https://epubs.siam.org/doi/10.1137/21M1393637.Google Scholar
- [45] (2008) Alternating projections on manifolds. Math. Oper. Res. 33(1):216–234.Google Scholar
- [46] (2021) A proximal bundle variant with optimal iteration-complexity for a large range of prox stepsizes. SIAM J. Optim. 31(4):2955–2986.Crossref, Google Scholar
- [47] (1953) Mean value methods in iteration. Proc. Amer. Math. Soc. 4(3):506–510.Crossref, Google Scholar
- [48] (1977) Semismooth and semiconvex functions in constrained optimization. SIAM J. Optim. 15(6):959–972.Crossref, Google Scholar
- [49] (2005) A VU-algorithm for convex minimization. Math. Programming 104(2):583–608.Crossref, Google Scholar
- [50] (1986) Stochastic generalized-differentiable functions in the problem of nonconvex nonsmooth stochastic optimization. Cybernetics 22(6):804–809.Crossref, Google Scholar
- [51] (1973) The quasigradient method for the solving of the nonlinear programming problems. Cybernetics 9(1):145–150.Crossref, Google Scholar
- [52] (1974) Minimization of nondifferentiable functions in the presence of noise. Cybernetics 10(4):619–621.Crossref, Google Scholar
- [53] (2014) Bundle methods in the XXIst century: A bird’s-eye view. Pesquisa Operacional 34(3):647–670.Crossref, Google Scholar
- [54] (2015) Nonconvex set intersection problems: From projection methods to the Newton method for super-regular sets. Preprint, submitted June 27, https://doi.org/10.48550/arXiv.1506.08246.Google Scholar
- [55] (2015) Set intersection problems: Supporting hyperplanes and quadratic programming. Math. Programming 149(1):329–359.Crossref, Google Scholar
- [56] (1997) Error bounds in mathematical programming. Math. Programming 79(1–3):299–332.Crossref, Google Scholar
- [57] (1969) Minimization of unsmooth functionals. USSR Comput. Math. Math. Physics 9(3):14–29.Crossref, Google Scholar
- [58] (1978) Subgradient methods: A survey of Soviet research. Nonsmoothe Optim. Proc. IIASA Workshop, vol. 3 (Pergamon, Oxford, UK), 5–29.Google Scholar
- [59] (1990) Reconstructing convex sets from support line measurements. IEEE Trans. Pattern Anal. Machine Intelligence 12(4):377–389.Crossref, Google Scholar
- [60] (1993) Convergence analysis of some algorithms for solving nonsmooth equations. Math. Oper. Res. 18(1):227–244.Link, Google Scholar
- [61] (1993) A nonsmooth version of Newton’s method. Math. Programming 58(1):353–367.Crossref, Google Scholar
- [62] (1998) Variational Analysis, Grundlehren der Mathematischen Wissenschaften, vol. 317 (Springer, Berlin).Crossref, Google Scholar
- [63] (2012) Divide to conquer: Decomposition methods for energy optimization. Math. Programming 134(1):187–222.Crossref, Google Scholar
- [64] (1989) A storage-efficient WY representation for products of householder transformations. SIAM J. Sci. Statist. Comput. 10(1):53–57.Google Scholar
- [65] (1970) The rate of convergence of the method of the generalized gradient descent with expansion of space. Kibernetika (Kiev) 2:80–85.Google Scholar
- [66] (2016) Staggered time average algorithm for stochastic non-smooth optimization with O(1/t) convergence. Preprint, submitted July 11, https://doi.org/10.48550/arXiv.1607.02842.Google Scholar
- [67] (2019) Supermann: A superlinearly convergent algorithm for finding fixed points of nonexpansive operators. IEEE Trans. Automatic Control 64(12):4875–4890.Crossref, Google Scholar
- [68] (2013) The lasso problem and uniqueness. Electronic J. Statist. 7:1456–1490.Crossref, Google Scholar
- [69] (2010) Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Programming 125(2):263–295.Crossref, Google Scholar
- [70] (1950) Functional Operators (AM-22), Volume 2: The Geometry of Orthogonal Spaces, Annals of Mathematics Studies, vol. 22 (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
- [71] (2018) Phase retrieval with random gaussian sensing vectors by alternating projections. IEEE Trans. Inform. Theory 64(5):3301–3312.Crossref, Google Scholar
- [72] (1992) Local properties of analytic varieties. Eells J, Toledo D, eds. Hassler Whitney Collected Papers, Contemporary Mathematicians (Birkhäuser, Boston), 497–536.Crossref, Google Scholar
- [73] (1992) Tangents to an analytic variety. Eells J, Toledo D, eds. Hassler Whitney Collected Papers, Contemporary Mathematicians (Birkhäuser, Boston), 537–590.Crossref, Google Scholar
- [74] (1975) A method of conjugate subgradients for minimizing nondifferentiable functions. Balinski ML, Wolfe P, eds. Nondifferentiable Optimization, Mathematical Programming Studies, vol. 3 (Springer, Berlin), 145–173.Crossref, Google Scholar
- [75] (2018) RSG: Beating subgradient method without smoothness and strong convexity. J. Machine Learn. Res. 19(6):1–33.Google Scholar

