The Cost of Nonconvexity in Deterministic Nonsmooth Optimization

Published Online:https://doi.org/10.1287/moor.2022.0289

References

  • [1] Bianchi P, Hachem W, Schechtman S (2022) Convergence of constant step stochastic gradient descent for non-smooth non-convex functions. Set-Valued Variational Anal. 30(3):1117–1147.CrossrefGoogle Scholar
  • [2] Bolte J, Pauwels E (2021) Conservative set valued fields, automatic differentiation, stochastic gradient methods and deep learning. Math. Programming 188(1):19–51.CrossrefGoogle Scholar
  • [3] Bolte J, Daniilidis A, Lewis A (2009) Tame functions are semismooth. Math. Programming 117(1):5–19.CrossrefGoogle Scholar
  • [4] Davis D, Drusvyatskiy D (2018) Complexity of finding near-stationary points of convex functions stochastically. Preprint, submitted February 21, https://arxiv.org/abs/1802.08556.Google Scholar
  • [5] Davis D, Drusvyatskiy D (2019) Stochastic model-based minimization of weakly convex functions. SIAM J. Optim. 29(1):207–239.CrossrefGoogle Scholar
  • [6] Davis D, Drusvyatskiy D, Lee YT, Padmanabhan S, Ye G (2022) A gradient sampling method with complexity guarantees for Lipschitz functions in high and low dimensions. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inf. Process, vol. 35 (Curran Associates, Inc., Red Hook, NY), 6692–6703.Google Scholar
  • [7] Dudley R (1977) On second derivatives of convex functions. Mathematica Scandinavica 41(1):159–174.CrossrefGoogle Scholar
  • [8] Evans L, Gariepy R (1992) Measure Theory and Fine Properties of Functions (CRC Press, Boca Raton, FL).Google Scholar
  • [9] Facchinei F, Pang JS (2003) Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. II, Springer Series in Operations Research (Springer-Verlag, New York).Google Scholar
  • [10] Goldstein A (1977) Optimization of Lipschitz continuous functions. Math. Programming 13(1):14–22.CrossrefGoogle Scholar
  • [11] Hartman P (1959) On functions representable as a difference of convex functions. Pacific J. Math. 9(3):707–713.CrossrefGoogle Scholar
  • [12] Henrion R, Outrata J (2001) A subdifferential condition for calmness of multifunctions. J. Math. Anal. Appl. 258(1):110–130.CrossrefGoogle Scholar
  • [13] Jordan M, Lin T, Zampetakis M (2022) On the complexity of deterministic nonsmooth and nonconvex optimization. Preprint, submitted September 26, https://arxiv.org/abs/2209.12463.Google Scholar
  • [14] Jordan M, Kornowski G, Lin T, Shamir O, Zampetakis M (2023) Deterministic nonsmooth nonconvex optimization. Gergely N, Lorenzo R, eds. Proc. Thirty Sixth Conf. Learn. Theory, Proc. Machine Learn. Res. 4570–4597.Google Scholar
  • [15] Kornowski G, Shamir O (2022) On the complexity of finding small subgradients in nonsmooth optimization. Preprint, submitted September 21, https://arxiv.org/abs/2209.10346.Google Scholar
  • [16] Mahdavi-Amiri N, Yousefpour R (2012) An effective nonsmooth optimization algorithm for locally Lpschitz functions. J. Optim. Theory Appl. 155(1):180–195.CrossrefGoogle Scholar
  • [17] Mifflin R (1977) An algorithm for constrained optimization with semismooth functions. Math. Oper. Res. 2(2):191–207.LinkGoogle Scholar
  • [18] Nesterov Y (2012) Lecture 1: Intrinsic complexity of black-box optimization. Accessed November 20, 2023, https://people.montefiore.uliege.be/francqui/slides/Lect1_Complexity_Boadilla.pdf.Google Scholar
  • [19] Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al. (2019) PyTorch: An imperative style, high-Performance deep learning library. Adv. Neural Inf. Process, vol. 32 (Curran Associates, Inc., Red Hook, NY), 8024–8035.Google Scholar
  • [20] Rudin W (1966) Real and Complex Analysis (McGraw-Hill, New York).Google Scholar
  • [21] Sagastizábal C (2018) A VU-point of view of nonsmooth optimization. Sirakov B, de Souza PN, Viana M, eds. Proc. Internat. Congress Math., vol. 4 (World Scientific, Singapore), 3815–3836.Google Scholar
  • [22] Tian L, So AMC (2021) Computing Goldstein (ϵ,δ)-stationary points of Lipschitz functions in O˜(ϵ−3δ−1) iterations via random conic perturbation. Accessed November 20, 2023, https://arxiv.org/pdf/2112.09002v1.pdf.Google Scholar
  • [23] Wolfe P (1975) A method of conjugate subgradients for minimizing nondifferentiable functions. Balinski ML, Wolfe P, eds. Nondifferentiable Optimization, Mathematical Programming Studies, vol. 3 (Springer, Berlin), 145–173.CrossrefGoogle Scholar
  • [24] Zhang J, Lin H, Jegelka S, Sra S, Jadbabaie A (2020) Complexity of finding stationary points of nonconvex nonsmooth functions. Daumé H III, Singh A, eds. Proc. Machine Learn. Res., vol. 119, 11173–11182.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.