A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications

Published Online:https://doi.org/10.1287/moor.2016.0817

References

  • Auslender A, Teboulle M (2006) Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16(3):697–725.CrossrefGoogle Scholar
  • Bauschke HH, Borwein JM (1997) Legendre functions and the method of Bregman projections. J. Convex Anal. 4:27–67.Google Scholar
  • Bauschke HH, Borwein JM (2001) Joint and separate convexity of the Bregman distance. Butnariu D, Censor Y, Reich S, eds. Inherently Parallel Algorithms in Feasibility and Optimization and Their Applications (Elsevier, Amsterdam), 23–36.CrossrefGoogle Scholar
  • Bauschke HH, Combettes PL (2011) Convex Analysis and Monotone Operator Theory in Hilbert Spaces (Springer, New York).CrossrefGoogle Scholar
  • Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.CrossrefGoogle Scholar
  • Beck A, Teboulle M (2009) Gradient-based algorithms with applications to signal recovery problems. Palomar D, Eldar YC, eds. Convex Optimization in Signal Processing and Communications (Cambridge University Press, New York), 139–162.CrossrefGoogle Scholar
  • Becker SR, Candes EJ, Grant MC (2011) Templates for convex cone problems with applications to sparse signal recovery. Math. Programming Comput. 3:165–218.CrossrefGoogle Scholar
  • Bertero M, Boccacci P, Desider G, Vicidomini G (2009) Image deblurring with Poisson data: From cells to galaxies. Inverse Problems 25:123006.CrossrefGoogle Scholar
  • Bertsekas DP (1999) Nonlinear Programming, 2nd ed. (Athena Scientific, Belmont, MA).Google Scholar
  • Bolte J, Teboulle M (2003) Barrier operators and associated gradient-like dynamical systems for constrained minimization problems. SIAM J. Control Optim. 42:1266–1292.CrossrefGoogle Scholar
  • Bregman LM (1967) The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. U.S.S.R. Computational Math. Math. Phys. 7:200–217.CrossrefGoogle Scholar
  • Bruck R (1977) On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space. J. Math. Anal. Appl. 61:159–164.CrossrefGoogle Scholar
  • Censor Y, Zenios SA (1992) Proximal minimization algorithm with D-functions. J. Optim. Theory Appl. 73:451–464.CrossrefGoogle Scholar
  • Chambolle A, Pock T (2011) A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging and Vision 40(1):120–145.CrossrefGoogle Scholar
  • Chen G, Teboulle M (1993) Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3:538–543.CrossrefGoogle Scholar
  • Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. SIAM Multiscale Modeling and Simululation 4:1168–1200.CrossrefGoogle Scholar
  • Csiszár I (1991) Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. Ann. Statist. 19:2032–2066.CrossrefGoogle Scholar
  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B 39:1–38.Google Scholar
  • Eckstein J (1993) Nonlinear proximal point algorithms using Bregman functions, with applications to convex programming. Math. Oper. Res. 18(1):202–226.LinkGoogle Scholar
  • Engl HW, Hanke M, Neubauer A (2000) Regularization of Inverse Problems, Mathematics and Its Applications (Kluwer Academic Publisher, Dordrecht, Netherlands).Google Scholar
  • Fukushima M, Milne H (1981) A generalized proximal point algorithm for certain nonconvex minimization problems. Internat. J. Systems Sci. 12:989–1000.CrossrefGoogle Scholar
  • Gabay D (1983) Applications of the method of multipliers to variational inequalities. Fortin M, Glowinski R, eds. Applications to the Solution of Boundary-Valued Problems (North-Holland, Amsterdam), 299–331.Google Scholar
  • Glowinski R, Le Tallec P (1989) Augmented Lagrangian and Operator Splitting Methods in Nonlinear Mechanics, Vol. 9 (SIAM, Philadelphia).CrossrefGoogle Scholar
  • Güler O (1991) On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control and Optim. 29:403–419.CrossrefGoogle Scholar
  • He B, Yuan X (2012) On the O(1/n) convergence rate of the Douglas-Rachford alternating direction method. SIAM J. Numerical Anal. 50:700–709.CrossrefGoogle Scholar
  • Moreau JJ (1965) Proximité et dualité dans un espace hilbertien. Bulletin de la Société Mathématique de France 90:273–299.CrossrefGoogle Scholar
  • Nesterov Y (1983) A method for solving the convex programming problem with convergence rate O(1/k2). Doklady Akad Nauk SSSR 269:543–547.Google Scholar
  • Opial Z (1967) Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. AMS 73:591–597.CrossrefGoogle Scholar
  • Palomar DP, Eldar YC (2010) Convex Optimization in Signal Processing and Communications (Cambridge University Press, New York).Google Scholar
  • Passty GB (1979) Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72:383–390.CrossrefGoogle Scholar
  • Polyak BT (1987) Introduction to Optimization (Optimization Software, Inc., New York).Google Scholar
  • Rockafellar RT (1970) Convex Analysis (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Shefi R, Teboulle M (2014) Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization. SIAM J. Optimization 24(1):269–297.CrossrefGoogle Scholar
  • Sra S, Nowozin S, Wright SJ (2011) Optimization for Machine Learning (MIT Press, Cambridge, MA).CrossrefGoogle Scholar
  • Teboulle M (1992) Entropic proximal mappings with application to nonlinear programming. Math. Oper. Res. 17(3):670–690.LinkGoogle Scholar
  • Tseng P (2010) Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Programming Ser. B 125:263–295.CrossrefGoogle Scholar
  • Vardi Y, Shepp L, Kaufman L (1985) A statistical model for positron emission tomography. J. Amer. Statist. Assoc. 80:8–37.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.