Inexact Bregman Proximal Gradient Method and Its Inertial Variant with Absolute and Partial Relative Stopping Criteria

Lei Yang
Corresponding Author
Lei Yang
[email protected]
https://orcid.org/0000-0001-9979-8371
School of Computer Science and Engineering and Guangdong Province Key Laboratory of Computational Science, Sun Yat-sen University, Guangzhou 510006, China
Search for more papers by this author
,
Kim-Chuan Toh
Kim-Chuan Toh
[email protected]
https://orcid.org/0000-0001-7204-8933
Department of Mathematics and Institute of Operations Research and Analytics, National University of Singapore, Singapore 119076, Singapore
Search for more papers by this author

Corresponding Author

Lei Yang

School of Computer Science and Engineering and Guangdong Province Key Laboratory of Computational Science, Sun Yat-sen University, Guangzhou 510006, China

Search for more papers by this author

Kim-Chuan Toh

[email protected]

https://orcid.org/0000-0001-7204-8933

Department of Mathematics and Institute of Operations Research and Analytics, National University of Singapore, Singapore 119076, Singapore

Search for more papers by this author

Published Online:28 Mar 2025https://doi.org/10.1287/moor.2023.0328

References

[1] Ahookhosh M, Nesterov Y (2024) High-order methods beyond the classical complexity bounds: Inexact high-order proximal-point methods. Math. Program. 208(1–2):365–407.Crossref, Google Scholar
[2] Ahookhosh M, Themelis A, Patrinos P (2021) A Bregman forward-backward linesearch algorithm for nonconvex composite optimization: Superlinear convergence to nonisolated local minima. SIAM J. Optim. 31(1):653–685.Crossref, Google Scholar
[3] Ahookhosh M, Hien L, Gillis N, Patrinos P (2021) A block inertial Bregman proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization. J. Optim. Theory Appl. 190(1):234–258.Crossref, Google Scholar
[4] Ahookhosh M, Hien L, Gillis N, Patrinos P (2021) Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization. Comput. Optim. Appl. 79(3):681–715.Crossref, Google Scholar
[5] Altschuler J, Weed J, Rigollet P (2017) Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. Proc. 31st Internat. Conf. Neural Inform. Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY), 1961–1971.Google Scholar
[6] Attouch H, Bolte J (2009) On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116(1):5–16.Crossref, Google Scholar
[7] Attouch H, Bolte J, Svaiter B (2013) Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward–Backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137(1):91–129.Crossref, Google Scholar
[8] Attouch H, Bolte J, Redont P, Soubeyran A (2010) Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. 35(2):438–457.Link, Google Scholar
[9] Aujol JF, Dossal C (2015) Stability of over-relaxations for the forward-backward algorithm, application to FISTA. SIAM J. Optim. 25(4):2408–2433.Crossref, Google Scholar
[10] Auslender A, Teboulle M (2006) Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16(3):697–725.Crossref, Google Scholar
[11] Bauschke H, Borwein J (1997) Legendre functions and the method of random Bregman projections. J. Convex Anal. 4(1):27–67.Google Scholar
[12] Bauschke H, Borwein J (2001) Joint and separate convexity of the Bregman distance. Studies in Computational Mathematics, vol. 8 (Elsevier), 23–36.Google Scholar
[13] Bauschke H, Combettes P (2011) Convex Analysis and Monotone Operator Theory in Hilbert Spaces, vol. 408 (Springer).Crossref, Google Scholar
[14] Bauschke H, Bolte J, Teboulle M (2017) A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications. Math Oper Res. 42(2):330–348.Link, Google Scholar
[15] Bauschke H, Bolte J, Chen J, Teboulle M, Wang X (2019) On linear convergence of non-Euclidean gradient methods without strong convexity and Lipschitz gradient continuity. J. Optim. Theory Appl. 182(3):1068–1087.Crossref, Google Scholar
[16] Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3):167–175.Crossref, Google Scholar
[17] Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.Crossref, Google Scholar
[18] Bello-Cruz Y, Gonçalves M, Krislock N (2023) On FISTA with a relative error rule. Comput. Optim. Appl. 84(2):295–318.Crossref, Google Scholar
[19] Blondel M, Seguy V, Rolet A (2018) Smooth and sparse optimal transport. Internat. Conf. Artificial Intelligence Statist., vol. 84, 880–889.Google Scholar
[20] Bolte J, Daniilidis A, Lewis A (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.Crossref, Google Scholar
[21] Bolte J, Sabach S, Teboublle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1):459–494.Crossref, Google Scholar
[22] Bolte J, Sabach S, Teboulle M, Vaisbourd Y (2018) First order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3):2131–2151.Crossref, Google Scholar
[23] Bonettini S, Porta F, Ruggiero V (2016) A variable metric forward-backward method with extrapolation. SIAM J. Sci. Comput. 38(4):A2558–A2584.Crossref, Google Scholar
[24] Bonettini S, Rebegoldi S, Ruggiero V (2018) Inertial variable metric techniques for the inexact forward-backward algorithm. SIAM J. Sci. Comput. 40(5):A3180–A3210.Crossref, Google Scholar
[25] Bregman L (1967) The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3):200–217.Crossref, Google Scholar
[26] Butnariu D, Byrne C, Censor Y (2003) Redundant axioms in the definition of Bregman functions. J. Convex Anal. 10(1):245–254.Google Scholar
[27] Censor Y, Lent A (1981) An iterative row-action method for interval convex programming. J. Optim. Theory Appl. 34(3):321–353.Crossref, Google Scholar
[28] Censor Y, Zenios S (1992) Proximal minimization algorithm with D-functions. J. Optim. Theory Appl. 73(3):451–464.Crossref, Google Scholar
[29] Chambolle A, Dossal C (2015) On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm”. J. Optim. Theory Appl. 166(3):968–982.Crossref, Google Scholar
[30] Chao HH, Vandenberghe L (2018) Entropic proximal operators for nonnegative trigonometric polynomials. IEEE Trans. Signal Process. 66(18):4826–4838.Crossref, Google Scholar
[31] Chu H, Liang L, Toh KC, Yang L (2023) An efficient implementable inexact entropic proximal point algorithm for a class of linear programming problems. Comput. Optim. Appl. 85(1):107–146.Crossref, Google Scholar
[32] Combettes P, Wajs V (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4(4):1168–1200.Crossref, Google Scholar
[33] Dragomir RA, d’Aspremont A, Bolte J (2021) Quartic first-order methods for low-rank minimization. J. Optim. Theory Appl. 189:341–363.Crossref, Google Scholar
[34] Dragomir RA, Taylor A, d’Aspremont A, Bolte J (2022) Optimal complexity and certification of Bregman first-order methods. Math. Program. 194(1–2):41–83.Crossref, Google Scholar
[35] Eckstein J (1993) Nonlinear proximal point algorithms using Bregman functions, with applications to convex programming. Math. Oper. 18(1):202–226.Link, Google Scholar
[36] Eckstein J (1998) Approximate iterations in Bregman-function-based proximal algorithms. Math. Program. 83(1–3):113–123.Crossref, Google Scholar
[37] Essid M, Solomon J (2018) Quadratically regularized optimal transport on graphs. SIAM J. Sci. Comput. 40(4):A1961–A1986.Crossref, Google Scholar
[38] Fukushima M, Mine H (1981) A generalized proximal point algorithm for certain non-convex minimization problems. Internat. J. Syst. Sci. 12(8):989–1000.Crossref, Google Scholar
[39] Gutman D, Peña J (2023) Perturbed Fenchel duality and first-order methods. Math. Program. 198(1):443–469.Crossref, Google Scholar
[40] Hanzely F, Richtárik P, Xiao L (2021) Accelerated Bregman proximal gradient methods for relatively smooth convex optimization. Comput. Optim. Appl. 79(2):405–440.Crossref, Google Scholar
[41] Hien L, Phan D, Gillis N, Ahookhosh M, Patrinos P (2022) Block Bregman majorization minimization with extrapolation. SIAM J. Math. Data Sci. 4(1):1–25.Crossref, Google Scholar
[42] Hiriart-Urruty J, Lemaréchal C (1993) Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods (Springer).Crossref, Google Scholar
[43] Jiang K, Sun D, Toh KC (2012) An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22(3):1042–1064.Crossref, Google Scholar
[44] Jiang K, Si W, Chen C, Bao C (2020) Efficient numerical methods for computing the stationary states of phase field crystal models. SIAM J. Sci. Comput. 42(6):B1350–B1377.Crossref, Google Scholar
[45] Kabbadj S (2020) Inexact version of Bregman proximal gradient algorithm. Abstr. Appl. Anal. 2020:1963980.Google Scholar
[46] Lan G, Lu Z, Monteiro R (2011) Primal-dual first-order methods with O(1/ϵ) iteration-complexity for cone programming. Math. Program. 126(1):1–29.Crossref, Google Scholar
[47] Latafat P, Themelis A, Ahookhosh M, Patrinos P (2022) Bregman Finito/MISO for nonconvex regularized finite sum minimization without Lipschitz gradient continuity. SIAM J. Optim. 32(3):2230–2262.Crossref, Google Scholar
[48] Lions P, Mercier B (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6):964–979.Crossref, Google Scholar
[49] Lorenz D, Manns P, Meyer C (2021) Quadratically regularized optimal transport. Appl. Math. Optim. 83:1919–1949.Crossref, Google Scholar
[50] Lu H, Freund M, Nesterov Y (2018) Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Optim. 28(1):333–354.Crossref, Google Scholar
[51] Mukkamala M, Ochs P, Pock T, Sabach S (2020) Convex-concave backtracking for inertial Bregman proximal gradient algorithms in nonconvex optimization. SIAM J. Math. Data Sci. 2(3):658–682.Crossref, Google Scholar
[52] Mukkamala M, Westerkamp F, Laude E, Cremers D, Ochs P (2019) Bregman proximal framework for deep linear neural networks. Preprint, submitted October 8, https://arxiv.org/abs/1910.03638.Google Scholar
[53] Nesterov Y (1983) A method of solving a convex programming problem with convergence rate O(1/k2). Soviet Math. Dokl. 27(2):372–376.Google Scholar
[54] Nesterov Y (1988) On an approach to the construction of optimal methods of minimization of smooth convex functions. Èkonom. i. Mat. Metody. 24:509–517.Google Scholar
[55] Nesterov Y (2003) Introductory Lectures on Convex Optimization: A Basic Course, vol. 87 (Springer Science & Business Media, New York).Google Scholar
[56] Nesterov Y (2005) Smooth minimization of non-smooth functions. Math. Program. 103(1):127–152.Crossref, Google Scholar
[57] Nesterov Y (2013) Gradient methods for minimizing composite functions. Math. Program. 140(1):125–161.Crossref, Google Scholar
[58] Nesterov Y (2023) Inexact accelerated high-order proximal-point methods. Math. Program. 197(1–2):1–26.Crossref, Google Scholar
[59] Peyré G, Cuturi M (2019) Computational optimal transport. Found. Trends Mach. Learn. 11(5–6):355–607.Crossref, Google Scholar
[60] Polyak B (1987) Introduction to Optimization (Optimization Software Inc., New York).Google Scholar
[61] Rebegoldi S, Bonettini S, Prato M (2018) A Bregman inexact linesearch–Based forward–Backward algorithm for nonsmooth nonconvex optimization. J. Phys. Conf. Ser. 1131:012013.Crossref, Google Scholar
[62] Rockafellar R (1970) Convex Analysis (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
[63] Romain M, d’Aspremont A (2020) A Bregman method for structure learning on sparse directed acyclic graphs. Preprint, submitted November 5, https://arxiv.org/abs/2011.02764.Google Scholar
[64] Salzo S (2017) The variable metric forward-backward splitting algorithm under mild differentiability assumptions. SIAM J. Optim. 27(4):2153–2181.Crossref, Google Scholar
[65] Schmidt M, Roux N, Bach F (2011) Convergence rates of inexact proximal-gradient methods for convex optimization. Adv. Neural Inform. Process. Syst., vol. 24 (Curran Associates Inc., Red Hook, NY).Google Scholar
[66] Solodov M, Svaiter B (2000) An inexact hybrid generalized proximal point algorithm and some new results on the theory of Bregman functions. Math. Oper. Res. 25(2):214–230.Link, Google Scholar
[67] Stonyakin F, Tyurin A, Gasnikov A, Dvurechensky P, Agafonov A, Dvinskikh D, Alkousa M, Pasechnyuk D, Artamonov S, Piskunova V (2021) Inexact model: A framework for optimization and variational inequalities. Optim. Methods Softw. 36(6):1155–1201.Crossref, Google Scholar
[68] Teboulle M (2018) A simplified view of first order methods for optimization. Math. Program. 170(1):67–96.Crossref, Google Scholar
[69] Tseng P (2008) On accelerated proximal gradient methods for convex-concave optimization. Technical report, University of Washington, Seattle.Google Scholar
[70] Tseng P (2010) Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125(2):263–295.Crossref, Google Scholar
[71] Van Nguyen Q (2017) Forward-backward splitting with Bregman distances. Vietnam J. Math. 45(3):519–539.Crossref, Google Scholar
[72] Villa S, Salzo S, Baldassarre L, Verri A (2013) Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23(3):1607–1633.Crossref, Google Scholar
[73] Yang L (2024) Proximal gradient method with extrapolation and line search for a class of nonconvex and nonsmooth problems. J. Optim. Theory Appl. 200(1):68–103.Crossref, Google Scholar
[74] Yang L, Toh KC (2022) Bregman proximal point algorithm revisited: A new inexact version and its variant. SIAM J. Optim. 32(3):1523–1554.Crossref, Google Scholar
[75] Zhou Y, Liang Y, Shen L (2019) A simple convergence analysis of Bregman proximal gradient algorithm. Comput. Optim. Appl. 73(3):903–912.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 51, Issue 1

February 2026

Pages iv-viii, 1-851

Article Information

Metrics

Information

Received:October 23, 2023
Accepted:January 25, 2025
Published Online:March 28, 2025

Cite as

Lei Yang, Kim-Chuan Toh (2025) Inexact Bregman Proximal Gradient Method and Its Inertial Variant with Absolute and Partial Relative Stopping Criteria. Mathematics of Operations Research 51(1):686-714.

https://doi.org/10.1287/moor.2023.0328

Keywords

Acknowledgments

The authors thank the editor and referees for their valuable suggestions and comments, which have helped to improve the quality of this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Inexact Bregman Proximal Gradient Method and Its Inertial Variant with Absolute and Partial Relative Stopping Criteria

References

Volume 51, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News