A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization

Afrooz Jalilzadeh
Afrooz Jalilzadeh
[email protected]
https://orcid.org/0000-0002-3734-1082
Industrial and Manufacturing Engineering, Pennsylvania State University, State College, Pennsylvania 16802
Search for more papers by this author
,
Angelia Nedić
Angelia Nedić
[email protected]
https://orcid.org/0000-0001-9365-6321
School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, Arizona 85287
Search for more papers by this author
,
Uday V. Shanbhag
Uday V. Shanbhag
[email protected]
https://orcid.org/0000-0001-5869-9561
Industrial and Manufacturing Engineering, Pennsylvania State University, State College, Pennsylvania 16802
Search for more papers by this author
,
Farzad Yousefian
Farzad Yousefian
[email protected]
https://orcid.org/0000-0003-2628-741X
School of Industrial Engineering and Management, Oklahoma State University, Stillwater, Oklahoma 74078
Search for more papers by this author

Industrial and Manufacturing Engineering, Pennsylvania State University, State College, Pennsylvania 16802

Search for more papers by this author

Angelia Nedić

[email protected]

https://orcid.org/0000-0001-9365-6321

School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, Arizona 85287

Search for more papers by this author

Uday V. Shanbhag

[email protected]

https://orcid.org/0000-0001-5869-9561

Industrial and Manufacturing Engineering, Pennsylvania State University, State College, Pennsylvania 16802

Search for more papers by this author

Farzad Yousefian

[email protected]

https://orcid.org/0000-0003-2628-741X

School of Industrial Engineering and Management, Oklahoma State University, Stillwater, Oklahoma 74078

Search for more papers by this author

Published Online:3 Dec 2021https://doi.org/10.1287/moor.2021.1147

References

[1] Beck A (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
[2] Beck A, Teboulle M (2012) Smoothing and first order methods: A unified framework. SIAM J. Optim. 22(2):557–580.Crossref, Google Scholar
[3] Berahas AS, Nocedal J, Takác M (2016) A multi-batch L-BFGS method for machine learning. Preprint, submitted May 19, https://arxiv.org/abs/1605.06049.Google Scholar
[4] Bollapragada R, Byrd R, Nocedal J (2018) Adaptive sampling strategies for stochastic optimization. SIAM J. Optim. 28(4):3312–3343.Crossref, Google Scholar
[5] Bollapragada R, Mudigere D, Nocedal J, Shi HJM, Tang PTP (2018) A progressive batching L-BFGS method for machine learning. Preprint, submitted February 15, https://arxiv.org/abs/1802.05374.Google Scholar
[6] Byrd RH, Chin GM, Nocedal J, Wu Y (2012) Sample size selection in optimization methods for machine learning. Math. Programming 134(1):127–155.Google Scholar
[7] Byrd RH, Hansen SL, Nocedal J, Singer Y (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J. Optim. 26(2):1008–1031.Crossref, Google Scholar
[8] Cartis C, Scheinberg K (2018) Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Math. Programming 169(2):337–375.Google Scholar
[9] Deng G, Ferris MC (2009) Variable-number sample-path optimization. Math. Programming 117(1-2):81–109.Crossref, Google Scholar
[10] Facchinei F, Fischer A, Kanzow C (1996) Inexact Newton Methods for Semismooth Equations with Applications to Variational Inequality Problems. Nonlinear Optimization and Applications (Springer, Boston), 125–139.Google Scholar
[11] Facchinei F, Fischer A, Kanzow C (1997) A semismooth Newton method for variational inequalities: The case of box constraints. Ferris MC, Jong-Shi Pang, eds. Complementarity and Variational Problems: State of the Art, vol. 92 (Society for Industrial & Applied), 76.Google Scholar
[12] Fletcher R (1987) Practical Methods of Optimization (2nd ed.) (Wiley-Interscience, New York).Google Scholar
[13] Friedlander MP, Schmidt M (2012) Hybrid deterministic-stochastic methods for data fitting. SIAM J. Sci. Comput. 34(3):A1380–A1405.Crossref, Google Scholar
[14] Ghadimi S, Lan G (2016) Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Programming 156(1-2):59–99.Crossref, Google Scholar
[15] Gower R, Goldfarb D, Richtárik P (2016) Stochastic block BFGS: Squeezing more curvature out of data. Internat. Conf. Machine Learn., 1869–1878.Google Scholar
[16] Haarala M (2004) Large-Scale Nonsmooth Optimization: Variable Metric Bundle Method with Limited Memory. Number 40 (University of Jyväskylä, Jyväskylä, Finland).Google Scholar
[17] Homem-De-Mello T (2003) Variable-sample methods for stochastic optimization. ACM Trans. Model. Comput. Simul. 13(2):108–133.Crossref, Google Scholar
[18] Iusem AN, Jofré A, Oliveira RI, Thompson P (2019) Variance-based extragradient methods with line search for stochastic variational inequalities. SIAM J. Optim. 29(1):175–206.Crossref, Google Scholar
[19] Jalilzadeh A, Nedić A, Shanbhag UV, Yousefian F (2018) A variable sample-size stochastic quasi-Newton method for smooth and nonsmooth stochastic convex optimization. 2018 IEEE Conf. Decision Control (CDC) (IEEE), 4097–4102.Crossref, Google Scholar
[20] Jalilzadeh A, Shanbhag UV (2016) eg-VSSA: An extragradient variable sample-size stochastic approximation scheme: Error analysis and complexity trade-offs. Winter Simulation Conf., WSC 2016, Washington, DC, 690–701.Crossref, Google Scholar
[21] Jalilzadeh A, Shanbhag UV, Blanchet JH, Glynn PW (2018) Optimal smoothed variable sample-size accelerated proximal methods for structured nonsmooth stochastic convex programs. Preprint, submitted March 2, https://128.84.21.199/abs/1803.00718v1.Google Scholar
[22] Jofré A, Thompson P (2019) On variance reduction for stochastic smooth convex optimization with multiplicative noise. Math. Program. 174(1-2):253–292.Crossref, Google Scholar
[23] Kleywegt AJ, Shapiro A, Homem-de Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12(2):479–502.Crossref, Google Scholar
[24] Lewis AS, Overton ML (2008) Behavior of BFGS with an exact line search on nonsmooth examples.Google Scholar
[25] Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: A new benchmark collection for text categorization research. J. Machine Learn. Res. 5(Apr):361–397.Google Scholar
[26] Liu DC, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math. Program. 45(1-3):503–528.Crossref, Google Scholar
[27] Lucchi A, McWilliams B, Hofmann T (2015) A variance reduced stochastic Newton method. Preprint, submitted March 28, https://arxiv.org/abs/1503.08316.Google Scholar
[28] Lukšan L, Vlček J (1999) Globally convergent variable metric method for convex nonsmooth unconstrained minimization. J. Optim. Theory Appl. 102(3):593–613.Crossref, Google Scholar
[29] Mokhtari A, Ribeiro A (2014) RES: Regularized stochastic BFGS algorithm. IEEE Trans. Signal Process. 62(23):6089–6104.Crossref, Google Scholar
[30] Mokhtari A, Ribeiro A (2015) Global convergence of online limited memory BFGS. J. Machine Learn. Res. 16(1):3151–3181.Google Scholar
[31] Moreau JJ (1965) Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France. 93(2):273–299.Crossref, Google Scholar
[32] Moritz P, Nishihara R, Jordan M (2016) A linearly-convergent stochastic L-BFGS algorithm. PMLR, vol. 51, 249–258.Google Scholar
[33] Nemirovski A, Juditsky A, Lan G, Shapiro A (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574–1609.Crossref, Google Scholar
[34] Nocedal J, Wright SJ (2006) Numerical optimization. Springer Series in Operations Research and Financial Engineering (Springer, New York).Google Scholar
[35] Paquette C, Scheinberg K (2020) A stochastic line search method with expected complexity analysis. SIAM J. Optim. 30(1):349–376.Crossref, Google Scholar
[36] Pasupathy R (2010) On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization. Oper. Res. 58(4):889–901.Link, Google Scholar
[37] Pasupathy R, Glynn P, Ghosh S, Hashemi FS (2018) On sampling rates in simulation-based recursions. SIAM J. Optim. 28(1):45–73.Crossref, Google Scholar
[38] Planiden C, Wang X (2016) Strongly convex functions, Moreau envelopes, and the generic nature of convex functions with strong minimizers. SIAM J. Optim. 26(2):1341–1364.Crossref, Google Scholar
[39] Polyak BT (1987) Introduction to optimization. Translations Series in Mathematics and Engineering (Optimization Software, Inc., New York).Google Scholar
[40] Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Stat. 22:400–407.Crossref, Google Scholar
[41] Roux NL, Schmidt M, Bach FR (2012) A stochastic gradient method with an exponential convergence rate for finite training sets. Adv. Neural Inf. Process. Systems 2663–2671.Google Scholar
[42] Shanbhag UV, Blanchet JH (2015) Budget-constrained stochastic approximation. Proc. 2015 Winter Simulation Conf., Huntington Beach, CA, 368–379.Google Scholar
[43] Shashaani S, Hashemi FS, Pasupathy R (2018) ASTRO-DF: A class of adaptive sampling trust-region algorithms for derivative-free stochastic optimization. SIAM J. Optim. 28(4):3145–3176.Crossref, Google Scholar
[44] Wang X, Ma S, Goldfarb D, Liu W (2017) Stochastic quasi-Newton methods for nonconvex stochastic optimization. SIAM J. Optim. 27(2):927–956.Crossref, Google Scholar
[45] Xiao L, Zhang T (2014) A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4):2057–2075.Crossref, Google Scholar
[46] Xie Y, Shanbhag UV (2016) SI-ADMM: a stochastic inexact ADMM framework for resolving structured stochastic convex programs. Proc. 2016 Winter Simulation Conf., 714–725.Crossref, Google Scholar
[47] Yousefian F, Nedić A, Shanbhag UV (2017) On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems. Math. Program. 165(1):391–431.Google Scholar
[48] Yousefian F, Nedich A, Shanbhag UV (2020) On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis. SIAM J. Optim. 30(2):1144–1172.Crossref, Google Scholar
[49] Yu J, Vishwanathan S, Günter S, Schraudolph NN (2010) A quasi-Newton approach to nonsmooth convex optimization problems in machine learning. J. Mach. Learn. Res. 11(Mar):1145–1200.Google Scholar
[50] Yuan G, Wei Z, Wang Z (2013) Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization. Comput. Optim. Appl. 54(1):45–64.Crossref, Google Scholar
[51] Zhou C, Gao W, Goldfarb D (2017) Stochastic adaptive quasi-Newton methods for minimizing expected values. Internat. Conf. Machine Learn., 4150–4159.Google Scholar

cover image Mathematics of Operations Research

Volume 47, Issue 1

February 2022

Pages 1-846, C2

Article Information

Metrics

Information

Received:December 31, 2018
Accepted:November 20, 2020
Published Online:December 03, 2021

Cite as

Afrooz Jalilzadeh, Angelia Nedić, Uday V. Shanbhag, Farzad Yousefian (2021) A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization. Mathematics of Operations Research 47(1):690-719.

https://doi.org/10.1287/moor.2021.1147

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization

References

Volume 47, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News