A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization
Published Online:3 Dec 2021https://doi.org/10.1287/moor.2021.1147
References
- [1] (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
- [2] (2012) Smoothing and first order methods: A unified framework. SIAM J. Optim. 22(2):557–580.Crossref, Google Scholar
- [3] (2016) A multi-batch L-BFGS method for machine learning. Preprint, submitted May 19, https://arxiv.org/abs/1605.06049.Google Scholar
- [4] (2018) Adaptive sampling strategies for stochastic optimization. SIAM J. Optim. 28(4):3312–3343.Crossref, Google Scholar
- [5] (2018) A progressive batching L-BFGS method for machine learning. Preprint, submitted February 15, https://arxiv.org/abs/1802.05374.Google Scholar
- [6] (2012) Sample size selection in optimization methods for machine learning. Math. Programming 134(1):127–155.Google Scholar
- [7] (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J. Optim. 26(2):1008–1031.Crossref, Google Scholar
- [8] (2018) Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Math. Programming 169(2):337–375.Google Scholar
- [9] (2009) Variable-number sample-path optimization. Math. Programming 117(1-2):81–109.Crossref, Google Scholar
- [10] (1996) Inexact Newton Methods for Semismooth Equations with Applications to Variational Inequality Problems. Nonlinear Optimization and Applications (Springer, Boston), 125–139.Google Scholar
- [11] (1997) A semismooth Newton method for variational inequalities: The case of box constraints. Ferris MC, Jong-Shi Pang, eds. Complementarity and Variational Problems: State of the Art, vol. 92 (Society for Industrial & Applied), 76.Google Scholar
- [12] (1987) Practical Methods of Optimization (2nd ed.) (Wiley-Interscience, New York).Google Scholar
- [13] (2012) Hybrid deterministic-stochastic methods for data fitting. SIAM J. Sci. Comput. 34(3):A1380–A1405.Crossref, Google Scholar
- [14] (2016) Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Programming 156(1-2):59–99.Crossref, Google Scholar
- [15] (2016) Stochastic block BFGS: Squeezing more curvature out of data. Internat. Conf. Machine Learn., 1869–1878.Google Scholar
- [16] (2004) Large-Scale Nonsmooth Optimization: Variable Metric Bundle Method with Limited Memory. Number 40 (University of Jyväskylä, Jyväskylä, Finland).Google Scholar
- [17] (2003) Variable-sample methods for stochastic optimization. ACM Trans. Model. Comput. Simul. 13(2):108–133.Crossref, Google Scholar
- [18] (2019) Variance-based extragradient methods with line search for stochastic variational inequalities. SIAM J. Optim. 29(1):175–206.Crossref, Google Scholar
- [19] (2018) A variable sample-size stochastic quasi-Newton method for smooth and nonsmooth stochastic convex optimization. 2018 IEEE Conf. Decision Control (CDC) (IEEE), 4097–4102.Crossref, Google Scholar
- [20] (2016) eg-VSSA: An extragradient variable sample-size stochastic approximation scheme: Error analysis and complexity trade-offs. Winter Simulation Conf., WSC 2016, Washington, DC, 690–701.Crossref, Google Scholar
- [21] (2018) Optimal smoothed variable sample-size accelerated proximal methods for structured nonsmooth stochastic convex programs. Preprint, submitted March 2, https://128.84.21.199/abs/1803.00718v1.Google Scholar
- [22] (2019) On variance reduction for stochastic smooth convex optimization with multiplicative noise. Math. Program. 174(1-2):253–292.Crossref, Google Scholar
- [23] (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12(2):479–502.Crossref, Google Scholar
- [24] (2008) Behavior of BFGS with an exact line search on nonsmooth examples.Google Scholar
- [25] (2004) Rcv1: A new benchmark collection for text categorization research. J. Machine Learn. Res. 5(Apr):361–397.Google Scholar
- [26] (1989) On the limited memory BFGS method for large scale optimization. Math. Program. 45(1-3):503–528.Crossref, Google Scholar
- [27] (2015) A variance reduced stochastic Newton method. Preprint, submitted March 28, https://arxiv.org/abs/1503.08316.Google Scholar
- [28] (1999) Globally convergent variable metric method for convex nonsmooth unconstrained minimization. J. Optim. Theory Appl. 102(3):593–613.Crossref, Google Scholar
- [29] (2014) RES: Regularized stochastic BFGS algorithm. IEEE Trans. Signal Process. 62(23):6089–6104.Crossref, Google Scholar
- [30] (2015) Global convergence of online limited memory BFGS. J. Machine Learn. Res. 16(1):3151–3181.Google Scholar
- [31] (1965) Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France. 93(2):273–299.Crossref, Google Scholar
- [32] (2016) A linearly-convergent stochastic L-BFGS algorithm. PMLR, vol. 51, 249–258.Google Scholar
- [33] (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574–1609.Crossref, Google Scholar
- [34] (2006) Numerical optimization. Springer Series in Operations Research and Financial Engineering (Springer, New York).Google Scholar
- [35] (2020) A stochastic line search method with expected complexity analysis. SIAM J. Optim. 30(1):349–376.Crossref, Google Scholar
- [36] (2010) On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization. Oper. Res. 58(4):889–901.Link, Google Scholar
- [37] (2018) On sampling rates in simulation-based recursions. SIAM J. Optim. 28(1):45–73.Crossref, Google Scholar
- [38] (2016) Strongly convex functions, Moreau envelopes, and the generic nature of convex functions with strong minimizers. SIAM J. Optim. 26(2):1341–1364.Crossref, Google Scholar
- [39] (1987) Introduction to optimization. Translations Series in Mathematics and Engineering (Optimization Software, Inc., New York).Google Scholar
- [40] (1951) A stochastic approximation method. Ann. Math. Stat. 22:400–407.Crossref, Google Scholar
- [41] (2012) A stochastic gradient method with an exponential convergence rate for finite training sets. Adv. Neural Inf. Process. Systems 2663–2671.Google Scholar
- [42] (2015) Budget-constrained stochastic approximation. Proc. 2015 Winter Simulation Conf., Huntington Beach, CA, 368–379.Google Scholar
- [43] (2018) ASTRO-DF: A class of adaptive sampling trust-region algorithms for derivative-free stochastic optimization. SIAM J. Optim. 28(4):3145–3176.Crossref, Google Scholar
- [44] (2017) Stochastic quasi-Newton methods for nonconvex stochastic optimization. SIAM J. Optim. 27(2):927–956.Crossref, Google Scholar
- [45] (2014) A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4):2057–2075.Crossref, Google Scholar
- [46] (2016) SI-ADMM: a stochastic inexact ADMM framework for resolving structured stochastic convex programs. Proc. 2016 Winter Simulation Conf., 714–725.Crossref, Google Scholar
- [47] (2017) On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems. Math. Program. 165(1):391–431.Google Scholar
- [48] (2020) On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis. SIAM J. Optim. 30(2):1144–1172.Crossref, Google Scholar
- [49] (2010) A quasi-Newton approach to nonsmooth convex optimization problems in machine learning. J. Mach. Learn. Res. 11(Mar):1145–1200.Google Scholar
- [50] (2013) Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization. Comput. Optim. Appl. 54(1):45–64.Crossref, Google Scholar
- [51] (2017) Stochastic adaptive quasi-Newton methods for minimizing expected values. Internat. Conf. Machine Learn., 4150–4159.Google Scholar

