A Variable Sample-Size Stochastic Quasi-Newton Method for Smooth and Nonsmooth Stochastic Convex Optimization

Published Online:https://doi.org/10.1287/moor.2021.1147

References

  • [1] Beck A (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).CrossrefGoogle Scholar
  • [2] Beck A, Teboulle M (2012) Smoothing and first order methods: A unified framework. SIAM J. Optim. 22(2):557–580.CrossrefGoogle Scholar
  • [3] Berahas AS, Nocedal J, Takác M (2016) A multi-batch L-BFGS method for machine learning. Preprint, submitted May 19, https://arxiv.org/abs/1605.06049.Google Scholar
  • [4] Bollapragada R, Byrd R, Nocedal J (2018) Adaptive sampling strategies for stochastic optimization. SIAM J. Optim. 28(4):3312–3343.CrossrefGoogle Scholar
  • [5] Bollapragada R, Mudigere D, Nocedal J, Shi HJM, Tang PTP (2018) A progressive batching L-BFGS method for machine learning. Preprint, submitted February 15, https://arxiv.org/abs/1802.05374.Google Scholar
  • [6] Byrd RH, Chin GM, Nocedal J, Wu Y (2012) Sample size selection in optimization methods for machine learning. Math. Programming 134(1):127–155.Google Scholar
  • [7] Byrd RH, Hansen SL, Nocedal J, Singer Y (2016) A stochastic quasi-Newton method for large-scale optimization. SIAM J. Optim. 26(2):1008–1031.CrossrefGoogle Scholar
  • [8] Cartis C, Scheinberg K (2018) Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Math. Programming 169(2):337–375.Google Scholar
  • [9] Deng G, Ferris MC (2009) Variable-number sample-path optimization. Math. Programming 117(1-2):81–109.CrossrefGoogle Scholar
  • [10] Facchinei F, Fischer A, Kanzow C (1996) Inexact Newton Methods for Semismooth Equations with Applications to Variational Inequality Problems. Nonlinear Optimization and Applications (Springer, Boston), 125–139.Google Scholar
  • [11] Facchinei F, Fischer A, Kanzow C (1997) A semismooth Newton method for variational inequalities: The case of box constraints. Ferris MC, Jong-Shi Pang, eds. Complementarity and Variational Problems: State of the Art, vol. 92 (Society for Industrial & Applied), 76.Google Scholar
  • [12] Fletcher R (1987) Practical Methods of Optimization (2nd ed.) (Wiley-Interscience, New York).Google Scholar
  • [13] Friedlander MP, Schmidt M (2012) Hybrid deterministic-stochastic methods for data fitting. SIAM J. Sci. Comput. 34(3):A1380–A1405.CrossrefGoogle Scholar
  • [14] Ghadimi S, Lan G (2016) Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Programming 156(1-2):59–99.CrossrefGoogle Scholar
  • [15] Gower R, Goldfarb D, Richtárik P (2016) Stochastic block BFGS: Squeezing more curvature out of data. Internat. Conf. Machine Learn., 1869–1878.Google Scholar
  • [16] Haarala M (2004) Large-Scale Nonsmooth Optimization: Variable Metric Bundle Method with Limited Memory. Number 40 (University of Jyväskylä, Jyväskylä, Finland).Google Scholar
  • [17] Homem-De-Mello T (2003) Variable-sample methods for stochastic optimization. ACM Trans. Model. Comput. Simul. 13(2):108–133.CrossrefGoogle Scholar
  • [18] Iusem AN, Jofré A, Oliveira RI, Thompson P (2019) Variance-based extragradient methods with line search for stochastic variational inequalities. SIAM J. Optim. 29(1):175–206.CrossrefGoogle Scholar
  • [19] Jalilzadeh A, Nedić A, Shanbhag UV, Yousefian F (2018) A variable sample-size stochastic quasi-Newton method for smooth and nonsmooth stochastic convex optimization. 2018 IEEE Conf. Decision Control (CDC) (IEEE), 4097–4102.CrossrefGoogle Scholar
  • [20] Jalilzadeh A, Shanbhag UV (2016) eg-VSSA: An extragradient variable sample-size stochastic approximation scheme: Error analysis and complexity trade-offs. Winter Simulation Conf., WSC 2016, Washington, DC, 690–701.CrossrefGoogle Scholar
  • [21] Jalilzadeh A, Shanbhag UV, Blanchet JH, Glynn PW (2018) Optimal smoothed variable sample-size accelerated proximal methods for structured nonsmooth stochastic convex programs. Preprint, submitted March 2, https://128.84.21.199/abs/1803.00718v1.Google Scholar
  • [22] Jofré A, Thompson P (2019) On variance reduction for stochastic smooth convex optimization with multiplicative noise. Math. Program. 174(1-2):253–292.CrossrefGoogle Scholar
  • [23] Kleywegt AJ, Shapiro A, Homem-de Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12(2):479–502.CrossrefGoogle Scholar
  • [24] Lewis AS, Overton ML (2008) Behavior of BFGS with an exact line search on nonsmooth examples.Google Scholar
  • [25] Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: A new benchmark collection for text categorization research. J. Machine Learn. Res. 5(Apr):361–397.Google Scholar
  • [26] Liu DC, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math. Program. 45(1-3):503–528.CrossrefGoogle Scholar
  • [27] Lucchi A, McWilliams B, Hofmann T (2015) A variance reduced stochastic Newton method. Preprint, submitted March 28, https://arxiv.org/abs/1503.08316.Google Scholar
  • [28] Lukšan L, Vlček J (1999) Globally convergent variable metric method for convex nonsmooth unconstrained minimization. J. Optim. Theory Appl. 102(3):593–613.CrossrefGoogle Scholar
  • [29] Mokhtari A, Ribeiro A (2014) RES: Regularized stochastic BFGS algorithm. IEEE Trans. Signal Process. 62(23):6089–6104.CrossrefGoogle Scholar
  • [30] Mokhtari A, Ribeiro A (2015) Global convergence of online limited memory BFGS. J. Machine Learn. Res. 16(1):3151–3181.Google Scholar
  • [31] Moreau JJ (1965) Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France. 93(2):273–299.CrossrefGoogle Scholar
  • [32] Moritz P, Nishihara R, Jordan M (2016) A linearly-convergent stochastic L-BFGS algorithm. PMLR, vol. 51, 249–258.Google Scholar
  • [33] Nemirovski A, Juditsky A, Lan G, Shapiro A (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574–1609.CrossrefGoogle Scholar
  • [34] Nocedal J, Wright SJ (2006) Numerical optimization. Springer Series in Operations Research and Financial Engineering (Springer, New York).Google Scholar
  • [35] Paquette C, Scheinberg K (2020) A stochastic line search method with expected complexity analysis. SIAM J. Optim. 30(1):349–376.CrossrefGoogle Scholar
  • [36] Pasupathy R (2010) On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization. Oper. Res. 58(4):889–901.LinkGoogle Scholar
  • [37] Pasupathy R, Glynn P, Ghosh S, Hashemi FS (2018) On sampling rates in simulation-based recursions. SIAM J. Optim. 28(1):45–73.CrossrefGoogle Scholar
  • [38] Planiden C, Wang X (2016) Strongly convex functions, Moreau envelopes, and the generic nature of convex functions with strong minimizers. SIAM J. Optim. 26(2):1341–1364.CrossrefGoogle Scholar
  • [39] Polyak BT (1987) Introduction to optimization. Translations Series in Mathematics and Engineering (Optimization Software, Inc., New York).Google Scholar
  • [40] Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Stat. 22:400–407.CrossrefGoogle Scholar
  • [41] Roux NL, Schmidt M, Bach FR (2012) A stochastic gradient method with an exponential convergence rate for finite training sets. Adv. Neural Inf. Process. Systems 2663–2671.Google Scholar
  • [42] Shanbhag UV, Blanchet JH (2015) Budget-constrained stochastic approximation. Proc. 2015 Winter Simulation Conf., Huntington Beach, CA, 368–379.Google Scholar
  • [43] Shashaani S, Hashemi FS, Pasupathy R (2018) ASTRO-DF: A class of adaptive sampling trust-region algorithms for derivative-free stochastic optimization. SIAM J. Optim. 28(4):3145–3176.CrossrefGoogle Scholar
  • [44] Wang X, Ma S, Goldfarb D, Liu W (2017) Stochastic quasi-Newton methods for nonconvex stochastic optimization. SIAM J. Optim. 27(2):927–956.CrossrefGoogle Scholar
  • [45] Xiao L, Zhang T (2014) A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4):2057–2075.CrossrefGoogle Scholar
  • [46] Xie Y, Shanbhag UV (2016) SI-ADMM: a stochastic inexact ADMM framework for resolving structured stochastic convex programs. Proc. 2016 Winter Simulation Conf., 714–725.CrossrefGoogle Scholar
  • [47] Yousefian F, Nedić A, Shanbhag UV (2017) On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems. Math. Program. 165(1):391–431.Google Scholar
  • [48] Yousefian F, Nedich A, Shanbhag UV (2020) On stochastic and deterministic quasi-Newton methods for nonstrongly convex optimization: Asymptotic convergence and rate analysis. SIAM J. Optim. 30(2):1144–1172.CrossrefGoogle Scholar
  • [49] Yu J, Vishwanathan S, Günter S, Schraudolph NN (2010) A quasi-Newton approach to nonsmooth convex optimization problems in machine learning. J. Mach. Learn. Res. 11(Mar):1145–1200.Google Scholar
  • [50] Yuan G, Wei Z, Wang Z (2013) Gradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization. Comput. Optim. Appl. 54(1):45–64.CrossrefGoogle Scholar
  • [51] Zhou C, Gao W, Goldfarb D (2017) Stochastic adaptive quasi-Newton methods for minimizing expected values. Internat. Conf. Machine Learn., 4150–4159.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.