Gradient-Based Algorithms for Convex Discrete Optimization via Simulation

Haixiang Zhang
Haixiang Zhang
[email protected]
https://orcid.org/0000-0001-5386-1208
Department of Mathematics, University of California, Berkeley, Berkeley, California 94720;
Search for more papers by this author
,
Zeyu Zheng
Corresponding Author
Zeyu Zheng
[email protected]
https://orcid.org/0000-0001-5653-152X
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720
Search for more papers by this author
,
Javad Lavaei
Javad Lavaei
[email protected]
https://orcid.org/0000-0003-4294-1338
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720
Search for more papers by this author

Department of Mathematics, University of California, Berkeley, Berkeley, California 94720;

Corresponding Author

Zeyu Zheng

Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720

Search for more papers by this author

Javad Lavaei

[email protected]

https://orcid.org/0000-0003-4294-1338

Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720

Search for more papers by this author

Published Online:28 Apr 2022https://doi.org/10.1287/opre.2022.2295

References

Agarwal A, Foster DP, Hsu DJ, Kakade SM, Rakhlin A (2011) Stochastic convex optimization with bandit feedback. Adv. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NJ) 24:1035–1043.Google Scholar
Agrawal S, Juneja S, Glynn P (2020) Optimal δ-correct best-arm selection for heavy-tailed distributions. Proc. 31st Internat. Conf. Algorithmic Learn. Theory (PMLR, Cambridge, MA) 117:61–110.Google Scholar
Ajalloeian A, Stich SU (2020) On the convergence of SGD with biased gradients. Workshop Beyond First Order Methods ML Systems Internat. Conf. Machine Learn (PMLR, Cambridge, MA).Google Scholar
Altman E, Gaujal B, Hordijk A (2000) Multimodularity, convexity, and optimization properties. Math. Oper. Res. 25(2):324–347.Link, Google Scholar
Altman E, Gaujal B, Hordijk A (2003) Discrete-Event Control of Stochastic Networks: Multimodularity and Regularity (Springer, Berlin).Crossref, Google Scholar
Axelrod B, Liu YP, Sidford A (2020) Near-optimal approximate discrete and continuous submodular function minimization. Proc. 14th Annual ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 837–853.Google Scholar
Bechhofer RE (1954) A single-sample multiple decision procedure for ranking means of normal populations with known variances. Ann. Math. Statist. 25(1):16–39.Crossref, Google Scholar
Belloni A, Liang T, Narayanan H, Rakhlin A (2015) Escaping the local minima via simulated annealing: Optimization of approximately convex functions. Conf. Learn. Theory (PMLR, Cambridge, MA) 240–265.Google Scholar
Chen J, Luss R (2018) Stochastic gradient descent with biased but consistent gradient estimators. Preprint, submitted July 31, https://arxiv.org/abs/1807.11880.Google Scholar
Chen X, Ankenman BE, Nelson BL (2013) Enhancing stochastic kriging metamodels with gradient estimators. Oper. Res. 61(2):512–528.Link, Google Scholar
Chen X, Zhou E, Hu J (2018) Discrete optimization via gradient-based adaptive stochastic search methods. IISE Trans. 50(9):789–805.Crossref, Google Scholar
Chick SE (2006) Subjective probability and Bayesian methodology. Henderson SG, Nelson BL, eds. Handbooks in Operations Research and Management Science, vol. 13 (Elsevier, New York), 225–257.Crossref, Google Scholar
Daniilidis A, Drusvyatskiy D (2020) Pathological subgradient dynamics. SIAM J. Optim. 30(2):1327–1338.Crossref, Google Scholar
Davis D, Drusvyatskiy D (2019) Stochastic model-based minimization of weakly convex functions. SIAM J. Optim. 29(1):207–239.Crossref, Google Scholar
Davis D, Drusvyatskiy D, Kakade S, Lee JD (2020) Stochastic subgradient method converges on tame functions. Foundations Comput. Math. 20(1):119–154.Crossref, Google Scholar
Devolder O, Glineur F, Nesterov Y (2014) First-order methods of smooth convex optimization with inexact oracle. Math. Programming 146(1–2):37–75.Crossref, Google Scholar
Eckman DJ, Henderson SG (2018) Guarantees on the probability of good selection. 2018 Winter Simulation Conf. (IEEE, Piscataway, NJ), 351–365.Google Scholar
Eckman DJ, Henderson SG (2020) Biased gradient estimators in simulation optimization. Bae KH, Feng B, Kim S, Lazarova-Molnar S, Zheng Z, Roeder T, Thiesing R, eds. Proc. 2020 Winter Simulation Conf. (IEEE, Piscataway, NJ).Google Scholar
Eckman DJ, Henderson SG (2021) Fixed-confidence, fixed-tolerance guarantees for ranking-and-selection procedures. ACM Trans. Model. Comput. Simulation 31(2):1–33.Crossref, Google Scholar
Eckman DJ, Plumlee M, Nelson BL (2022) Plausible screening using functional properties for simulations with large solution spaces. Oper. Res. Forthcoming.Google Scholar
Even-Dar E, Mannor S, Mansour Y (2002) PAC bounds for multi-armed bandit and markov decision processes. Internat. Conf. Comput. Learn. Theory (Springer, Berlin), 255–270.Google Scholar
Fan W, Hong LJ, Nelson BL (2016) Indifference-zone-free selection of the best. Oper. Res. 64(6):1499–1514.Link, Google Scholar
Favati P (1990) Convexity in nonlinear integer programming. Ricerca Operativa 53:3–44.Google Scholar
Freund D, Henderson SG, Shmoys DB (2017) Minimizing multimodular functions and allocating capacity in bike-sharing systems. Internat. Conf. Integer Programming Combin. Optim. (Springer), 186–198.Google Scholar
Fu MC (2002) Optimization for simulation: Theory vs. practice. INFORMS J. Comput. 14(3):192–215.Link, Google Scholar
Fu MC, Qu H (2014) Regression models augmented with direct stochastic gradient estimators. INFORMS J. Comput. 26(3):484–499.Link, Google Scholar
Fujishige S (1984) Theory of submodular programs: A Fenchel-type min-max theorem and subgradients of submodular functions. Math. Programming 29(2):142–155.Crossref, Google Scholar
Fujishige S (2005) Submodular Functions and Optimization (Elsevier, Amsterdam).Google Scholar
Futschik A, Pflug G (1995) Confidence sets for discrete stochastic optimization. Ann. Oper. Res. 56(1):95–108.Crossref, Google Scholar
Futschik A, Pflug GC (1997) Optimal allocation of simulation experiments in discrete stochastic optimization and approximative algorithms. Eur. J. Oper. Res. 101(2):245–260.Crossref, Google Scholar
Garivier A, Kaufmann E (2016) Optimal best arm identification with fixed confidence. Conf. Learn. Theory, (PMLR, Cambridge, MA) 998–1027.Google Scholar
Graur A, Pollner T, Ramaswamy V, Weinberg SM (2020) New query lower bounds for submodular function minimization. 11th Innovations Theoretical Comput. Sci. Conf. (Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Saarbrücken/Wadern, Germany).Google Scholar
Gutjahr WJ, Pflug GC (1996) Simulated annealing for noisy cost functions. J. Global Optim. 8(1):1–13.Crossref, Google Scholar
Hong LJ, Nelson BL (2006) Discrete optimization via simulation using compass. Oper. Res. 54(1):115–129.Link, Google Scholar
Hong LJ, Fan W, Luo J (2021) Review on ranking and selection: A new perspective. Frontiers Engrg. Management 8(3):321–343.Crossref, Google Scholar
Hong LJ, Nelson BL, Xu J (2010) Speeding up COMPASS for high-dimensional discrete optimization via simulation. Oper. Res. Lett. 38(6):550–555.Crossref, Google Scholar
Hong LJ, Nelson BL, Xu J (2015) Discrete optimization via simulation. Fu MC, ed. Handbook of Simulation Optimization (Springer, New York), 9–44.Crossref, Google Scholar
Hu J, Fu MC, Marcus SI (2007) A model reference adaptive search method for global optimization. Oper. Res. 55(3):549–568.Link, Google Scholar
Hu J, Fu MC, Marcus SI (2008) A model reference adaptive search method for stochastic global optimization. Comm. Inform. Systems 8(3):245–276.Crossref, Google Scholar
Hu Y, Zhang S, Chen X, He N (2020) Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning. Adv. Neural Inform. Processing Systems 33:2759–2770.Google Scholar
Hunter SR, Nelson BL (2017) Parallel ranking and selection. Tolk A, Fowler J, Shao G, Yucesan E, eds. Advances in Modeling and Simulation (Springer, New York), 249–275.Crossref, Google Scholar
Ito S (2019) Submodular function minimization with noisy evaluation oracle. Adv. Neural Inform. Processing Systems 32:12103–12113.Google Scholar
Jian N (2017) Exploring and exploiting structure in large scale simulation optimization. Unpublished PhD thesis, Cornell University, Ithaca, New York.Google Scholar
Jian N, Freund D, Wiberg HM, Henderson SG (2016) Simulation optimization for a large-scale bike-sharing system. 2016 Winter Simulation Conf. (IEEE, Piscataway, NJ), 602–613.Google Scholar
Jin C, Liu LT, Ge R, Jordan MI (2018) On the local minima of the empirical risk. Proc. 32nd Internat. Conf. Neural Inform. Processing Systems, 4901–4910.Google Scholar
Kaufmann E, Kalyanakrishnan S (2013) Information complexity in bandit subset selection. Conf. Learn. Theory (PMLR, Cambridge, MA), 228–251.Google Scholar
Kaufmann E, Cappé O, Garivier A (2016) On the complexity of best-arm identification in multi-armed bandit models. J. Machine Learn. Res. 17(1):1–42.Google Scholar
Kim SH, Nelson BL (2006) Selecting the best system. Henderson SG, Nelson BL, eds. Handbooks in Operations Research and Management Science, vol. 13 (Elsevier, New York), 501–534.Crossref, Google Scholar
Kleywegt AJ, Shapiro A, Homem-de Mello T (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optim. 12(2):479–502.Crossref, Google Scholar
L’Ecuyer P (1990) A unified view of the IPA, SF, and LR gradient estimation techniques. Management Sci. 36(11):1364–1383.Link, Google Scholar
Lee YT, Sidford A, Wong SCW (2015) A faster cutting plane method and its implications for combinatorial and convex optimization. 2015 IEEE 56th Annual Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 1049–1065.Google Scholar
Lim E (2012) Stochastic approximation over multidimensional discrete sets with applications to inventory systems and admission control of queueing networks. ACM Trans. Model. Comput. Simulation 22(4):1–23.Crossref, Google Scholar
Lovász L (1983) Submodular functions and convexity. Bachem A, Grotschel M, Korte B, eds. Mathematical Programming the State of the Art (Springer, Berlin), 235–257.Crossref, Google Scholar
Luo J, Hong LJ, Nelson BL, Wu Y (2015) Fully sequential procedures for large-scale ranking-and-selection problems in parallel computing environments. Oper. Res. 63(5):1177–1194.Link, Google Scholar
Ma S, Henderson SG (2017) An efficient fully sequential selection procedure guaranteeing probably approximately correct selection. 2017 Winter Simulation Conf. (IEEE, Piscataway, NJ), 2225–2236.Google Scholar
Ma S, Henderson SG (2019) Predicting the simulation budget in ranking and selection procedures. ACM Trans. Modeling Comput. Simulation 29(3):1–25.Google Scholar
Mai V, Johansson M (2020) Convergence of a stochastic gradient method with momentum for non-smooth non-convex optimization. Internat. Conf. Machine Learn. (PMLR, Cambridge, MA), 6630–6639.Google Scholar
Mangoubi O, Vishnoi NK (2018) Convex optimization with unbounded nonconvex oracles using simulated annealing. Conference Learn. Theory (PMLR), 1086–1124.Google Scholar
Murota K (2003) Discrete Convex Analysis (Society for Industrial and Applied Mathematics, Philadelphia).Google Scholar
Nelson BL (2010) Optimization via simulation over discrete decision variables. Gray P, ed. Risk and Optimization in an Uncertain World (Informs, Cantonsville, MD), 193–207.Link, Google Scholar
Nemirovskij AS, Yudin DB (1983) Problem Complexity and Method Efficiency in Optimization (Wiley-Interscience, New York).Google Scholar
Ni EC, Ciocan DF, Henderson SG, Hunter SR (2017) Efficient ranking and selection in high performance computing environments. Oper. Res. 65(3):821–836.Link, Google Scholar
Park C, Kim SH (2015) Penalty function with memory for discrete optimization via simulation with stochastic constraints. Oper. Res. 63(5):1195–1212.Link, Google Scholar
Park C, Telci IT, Kim SH, Aral MM (2014) Designing an optimal water quality monitoring network for river systems using constrained discrete optimization via simulation. Engrg. Optim. 46(1):107–129.Crossref, Google Scholar
Qu H, Fu MC (2014) Gradient extrapolated stochastic kriging. ACM Trans. Model. Comput. Simulation 24(4):1–25.Crossref, Google Scholar
Semelhago M, Nelson BL, Song E, Wächter A (2020) Rapid discrete optimization via simulation with Gaussian Markov random fields. INFORMS J. Comput. 33(3):915–930.Link, Google Scholar
Sen S, Higle JL (2001) Stabilization of cutting plane algorithms for stochastic linear programming problems. Floudas C, Pardalos P, eds. Encyclopedia of Optimization (Springer, Boston), 2434–2440.Crossref, Google Scholar
Shaked M, Shanthikumar JG (1988) Stochastic convexity and its applications. Adv. Appl. Probab. 20(2):427–446.Crossref, Google Scholar
Singhvi D, Singhvi S, Frazier PI, Henderson SG, O’Mahony E, Shmoys DB, Woodard DB (2015) Predicting bike usage for New York City’s bike sharing system. AAAI Workshop: Comput. Sustainability.Google Scholar
Sun L, Hong LJ, Hu Z (2014) Balancing exploitation and exploration in discrete optimization via simulation through a Gaussian process-based search. Oper. Res. 62(6):1416–1438.Link, Google Scholar
Wang H, Pasupathy R, Schmeiser BW (2013) Integer-ordered simulation optimization using r-spline: Retrospective search with piecewise-linear interpolation and neighborhood enumeration. ACM Trans. Model. Comput. Simulation 23(3):1–24.Crossref, Google Scholar
Wang T, Xu J, Hu JQ, Chen CH (2021) Optimal computing budget allocation for regression with gradient information. Automatica, 134: 109927.Google Scholar
Wolff RW, Wang CL (2002) On the convexity of loss probabilities. J. Appl. Probab. 39(2):402–406.Crossref, Google Scholar
Xu Y, Lin Q, Yang T (2016) Accelerated stochastic subgradient methods under local error bound condition. Preprint, submitted July 4, https://arxiv.org/abs/1607.01027.Google Scholar
Xu J, Nelson BL, Hong JL (2010) Industrial strength COMPASS: A comprehensive algorithm and software for optimization via simulation. ACM Trans. Model. Comput. Simulation 20(1):1–29.Crossref, Google Scholar
Zhang S, He N (2018) On the convergence rate of stochastic mirror descent for nonsmooth nonconvex optimization. Preprint, submitted June 12, https://arxiv.org/abs/1806.04781.Google Scholar
Zhang J, Lin H, Jegelka S, Sra S, Jadbabaie A (2020) Complexity of finding stationary points of nonconvex nonsmooth functions. Proc. 37th Internat. Conf. Machine Learn., vol. 119 (PMLR, Cambridge, MA), 11173–11182.Google Scholar
Zhong Y, Hong LJ (2018) Fully sequential ranking and selection procedures with PAC guarantee. 2018 Winter Simulation Conf. (IEEE, Piscataway, NJ), 1898–1908.Google Scholar
Zhong Y, Hong LJ (2021) Knockout-tournament procedures for large-scale ranking and selection in parallel computing environments. Oper. Res. 70(1):432–453.Link, Google Scholar

Volume 71, Issue 5

September-October 2023

Pages iii-vii, 1441-1924, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:October 30, 2020
Accepted:February 11, 2022
Published Online:April 28, 2022

Cite as

Haixiang Zhang, Zeyu Zheng, Javad Lavaei (2022) Gradient-Based Algorithms for Convex Discrete Optimization via Simulation. Operations Research 71(5):1815-1834.

https://doi.org/10.1287/opre.2022.2295

Keywords

Acknowledgments

The authors are grateful to the anonymous reviewers, the associate editor, Jeff Hong, and Shane Henderson for very helpful comments and suggestions.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Gradient-Based Algorithms for Convex Discrete Optimization via Simulation

References

Volume 71, Issue 5

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News