Balancing Optimal Large Deviations in Sequential Selection

Published Online:https://doi.org/10.1287/mnsc.2022.4527

References

  • Agrawal S, Goyal N (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro N, Williamson RC, eds. Proc. 2012 Conf. Learn. Theory (PMLR, New York), 39:1–39:26.Google Scholar
  • Andradóttir S, Kim SH (2010) Fully sequential procedures for comparing constrained systems via simulation. Naval Res. Logist. 57(5):403–421.CrossrefGoogle Scholar
  • Audibert JY, Bubeck S (2010) Best arm identification in multi-armed bandits. Kalai AT, Mohri M, eds. Proc. 2010 Conf. Learn. Theory (Omnipress, Madison, WI), 41–53.Google Scholar
  • Boesel J, Nelson BL, Ishii N (2003) A framework for simulation-optimization software. IIE Trans. 35(3):221–229.CrossrefGoogle Scholar
  • Bubeck S, Munos R, Stoltz G (2009) Pure exploration in multi-armed bandits problems. Gavaldà R, Lugosi G, Zeugmann T, Zilles S, eds. Proc. 2009 Internat. Conf. Algorithmic Learn. Theory (Springer, Berlin), 23–37.Google Scholar
  • Chen CH, Lee LH (2010) Stochastic Simulation Optimization: An Optimal Computing Budget Allocation (World Scientific, Singapore).CrossrefGoogle Scholar
  • Chen Y, Ryzhov IO (2017) Rate-optimality of the complete expected improvement criterion. Chan WKV, D’Ambrogio A, Zacharewicz G, Mustafee N, Wainer G, Page E, eds. Proc. 2017 Winter Simulation Conf. (IEEE, Piscataway, NJ), 2173–2182.Google Scholar
  • Chen Y, Ryzhov IO (2019a) Balancing optimal large deviations in ranking and selection. Mustafee N, Bae KH, Lazarova-Molnar S, Rabe M, Szabo C, Haas P, Son YJ, eds. Proc. 2019 Winter Simulation Conf. (IEEE, Piscataway NJ), 3368–3379.Google Scholar
  • Chen Y, Ryzhov IO (2019b) Complete expected improvement converges to an optimal budget allocation. Adv. Appl. Probab. 51(1):209–235.CrossrefGoogle Scholar
  • Chen CH, Chick SE, Lee LH, Pujowidianto NA (2015) Ranking and selection: Efficient simulation budget allocation. Fu MC, ed. Handbook of Simulation Optimization (Springer, New York), 45–80.CrossrefGoogle Scholar
  • Chick SE, Branke J, Schmidt C (2010) Sequential sampling to myopically maximize the expected value of information. INFORMS J. Comput. 22(1):71–80.LinkGoogle Scholar
  • Dembo A, Zeitouni O (2009) Large Deviations Techniques and Applications, 2nd ed. (Springer-Verlag, New York).Google Scholar
  • Fan W, Hong LJ, Nelson BL (2016) Indifference-zone-free selection of the best. Oper. Res. 64(6):1499–1514.LinkGoogle Scholar
  • Gao F, Gao S (2016) Optimal computing budget allocation with exponential underlying distribution. Roeder TMK, Frazier PI, Szechtman R, Zhou E, Huschka T, Chick SE, eds. Proc. 2016 Winter Simulation Conf. (IEEE, Piscataway, NJ), 682–689.Google Scholar
  • Gao S, Chen W, Shi L (2017) A new budget allocation framework for the expected opportunity cost. Oper. Res. 65(3):787–803.LinkGoogle Scholar
  • Garivier A, Kaufmann E (2016) Optimal best arm identification with fixed confidence. Feldman V, Rakhlin A, Shamir O, eds. Proc. 2016 Conf. Learn. Theory, vol. 49. (PMLR, New York), 998–1027.Google Scholar
  • Glynn PW, Juneja S (2004) A large deviations perspective on ordinal optimization. Ingalls RG, Rossetti MD, Smith JS, Peters BA, eds. Proc. 2004 Winter Simulation Conf. (IEEE, Piscataway, NJ), 577–585.Google Scholar
  • Glynn PW, Juneja S (2011) Ordinal optimization: A nonparametric framework. Jain S, Creasey RR, Himmelspach J, White KP, Fu MC, eds. Proc. 2011 Winter Simulation Conf. (IEEE, Piscataway, NJ), 4062–4069.Google Scholar
  • Glynn PW, Juneja S (2018) Selecting the best system and multi-armed bandits. Preprint, submitted September 10, https://doi.org/10.48550/arXiv.1507.04564.Google Scholar
  • Hunter SR, Feldman G (2015) Optimal sampling laws for bi-objective simulation optimization on finite sets. Yilmaz L, Chan WKV, Moon I, Roeder TMK, Macal C, Rossetti MD, eds. Proc. 2015 Winter Simulation Conf. (IEEE, Piscataway, NJ), 3749–3757.Google Scholar
  • Hunter SR, McClosky B (2016) Maximizing quantitative traits in the mating design problem via simulation-based Pareto estimation. IIE Trans. 48(6):565–578.CrossrefGoogle Scholar
  • Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. J. Global Optim. 13(4):455–492.CrossrefGoogle Scholar
  • Kim SH, Nelson BL (2001) A fully sequential procedure for indifference-zone selection in simulation. ACM Trans. Model. Comput. Simulation 11(3):251–273.Google Scholar
  • Kim SH, Nelson BL (2006) On the asymptotic validity of fully sequential selection procedures for steady-state simulation. Oper. Res. 54(3):475–488.LinkGoogle Scholar
  • Ma S, Henderson SG (2017) An efficient fully sequential selection procedure guaranteeing probably approximately correct selection. Chan WKV, D’Ambrogio A, Zacharewicz G, Mustafee N, Wainer G, Page E, eds. Proc. 2017 Winter Simulation Conf. (IEEE, Piscataway, NJ), 2225–2236.Google Scholar
  • Nelson B, Matejcik F (1995) Using common random numbers for indifference-zone selection and multiple comparisons in simulation. Management Sci. 41(12):1935–1945.LinkGoogle Scholar
  • Pasupathy R, Hunter SR, Pujowidianto NA, Lee LH, Chen CH (2014) Stochastically constrained ranking and selection via SCORE. ACM Trans. Model. Comput. Simulation 25(1):1–26.Google Scholar
  • Pronzato L, Müller WG (2012) Design of computer experiments: Space filling and beyond. Statist. Comput. 22(3):681–701.CrossrefGoogle Scholar
  • Qian PZG, Wu H, Wu CFJ (2008) Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50(3):383–396.CrossrefGoogle Scholar
  • Qin C, Klabjan D, Russo D (2017) Improving the expected improvement algorithm. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates, Red Hook, NY), 5381–5391.Google Scholar
  • Russo D (2020) Simple Bayesian algorithms for best arm identification. Oper. Res. 68(6):1625–1647.LinkGoogle Scholar
  • Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.LinkGoogle Scholar
  • Ryzhov IO (2016) On the convergence rates of expected improvement methods. Oper. Res. 64(6):1515–1528.LinkGoogle Scholar
  • Ryzhov IO (2018) The local time method for targeting and selection. Oper. Res. 66(5):1406–1422.LinkGoogle Scholar
  • Ryzhov IO, Powell WB (2011) The value of information in multi-armed bandits with exponentially distributed rewards. Sato M, Matsuoka S, Sloot PM, van Albada GD, Dongarra J, eds. Proc. 2011 Internat. Conf. Computational Sci. (Elsevier, Amsterdam), 1363–1372.Google Scholar
  • Salemi P, Nelson BL, Staum J (2014) Discrete optimization via simulation using Gaussian Markov random fields. Tolk A, Diallo SY, Ryzhov IO, Yilmaz L, Buckley S, Miller JA, eds. Proc. 2014 Winter Simulation Conf. (IEEE, Piscataway, NJ), 3809–3820.Google Scholar
  • Shin D, Broadie M, Zeevi A (2018) Tractable sampling strategies for ordinal optimization. Oper. Res. 66(6):1693–1712.LinkGoogle Scholar
  • Wu D, Zhou E (2018) Analyzing and provably improving fixed budget ranking and selection algorithms. Preprint, submitted November 26, https://doi.org/10.48550/arXiv.1811.12183.Google Scholar
  • Xu J, Nelson BL, Hong LJ (2010) Industrial strength COMPASS: A comprehensive algorithm and software for optimization via simulation. ACM Trans. Model. Comput. Simulation 20(1):3:1–3:29.Google Scholar
  • Zhang Q, Qian PZG (2013) Designs for crossvalidating approximation models. Biometrika 100(4):997–1004.CrossrefGoogle Scholar
  • Zhang S, Lee LH, Chew EP, Xu J, Chen CH (2016) A simulation budget allocation procedure for enhancing the efficiency of optimal subset selection. IEEE Trans. Automatic Control 61(1):62–75.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.