The (Surprising) Sample Optimality of Greedy Procedures for Large-Scale Ranking and Selection
References
- (2010) Best arm identification in multi-armed bandits. Kalai AT, Mohri M, eds. 23rd Conf. Learn. Theory (COLT) (Omnipress, Madison, WI), 41–53.Google Scholar
- (2021) Mostly exploration-free algorithms for contextual bandits. Management Sci. 67(3):1329–1349.Link, Google Scholar
- (2020) Unreasonable effectiveness of greedy algorithms in multi-armed bandit with many arms. Larochelle H, Ranzato M, Hadsell R, Balcan M-F, Lin H-T, eds. Adv. Neural Inform. Processing Systems (NeurIPS 2020), vol. 33 (Curran Associates, Inc., Red Hook, NY), 1713–1723.Google Scholar
- (1954) A single-sample multiple decision procedure for ranking means of normal populations with known variances. Ann. Math. Statist. 25(1):16–39.Crossref, Google Scholar
- (2007) Selecting a selection procedure. Management Sci. 53(12):1916–1932.Link, Google Scholar
- (2023) Balancing optimal large deviations in sequential selection. Management Sci. 69(6):3457–3473.Link, Google Scholar
- (2015) Ranking and selection: Efficient simulation budget allocation. Fu MC, ed. Handbook of Simulation Optimization (Springer, New York), 45–80.Crossref, Google Scholar
- (2000) Simulation budget allocation for further enhancing the efficiency of ordinal optimization. Discrete Event Dyn. Syst. 10(3):251–270.Crossref, Google Scholar
- (2013) Limit theorems for simulation-based optimization via random search. ACM Trans. Model. Comput. Simul. 23(3):1–18.Crossref, Google Scholar
- (2012) Sequential sampling with economics of selection procedures. Management Sci. 58(3):550–569.Link, Google Scholar
- (2001a) New procedures to select the best simulated system using common random numbers. Management Sci. 47(8):1133–1149.Link, Google Scholar
- (2001b) New two-stage and sequential procedures for selecting the best simulated system. Oper. Res. 49(5):732–743.Link, Google Scholar
- (2022) Posterior-based stopping rules for Bayesian ranking-and-selection procedures. INFORMS J. Comput. 34(3):1711–1728.Link, Google Scholar
- (2006) Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. J. Mach. Learn. Res. 7(39):1079–1105.Google Scholar
- (2016) Indifference-zone-free selection of the best. Oper. Res. 64(6):1499–1514.Link, Google Scholar
- (2014) A fully sequential elimination procedure for indifference-zone ranking and selection with tight bounds on probability of correct selection. Oper. Res. 62(4):926–942.Link, Google Scholar
- (2008) A knowledge-gradient policy for sequential information collection. SIAM J. Control Optim. 47(5):2410–2439.Crossref, Google Scholar
- (2017) A new budget allocation framework for the expected opportunity cost. Oper. Res. 65(3):787–803.Link, Google Scholar
- (2004) A large deviations perspective on ordinal optimization. Ingalls RG, Rossetti MD, Smith JS, Peters BA, eds. Proc. 36th Winter Simulation Conf. (WSC), vol. 1 (IEEE, Piscataway, NJ), 577–585.Google Scholar
- (1991) An improvement on Paulson’s procedure for selecting the poprlation with the largest mean from k normal populations with a common unknown variance. Sequential Anal. 10(1–2):1–16.Crossref, Google Scholar
- (2020) An optimal elimination algorithm for learning a best arm. Larochelle H, Ranzato M, Hadsell R, Balcan M-F, Lin H-T, eds. Adv. Neural Inform. Processing Systems (NeurIPS 2020), vol. 33 (Curran Associates, Inc., Red Hook, NY), 10788–10798.Google Scholar
- (2006) Fully sequential indifference-zone selection procedures with variance-dependent sampling. Naval Res. Logist. 53(5):464–476.Crossref, Google Scholar
- (2021) Review on ranking and selection: A new perspective. Front. Engrg. Management 8(3):321–343.Crossref, Google Scholar
- (2022) Solving large-scale fixed-budget ranking and selection problems. INFORMS J. Comput. 34(6):2930–2949.Link, Google Scholar
- (2017) Parallel ranking and selection. Tolk A, Fowler J, Shao G, Yücesan E, eds. Advances in Modeling and Simulation (Springer, Cham, Switzerland), 249–275.Crossref, Google Scholar
- (2013) On finding the largest mean among many. Preprint, submitted June 17, https://arxiv.org/abs/1306.3917.Google Scholar
- (2021) Be greedy in multi-armed bandits. Preprint, submitted January 4, https://arxiv.org/abs/2101.01086.Google Scholar
- (2018) On parallel policies for ranking and selection problems. J. Appl. Statist. 45(9):1690–1713.Crossref, Google Scholar
- (2018) A smoothed analysis of the greedy algorithm for the linear contextual bandit problem. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Adv. Neural Inform. Processing Systems (NeurIPS 2018), vol. 31 (Curran Associates, Inc., Red Hook, NY), 2231–2241.Google Scholar
- (2013) Almost optimal exploration in multi-armed bandits. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn., vol. 28 (PMLR, New York), 1238–1246.Google Scholar
- (2001) A fully sequential procedure for indifference-zone selection in simulation. ACM Trans. Model. Comput. Simul. 11(3):251–273.Crossref, Google Scholar
- (2006) Selecting the best system. Henderson SG, Nelson BL, eds. Simulation. Handbooks in Operations Research and Management Science, vol. 13 (Elsevier, Amsterdam), 501–534.Crossref, Google Scholar
- (2015) Fully sequential procedures for large-scale ranking-and-selection problems in parallel computing environments. Oper. Res. 63(5):1177–1194.Link, Google Scholar
- (1995) Using common random numbers for indifference-zone selection and multiple comparisons in simulation. Management Sci. 41(12):1935–1945.Link, Google Scholar
- (2017) Efficient ranking and selection in parallel computing environments. Oper. Res. 65(3):821–836.Link, Google Scholar
- (1964) A sequential procedure for selecting the population with the largest mean from k normal populations. Ann. Math. Statist. 35(1):174–180.Crossref, Google Scholar
- (2022) Parallel adaptive survivor selection. Oper. Res. 72(1):336–354.Link, Google Scholar
- (2018) Gradient-based myopic allocation policy: An efficient sampling procedure in a low-confidence scenario. IEEE Trans. Automat. Control 63(9):3091–3097.Crossref, Google Scholar
- (2006) A sequential procedure for neighborhood selection-of-the-best in optimization via simulation. Eur. J. Oper. Res. 173(1):283–298.Crossref, Google Scholar
- (1978) On two-stage selection procedures and related probability-inequalities. Comm. Statist. Theory Methods 7(8):799–811.Crossref, Google Scholar
- (2019) Gaussian Markov random fields for discrete optimization via simulation: Framework and algorithms. Oper. Res. 67(1):250–266.Link, Google Scholar
- (2021) Rapid discrete optimization via simulation with Gaussian Markov random fields. INFORMS J. Comput. 33(3):915–930.Link, Google Scholar
- (1985) Sequential Analysis: Tests and Confidence Intervals (Springer Science & Business Media, New York).Crossref, Google Scholar
- (2023) Bonferroni-free and indifference-zone-flexible sequential elimination procedures for ranking and selection. Oper. Res., ePub ahead of print April 11, https://doi.org/10.1287/opre.2023.2447.Link, Google Scholar
- (2018a) Analyzing and provably improving fixed budget ranking and selection algorithms. Preprint, submitted November 26, https://arxiv.org/abs/1811.12183.Google Scholar
- (2018b) Provably improving the optimal computing budget allocation algorithm. Proc. 2018 Winter Simulation Conf. (IEEE, Piscataway, NJ), 1921–1932.Google Scholar
- (2020) Algorithm for calculating the initial sample size in a fully sequential ranking and selection procedure. Asia Pac. J. Oper. Res. 37(03):2050015.Crossref, Google Scholar
- (2000) Global stochastic optimization with low-dispersion point sets. Oper. Res. 48(6):939–950.Link, Google Scholar
- (2023) Revisiting simple regret: Fast rates for returning a good arm. Andreas K, Emma B, Kyunghyun C Barbara E, Sivan S, Jonathan S, eds. Proc. 40th Internat. Conf. Machine Learn., vol. 202 (PMLR, New York), 42110–42158.Google Scholar
- (2022) Knockout-tournament procedures for large-scale ranking and selection in parallel computing environments. Oper. Res. 70(1):432–453.Link, Google Scholar
- (2022) Speeding up Paulson’s procedure for large-scale problems using parallel computing. INFORMS J. Comput. 34(1):586–606.Link, Google Scholar

