Short-Lived High-Volume Bandits
References
- (2008) Competing in the dark: An efficient algorithm for bandit linear optimization. Colt (Citeseer), 263–274.Google Scholar
- (2017) Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. Conf. Learn. Theory (PMLR, New York), 39–75.Google Scholar
- (2007) Fast learning rates for plug-in classifiers. Ann. Statist. 35(2):608–633.Crossref, Google Scholar
- (2014) Regret in online combinatorial optimization. Math. Oper. Res. 39(1):31–45.Link, Google Scholar
- (2019) Adaptively tracking the best bandit arm with an unknown number of distribution changes. Conf. Learn. Theory (PMLR, New York), 138–158.Google Scholar
- (2004) Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches. Proc. Thirty-Sixth Annual ACM Symposium Theory Comput. (Association for Computing Machinery, New York), 45–53.Google Scholar
- (2019) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.Abstract, Google Scholar
- (1997) Bandit problems with infinitely many arms. Ann. Statist. 25(5):2103–2116.Crossref, Google Scholar
- (2011) On the minimax complexity of pricing in a changing environment. Oper. Res. 59(1):66–79.Google Scholar
- (2014) Stochastic multi-armed-bandit problem with non-stationary rewards. Adv. Neural Inform. Processing Systems 27.Google Scholar
- (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.Link, Google Scholar
- (2008) Mortal multi-armed bandits. Adv. Neural Inform. Processing Systems 21:273–280.Google Scholar
- (2019) A new algorithm for non-stationary contextual bandits: Efficient, optimal and parameter-free. Conf. Learn. Theory (PMLR, New York), 696–726.Google Scholar
- (2019) Learning from inventory availability information: Evidence from field experiments on amazon. Management Sci. 65(3):1216–1235.Link, Google Scholar
- (2006) Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary. Soda 6:937–943.Google Scholar
- (2007) The price of bandit information for online optimization. Adv. Neural Inform. Processing Systems 20.Google Scholar
- (2018) An information-theoretic analysis for Thompson sampling with many actions. Adv. Neural Inform. Processing Systems 31.Google Scholar
- (2011) The irrevocable multiarmed bandit problem. Oper. Res. 59(2):383–399.Link, Google Scholar
- (2022) Customer choice models vs. machine learning: Finding optimal product displays on Alibaba. Oper. Res. 70(1):309–328.Link, Google Scholar
- (2015) Fundamentals of Clinical Trials (Springer, New York).Crossref, Google Scholar
- (2019) Batched multi-armed bandits problem. Adv. Neural Inform. Processing Systems 32.Google Scholar
- (2022) Unirank: Unimodal bandit algorithms for online ranking. Internat. Conf. Machine Learn. (PMLR, New York), 7279–7309.Google Scholar
- (2001) Convergence rates for density estimation with Bernstein polynomials. Ann. Statist. 29(5):1264–1280.Crossref, Google Scholar
- (2003) Econometric Analysis (Pearson Education India, India).Google Scholar
- (2007) The on-line shortest path problem under partial monitoring. J. Machine Learn. Res. 8(10):2369–2403.Google Scholar
- (2021) Hybrid regret bounds for combinatorial semi-bandits and adversarial linear bandits. Adv. Neural Inform. Processing Systems 34:2654–2667.Google Scholar
- (2023a) Clustered switchback experiments: Near-optimal rates under spatiotemporal interference. Preprint, submitted December 25, https://arxiv.org/html/2312.15574v2.Google Scholar
- (2023b) Smooth non-stationary bandits. Internat. Conf. Machine Learn. (PMLR, New York), 14930–14944.Google Scholar
- (2016) Top arm identification in multi-armed bandits with batch arm pulls. Artificial Intelligence Statist. (PMLR, New York), 139–148.Google Scholar
- (2023) Inmobi’s glance launches in japan, aims for 40 percent of android market. Forbes India (August 28). https://www.forbesindia.com/article/take-one-big-story-of-the-day/inmobis-glance-launches-in-japan-aims-for-40-percent-of-android-market/87801/1.Google Scholar
- (2020) Dynamic assortment personalization in high dimensions. Oper. Res. 68(4):1020–1037.Link, Google Scholar
- (2017) Chasing demand: Learning and earning in a changing environment. Math. Oper. Res. 42(2):277–307.Google Scholar
- (2021) The nonstationary newsvendor: Data-driven nonparametric learning. Preprint, submitted June 15, https://doi.org/10.2139/ssrn.3866171Google Scholar
- (2017) Online controlled experiments and a/b testing. Encyclopedia Machine Learn. Data Mining 7(8):922–929.Crossref, Google Scholar
- (2015) Optimal regret analysis of Thompson sampling in stochastic multi-armed bandit problem with multiple plays. Internat. Conf. Machine Learn. (PMLR, New York), 1152–1161.Google Scholar
- (2015) Tight regret bounds for stochastic combinatorial semi-bandits. Artificial Intelligence Statist. (PMLR, New York), 535–543.Google Scholar
- (2016) Multiple-play bandits in the position-based model. Adv. Neural Inform. Processing Systems 29.Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2017) Rotting bandits. Adv. Neural Inform. Processing Systems 30.Google Scholar
- (2010) A contextual-bandit approach to personalized news article recommendation. Proc. 19th Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 661–670.Google Scholar
- (2021) A map of bandits for e-commerce. Preprint, submitted July 1, https://arxiv.org/abs/2107.00680.Google Scholar
- (2018) Efficient contextual bandits in non-stationary worlds. Conf. Learn. Theory (PMLR, New York), 1739–1776.Google Scholar
- (2021) Quantifying the value of iterative experimentation. Preprint, submitted November 3, https://arxiv.org/abs/2111.02334.Google Scholar
- (2012) Experiment!: Website Conversion Rate Optimization with A/B and Multivariate Testing (New Riders, Berkeley, CA).Google Scholar
- (2004) Online geometric optimization in the bandit setting against an adaptive adversary. Learn. Theory: 17th Annual Conf. Learning Theory, COLT 2004, Banff, Canada, July 1–4, 2004. Proc., vol. 17 (Springer, Berlin, Heidelberg), 109–123. Google Scholar
- (2015) First-order regret bounds for combinatorial semi-bandits. Conf. Learn. Theory (PMLR, New York), 1360–1375.Google Scholar
- (2016) Importance weighting without importance weights: An efficient algorithm for combinatorial semi-bandits. J. Machine Learn. Res. 17(154):1–21.Google Scholar
- (2020) Personalizing multi-modal content for a diverse audience: A scalable deep learning approach. Accessed August 25, 2020, https://irsworkshop.github.io/2020/publications/paper_18_Oli_MultiModal.pdf.Google Scholar
- (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Crossref, Google Scholar
- (2002) Consistency of Bernstein polynomial posteriors. J. Roy. Statist. Soc. Ser. B (Statist. Methodology) 64(1):79–100.Crossref, Google Scholar
- (2013) Clinical Trials: A Practical Approach (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
- (2008) Learning diverse rankings with multi-armed bandits. Proc. 25th Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 784–791.Google Scholar
- (2018) Combinatorial semi-bandits with knapsacks. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1760–1770.Google Scholar
- (2017) Customer acquisition via display advertising using multi-armed bandit experiments. Marketing Sci. 36(4):500–522.Link, Google Scholar
- (2019) Rotting bandits are no harder than stochastic ones. 22nd Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 2564–2572.Google Scholar
- (2019) Introduction to multi-armed bandits. Foundations Trends Machine Learn. 12(1–2):1–286.Crossref, Google Scholar
- (2020) OM Forum—A review of empirical operations management over the last two decades. Manufacturing Service Oper. Management 22(4):656–668.Link, Google Scholar
- (2020) Experimentation Works: The Surprising Power of Business Experiments (Harvard Business Press, Brighton, MA).Google Scholar
- (2021) How much data is created every day? https://seedscientific.com/how-much-data-is-created-every-day/.Google Scholar
- (2008) Algorithms for infinitely many-armed bandits. Adv. Neural Inform. Processing Systems 21.Google Scholar
- (2018) More adaptive algorithms for adversarial bandits. Conf. Learn. Theory (PMLR, New York), 1263–1291.Google Scholar
- (2019) Optimal experimental design for staggered rollouts. Preprint, submitted November 9, https://doi.org/10.2139/ssrn.3483934.Google Scholar
- (2018) SQR: Balancing speed, quality and risk in online experiments. Proc. 24th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 895–904.Google Scholar
- (2015) From infrastructure to culture: A/b testing challenges in large scale social networks. Proc. 21th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2227–2236.Google Scholar
- (2017) A framework for multi-a(rmed)/b(andit) testing with online FDR control. Adv. Neural Inform. Processing Systems 30.Google Scholar
- (2022) The impact of social nudges on user-generated content for social network platforms. Management Sci. 69(9):5189–5208.Link, Google Scholar
- (2021) Restless bandits with many arms: Beating the central limit theorem. Preprint, submitted July 25, https://arxiv.org/abs/2107.11911.Google Scholar
- (2020) The long-term and spillover effects of price promotions on retailing platforms: Evidence from a large randomized experiment on Alibaba. Management Sci. 66(6):2589–2609.Link, Google Scholar
- (2019) Beating stochastic and adversarial semi-bandits optimally and simultaneously. Internat. Conf. Machine Learn. (PMLR, New York), 7683–7692.Google Scholar

