Phase Transitions in Bandits with Switching Constraints
References
- (1988) Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost. IEEE Trans. Automatic Control 33(10):899–906.Crossref, Google Scholar
- (1990) Multi-armed bandit problems with multiple plays and switching cost. Stochastics Stochastic Rep. 29(4):437–459.Crossref, Google Scholar
- (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
- (2021) Online learning over a finite action set with limited switching. Math. Oper. Res. 46(1):179–203.Link, Google Scholar
- (1996) Multi-armed bandits with switching penalties. IEEE Trans. Automatic Control 41(3):328–348.Crossref, Google Scholar
- (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
- (1994) Switching costs and the Gittins index. Econometrica 62(3):687–694.Crossref, Google Scholar
- (2015) Universality in polytope phase transitions and message passing algorithms. Ann. Appl. Probab. 25(2):753–822.Crossref, Google Scholar
- (2001) Stationary multi-choice bandit problems. J. Econom. Dynamic Control 25(10):1585–1594.Crossref, Google Scholar
- (2002) Optimal learning and experimentation in bandit problems. J. Econom. Dynamic Control 27(1):87–108.Crossref, Google Scholar
- (2013) Online learning with switching costs and other adaptive adversaries. Adv. Neural Inform. Processing Systems 1:1160–1168.Google Scholar
- (2019) Parametric demand learning with limited price explorations in a backlog stochastic inventory system. IISE Trans. 51(6):605–613.Crossref, Google Scholar
- (2020) Data-based dynamic pricing and inventory control with censored demand and limited price changes. Oper. Res. 68(5):1445–1456.Link, Google Scholar
- (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
- (2014) Bandits with switching costs: T 2/3 regret. Proc. 46th Annual ACM Sympos. Theory Comput., 459–467.Google Scholar
- (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
- (2000) Phase Transitions and Critical Phenomena, vol. 1 (Elsevier, Amsterdam).Google Scholar
- (2020) Multinomial logit bandit with low switching cost. Internat. Conf. Machine Learn. (PMLR), 2607–2615.Google Scholar
- (2019) Batched multi-armed bandits problem. Adv. Neural Inform. Processing Systems 33:503–513.Google Scholar
- (2009) Multi-armed bandits with metric switching costs. Internat. Colloquium Automata Languages Programming (Springer, Berlin), 496–507.Crossref, Google Scholar
- (2013) Approximation algorithms for Bayesian multi-armed bandit problems. Preprint, submitted June 14, https://arxiv.org/abs/1306.3525.Google Scholar
- (1998) Tracking the best expert. Machine Learn. 32(2):151–178.Crossref, Google Scholar
- (2020) Smooth contextual bandits: Bridging the parametric and non-differentiable regret regimes. Conf. Learn. Theory (PMLR), 2007–2010.Google Scholar
- (2021) Markdown pricing under unknown demand. Preprint, submitted June 8, https://dx.doi.org/10.2139/ssrn.3861379.Google Scholar
- (2004) A survey on the bandit problem with switching costs. Economist 152(4):513–541.Crossref, Google Scholar
- (2017) Online learning for changing environments using coin betting. Electronic J. Statist. 11(2):5282–5310.Crossref, Google Scholar
- (2015) Detest Uber’s surge pricing? Some drivers don’t like it either. Accessed July 11, 2023, https://www.cnet.com/tech/tech-industry/detest-ubers-surge-pricing-some-drivers-dont-like-it-either/.Google Scholar
- (2017) Bandits with movement costs and adaptive pricing. Conf. Learn. Theory (PMLR), 1242–1268.Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Crossref, Google Scholar
- (1984) Asymptotic behavior of m-estimators of p regression parameters when p2/n is large. I. Consistency. Ann. Statist. 12(4):1298–1309.Crossref, Google Scholar
- (1988) Asymptotic behavior of likelihood methods for exponential families when the number of parameters tends to infinity. Ann. Statist. 16(1):356–366.Crossref, Google Scholar
- (2017) How Uber uses psychological tricks to push its drivers’ buttons. The New York Times Online (April 2), https://www.nytimes.com/interactive/2017/04/02/technology/uber-drivers-psychological-tricks.html.Google Scholar
- (2019) Phase transitions and cyclic phenomena in bandits with switching constraints. Adv. Neural Inform. Processing Systems 32:7521–7530.Google Scholar
- (2008) Designing and Managing the Supply Chain: Concepts, Strategies and Case Studies (Tata McGraw-Hill Education, New York).Google Scholar
- (2019) Introduction to multi-armed bandits. Preprint, submitted April 15, https://arxiv.org/abs/1904.07272.Google Scholar
- (2008) Introduction to Nonparametric Estimation (Springer Science & Business Media, Berlin).Google Scholar
- (2009) Sharp thresholds for high-dimensional and noisy sparsity recovery using ℓ1-constrained quadratic programming (LASSO). IEEE Trans. Inform. Theory 55(5):2183–2202.Crossref, Google Scholar

