Blind Network Revenue Management and Bandits with Knapsacks Under Limited Switches

David Simchi-Levi
David Simchi-Levi
[email protected]
https://orcid.org/0000-0002-4650-1519
Department of Civil and Environmental Engineering, Operations Research Center, and Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author
,
Yunzong Xu
Corresponding Author
Yunzong Xu
[email protected]
https://orcid.org/0000-0002-1682-419X
Department of Industrial and Enterprise Systems Engineering, Grainger College of Engineering, University of Illinois, Urbana-Champaign, Illinois 61801
Search for more papers by this author
,
Jinglong Zhao
Jinglong Zhao
[email protected]
https://orcid.org/0000-0003-0986-0085
Questrom School of Business, Boston University, Boston, Massachusetts 02215
Search for more papers by this author

Department of Civil and Environmental Engineering, Operations Research Center, and Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Yunzong Xu

Corresponding Author

Yunzong Xu

[email protected]

https://orcid.org/0000-0002-1682-419X

Department of Industrial and Enterprise Systems Engineering, Grainger College of Engineering, University of Illinois, Urbana-Champaign, Illinois 61801

Search for more papers by this author

Jinglong Zhao

[email protected]

https://orcid.org/0000-0003-0986-0085

Questrom School of Business, Boston University, Boston, Massachusetts 02215

Search for more papers by this author

Published Online:14 Apr 2025https://doi.org/10.1287/opre.2020.0753

References

Adelman D (2007) Dynamic bid prices in revenue management. Oper. Res. 55(4):647–661.Link, Google Scholar
Agrawal S, Devanur NR (2014) Bandits with concave rewards and convex knapsacks. Proc. 15th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 989–1006.Google Scholar
Agrawal R, Hedge M, Teneketzis D (1988) Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost. IEEE Trans. Automatic Control 33(10):899–906.Crossref, Google Scholar
Agrawal R, Hedge M, Teneketzis D (1990) Multi-armed bandit problems with multiple plays and switching cost. Stochastics Stochastic Rep. 29(4):437–459.Crossref, Google Scholar
Ahn HS, Ryan C, Uichanco J, Zhang M (2019) Certainty equivalent pricing under sales-dependent and inventory-dependent demand. Preprint, submitted December 16, https://dx.doi.org/10.2139/ssrn.3502478.Google Scholar
Altschuler J, Talwar K (2018) Online learning over a finite action set with limited switching. Conf. Learn. Theory (PMLR, New York), 1569–1573.Google Scholar
Arlotto A, Gurvich I (2019) Uniformly bounded regret in the multisecretary problem. Stochastic Systems 9(3):231–260.Link, Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
Badanidiyuru A, Kleinberg R, Slivkins A (2013) Bandits with knapsacks. 2013 IEEE 54th Annual Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 207–216.Google Scholar
Badanidiyuru A, Kleinberg R, Slivkins A (2018) Bandits with knapsacks. J. ACM 65(3):1–55.Google Scholar
Balseiro SR, Brown DB, Chen C (2019) Dynamic pricing of relocating resources in large networks. Abstracts 2019 SIGMETRICS/Performance Joint Internat. Conf. Measurement Model. Comput. Systems (ACM, New York), 29–30.Google Scholar
Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
Bray RL, Stamatopoulos I (2022) Menu costs and the bullwhip effect: Supply chain implications of dynamic pricing. Oper. Res. 70(2):748–765.Link, Google Scholar
Bumpensanti P, Wang H (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Sci. 66(7):2993–3009.Link, Google Scholar
Cesa-Bianchi N, Dekel O, Shamir O (2013) Online learning with switching costs and other adaptive adversaries. Adv. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1160–1168.Google Scholar
Chen B, Chao X (2019) Parametric demand learning with limited price explorations in a backlog stochastic inventory system. IISE Trans. 51(6):605–613.Crossref, Google Scholar
Chen Y, Shi C (2019) Network revenue management with online inverse batch gradient descent method. Preprint, submitted February 26, https://dx.doi.org/10.2139/ssrn.3331939.Google Scholar
Chen B, Chao X, Wang Y (2020) Data-based dynamic pricing and inventory control with censored demand and limited price changes. Oper. Res. 68(5):1445–1456.Link, Google Scholar
Chen Q, Jasin S, Duenyas I (2015) Real-time dynamic pricing with minimal and flexible price adjustment. Management Sci. 62(8):2437–2455.Link, Google Scholar
Chen Q, Jasin S, Duenyas I (2019) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.Link, Google Scholar
Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
Cooper WL (2002) Asymptotic behavior of an allocation policy for revenue management. Oper. Res. 50(4):720–727.Link, Google Scholar
Dekel O, Ding J, Koren T, Peres Y (2014) Bandits with switching costs: T 2/3 regret. Proc. 46th Annual ACM Sympos. Theory Comput. (ACM, New York), 459–467.Google Scholar
Dong K, Li Y, Zhang Q, Zhou Y (2020) Multinomial logit bandit with low switching cost. Internat. Conf. Machine Learn. (PMLR, New York), 2607–2615.Google Scholar
Ferreira KJ, Lee BHA, Simchi-Levi D (2016) Analytics for an online retailer: Demand forecasting and price optimization. Manufacturing Service Oper. Management 18(1):69–88.Link, Google Scholar
Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
Gallego G, Van Ryzin G (1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Sci. 40(8):999–1020.Link, Google Scholar
Gallego G, Van Ryzin G (1997) A multiproduct dynamic pricing problem and its applications to network yield management. Oper. Res. 45(1):24–41.Link, Google Scholar
Gao Z, Han Y, Ren Z, Zhou Z (2019) Batched multi-armed bandits problem. Adv. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 501–511.Google Scholar
Guha S, Munagala K (2009) Multi-armed bandits with metric switching costs. Internat. Colloquium Automata Languages Programming (Springer, New York), 496–507.Crossref, Google Scholar
Hajiaghayi MT, Kleinberg R, Sandholm T (2007) Automated online mechanism design and prophet inequalities. AAAI, vol. 7, 58–65.Google Scholar
Immorlica N, Sankararaman KA, Schapire R, Slivkins A (2019) Adversarial bandits with knapsacks. 2019 IEEE 60th Annual Sympos. Foundations Comput. Sci. (IEEE, Piscataway, NJ), 202–219.Google Scholar
Jasin S (2014) Reoptimization and self-adjusting price control for network revenue management. Oper. Res. 62(5):1168–1178.Link, Google Scholar
Jørgensen S, Taboubi S, Zaccour G (2003) Retail promotions with negative brand image effects: Is cooperation possible? Eur. J. Oper. Res. 150(2):395–405.Crossref, Google Scholar
Keskin NB, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.Link, Google Scholar
Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Levy D, Dutta S, Bergen M, Venable R (1998) Price adjustment at multiproduct retailers. Managerial Decision Econom. 19(2):81–120.Crossref, Google Scholar
Liu Q, Van Ryzin G (2008) On the choice-based linear programming model for network revenue management. Manufacturing Service Oper. Management 10(2):288–310.Link, Google Scholar
Ma W, Simchi-Levi D, Zhao J (2021) Dynamic pricing (and assortment) under a static calendar. Management Sci. 67(4):2292–2313.Link, Google Scholar
Ma Y, Rusmevichientong P, Sumida M, Topaloglu H (2020) An approximation algorithm for network revenue management under nonstationary arrivals. Oper. Res. 68(3):834–855.Link, Google Scholar
Maglaras C, Meissner J (2006) Dynamic pricing strategies for multiproduct revenue management problems. Manufacturing Service Oper. Management 8(2):136–148.Link, Google Scholar
Miao S, Wang Y (2021) Network revenue management with nonparametric demand learning: T-regret and polynomial dimension dependency. Preprint, submitted October 25, https://dx.doi.org/10.2139/ssrn.3948140.Google Scholar
Netessine S (2006) Dynamic pricing of inventory/capacity with infrequent price changes. Eur. J. Oper. Res. 174(1):553–580.Crossref, Google Scholar
Perakis G, Singhvi D (2023) Dynamic pricing with unknown nonparametric demand and limited price changes. Oper. Res. 72(6):2726–2744.Link, Google Scholar
Perchet V, Rigollet P, Chassang S, Snowberg E. (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Crossref, Google Scholar
Sankararaman KA, Slivkins A (2020) Advances in bandits with knapsacks. Preprint, submitted February 1, https://arxiv.org/abs/2002.00253.Google Scholar
Simchi-Levi D, Xu Y (2019) Phase transitions and cyclic phenomena in bandits with switching constraints. Adv. Neural Inform. Processing Systems, 7523–7532.Google Scholar
Simchi-Levi D, Xu Y (2023) Phase transitions in bandits with switching constraints. Management Sci. 69(12):7182–7201.Link, Google Scholar
Slivkins A (2019) Introduction to multi-armed bandits. Preprint, submitted April 15, https://arxiv.org/abs/1904.07272.Google Scholar
Slivkins A, Vaughan JW (2014) Online decision making in crowdsourcing markets: Theoretical challenges. ACM SIGecom Exchanges 12(2):4–23.Crossref, Google Scholar
Stamatopoulos I, Bassamboo A, Moreno A (2020) The effects of menu costs on retail performance: Evidence from adoption of the electronic shelf label technology. Management Sci. 67(1):242–256.Link, Google Scholar
Sun R, Wang X, Zhou Z (2020) Near-optimal primal-dual algorithms for quantity-based network revenue management. Preprint, submitted November 12, https://arxiv.org/abs/2011.06327.Google Scholar
Talluri K, Van Ryzin G (1998) An analysis of bid-price controls for network revenue management. Management Sci. 44(11-part-1):1577–1593.Link, Google Scholar
Topaloglu H (2009) Using Lagrangian relaxation to compute capacity-dependent bid prices in network revenue management. Oper. Res. 57(3):637–649.Link, Google Scholar
Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.Link, Google Scholar
Zbaracki MJ, Ritson M, Levy D, Dutta S, Bergen M (2004) Managerial and customer costs of price adjustment: Direct evidence from industrial markets. Rev. Econom. Statist. 86(2):514–533.Crossref, Google Scholar

Volume 73, Issue 5

September-October 2025

Pages iii-vii, 2297-2866, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:December 01, 2020
Accepted:December 22, 2024
Published Online:April 14, 2025

Cite as

David Simchi-Levi, Yunzong Xu, Jinglong Zhao (2025) Blind Network Revenue Management and Bandits with Knapsacks Under Limited Switches. Operations Research 73(5):2496-2514.

https://doi.org/10.1287/opre.2020.0753

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Blind Network Revenue Management and Bandits with Knapsacks Under Limited Switches

References

Volume 73, Issue 5

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News