Online Learning and Optimization for Revenue Management Problems with Add-on Discounts

David Simchi-Levi
David Simchi-Levi
[email protected]
https://orcid.org/0000-0002-4650-1519
Institute for Data, Systems, and Society, Department of Civil & Environmental Engineering, and Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;
Search for more papers by this author
,
Rui Sun
Rui Sun
[email protected]
https://orcid.org/0000-0001-6273-6898
Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;
Search for more papers by this author
,
Huanan Zhang
Huanan Zhang
[email protected]
https://orcid.org/0000-0002-0672-5227
Leeds School of Business, University of Colorado Boulder, Boulder, Colorado 80309
Search for more papers by this author

Institute for Data, Systems, and Society, Department of Civil & Environmental Engineering, and Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;

Search for more papers by this author

Rui Sun

[email protected]

https://orcid.org/0000-0001-6273-6898

Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;

Search for more papers by this author

Huanan Zhang

[email protected]

https://orcid.org/0000-0002-0672-5227

Leeds School of Business, University of Colorado Boulder, Boulder, Colorado 80309

Search for more papers by this author

Published Online:12 Jan 2022https://doi.org/10.1287/mnsc.2021.4222

References

Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Adv. Neural Inform. Processing Systems 24:2312–2320.Google Scholar
Abdallah T (2019) On the benefit (or cost) of large-scale bundling. Production Oper. Management 28(4):955–969.Crossref, Google Scholar
Abdallah T, Asadpour A, Reed J (2021) Large-scale bundle size pricing: A theoretical analysis. Oper. Res. 69(4):1158–1185.Google Scholar
Agrawal S, Devanur N (2016) Linear contextual bandits with knapsacks. Adv. Neural Inform. Processing Systems 30:3450–3458.Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2016) A near-optimal exploration-exploitation approach for assortment selection. Proc. 2016 ACM Conf. Econom. Comput. (ACM, New York), 599–600.Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Proc. 2017 Conf. Learn. Theory (PMLR, Amsterdam, Netherlands), 65:76–78.Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002a) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002b) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.Crossref, Google Scholar
Badanidiyuru A, Kleinberg R, Slivkins A (2013) Bandits with knapsacks. 2013 IEEE 54th Annual Sympos. Foundations Comput. Sci. (IEEE), 207–216.Google Scholar
Bakos Y, Brynjolfsson E (1999) Bundling information goods: Pricing, profits, and efficiency. Management Sci. 45(12):1613–1630.Link, Google Scholar
Bernstein F, Modaresi S, Sauré D (2018) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
Besbes O, Zeevi A (2015) On the surprising sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.Link, Google Scholar
Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.Crossref, Google Scholar
Cesa-Bianchi N, Lugosi G (2012) Combinatorial bandits. J. Comput. System Sci. 78(5):1404–1422.Crossref, Google Scholar
Chen B, Chao X, Ahn H-S (2019a) Coordinating pricing and inventory replenishment with nonparametric demand learning. Oper. Res. 67(4):1035–1052.Abstract, Google Scholar
Chen X, Ma W, Simchi-Levi D, Xin L (2019b) Assortment planning for recommendations at checkout under inventory constraints. Preprint, submitted October 1, https://dx.doi.org/10.2139/ssrn.2853093.Google Scholar
Cheung WC, Simchi-Levi D (2016) Efficiency and performance guarantees for choice-based network revenue management problems with flexible products. Preprint, submitted August 15, https://dx.doi.org/10.2139/ssrn.2823339.Google Scholar
Cheung WC, Simchi-Levi D (2017a) Assortment optimization under unknown multinomial logit choice models. Preprint, submitted April 1, https://arxiv.org/abs/1704.00108.Google Scholar
Cheung WC, Simchi-Levi D (2017b) Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Preprint, submitted November 27, https://dx.doi.org/10.2139/ssrn.3075658.Google Scholar
Cheung WC, Ma W, Simchi-Levi D, Wang X (2018) Inventory balancing with online learning. Preprint, submitted October 11, https://arxiv.org/abs/1810.05640.Google Scholar
Chu CS, Leslie P, Sorensen A (2011a) Bundle-size pricing as an approximation to mixed bundling. Amer. Econom. Rev. 101(1):263–303.Crossref, Google Scholar
Chu W, Li L, Reyzin L, Schapire R (2011b) Contextual bandits with linear payoff functions. Proc. 14th Internat. Conf. Artificial Intelligence Statist. (PLMR, Ft. Lauderdale, FL), 208–214.Google Scholar
Davis J, Gallego G, Topaloglu H (2013) Assortment planning under the multinomial logit model with totally unimodular constraint structures. Working paper, University of Illinois at Urbana Champaign, Champaign, IL.Google Scholar
Feldman JB, Topaloglu H (2017) Revenue management under the Markov chain choice model. Oper. Res. 65(5):1322–1342.Link, Google Scholar
Feng Q, Shanthikumar JG, Xue M (2022) Consumer choice models and estimation: A review and extension. Production Oper. Management. Forthcoming.Google Scholar
Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
Gallego G, Iyengar G, Phillips R, Dubey A (2004) Managing flexible products on a network. Working paper, HKUST, Hong Kong, China.Google Scholar
Gao X, Jasin S, Najafi S, Zhang H (2022) Joint learning and optimization for multi-product pricing (and ranking) under a general cascade click model. Management Sci. Forthcoming.Google Scholar
Golrezaei N, Nazerzadeh H, Rusmevichientong P (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.Link, Google Scholar
Hitt LM, Chen P (2005) Bundling with customer self-selection: A simple approach to bundling low-marginal-cost goods. Management Sci. 51(10):1481–1493.Link, Google Scholar
Jin R, Simchi-Levi D, Wang L, Wang X, Yang S (2019) Shrinking the upper confidence bound: A dynamic product selection problem for urban warehouses. Preprint, submitted March 19, https://arxiv.org/abs/1903.07844.Google Scholar
Kallus N, Udell M (2016) Dynamic assortment personalization in high dimensions. Preprint, submitted October 18, https://arxiv.org/abs/1610.05604.Google Scholar
Kök AG, Fisher ML, Vaidyanathan R (2008) Assortment planning: Review of literature and industry practice. Agrawal N, Smith S, eds. Retail Supply Chain Management (Springer, Boston), 99–153.Crossref, Google Scholar
Liu Q, Van Ryzin G (2008) On the choice-based linear programming model for network revenue management. Manufacturing Service Oper. Management 10(2):288–310.Link, Google Scholar
Ma W, Simchi-Levi D (2021) Reaping the benefits of bundling under high production costs. Proc. 24th Internat. Conf. Artificial Intelligence Statist. (PLMR, online), 1342–1350.Google Scholar
Miao S, Chao X (2019) Fast algorithms for online personalized assortment optimization in a big data regime. Preprint, submitted August 8, https://dx.doi.org/10.2139/ssrn.3432574.Google Scholar
Miao S, Chao X (2022) Dynamic joint assortment and pricing optimization with demand learning. Manufacturing Service Oper. Management. 23(2):525–545.Google Scholar
Miao S, Chen X, Chao X, Liu J, Zhang Y (2019) Context-based dynamic pricing with online clustering. Preprint, submitted February 17, https://arxiv.org/abs/1902.06199.Google Scholar
Rusmevichientong P, Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
Rusmevichientong P, Shen ZJM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.Link, Google Scholar
Slivkins A (2019) Introduction to multi-armed bandits. Preprint, submitted April 15, https://arxiv.org/abs/1904.07272.Google Scholar
Talluri K, Van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
VentureBeat (2019) NPD: U.S. game sales hit a record $43.4 billion in 2018. Accessed December 30, 2019, https://venturebeat.com/2019/01/22/npd-u-s-game-sales-hit-a-record-43-4-billion-in-2018/.Google Scholar
Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.Link, Google Scholar
Wu SY, Hitt LM, Chen PY, Anandalingam GA (2008) Customized bundle pricing for information goods: A nonlinear mixed-integer programming approach. Management Sci. 54(3):608–622.Link, Google Scholar
Yuan H, Luo Q, Shi C (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Sci. 67(10):6089–6115.Google Scholar
Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transportation Sci. 43(3):381–394.Link, Google Scholar
Zhang H, Chao X, Shi C (2018) Perishable inventory problems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Research. 66(5):1276–1286.Link, Google Scholar
Zhang H, Chao X, Shi C (2019) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.Link, Google Scholar

Volume 68, Issue 10

October 2022

Pages 7065-7791, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:May 02, 2020
Accepted:August 14, 2021
Published Online:January 12, 2022

Cite as

David Simchi-Levi, Rui Sun, Huanan Zhang (2022) Online Learning and Optimization for Revenue Management Problems with Add-on Discounts. Management Science 68(10):7402-7421.

https://doi.org/10.1287/mnsc.2021.4222

Keywords

Acknowledgments

The authors thank the department editor (J. George Shanthikumar), the associate editor, and the referees whose comments and guidance throughout the review process have greatly improved both the content and the exposition of the paper. The numerical experiments were done when the second author interned at the Alibaba DAMO Academy of Alibaba Group (US) Inc. under the supervision of Dr. Xinshang Wang and Prof. Wotao Yin. The authors gratefully acknowledge the support of Dr. Wang and Prof. Yin during the design of the experiments and the revision of the paper. Huanan Zhang thanks Yajie Wu for her help at the early stage of this manuscript.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning and Optimization for Revenue Management Problems with Add-on Discounts

References

Volume 68, Issue 10

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News