Online Learning and Optimization for Revenue Management Problems with Add-on Discounts

Published Online:https://doi.org/10.1287/mnsc.2021.4222

References

  • Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Adv. Neural Inform. Processing Systems 24:2312–2320.Google Scholar
  • Abdallah T (2019) On the benefit (or cost) of large-scale bundling. Production Oper. Management 28(4):955–969.CrossrefGoogle Scholar
  • Abdallah T, Asadpour A, Reed J (2021) Large-scale bundle size pricing: A theoretical analysis. Oper. Res. 69(4):1158–1185.Google Scholar
  • Agrawal S, Devanur N (2016) Linear contextual bandits with knapsacks. Adv. Neural Inform. Processing Systems 30:3450–3458.Google Scholar
  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2016) A near-optimal exploration-exploitation approach for assortment selection. Proc. 2016 ACM Conf. Econom. Comput. (ACM, New York), 599–600.Google Scholar
  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Proc. 2017 Conf. Learn. Theory (PMLR, Amsterdam, Netherlands), 65:76–78.Google Scholar
  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.LinkGoogle Scholar
  • Auer P, Cesa-Bianchi N, Fischer P (2002a) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.CrossrefGoogle Scholar
  • Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002b) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.CrossrefGoogle Scholar
  • Badanidiyuru A, Kleinberg R, Slivkins A (2013) Bandits with knapsacks. 2013 IEEE 54th Annual Sympos. Foundations Comput. Sci. (IEEE), 207–216.Google Scholar
  • Bakos Y, Brynjolfsson E (1999) Bundling information goods: Pricing, profits, and efficiency. Management Sci. 45(12):1613–1630.LinkGoogle Scholar
  • Bernstein F, Modaresi S, Sauré D (2018) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.Google Scholar
  • Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.LinkGoogle Scholar
  • Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.LinkGoogle Scholar
  • Besbes O, Zeevi A (2015) On the surprising sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.LinkGoogle Scholar
  • Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.CrossrefGoogle Scholar
  • Cesa-Bianchi N, Lugosi G (2012) Combinatorial bandits. J. Comput. System Sci. 78(5):1404–1422.CrossrefGoogle Scholar
  • Chen B, Chao X, Ahn H-S (2019a) Coordinating pricing and inventory replenishment with nonparametric demand learning. Oper. Res. 67(4):1035–1052.AbstractGoogle Scholar
  • Chen X, Ma W, Simchi-Levi D, Xin L (2019b) Assortment planning for recommendations at checkout under inventory constraints. Preprint, submitted October 1, https://dx.doi.org/10.2139/ssrn.2853093.Google Scholar
  • Cheung WC, Simchi-Levi D (2016) Efficiency and performance guarantees for choice-based network revenue management problems with flexible products. Preprint, submitted August 15, https://dx.doi.org/10.2139/ssrn.2823339.Google Scholar
  • Cheung WC, Simchi-Levi D (2017a) Assortment optimization under unknown multinomial logit choice models. Preprint, submitted April 1, https://arxiv.org/abs/1704.00108.Google Scholar
  • Cheung WC, Simchi-Levi D (2017b) Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Preprint, submitted November 27, https://dx.doi.org/10.2139/ssrn.3075658.Google Scholar
  • Cheung WC, Ma W, Simchi-Levi D, Wang X (2018) Inventory balancing with online learning. Preprint, submitted October 11, https://arxiv.org/abs/1810.05640.Google Scholar
  • Chu CS, Leslie P, Sorensen A (2011a) Bundle-size pricing as an approximation to mixed bundling. Amer. Econom. Rev. 101(1):263–303.CrossrefGoogle Scholar
  • Chu W, Li L, Reyzin L, Schapire R (2011b) Contextual bandits with linear payoff functions. Proc. 14th Internat. Conf. Artificial Intelligence Statist. (PLMR, Ft. Lauderdale, FL), 208–214.Google Scholar
  • Davis J, Gallego G, Topaloglu H (2013) Assortment planning under the multinomial logit model with totally unimodular constraint structures. Working paper, University of Illinois at Urbana Champaign, Champaign, IL.Google Scholar
  • Feldman JB, Topaloglu H (2017) Revenue management under the Markov chain choice model. Oper. Res. 65(5):1322–1342.LinkGoogle Scholar
  • Feng Q, Shanthikumar JG, Xue M (2022) Consumer choice models and estimation: A review and extension. Production Oper. Management. Forthcoming.Google Scholar
  • Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.LinkGoogle Scholar
  • Gallego G, Iyengar G, Phillips R, Dubey A (2004) Managing flexible products on a network. Working paper, HKUST, Hong Kong, China.Google Scholar
  • Gao X, Jasin S, Najafi S, Zhang H (2022) Joint learning and optimization for multi-product pricing (and ranking) under a general cascade click model. Management Sci. Forthcoming.Google Scholar
  • Golrezaei N, Nazerzadeh H, Rusmevichientong P (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.LinkGoogle Scholar
  • Hitt LM, Chen P (2005) Bundling with customer self-selection: A simple approach to bundling low-marginal-cost goods. Management Sci. 51(10):1481–1493.LinkGoogle Scholar
  • Jin R, Simchi-Levi D, Wang L, Wang X, Yang S (2019) Shrinking the upper confidence bound: A dynamic product selection problem for urban warehouses. Preprint, submitted March 19, https://arxiv.org/abs/1903.07844.Google Scholar
  • Kallus N, Udell M (2016) Dynamic assortment personalization in high dimensions. Preprint, submitted October 18, https://arxiv.org/abs/1610.05604.Google Scholar
  • Kök AG, Fisher ML, Vaidyanathan R (2008) Assortment planning: Review of literature and industry practice. Agrawal N, Smith S, eds. Retail Supply Chain Management (Springer, Boston), 99–153.CrossrefGoogle Scholar
  • Liu Q, Van Ryzin G (2008) On the choice-based linear programming model for network revenue management. Manufacturing Service Oper. Management 10(2):288–310.LinkGoogle Scholar
  • Ma W, Simchi-Levi D (2021) Reaping the benefits of bundling under high production costs. Proc. 24th Internat. Conf. Artificial Intelligence Statist. (PLMR, online), 1342–1350.Google Scholar
  • Miao S, Chao X (2019) Fast algorithms for online personalized assortment optimization in a big data regime. Preprint, submitted August 8, https://dx.doi.org/10.2139/ssrn.3432574.Google Scholar
  • Miao S, Chao X (2022) Dynamic joint assortment and pricing optimization with demand learning. Manufacturing Service Oper. Management. 23(2):525–545.Google Scholar
  • Miao S, Chen X, Chao X, Liu J, Zhang Y (2019) Context-based dynamic pricing with online clustering. Preprint, submitted February 17, https://arxiv.org/abs/1902.06199.Google Scholar
  • Rusmevichientong P, Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.LinkGoogle Scholar
  • Rusmevichientong P, Shen ZJM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.LinkGoogle Scholar
  • Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.LinkGoogle Scholar
  • Slivkins A (2019) Introduction to multi-armed bandits. Preprint, submitted April 15, https://arxiv.org/abs/1904.07272.Google Scholar
  • Talluri K, Van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.LinkGoogle Scholar
  • VentureBeat (2019) NPD: U.S. game sales hit a record $43.4 billion in 2018. Accessed December 30, 2019, https://venturebeat.com/2019/01/22/npd-u-s-game-sales-hit-a-record-43-4-billion-in-2018/.Google Scholar
  • Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.LinkGoogle Scholar
  • Wu SY, Hitt LM, Chen PY, Anandalingam GA (2008) Customized bundle pricing for information goods: A nonlinear mixed-integer programming approach. Management Sci. 54(3):608–622.LinkGoogle Scholar
  • Yuan H, Luo Q, Shi C (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Sci. 67(10):6089–6115.Google Scholar
  • Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transportation Sci. 43(3):381–394.LinkGoogle Scholar
  • Zhang H, Chao X, Shi C (2018) Perishable inventory problems: Convexity results for base-stock policies and learning algorithms under censored demand. Oper. Research. 66(5):1276–1286.LinkGoogle Scholar
  • Zhang H, Chao X, Shi C (2019) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.