A Primal–Dual Learning Algorithm for Personalized Dynamic Pricing with an Inventory Constraint

Published Online:https://doi.org/10.1287/moor.2021.1220

References

  • [1] Agrawal S, Devanur NR (2014) Bandits with concave rewards and convex knapsacks. Proc. 15th ACM Conf. Econom. Comput. (ACM), 989–1006.Google Scholar
  • [2] Araman VF, Caldentey R (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.LinkGoogle Scholar
  • [3] Auer P, Ortner R, Szepesvári C (2007) Improved rates for the stochastic continuum-armed bandit problem. Proc. 20th Annual Conf. Learn. Theory, San Diego, (Springer-Verlag, Berlin, Heidelberg), 454–468.Google Scholar
  • [4] Badanidiyuru A, Kleinberg R, Slivkins A (2013) Bandits with knapsacks. 2013 IEEE 54th Annual Sympos. Foundations Comput. Sci. (IEEE), 207–216.Google Scholar
  • [5] Badanidiyuru A, Langford J, Slivkins A (2014) Resourceful contextual bandits. Balcan M, Feldman V, Szepesvári C, eds. Proc. 27th Conf. Learn. Theory, vol. 35 (PMLR, Barcelona, Spain), 1109–1134.Google Scholar
  • [6] Ban GY, Keskin NB (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.LinkGoogle Scholar
  • [7] Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.LinkGoogle Scholar
  • [8] Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.LinkGoogle Scholar
  • [9] Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.LinkGoogle Scholar
  • [10] Bubeck S, Cesa-Bianchi N (2012) Regret Analysis of stochastic and Nonstochastic Multi-armed Bandit Problems, Foundations and Trends in Machine Learning, 5(1):1–122.Google Scholar
  • [11] Bubeck S, Munos R, Stoltz G, Szepesvári C (2011) X-armed bandits. J. Machine Learn. Res. 12:1655–1695.Google Scholar
  • [12] Canonne C (2017) A short note on Poisson tail bounds. Accessed December 14, 2018, http://www.cs.columbia.edu/∼ccanonne/files/misc/2017-poissonconcentration.pdf.Google Scholar
  • [13] Cesa-Bianchi N, Lugosi G (2006) Prediction, Learning, and Games (Cambridge University Press).CrossrefGoogle Scholar
  • [14] Chen N, Gallego G (2021) Nonparametric pricing analytics with customer covariates. Oper. Res. 69(3):974–984.LinkGoogle Scholar
  • [15] Chen Y, Shi C (2019) Network revenue management with online inverse batch gradient descent method. Working paper, Fox School of Business, Temple University.Google Scholar
  • [16] Chen Q, Jasin S, Duenyas I (2019) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.LinkGoogle Scholar
  • [17] Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.LinkGoogle Scholar
  • [18] Cohen MC, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.LinkGoogle Scholar
  • [19] den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.CrossrefGoogle Scholar
  • [20] den Boer AV, Keskin NB (2020) Discontinuous demand functions: Estimation and pricing. Management Sci. 66(10):4516–4534.LinkGoogle Scholar
  • [21] den Boer AV, Zwart B (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770–783.LinkGoogle Scholar
  • [22] den Boer AV, Zwart B (2015) Dynamic pricing and learning with finite inventories. Oper. Res. 63(4):965–978.LinkGoogle Scholar
  • [23] Farias VF, Van Roy B (2010) Dynamic pricing with a prior on market response. Oper. Res. 58(1):16–29.LinkGoogle Scholar
  • [24] Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.LinkGoogle Scholar
  • [25] Gallego G, Topaloglu H (2019) Revenue Management and Pricing Analytics, vol. 209 (Springer).CrossrefGoogle Scholar
  • [26] Gallego G, Van Ryzin G (1997) A multiproduct dynamic pricing problem and its applications to network yield management. Oper. Res. 45(1):24–41.LinkGoogle Scholar
  • [27] Javanmard A, Nazerzadeh H (2019) Dynamic pricing in high-dimensions. J. Machine Learn. Res. 20(1):315–363.Google Scholar
  • [28] Keskin NB, Birge JR (2019) Dynamic selling mechanisms for product differentiation and learning. Oper. Res. 67(4):1069–1089.AbstractGoogle Scholar
  • [29] Keskin NB, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.LinkGoogle Scholar
  • [30] Keskin NB, Zeevi A (2018) On incomplete learning and certainty-equivalence control. Oper. Res. 66(4):1136–1167.LinkGoogle Scholar
  • [31] Kleinberg R (2005) Nearly tight bounds for the continuum-armed bandit problem. Saul L, Weiss Y, Bottou L, eds. Proc. 17th Internat. Conf. Neural Inform. Processing Systems, vol. 17 (MIT Press), 697–704.Google Scholar
  • [32] Kleinberg R, Slivkins A, Upfal E (2008) Multi-armed bandits in metric spaces. Proc. 40th Annual ACM Sympos. Theory Comput. Victoria, BC (ACM, New York), 681–690.Google Scholar
  • [33] Lei Y, Jasin S, Sinha A (2017) Near-optimal bisection search for nonparametric dynamic pricing with inventory constraint. Working paper, Stephen J.R. Smith School of Business, Queen’s University.Google Scholar
  • [34] Li W, Chen N, Hong LJ (2019) A dimension-free algorithm for contextual continuum-armed bandits. Working paper.Google Scholar
  • [35] Mahdavi M, Yang T, Jin R (2013) Stochastic convex optimization with multiple objectives. Adv. Neural Inform. Processing Systems 26:1115–1123.Google Scholar
  • [36] Qiang S, Bayati M (2016) Dynamic pricing with demand covariates. Working paper, Stanford University.Google Scholar
  • [37] Shalev-Shwartz S (2012) Online Learning and Online Convex Optimization, Foundations and Trends inMachine Learning, vol. 4, no. 2.Google Scholar
  • [38] Slivkins A (2014) Contextual bandits with similarity information. J. Machine Learn. Res. 15(1):2533–2568.Google Scholar
  • [40] Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.