Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models

Published Online:https://doi.org/10.1287/opre.2025.1779

References

  • Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 24 (Curran Associates Inc., Red Hook, NY), 2312–2320.Google Scholar
  • Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(v):397–422.Google Scholar
  • Ban GY, Keskin NB (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.LinkGoogle Scholar
  • Besbes O, Zeevi A (2015) On the (surprising) sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.LinkGoogle Scholar
  • Cesa-Bianchi N, Cesari T, Perchet V (2019) Dynamic pricing with finitely many unknown valuations. Aurélien G, Satyen K, eds. Algorithmic Learning Theory (PMLR, New York), 247–273.Google Scholar
  • Chen N, Gallego G (2021) Nonparametric pricing analytics with customer covariates. Oper. Res. 69(3):974–984.LinkGoogle Scholar
  • Chen X, Liu Q, Wang Y (2023) Active learning for contextual search with binary feedback. Management Sci. 69(4):2165–2181.LinkGoogle Scholar
  • Chen E, Chen X, Gao L, Li J (2024) Dynamic contextual pricing with doubly non-parametric random utility models. Preprint, submitted May 11, https://arxiv.org/abs/2405.06866.Google Scholar
  • Choi YG, Kim GS, Choi Y, Cho W, Paik MC, Oh MH (2023) Semi-parametric contextual pricing algorithm using Cox proportional hazards model. Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J, eds. Proc. 40th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 202 (PMLR, New York), 5771–5786.Google Scholar
  • Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions. Gordon G, Dunson D, Dudík M, eds. Proc. 14th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 208–214.Google Scholar
  • Cohen MC, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.LinkGoogle Scholar
  • den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.CrossrefGoogle Scholar
  • Fan J, Guo Y, Yu M (2024) Policy optimization using semiparametric models for dynamic pricing. J. Amer. Statist. Assoc. 119(545):552–564.CrossrefGoogle Scholar
  • Foster D, Rakhlin A (2020) Beyond UCB: Optimal and efficient contextual bandits with regression oracles. Daumé H III, Singh A, eds. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 3199–3210.Google Scholar
  • Golrezaei N, Javanmard A, Mirrokni V (2019) Dynamic incentive-aware learning: Robust pricing in contextual auctions. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY).Google Scholar
  • Javanmard A, Nazerzadeh H (2019) Dynamic pricing in high-dimensions. J. Machine Learn. Res. 20(9):1–49.Google Scholar
  • Kleinberg R (2004) Nearly tight bounds for the continuum-armed bandit problem. Saul L, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems, vol. 17 (MIT Press, Cambridge, MA), 697–704.Google Scholar
  • Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Lei Y, Jasin S, Sinha A (2018) Joint dynamic pricing and order fulfillment for e-commerce retailers. Manufacturing Service Oper. Management 20(2):269–284.LinkGoogle Scholar
  • Li Y, Wang Y, Zhou Y (2019) Nearly minimax-optimal regret for linearly parameterized bandits. Beygelzimer A, Hsu D, eds. Proc. 32nd Conf. Learn. Theory, Proceedings of Machine Learning Research, vol. 99 (PMLR, New York), 2173–2174.Google Scholar
  • Luo Y, Sun WW, Liu Y (2022) Contextual dynamic pricing with unknown noise: Explore-then-UCB strategy and improved regrets. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 37445–37457. Google Scholar
  • Luo Y, Sun WW, Liu Y (2024) Distribution-free contextual dynamic pricing. Math. Oper. Res. 49(1):599–618.LinkGoogle Scholar
  • Mendelson S, Neeman J (2010) Regularization in kernel learning. Ann. Statist. 38(1):526–565.CrossrefGoogle Scholar
  • Mourtada J (2022) Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices. Ann. Statist. 50(4):2157–2178.CrossrefGoogle Scholar
  • Oh MH, Iyengar G, Zeevi A (2021) Sparsity-agnostic Llasso bandit. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 8271–8280.Google Scholar
  • Ren Z, Zhou Z (2024) Dynamic batch learning in high-dimensional sparse linear contextual bandits. Management Sci. 70(2):1315–1342.LinkGoogle Scholar
  • Saharan S, Bawa S, Kumar N (2020) Dynamic pricing techniques for intelligent transportation system in smart cities: A systematic review. Comput. Comm. 150:603–625.CrossrefGoogle Scholar
  • Steinwart I, Hush DR, Scovel C (2009) Optimal rates for regularized least squares regression. Proc. 22nd Conf. Learn. Theory (University of Stuttgart, Stuttgart, Germany), 18--21.Google Scholar
  • Takemura K, Ito S, Hatano D, Sumita H, Fukunaga T, Kakimura N, Kawarabayashi K (2021) A parameter-free algorithm for misspecified linear contextual bandits. Banerjee A, Fukumizu K, eds. Proc. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 3367–3375.Google Scholar
  • Tullii M, Gaucher S, Merlis N, Perchet V (2024) Improved algorithms for contextual dynamic pricing. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. Advances in Neural Information Processing Systems (Curran Associates, Inc., Red Hook, NY), 126088–126117.Google Scholar
  • Wang Y, Chen B (2025) Tight regret bounds in contextual pricing with semi-parametric demand learning. Preprint, submitted February 24, https://doi.org/10.2139/ssrn.5133677.Google Scholar
  • Wang Y, Liu Q (2025) Estimation of high-dimensional contextual pricing models with nonparametric price confounders. Oper. Res. 73(6): 2867–3452.LinkGoogle Scholar
  • Wang Y, Chen B, Simchi-Levi D (2021) Multimodal dynamic pricing. Management Sci. 67(10):6136–6152.LinkGoogle Scholar
  • Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.LinkGoogle Scholar
  • Wang H, Talluri K, Li X (2025) Technical note—On dynamic pricing with covariates. Oper. Res. 73(4):1723–2295.LinkGoogle Scholar
  • Xu J, Wang YX (2022) Towards agnostic feature-based dynamic pricing: Linear policies vs linear valuation with unknown noise. Camps-Valls G, Ruiz FJR, Valera I, eds. Proc. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 9643–9662.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.