Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models

Xueping Gong
Xueping Gong
[email protected]
https://orcid.org/0000-0001-6747-9886
School of Management, Xiamen University, Xiamen, Fujian 361005, China
Search for more papers by this author
,
Wei You
Corresponding Author
Wei You
[email protected]
https://orcid.org/0000-0003-0844-4194
Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Hong Kong S.A.R., China
Search for more papers by this author
,
Jiheng Zhang
Jiheng Zhang
[email protected]
https://orcid.org/0000-0003-3025-1495
Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Hong Kong S.A.R., China
Search for more papers by this author

School of Management, Xiamen University, Xiamen, Fujian 361005, China

Corresponding Author

Wei You

Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Hong Kong S.A.R., China

Search for more papers by this author

Jiheng Zhang

[email protected]

https://orcid.org/0000-0003-3025-1495

Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Hong Kong S.A.R., China

Search for more papers by this author

Published Online:12 Dec 2025https://doi.org/10.1287/opre.2025.1779

References

Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 24 (Curran Associates Inc., Red Hook, NY), 2312–2320.Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(v):397–422.Google Scholar
Ban GY, Keskin NB (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.Link, Google Scholar
Besbes O, Zeevi A (2015) On the (surprising) sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.Link, Google Scholar
Cesa-Bianchi N, Cesari T, Perchet V (2019) Dynamic pricing with finitely many unknown valuations. Aurélien G, Satyen K, eds. Algorithmic Learning Theory (PMLR, New York), 247–273.Google Scholar
Chen N, Gallego G (2021) Nonparametric pricing analytics with customer covariates. Oper. Res. 69(3):974–984.Link, Google Scholar
Chen X, Liu Q, Wang Y (2023) Active learning for contextual search with binary feedback. Management Sci. 69(4):2165–2181.Link, Google Scholar
Chen E, Chen X, Gao L, Li J (2024) Dynamic contextual pricing with doubly non-parametric random utility models. Preprint, submitted May 11, https://arxiv.org/abs/2405.06866.Google Scholar
Choi YG, Kim GS, Choi Y, Cho W, Paik MC, Oh MH (2023) Semi-parametric contextual pricing algorithm using Cox proportional hazards model. Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J, eds. Proc. 40th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 202 (PMLR, New York), 5771–5786.Google Scholar
Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions. Gordon G, Dunson D, Dudík M, eds. Proc. 14th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 208–214.Google Scholar
Cohen MC, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.Link, Google Scholar
den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
Fan J, Guo Y, Yu M (2024) Policy optimization using semiparametric models for dynamic pricing. J. Amer. Statist. Assoc. 119(545):552–564.Crossref, Google Scholar
Foster D, Rakhlin A (2020) Beyond UCB: Optimal and efficient contextual bandits with regression oracles. Daumé H III, Singh A, eds. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 3199–3210.Google Scholar
Golrezaei N, Javanmard A, Mirrokni V (2019) Dynamic incentive-aware learning: Robust pricing in contextual auctions. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY).Google Scholar
Javanmard A, Nazerzadeh H (2019) Dynamic pricing in high-dimensions. J. Machine Learn. Res. 20(9):1–49.Google Scholar
Kleinberg R (2004) Nearly tight bounds for the continuum-armed bandit problem. Saul L, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems, vol. 17 (MIT Press, Cambridge, MA), 697–704.Google Scholar
Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Lei Y, Jasin S, Sinha A (2018) Joint dynamic pricing and order fulfillment for e-commerce retailers. Manufacturing Service Oper. Management 20(2):269–284.Link, Google Scholar
Li Y, Wang Y, Zhou Y (2019) Nearly minimax-optimal regret for linearly parameterized bandits. Beygelzimer A, Hsu D, eds. Proc. 32nd Conf. Learn. Theory, Proceedings of Machine Learning Research, vol. 99 (PMLR, New York), 2173–2174.Google Scholar
Luo Y, Sun WW, Liu Y (2022) Contextual dynamic pricing with unknown noise: Explore-then-UCB strategy and improved regrets. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 37445–37457. Google Scholar
Luo Y, Sun WW, Liu Y (2024) Distribution-free contextual dynamic pricing. Math. Oper. Res. 49(1):599–618.Link, Google Scholar
Mendelson S, Neeman J (2010) Regularization in kernel learning. Ann. Statist. 38(1):526–565.Crossref, Google Scholar
Mourtada J (2022) Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices. Ann. Statist. 50(4):2157–2178.Crossref, Google Scholar
Oh MH, Iyengar G, Zeevi A (2021) Sparsity-agnostic Llasso bandit. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 8271–8280.Google Scholar
Ren Z, Zhou Z (2024) Dynamic batch learning in high-dimensional sparse linear contextual bandits. Management Sci. 70(2):1315–1342.Link, Google Scholar
Saharan S, Bawa S, Kumar N (2020) Dynamic pricing techniques for intelligent transportation system in smart cities: A systematic review. Comput. Comm. 150:603–625.Crossref, Google Scholar
Steinwart I, Hush DR, Scovel C (2009) Optimal rates for regularized least squares regression. Proc. 22nd Conf. Learn. Theory (University of Stuttgart, Stuttgart, Germany), 18--21.Google Scholar
Takemura K, Ito S, Hatano D, Sumita H, Fukunaga T, Kakimura N, Kawarabayashi K (2021) A parameter-free algorithm for misspecified linear contextual bandits. Banerjee A, Fukumizu K, eds. Proc. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 3367–3375.Google Scholar
Tullii M, Gaucher S, Merlis N, Perchet V (2024) Improved algorithms for contextual dynamic pricing. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. Advances in Neural Information Processing Systems (Curran Associates, Inc., Red Hook, NY), 126088–126117.Google Scholar
Wang Y, Chen B (2025) Tight regret bounds in contextual pricing with semi-parametric demand learning. Preprint, submitted February 24, https://doi.org/10.2139/ssrn.5133677.Google Scholar
Wang Y, Liu Q (2025) Estimation of high-dimensional contextual pricing models with nonparametric price confounders. Oper. Res. 73(6): 2867–3452.Link, Google Scholar
Wang Y, Chen B, Simchi-Levi D (2021) Multimodal dynamic pricing. Management Sci. 67(10):6136–6152.Link, Google Scholar
Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.Link, Google Scholar
Wang H, Talluri K, Li X (2025) Technical note—On dynamic pricing with covariates. Oper. Res. 73(4):1723–2295.Link, Google Scholar
Xu J, Wang YX (2022) Towards agnostic feature-based dynamic pricing: Linear policies vs linear valuation with unknown noise. Camps-Valls G, Ruiz FJR, Valera I, eds. Proc. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 9643–9662.Google Scholar

Volume 74, Issue 2

March-April 2026

Pages v-ix, 573-1152, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:March 07, 2025
Accepted:October 31, 2025
Published Online:December 12, 2025

Cite as

Xueping Gong, Wei You, Jiheng Zhang (2025) Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models. Operations Research 74(2):879-897.

https://doi.org/10.1287/opre.2025.1779

Keywords

Acknowledgments

The authors thank the area editor, associate editor, and anonymous reviewers for valuable feedback that improved the paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models

References

Volume 74, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News