Network Revenue Management with Nonparametric Demand Learning: T-Regret and Polynomial Dimension Dependency

Published Online:https://doi.org/10.1287/moor.2022.0086

This paper studies the classic price-based network revenue management (NRM) problem with demand learning. The retailer dynamically decides prices of n products over a finite selling season (of length T) subject to m resource constraints, with the purpose of maximizing the cumulative revenue. In this paper, we focus on a nonparametric demand model with some mild technical assumptions which are satisfied by most of the commonly used demand functions. We propose a robust ellipsoid method adapted to the NRM setting in a nontrivial manner. This is the first result which achieves the regret of the form O(poly(n,m,ln(T))T) (where poly(n,m,ln(T)) is a polynomial function of n,m,ln(T)) in the current literature on the nonparametric NRM problem.

Funding: S. Miao gratefully acknowledges financial support provided by the Ruegg Family Scholar and the Leeds School of Business.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/moor.2022.0086.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.