Utility Fairness in Contextual Dynamic Pricing with Demand Learning

Published Online:https://doi.org/10.1287/mnsc.2023.03956

This paper introduces a novel contextual bandit algorithm for personalized pricing under utility fairness constraints in scenarios with uncertain demand, achieving an optimal regret upper bound. Our approach, which incorporates dynamic pricing and demand learning, addresses the critical challenge of fairness in pricing strategies. We first delve into the static full-information setting to formulate an optimal pricing policy as a constrained optimization problem. Here, we propose an approximation algorithm for efficiently and approximately computing the ideal policy. We also use mathematical analysis and computational studies to characterize the structures of optimal contextual pricing policies subject to fairness constraints, deriving simplified policies that lay the foundations of more in-depth research and extensions. Further, we extend our study to dynamic pricing problems with demand learning, establishing a nonstandard regret lower bound that highlights the complexity added by fairness constraints. Our research offers a comprehensive analysis of the cost of fairness and its impact on the balance between utility and revenue maximization. This work represents a step toward integrating ethical considerations into algorithmic efficiency in data-driven dynamic pricing.

This paper was accepted by J. George Shanthikumar, big data analytics.

Funding: X. Chen acknowledges support from the National Science Foundation [Grant IIS-1845444]. D. Simchi-Levi thanks the MIT Data Science Lab for support.

Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2023.03956.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.