Optimal Dynamic Pricing Policies for an M/M/s Queue

Published Online:https://doi.org/10.1287/opre.22.3.545

We consider the problem of maximizing the long-run average expected reward per unit time in a queuing-reward system, which we formulate as a semi-Markov decision process. Control of the system is effected by increasing or decreasing the price charged for the facility's service in order to discourage or encourage the arrival of customers. We assume that the arrival process is Poisson with arrival rate a strictly decreasing function of the currently advertized price, and that the service times are independent exponentially distributed random variables. The reward structure consists of customer payments and holding costs (possibly nonlinear). At each transition (customer arrival or service completion), the manager of the facility must choose one of a finite number of prices to advertize until the next transition. We show that there exist optimal stationary policies and that each possesses the monotonicity property: the optimal price to advertize is a nondecreasing function of the number of customers in the system. An efficient computational algorithm is developed that, in a finite number of steps, produces a stationary policy that is optimal.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.