Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions
Abstract
We present an approximate dynamic programming method based on simulation, policy iteration, a postdecision state formulation, and a logistic value function approximation. This method was developed as part of our efforts to determine whether nonlinear value function approximations could provide cost-effective policies for advance patient scheduling problems, and as a way of identifying the main advantages and disadvantages of using simulation versus linear programming to approximately solve dynamic capacity allocation problems. We first apply the proposed method to a queueing problem and then study a more practical application based on an advance multipriority patient scheduling problem. We investigate the quality and practical implications of the resulting appointment scheduling policies using simulation, and compare their performance to that of four other policies. Patient scheduling policies obtained by the new method not only depend on the number of appointments already booked on each day but also on the overall system workload. In particular, these policies provide lower discounted cost values and shorter average wait times for higher priority patients than policies directly obtained using linear programming and an affine value function approximation in the predecision state variables.

