Mean-Variance Tradeoffs in an Undiscounted MDP: The Unichain Case

Published Online:https://doi.org/10.1287/opre.42.1.184

The problem analyzed here is the computation of Pareto optima in the sense of high mean and low variance of the stationary distribution in the unichain, undiscounted Markov decision process (MDP, for short).

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.