On Dynamic Programming with Unbounded Rewards

Steven A. Lippman
Steven A. Lippman
University of California, Los Angeles
Search for more papers by this author

University of California, Los Angeles

Published Online:1 Jul 1975https://doi.org/10.1287/mnsc.21.11.1225

Abstract

Using the technique employed by the author in an earlier paper, the existence of an optimal stationary policy that can be obtained from the usual functional equation is again established in the presence of a bound (not necessarily polynomial) on the one-period reward of a semi-Markov decision process. This is done for both the discounted and the average cost case. In addition to allowing an uncountable state space, the law of motion of the system is rather general in that we permit any state to be reached in a single transition. There is, however, a bound on a weighted moment of the next state reached. Finally, we indicate the applicability of these results.

Cited by
- Finite-Time Optimal Policy Identification for the Stochastic Shortest Path Problem
  IEEE Control Systems Letters, Vol. 10
- Unbounded Markov dynamic programming with weighted supremum norm Perov contractions
  26 April 2024 | Economic Theory Bulletin, Vol. 12, No. 2
- Constrained Markov Decision Process for the Industry
  19 April 2024
- Dynamic Load Balancing Control of Multiclass Traffic in Multi-Link IEEE 802.11be Networks
- Equilibrium Defaultable Corporate Debt and Investment
  SSRN Electronic Journal, Vol. 73
- A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits
  José Niño-Mora
  24 October 2019 | Mathematics of Operations Research, Vol. 45, No. 2
- Condition-based maintenance policies under imperfect maintenance at scheduled and unscheduled opportunities
  24 August 2019 | Queueing Systems, Vol. 93, No. 3-4
- Optimized Security as a Service Platforms via Stochastic Modeling and Dynamic Programming
  10 March 2018
- Dynamic Programming and Markov Decision Processes
  15 February 2018
- Optimal denial-of-service attack on feedback channel against acknowledgment-based sensor power schedule for remote estimation
- Simplex Algorithm for Countable-State Discounted Markov Decision Processes
  Ilbin Lee,
  Marina A. Epelman,
  H. Edwin Romeijn,
  Robert L. Smith
  30 May 2017 | Operations Research, Vol. 65, No. 4
- The value of service rate flexibility in an M / M /1 queue with admission control
  5 April 2017 | IISE Transactions, Vol. 49, No. 6
- IDENTIFICATION OF DISCRETE CHOICE DYNAMIC PROGRAMMING MODELS WITH NONPARAMETRIC DISTRIBUTION OF UNOBSERVABLES
  21 March 2016 | Econometric Theory, Vol. 33, No. 3
- Structures of Optimal Policies in MDPs with Unbounded Jumps: The State of Our Art
  11 March 2017
- Markovian Decision Processes with Discrete Transition Law
  13 January 2017
- Dynamic server assignment in an extended machine-repair model
  22 December 2014 | IIE Transactions, Vol. 47, No. 4
- Bibliography
  19 February 2014
- Technology Adoption with Uncertain Future Costs and Quality
  James E. Smith,
  Canan Ulu,
  1 April 2012 | Operations Research, Vol. 60, No. 2
- Semi‐Markov Decision Processes
  15 February 2011
- Total Expected Discounted Reward MDPs : Value Iteration Algorithm
  15 February 2011
- Continuity and differentiability of expected value functions in dynamic discrete choice models
  9 December 2010 | Quantitative Economics, Vol. 1, No. 2
- Weak Differentiability of Product Measures
  Bernd Heidergott,
  Haralambie Leahu,
  8 December 2009 | Mathematics of Operations Research, Vol. 35, No. 1
- A Structured Multiarmed Bandit Problem and the Greedy Policy
  IEEE Transactions on Automatic Control, Vol. 54, No. 12
- Inventory, Discounts, and the Timing Effect
  Hyun-Soo Ahn,
  Mehmet Gümüş,
  Philip Kaminsky,
  24 October 2008 | Manufacturing & Service Operations Management, Vol. 11, No. 4
- Strong bounds on perturbations
  1 July 2008 | Mathematical Methods of Operations Research, Vol. 70, No. 1
- A structured multiarmed bandit problem and the greedy policy
- Two Person Zero-Sum Semi-Markov Games with Unknown Holding Times Distribution on One Side: A Discounted Payoff Criterion
  5 September 2007 | Applied Mathematics and Optimization, Vol. 57, No. 3
- Approximate Fixed Point Iteration with an Application to Infinite Horizon Markov Decision Processes
  SIAM Journal on Control and Optimization, Vol. 47, No. 5
- Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates
  Bernoulli, Vol. 11, No. 6
- Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards
  SIAM Journal on Control and Optimization, Vol. 44, No. 1
- An Explicit Solution for the Value Function of a Priority Queue
  Queueing Systems, Vol. 47, No. 3
- Time and Ratio Expected Average Cost Optimality for Semi-Markov Control Processes on Borel Spaces
  Communications in Statistics - Theory and Methods, Vol. 33, No. 3
- Continuous Time Markov Decision Processes with Expected Discounted Total Rewards
  18 June 2003
- Optimal QoS control of interacting service stations
  15 April 2003 | RAIRO - Operations Research, Vol. 36, No. 3
- Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion
  14 July 2016 | Journal of Applied Probability, Vol. 39, No. 02
- Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion
  14 July 2016 | Journal of Applied Probability, Vol. 39, No. 2
- Average Optimality for Adaptive Markov Control Processes with Unbounded Costs and Unknown Disturbance Distribution
- Average Reward Optimization Theory for Denumerable State Spaces
- Optimal admission control for high speed networks: a dynamic programming approach
- Bibliography
  27 May 2008
- Valuing Oil Properties: Integrating Option Pricing and Decision Analysis Approaches
  James E. Smith,
  Kevin F. McCardle,
  1 April 1998 | Operations Research, Vol. 46, No. 2
- Mixed Markov Decision Processes in a Semi-Markov Environment with Discounted Criterion
  Journal of Mathematical Analysis and Applications, Vol. 219, No. 1
- Robustness inequality for Markov control processes with unbounded costs
  Systems & Control Letters, Vol. 33, No. 2
- Stochastic Inventory Models with Limited Production Capacity and Periodically Varying Parameters
  27 July 2009 | Probability in the Engineering and Informational Sciences, Vol. 11, No. 1
- Optimality of a Threshold Policy in theM/M/ queue with repeated vacations
  Mathematical Methods of Operations Research, Vol. 44, No. 1
- Optimal routing into two heterogeneous service stations with delayed information
  IEEE Transactions on Automatic Control, Vol. 40, No. 7
- Why does nuclear power performance differ across Europe?
  European Economic Review, Vol. 39, No. 6
- Myopia and R&D/production complementarities
  Economic Theory, Vol. 4, No. 3
- Bibliography
  27 May 2008
- Linear Programming and Average Optimality of Markov Control Processes on Borel Spaces—Unbounded Costs
  SIAM Journal on Control and Optimization, Vol. 32, No. 2
- Discrete type shock semi-markov decision processes with borel state space
  Optimization, Vol. 28, No. 3-4
- Jointly optimal admission and routing controls at a network node
  Communications in Statistics. Stochastic Models, Vol. 10, No. 1
- Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
  SIAM Journal on Control and Optimization, Vol. 31, No. 2
- Optimal control of the M/G/1 queue with repeated vacations of the server
  IEEE Transactions on Automatic Control, Vol. 38, No. 12
- Optimal routing into two heterogeneous service stations with delayed information
- Time-average and asymptotically optimal flow control policies in networks with multiple transmitters
  Annals of Operations Research, Vol. 35, No. 5
- Optimal control of admission to a multiserver queue with two arrival streams
  IEEE Transactions on Automatic Control, Vol. 37, No. 6
- Optimal Control of the M/G/1 Queue with Repeated Vacations
- Continuous time shock markov decision processes with discounted criterion
  Optimization, Vol. 25, No. 2-3
- Optimal control of the M/G/1 queue with repeated vacations of the server
- Properties of optimal hop-by-hop allocation policies in networks with multiple transmitters and linear equal holding costs
  IEEE Transactions on Automatic Control, Vol. 36, No. 12
- Average cost optimal policies for Markov control processes with Borel state space and unbounded costs
  Systems & Control Letters, Vol. 15, No. 4
- Recursive utility and the Ramsey problem
  Journal of Economic Theory, Vol. 50, No. 2
- Chapter 8 Markov decision processes
- Value iteration and rolling plans for Markov control processes with unbounded rewards
- Optimal hop-by-hop flow control policies in computer networks with multiple transmitters: convexity and monotonicity properties. I. linear and equal holding costs
- Controlled semi-Markov models under long-run average rewards
  Journal of Statistical Planning and Inference, Vol. 22, No. 2
- Average Cost Semi-Markov Decision Processes and the Control of Queueing Systems
  27 July 2009 | Probability in the Engineering and Informational Sciences, Vol. 3, No. 2
- Controlled semi-markov models - the discounted case
  Journal of Statistical Planning and Inference, Vol. 21, No. 3
- Resource allocation and project selection: Control of r & d under dynamic process of data improvement
  Theory and Decision, Vol. 26, No. 1
- Semi-Markov decision processes with a reachable state-subset
  Optimization, Vol. 20, No. 3
- Control of economic systems under the process of data improvement
  Journal of Economic Dynamics and Control, Vol. 12, No. 4
- Maximum Likelihood Estimation of Discrete Control Processes
  SIAM Journal on Control and Optimization, Vol. 26, No. 5
- Optimal hop-by-hop flow control policies with multiple heterogeneous transmitters
- Dynamic Programming and Markov Decision Processes
  11 November 2016
- Estimation and control in discounted stochastic dynamic programming
  Stochastics, Vol. 20, No. 1
- Optimal hop-by-hop flow control in computer networks
  IEEE Transactions on Automatic Control, Vol. 31, No. 9
- Finite-state approximations for denumerable state discounted markov decision processes
  Applied Mathematics & Optimization, Vol. 14, No. 1
- Nonstationary value-iteration and adaptive control of discounted semi-Markov processes
  Journal of Mathematical Analysis and Applications, Vol. 112, No. 2
- Technological expectations and adoption of improved technology
  Journal of Economic Theory, Vol. 34, No. 2
- Optimal adaptive control of priority assignment in queueing systems
  Systems & Control Letters, Vol. 4, No. 2
- Optimal admission pricing and service rate control of an M[x] /M/ s queue with reneging
  21 November 2006 | Naval Research Logistics Quarterly, Vol. 30, No. 2
- The Optimal Control of Partially Observable Semi-Markov Processes Over the Infinite Horizon: Discounted Costs
- Discounted Dynamic Programming
- Semi-Markov decision processes with polynomial reward
  14 July 2016 | Journal of Applied Probability, Vol. 19, No. 02
- Semi-Markov decision processes with polynomial reward
  14 July 2016 | Journal of Applied Probability, Vol. 19, No. 2
- Finite state approximations for denumerable state infinite horizon discounted Markov decision processes with unbounded rewards
  Journal of Mathematical Analysis and Applications, Vol. 86, No. 1
- A Contraction Theorem in Inventory Problems
  Journal of Information and Optimization Sciences, Vol. 3, No. 2
- On Semi-Markov Controlled Models with an Average Reward Criterion
  Theory of Probability & Its Applications, Vol. 26, No. 4
- Monotone optimal preventive maintenance policies for stochastically failing equipment
  5 July 2007 | Naval Research Logistics Quarterly, Vol. 28, No. 3
- Optimal control of price through restricted production
  5 July 2007 | Naval Research Logistics Quarterly, Vol. 28, No. 3
- Action-dependent stopping times and Markov decision process with unbounded rewards
  1 September 1981 | Operations-Research-Spektrum, Vol. 3, No. 3
- Chapter 6 The economics of uncertainty: Selected topics and probabilistic methods
- Optimal sequential selection and resource allocation under uncertainty
  1 July 2016 | Advances in Applied Probability, Vol. 12, No. 04
- Optimal sequential selection and resource allocation under uncertainty
  1 July 2016 | Advances in Applied Probability, Vol. 12, No. 4
- Optimal admission pricing policies for M/E k /1 queues
  21 November 2006 | Naval Research Logistics Quarterly, Vol. 27, No. 1
- The optimal frequency of information purchases
  European Journal of Operational Research, Vol. 4, No. 2
- On monotone optimal policies in a queueing model of M/G/1 type with controllable service time distribution
  1 July 2016 | Advances in Applied Probability, Vol. 11, No. 04
- On monotone optimal policies in a queueing model of M/G /1 type with controllable service time distribution
  1 July 2016 | Advances in Applied Probability, Vol. 11, No. 4
- On theory and algorithms for Markov decision problems with the total reward criterion
  8 May 1979 | Operations-Research-Spektrum, Vol. 1, No. 1
- Successive approximations for Markov decision processes and Markov games with unbounded rewards
  Mathematische Operationsforschung und Statistik. Series Optimization, Vol. 10, No. 3
- Markov decision processes and strongly excessive functions
  Stochastic Processes and their Applications, Vol. 8, No. 1
- Group random-access disciplines for multi-access broadcast channels
  IEEE Transactions on Information Theory, Vol. 24, No. 5
- The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms
  14 July 2016 | Journal of Applied Probability, Vol. 15, No. 02
- The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms
  14 July 2016 | Journal of Applied Probability, Vol. 15, No. 2
- A Stochastic Game Model of a Weapons Development Competition
  SIAM Journal on Control and Optimization, Vol. 16, No. 3
- THE ANALYTIC THEORY OF POLICY ITERATION11This research was partially supported by NRC Grant A3609.
- EXISTENCE OF AVERAGE OPTIMAL STRATEGIES IN MARKOVIAN DECISION PROBLEMS WITH STRICTLY UNBOUNDED COSTS
- Markov programming by successive approximations with respect to weighted supremum norms
  Journal of Mathematical Analysis and Applications, Vol. 58, No. 2
- Existence and uniqueness theorems for the optimal inventory equation: The backlogging case
  Journal of Mathematical Analysis and Applications, Vol. 57, No. 3
- Dynamic investment strategies for a risky R and D project
  14 July 2016 | Journal of Applied Probability, Vol. 14, No. 01
- Dynamic investment strategies for a risky R and D project
  14 July 2016 | Journal of Applied Probability, Vol. 14, No. 1
- THE ECONOMICS OF JOB SEARCH: A SURVEY
  Economic Inquiry, Vol. 14, No. 2
- On the optimality of a switch-over policy for controlling the queue size in an M/G/1 queue with variable service rate
  21 May 2005
- Monotone optimal policies for Markov decision processes
  24 February 2009

Volume 21, Issue 11

July 1975

Pages 1215-1354

Article Information

Metrics

Information

Published Online:July 01, 1975

Cite as

Steven A. Lippman, (1975) On Dynamic Programming with Unbounded Rewards. Management Science 21(11):1225-1233.

https://doi.org/10.1287/mnsc.21.11.1225

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

On Dynamic Programming with Unbounded Rewards

Abstract

Volume 21, Issue 11

Article Information

Metrics

Information

Cite as

Sign Up for INFORMS Publications Updates and News