The Value Iteration Algorithm in Risk-Sensitive Average Markov Decision Chains with Finite State Space

Rolando Cavazos-Cadena
Rolando Cavazos-Cadena
[email protected]
Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, Saltillo 25315 Coah, Mexico
Search for more papers by this author
,
Raúl Montes-de-Oca
Raúl Montes-de-Oca
[email protected]
Departamento de Matemáticas, Universidad Autónoma Metropolitana, Campus Iztapalapa, Avenida San Rafael, Atlixco #186, Colonia Vicentina, Mexico 09340, D.F. Mexico
Search for more papers by this author

Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, Saltillo 25315 Coah, Mexico

Search for more papers by this author

Raúl Montes-de-Oca

[email protected]

Departamento de Matemáticas, Universidad Autónoma Metropolitana, Campus Iztapalapa, Avenida San Rafael, Atlixco #186, Colonia Vicentina, Mexico 09340, D.F. Mexico

Search for more papers by this author

Published Online:1 Nov 2003https://doi.org/10.1287/moor.28.4.752.20515

Abstract

This work concerns discrete-time Markov decision chains with finite state space and bounded costs. The controller has constant risk sensitivity λ, and the performance of a control policy is measured by the corresponding risk-sensitive average cost criterion. Assuming that the optimality equation has a solution, it is shown that the value iteration scheme can be implemented to obtain, in a finite number of steps, (1) an approximation to the optimal λ-sensitive average cost with an error less than a given tolerance, and (2) a stationary policy whose performance index is arbitrarily close to the optimal value. The argument used to establish these results is based on a modification of the original model, which is an extension of a transformation introduced by Schweitzer (1971) to analyze the the risk-neutral case.

cover image Mathematics of Operations Research

Volume 28, Issue 4

November 2003

Pages 609-887

Article Information

Metrics

Information

Received:March 08, 2002
Published Online:November 01, 2003

Cite as

Rolando Cavazos-Cadena, Raúl Montes-de-Oca, (2003) The Value Iteration Algorithm in Risk-Sensitive Average Markov Decision Chains with Finite State Space. Mathematics of Operations Research 28(4):752-776.

https://doi.org/10.1287/moor.28.4.752.20515

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

The Value Iteration Algorithm in Risk-Sensitive Average Markov Decision Chains with Finite State Space

Abstract

Volume 28, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News