Performance Loss Bounds for Approximate Value Iteration with State Aggregation

Benjamin Van Roy
Benjamin Van Roy
[email protected]
Stanford University, Stanford, California 94305
Search for more papers by this author

Stanford University, Stanford, California 94305

Published Online:1 May 2006https://doi.org/10.1287/moor.1060.0188

Abstract

We consider approximate value iteration with a parameterized approximator in which the state space is partitioned and the optimal cost-to-go function over each partition is approximated by a constant. We establish performance loss bounds for policies derived from approximations associated with fixed points. These bounds identify benefits to using invariant distributions of appropriate policies as projection weights. Such projection weighting relates to what is done by temporal-difference learning. Our analysis also leads to the first performance loss bound for approximate value iteration with an average-cost objective.

cover image Mathematics of Operations Research

Volume 31, Issue 2

May 2006

Pages 217-432

Article Information

Metrics

Information

Received:August 02, 2004
Published Online:May 01, 2006

Cite as

Benjamin Van Roy, (2006) Performance Loss Bounds for Approximate Value Iteration with State Aggregation. Mathematics of Operations Research 31(2):234-244.

https://doi.org/10.1287/moor.1060.0188

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Performance Loss Bounds for Approximate Value Iteration with State Aggregation

Abstract

Volume 31, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News