On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies

Shie Mannor
Shie Mannor
[email protected]
Department of Electrical and Computer Engineering, McGill University, 3480 University Street, Montreal, Québec, Canada H3A 2A7
Search for more papers by this author
,
John N. Tsitsiklis
John N. Tsitsiklis
[email protected]
Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Department of Electrical and Computer Engineering, McGill University, 3480 University Street, Montreal, Québec, Canada H3A 2A7

Search for more papers by this author

John N. Tsitsiklis

[email protected]

Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:1 Aug 2005https://doi.org/10.1287/moor.1050.0148

Abstract

We consider the empirical state-action frequencies and the empirical reward in weakly communicating finite-state Markov decision processes under general policies. We define a certain polytope and establish that every element of this polytope is the limit of the empirical frequency vector, under some policy, in a strong sense. Furthermore, we show that the probability of exceeding a given distance between the empirical frequency vector and the polytope decays exponentially with time under every policy. We provide similar results for vector-valued empirical rewards.

cover image Mathematics of Operations Research

Volume 30, Issue 3

August 2005

Pages 545-784

Article Information

Metrics

Information

Received:March 06, 2003
Published Online:August 01, 2005

Cite as

Shie Mannor, John N. Tsitsiklis, (2005) On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies. Mathematics of Operations Research 30(3):545-561.

https://doi.org/10.1287/moor.1050.0148

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies

Abstract

Volume 30, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News