Finite-Memory Strategies in POMDPs with Long-Run Average Objectives

Krishnendu Chatterjee
Krishnendu Chatterjee
[email protected]
https://orcid.org/0000-0002-4561-241X
Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria
Search for more papers by this author
,
Raimundo Saona
Raimundo Saona
[email protected]
https://orcid.org/0000-0001-5103-038X
Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria
Search for more papers by this author
,
Bruno Ziliotto
Corresponding Author
Bruno Ziliotto
[email protected]
http://orcid.org/0000-0002-4448-1411
Centre de Recherche en Mathématiques de la Décision, Centre National de la Recherche Scientifique, Université Paris Dauphine, Université PSL, 75016 Paris, France
Search for more papers by this author

Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria

Search for more papers by this author

Raimundo Saona

[email protected]

https://orcid.org/0000-0001-5103-038X

Institute of Science and Technology Austria, 3400 Klosterneuburg, Austria

Search for more papers by this author

Bruno Ziliotto

Corresponding Author

Bruno Ziliotto

[email protected]

http://orcid.org/0000-0002-4448-1411

Centre de Recherche en Mathématiques de la Décision, Centre National de la Recherche Scientifique, Université Paris Dauphine, Université PSL, 75016 Paris, France

Search for more papers by this author

Published Online:6 Apr 2021https://doi.org/10.1287/moor.2020.1116

Abstract

Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function.

cover image Mathematics of Operations Research

Volume 47, Issue 1

February 2022

Pages 1-846, C2

Article Information

Metrics

Information

Received:February 14, 2020
Accepted:August 08, 2020
Published Online:April 06, 2021

Cite as

Krishnendu Chatterjee, Raimundo Saona, Bruno Ziliotto (2021) Finite-Memory Strategies in POMDPs with Long-Run Average Objectives. Mathematics of Operations Research 47(1):100-119.

https://doi.org/10.1287/moor.2020.1116

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Finite-Memory Strategies in POMDPs with Long-Run Average Objectives

Abstract

Volume 47, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News