An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses

Vijay Kamble
Corresponding Author
Vijay Kamble
[email protected]
https://orcid.org/0000-0002-9261-1612
Department of Information and Decision Sciences, University of Illinois, Chicago, Illinois 60607;
Search for more papers by this author
,
Patrick Loiseau
Patrick Loiseau
[email protected]
University Grenoble Alpes, INRIA, Centre National de la Recherche Scientifique, Institut Polytechnique de Grenoble, Laboratoire d’Informatique de Grenoble, 38058 Grenoble, France;Max-Planck Institute for Software Systems, D-66123 Saarbrücken, Germany;
Search for more papers by this author
,
Jean Walrand
Jean Walrand
[email protected]
Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720
Search for more papers by this author

Vijay Kamble

Corresponding Author

Vijay Kamble

[email protected]

https://orcid.org/0000-0002-9261-1612

Department of Information and Decision Sciences, University of Illinois, Chicago, Illinois 60607;

Search for more papers by this author

Patrick Loiseau

[email protected]

University Grenoble Alpes, INRIA, Centre National de la Recherche Scientifique, Institut Polytechnique de Grenoble, Laboratoire d’Informatique de Grenoble, 38058 Grenoble, France;Max-Planck Institute for Software Systems, D-66123 Saarbrücken, Germany;

Search for more papers by this author

Jean Walrand

[email protected]

Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720

Search for more papers by this author

Published Online:29 Aug 2022https://doi.org/10.1287/opre.2022.2334

Abstract

We describe an approximate dynamic programming (ADP) approach to compute approximations of the optimal strategies and of the minimal losses that can be guaranteed in discounted repeated games with vector-valued losses. Among other applications, such vector-valued games prominently arise in the analysis of worst-case regret in repeated decision making in unknown environments, also known as the adversarial online learning framework. At the core of our approach is a characterization of the lower Pareto frontier of the set of expected losses that a player can guarantee in these games as the unique fixed point of a set-valued dynamic programming operator. When applied to the problem of worst-case regret minimization with discounted losses, our approach yields algorithms that achieve markedly improved performance bounds compared with off-the-shelf online learning algorithms like Hedge. These results thus suggest the significant potential of ADP-based approaches in adversarial online learning.

Funding: This work has been partially supported by the Multidisciplinary Institute in Artificial Intelligence (MIAI) at Grenoble Alpes (ANR-19-P3IA-0003), by the French National Research Agency (ANR) [Grant ANR-20-CE23-0007], by the U.S. Airforce Office of Scientific Research (AFOSR) [Grant MURI FA9550-10-1-0573], by the France-Berkeley Fund, and by the Alexander von Humboldt Foundation.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.2334.

Volume 72, Issue 1

January-February 2024

Pages iii-vi, 1-424, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:September 29, 2018
Accepted:June 03, 2022
Published Online:August 29, 2022

Cite as

Vijay Kamble, Patrick Loiseau, Jean Walrand (2022) An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses. Operations Research 72(1):373-388.

https://doi.org/10.1287/opre.2022.2334

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

An Approximate Dynamic Programming Approach to Repeated Games with Vector Losses

Abstract

Volume 72, Issue 1

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News