Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times

Xingyu Bai
Corresponding Author
Xingyu Bai
[email protected]
https://orcid.org/0000-0001-5410-0064
Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801;
Search for more papers by this author
,
Xin Chen
Xin Chen
[email protected]
https://orcid.org/0000-0002-5168-4823
H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332;
Search for more papers by this author
,
Menglong Li
Menglong Li
[email protected]
https://orcid.org/0000-0001-9770-0908
Department of Management Sciences, City University of Hong Kong, Hong Kong;
Search for more papers by this author
,
Alexander Stolyar
Alexander Stolyar
[email protected]
https://orcid.org/0000-0002-1496-9803
Industrial and Enterprise Systems Engineering & Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
Search for more papers by this author

Xingyu Bai

Corresponding Author

Xingyu Bai

[email protected]

https://orcid.org/0000-0001-5410-0064

Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801;

Search for more papers by this author

Xin Chen

[email protected]

https://orcid.org/0000-0002-5168-4823

H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332;

Search for more papers by this author

Menglong Li

[email protected]

https://orcid.org/0000-0001-9770-0908

Department of Management Sciences, City University of Hong Kong, Hong Kong;

Search for more papers by this author

Alexander Stolyar

[email protected]

https://orcid.org/0000-0002-1496-9803

Industrial and Enterprise Systems Engineering & Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801

Search for more papers by this author

Published Online:15 Jun 2023https://doi.org/10.1287/opre.2021.0088

Abstract

We consider a generic Markov decision process (MDP) with two controls: one control taking effect immediately and the other control whose effect is delayed by a positive lead time. As the lead time grows, one naturally expects that the effect of the delayed action only weakly depends on the current state, and decoupling the delayed action from the current state could provide good controls. The purpose of this paper is to substantiate this decoupling intuition by establishing asymptotic optimality of semi-open-loop policies, which specify open-loop controls for the delayed action and closed-loop controls for the immediate action. For MDPs defined on general spaces with uniformly bounded cost functions and a fast mixing property, we construct a periodic semi-open-loop policy for each lead time value and show that these policies are asymptotically optimal as the lead time goes to infinity. For MDPs defined on Euclidean spaces with linear dynamics and convex structures (convex cost functions and constraint sets), we impose another set of conditions under which semi-open-loop policies (actually, constant delayed-control policies) are asymptotically optimal. Moreover, we verify that these conditions hold for a broad class of inventory models, in which there are multiple controls with nonidentical lead times.

Funding: Research of the first three authors was partly supported by the National Science Foundation [Grant CMMI-1635160].

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2021.0088.

Volume 71, Issue 6

November-December 2023

Pages iii-vii, 1925-2396, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:February 10, 2021
Accepted:April 30, 2023
Published Online:June 15, 2023

Cite as

Xingyu Bai, Xin Chen, Menglong Li, Alexander Stolyar (2023) Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times. Operations Research 71(6):2061-2077.

https://doi.org/10.1287/opre.2021.0088

Keywords

Acknowledgments

The authors gratefully acknowledge the comments and suggestions of the area editor John Birge, the anonymous associate editor, and three reviewers, which significantly improved the paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Asymptotic Optimality of Semi-Open-Loop Policies in Markov Decision Processes with Large Lead Times

Abstract

Volume 71, Issue 6

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News