Sequential Interdiction with Incomplete Information and Learning

Juan S. Borrero
Juan S. Borrero
School of Industrial Engineering & Management, Oklahoma State University, Stillwater, Oklahoma 74078;
Search for more papers by this author
,
Oleg A. Prokopyev
Oleg A. Prokopyev
Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261;
Search for more papers by this author
,
Denis Sauré
Denis Sauré
http://orcid.org/0000-0002-8123-5009
Department of Industrial Engineering, University of Chile, Santiago 8370456, Chile
Search for more papers by this author

Juan S. Borrero

School of Industrial Engineering & Management, Oklahoma State University, Stillwater, Oklahoma 74078;

Search for more papers by this author

Oleg A. Prokopyev

Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261;

Search for more papers by this author

Denis Sauré

http://orcid.org/0000-0002-8123-5009

Department of Industrial Engineering, University of Chile, Santiago 8370456, Chile

Search for more papers by this author

Published Online:1 Feb 2019https://doi.org/10.1287/opre.2018.1773

Abstract

We present a framework for a class of sequential decision-making problems in the context of general interdiction problems, in which a leader and a follower repeatedly interact. At each period, the leader allocates resources to disrupt the performance of the follower (e.g., as in defender–attacker or network interdiction problems), who, in turn, minimizes some cost function over a set of activities that depends on the leader’s decision. Although the follower has complete knowledge of the follower’s problem, the leader has only partial information and needs to learn about the cost parameters, available resources, and the follower’s activities from the feedback generated by the follower’s actions. We measure policy performance in terms of its time-stability, defined as the number of periods it takes for the leader to match the actions of an oracle with complete information. In particular, we propose a class of greedy and robust policies and show that these policies are weakly optimal, eventually match the oracle’s actions, and provide a real-time certificate of optimality. We also study a lower bound on any policy performance based on the notion of a semioracle. Our numerical experiments demonstrate that the proposed policies consistently outperform a reasonable benchmark and perform fairly close to the semioracle.

The online appendix is available at https://doi.org/10.1287/opre.2018.1773.

Volume 67, Issue 1

January-February 2019

Pages ii-iv, 1-294

Article Information

Supplemental Material

Metrics

Information

Received:September 20, 2016
Accepted:May 10, 2018
Published Online:February 01, 2019

Cite as

Juan S. Borrero, Oleg A. Prokopyev, Denis Sauré (2019) Sequential Interdiction with Incomplete Information and Learning. Operations Research 67(1):72-89.

https://doi.org/10.1287/opre.2018.1773

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Sequential Interdiction with Incomplete Information and Learning

Abstract

Volume 67, Issue 1

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News