Reinforcement with Fading Memories

Kuang Xu
Corresponding Author
Kuang Xu
[email protected]
https://orcid.org/0000-0002-2221-1648
Graduate School of Business, Stanford University, Stanford, California 94305;
Search for more papers by this author
,
Se-Young Yun
Se-Young Yun
[email protected]
Department of Industrial & Systems Engineering, KAIST, Daejeon, Republic of Korea
Search for more papers by this author

Kuang Xu

Corresponding Author

Kuang Xu

[email protected]

https://orcid.org/0000-0002-2221-1648

Graduate School of Business, Stanford University, Stanford, California 94305;

Search for more papers by this author

Se-Young Yun

[email protected]

Department of Industrial & Systems Engineering, KAIST, Daejeon, Republic of Korea

Search for more papers by this author

Published Online:18 Jun 2020https://doi.org/10.1287/moor.2019.1031

Abstract

We study the effect of imperfect memory on decision making in the context of a stochastic sequential action-reward problem. An agent chooses a sequence of actions, which generate discrete rewards at different rates. She is allowed to make new choices at rate β, whereas past rewards disappear from her memory at rate μ. We focus on a family of decision rules where the agent makes a new choice by randomly selecting an action with a probability approximately proportional to the amount of past rewards associated with each action in her memory. We provide closed form formulas for the agent’s steady-state choice distribution in the regime where the memory span is large ( $μ \to 0$ ) and show that the agent’s success critically depends on how quickly she updates her choices relative to the speed of memory decay. If $β ≫ μ$ , the agent almost always chooses the best action (that is, the one with the highest reward rate). Conversely, if $β ≪ μ$ , the agent chooses an action with a probability roughly proportional to its reward rate.

cover image Mathematics of Operations Research

Volume 45, Issue 4

November 2020

Pages 1193-1620, C2

Article Information

Supplemental Material

Metrics

Information

Received:September 01, 2017
Accepted:July 25, 2019
Published Online:June 18, 2020

Cite as

Kuang Xu, Se-Young Yun (2020) Reinforcement with Fading Memories. Mathematics of Operations Research 45(4):1258-1288.

https://doi.org/10.1287/moor.2019.1031

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Reinforcement with Fading Memories

Abstract

Volume 45, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News