Analytical Solution to a Discrete-Time Model for Dynamic Learning and Decision Making

Hao Zhang
Hao Zhang
[email protected]
https://orcid.org/0000-0002-5078-9252
Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada
Search for more papers by this author

Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada

Published Online:1 Feb 2022https://doi.org/10.1287/mnsc.2021.4194

Abstract

Problems concerning dynamic learning and decision making are difficult to solve analytically. We study an infinite-horizon discrete-time model with a constant unknown state that may take two possible values. As a special partially observable Markov decision process (POMDP), this model unifies several types of learning-and-doing problems such as sequential hypothesis testing, dynamic pricing with demand learning, and multiarmed bandits. We adopt a relatively new solution framework from the POMDP literature based on the backward construction of the efficient frontier(s) of continuation-value vectors. This framework accommodates different optimality criteria simultaneously. In the infinite-horizon setting, with the aid of a set of signal quality indices, the extreme points on the efficient frontier can be linked through a set of difference equations and solved analytically. The solution carries structural properties analogous to those obtained under continuous-time models, and it provides a useful tool for making new discoveries through discrete-time models.

This paper was accepted by Baris Ata, stochastic models and simulation.

Volume 68, Issue 8

August 2022

Pages 5557-6354, iv-v

Article Information

Supplemental Material

Metrics

Information

Received:September 10, 2020
Accepted:August 12, 2021
Published Online:February 01, 2022

Cite as

Hao Zhang (2022) Analytical Solution to a Discrete-Time Model for Dynamic Learning and Decision Making. Management Science 68(8):5924-5957.

https://doi.org/10.1287/mnsc.2021.4194

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Analytical Solution to a Discrete-Time Model for Dynamic Learning and Decision Making

Abstract

Volume 68, Issue 8

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News