Search in the Dark: The Case with Recall and Gaussian Learning

Published Online:https://doi.org/10.1287/opre.2023.0150

The classic sequential search problem rewards the decision maker with the highest sampled value minus a cost per sample. If the sampling distribution is unknown, then a Bayesian decision maker faces a complex balance between exploration and exploitation. We solve the stopping problem of sampling from a normal distribution with unknown mean and variance and a conjugate prior, a longstanding open problem. The optimal stopping region may be empty (it may be optimal to continue the search regardless of the offer one receives, especially at the early stages), or it may consist of one or two bounded intervals. Whereas a single reservation price cannot describe the optimal rule, we do find an optimal index policy taking the form of a standardized reservation rule: stop if and only if the standardized value of the current best exceeds a threshold that depends on the standardized search cost. We also provide an algorithm to compute the index function, producing a practical way to implement the optimal stopping rule for any given prior, sampling history, and sampling horizon.

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2023.0150.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.