Multiarmed bandit (MAB) algorithms are efficient approaches to reduce the opportunity cost of online experimentation and are used by companies to find the best product from periodically refreshed product catalogs. However, these algorithms face the so-called cold-start at the onset of the experiment due to a lack of knowledge of customer preferences for new products, requiring an initial data collection phase known as the burn-in period. During this period, standard MAB algorithms operate like randomized experiments, incurring large burn-in costs which scale with the large number of products. We attempt to reduce the burn-in by identifying that many products can be cast into two-sided products and then naturally model the rewards of the products with a matrix, whose rows and columns represent the two sides, respectively. Next, we design two-phase bandit algorithms that first use subsampling and low-rank matrix estimation to obtain a substantially smaller targeted set of products and then apply a Upper Confidence Bound procedure on the target products to find the best one. We theoretically show that the proposed algorithms lower costs and expedite the experiment in cases when there is limited experimentation time along with a large product set. Our analysis also provides insights into three experiment regimes of long, short, and ultra-short horizon, determined by the dimensions of the matrix. Empirical evidence from both synthetic data and a real-world data set on music streaming services validates the superior performance suggested by our theory.

This paper was accepted by J. George Shanthikumar, data science.

Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2022.03394.

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:November 02, 2022
Accepted:November 10, 2025
Published Online:June 26, 2026

Cite as

Mohsen Bayati, Junyu Cao, Wanning Chen (2026) Speed Up the Cold-Start Learning in Two-Sided Bandits with Many Arms. Management Science 0(0).

https://doi.org/10.1287/mnsc.2022.03394

Keywords

Acknowledgments

All authors contributed equally.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Speed Up the Cold-Start Learning in Two-Sided Bandits with Many Arms

Abstract

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News