Online Learning with Sample Selection Bias

Published Online:https://doi.org/10.1287/opre.2023.0223

We consider the problem of personalized recommendations on online platforms, where user preferences are unknown, and users interact with the platform through a series of sequential decisions (such as clicking to watch on video platforms or clicking to donate on donation platforms). The platform aims to maximize the final outcome (e.g., viewing duration on video platforms or donations on donation platforms). However, the platform only observes the final outcome for users who complete the first stage (clicking on the recommendation). The final outcome for users who do not complete the first stage (not clicking on the recommendation) remains unobserved (also referred to as funneling). This censoring of outcomes creates a selection bias issue, as the observed outcomes at different stages are often correlated. We demonstrate that failing to account for this selection bias results in biased estimates and suboptimal recommendations. In fact, well-performing personalized learning algorithms perform poorly and incur linear regret in this setting. Therefore, we propose the sample selection bandit (SSB) algorithm, which combines Heckman’s two-step estimator with the “optimism under uncertainty” principle to address the sample selection bias issue. We show that the SSB algorithm achieves a rate-optimal regret rate (up to logarithmic terms) of O˜(T). Furthermore, we conduct extensive numerical experiments on both synthetic data and real donation data collected from GoFundMe (a crowdfunding platform), demonstrating significant improvements over benchmark state-of-the-art learning algorithms in this setting.

Supplemental Material: All supplemental materials, including the code, data, and files required to reproduce the results, are available at https://doi.org/10.1287/opre.2023.0223.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.