Online Allocation of Reusable Resources in Nonstationary Environments
Abstract
We study a general reusable resource allocation model under both model uncertainty and nonstationarity. Our study involves a set of heterogeneous customers who arrive sequentially at the decision maker’s (DM’s) platform, each associated with a different customer type. Each arriving customer’s type is drawn from an unknown and time-varying probability distribution. Upon observing the customer’s type, the DM selects an allocation decision that generates a random amount of reward and occupies random amounts of resource units. Each resource unit is occupied for a bounded random duration before becoming available for future allocations. The DM aims to maximize the total reward while ensuring that capacity constraints are met with certainty. Our model captures a variety of applications, such as admission control and assortment planning, in changing environments. We develop dual learning with nonstationarity tests, a multiphase online algorithm that converges to the optimal reward as the resource inventories and horizon length increase under the mild assumption that the inventory amount for each resource is at least logarithmic in the length of the horizon. The algorithm incorporates a dual-learning process for decision making and employs a set of judiciously designed tests to detect potential drifts in the latent nonstationary environment.
Funding: This work was supported by the Ministry of Education—Singapore [Grant MOE-T2EP20121-0012]. We acknowledge the funding support from the Singapore Ministry of Education Academic Research Fund Tier 2 [Grant T2EP20121-0035].
Supplemental Material: The online appendix is available at https://doi.org/10.1287/moor.2023.0250.

