Online Allocation of Reusable Resources in Nonstationary Environments
References
- [1] (2014) Fast algorithms for online stochastic convex programming. Proc. Twenty-Sixth Annual ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 1405–1424.Google Scholar
- [2] (2014) A dynamic near-optimal algorithm for online linear programming. Oper. Res. 62(4):876–890.Link, Google Scholar
- [3] (2022) Representing random utility choice models with neural networks. Preprint, submitted July 26, https://arxiv.org/abs/2207.12877.Google Scholar
- [4] (2019) Adaptively tracking the best bandit arm with an unknown number of distribution changes. Conf. Learn. Theory, Proceedings of Machine Learning Research, vol. 99 (PMLR, Phoenix), 138–158.Google Scholar
- [5] (2008) Near-optimal regret bounds for reinforcement learning. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [6] (2020) Dual mirror descent for online allocation problems. Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 119 (PMLR, New York), 613–628.Google Scholar
- [7] (2023) The best of many worlds: Dual mirror descent for online allocation problems. Oper. Res. 71(1):101–119.Link, Google Scholar
- [8] (2022) Static pricing: Universal guarantees for reusable resources. Oper. Res. 70(2):1143–1152.Link, Google Scholar
- [9] (2014) Stochastic multi-armed-bandit problem with non-stationary rewards. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [10] (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.Link, Google Scholar
- [11] (2022) A unifying framework for online optimization with long-term constraints. Oh AH, Agarwal A, Belgrave D, Cho K, eds. Advances in Neural Information Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [12] (2017) Revenue management of reusable resources with advanced reservations. Production Oper. Management 26(5):836–859.Crossref, Google Scholar
- [13] (2019) A new algorithm for non-stationary contextual bandits: Efficient, optimal and parameter-free. Beygelzimer A, Hsu D, eds. Proc. Thirty-Second Conf. Learn. Theory, Proceedings of Machine Learning Research, vol. 99 (PMLR, New York), 696–726.Google Scholar
- [14] (2022) A statistical learning approach to personalization in revenue management. Management Sci. 68(3):1923–1937.Link, Google Scholar
- [15] (2020) Reinforcement learning for non-stationary Markov decision processes: The blessing of (more) optimism. Daumé H III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 119 (PMLR, New York), 1843–1854.Google Scholar
- [16] (2022) Hedging the drift: Learning to optimize under nonstationarity. Management Sci. 68(3):1696–1713.Link, Google Scholar
- [17] (2019) Near optimal online algorithms and fast approximation algorithms for resource allocation problems. J. ACM 66(1):7.Crossref, Google Scholar
- [18] (2019) Linear programming based online policies for real-time assortment of reusable resources. Preprint, submitted July 17, http://dx.doi.org/10.2139/ssrn.3421227.Google Scholar
- [19] (2022) Near-optimal Bayesian online assortment of reusable resources. Pennock DM, Segal I, Seuken S, eds. Proc. 23rd ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 964–965.Google Scholar
- [20] (2018) A sliding-window algorithm for Markov decision processes with arbitrarily changing rewards and transitions. Preprint, submitted May 25, https://arxiv.org/abs/1805.10066.Google Scholar
- [21] (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.Link, Google Scholar
- [22] (2022) Online assortment optimization with reusable resources. Management Sci. 68(7):4772–4785.Google Scholar
- [23] (2025) Asymptotically optimal competitive ratio for online allocation of reusable resources. Oper. Res., ePub ahead of print January 21, https://doi.org/10.1287/opre.2021.0695.Google Scholar
- [24] (2020) Online allocation of reusable resources via algorithms guided by fluid approximations. Preprint, submitted October 8, https://arxiv.org/abs/2010.03983.Google Scholar
- [25] (2020) A neural-embedded choice model: Tastenet-MNL modeling taste heterogeneity with flexibility and interpretability. Preprint, submitted February 3, https://arxiv.org/abs/2002.00922.Google Scholar
- [26] (2022) Adversarial bandits with knapsacks. J. ACM 69(6):1–47.Google Scholar
- [27] (2025) Online stochastic optimization with Wasserstein based non-stationarity. Management Sci., ePub ahead of print March 3, https://doi.org/10.1287/mnsc.2020.03850.Google Scholar
- [28] (2018) The value of dynamic pricing in large queueing systems. Oper. Res. 66(2):409–425.Link, Google Scholar
- [29] (2020) Real-time dynamic pricing for revenue management with reusable resources, advance reservation, and deterministic service time requirements. Oper. Res. 68(3):676–685.Link, Google Scholar
- [30] (2010) Provably near-optimal LP-based policies for revenue management in systems with reusable resources. Oper. Res. 58(2):503–507.Link, Google Scholar
- [31] (2022) Online linear programming: Dual convergence, new algorithms, and regret bounds. Oper. Res. 70(5):2948–2966.Link, Google Scholar
- [32] (2020) Simple and fast algorithm for binary integer and online linear programming. Larochelle H, Ranzato MA, Hadsell R, Balcan M-F, Lin H-T, eds. Advances in Neural Information Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 9412–9421.Google Scholar
- [33] (2022) Non-stationary bandits with knapsacks. Oh AH, Agarwal A, Belgrave D, Cho K, eds. Thirty-Sixth Annual Conf. Neural Inform. Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [34] (2018) Efficient contextual bandits in non-stationary worlds. Proc. 31st Conf. Learn. Theory (COLT), Proceedings of Machine Learning Research, vol. 75 (PMLR, New York), 1739–1776.Google Scholar
- [35] (2020) Algorithms for online matching, assortment, and pricing with tight weight-dependent competitive ratios. Oper. Res. 68(6):1787–1803.Link, Google Scholar
- [36] (2021) Near-optimal model-free reinforcement learning in non-stationary episodic MDPs. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn. (ICML), Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 7447–7458.Google Scholar
- [37] (2019) Unifying the stochastic and the adversarial bandits with knapsack. Kraus S, ed. Proc. Twenty-Eighth Internat. Joint Conf. Artificial Intelligence (IJCAI-19) (International Joint Conferences on Artificial Intelligence Organization, Macao, China), 3311–3317.Google Scholar
- [38] (2020) Dynamic assortment optimization for reusable products with random usage durations. Management Sci. 66(7):2820–2844.Link, Google Scholar
- [39] (2019) Weighted linear bandits for non-stationary environments. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [40] (2021) Non-stationary reinforcement learning without prior knowledge: An optimal black-box approach. Belkin M, Kpotufe S, eds. Proc. Thirty Fourth Conf. Learn. Theory (PMLR, New York), 4300–4354.Google Scholar
- [41] (2024) Dynamic allocation of reusable resources: Logarithmic regret in overloaded networks. Oper. Res., ePub ahead of print July 24, https://doi.org/10.1287/opre.2022.0429.Google Scholar
- [42] (2017) Online convex optimization with stochastic constraints. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [43] (2018) Online convex optimization for cumulative constraints. Bengio S, Chockler H, Wehbe L, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [44] (2022) Online resource allocation for reusable resources. Preprint, submitted December 6, https://arxiv.org/abs/2212.02855.Google Scholar
- [45] (2022) Learning the scheduling policy in time-varying multiclass many server queues with abandonment. Preprint, submitted April 21, http://dx.doi.org/10.2139/ssrn.4090021.Google Scholar

