On Greedy-Like Policies in Online Matching with Reusable Network Resources and Decaying Rewards

David Simchi-Levi
David Simchi-Levi
[email protected]
https://orcid.org/0000-0002-4650-1519
Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author
,
Zeyu Zheng
Zeyu Zheng
[email protected]
https://orcid.org/0000-0001-5653-152X
Department of Industrial Engineering and Operations Research, University of California, Berkeley, California 94720
Search for more papers by this author
,
Feng Zhu
Corresponding Author
Feng Zhu
[email protected]
https://orcid.org/0000-0003-4979-4879
Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Zeyu Zheng

[email protected]

https://orcid.org/0000-0001-5653-152X

Department of Industrial Engineering and Operations Research, University of California, Berkeley, California 94720

Search for more papers by this author

Feng Zhu

Corresponding Author

Feng Zhu

[email protected]

https://orcid.org/0000-0003-4979-4879

Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:21 Feb 2025https://doi.org/10.1287/mnsc.2023.02588

References

Adelman D (2007) Dynamic bid prices in revenue management. Oper. Res. 55(4):647–661.Link, Google Scholar
Aggarwal G, Goel G, Karande C, Mehta A (2011) Online vertex-weighted bipartite matching and single-bid budgeted allocations. Proc. 22nd Annual ACM-SIAM Sympos. on Discrete Algorithms (SIAM, Philadelphia), 1253–1264.Google Scholar
Atsidakou A, Papadigenopoulos O, Basu S, Caramanis C, Shakkottai S (2021) Combinatorial blocking bandits with stochastic delays. Internat. Conf. Machine Learn. (PMLR, New York), 404–413.Google Scholar
Baek J, Ma W (2022) Bifurcating constraints to improve approximation ratios for network revenue management with reusable resources. Oper. Res. 70(4):2226–2236.Google Scholar
Ball MO, Queyranne M (2009) Toward robust revenue management: Competitive analysis of online booking. Oper. Res. 57(4):950–963.Link, Google Scholar
Balseiro SR, Besbes O, Pizarro D (2024) Survey of dynamic resource-constrained reward collection problems: Unified model and analysis. Oper. Res. 72(5):2168–2189.Google Scholar
Banerjee S, Freund D (2020) SIGMETRICS’20. Abstracts of the 2020 SIGMETRICS/Performance Joint Internat. Conf. Measurement and Modeling of Comput. Systems, 1–2.Google Scholar
Basu S, Papadigenopoulos O, Caramanis C, Shakkottai S (2020) Contextual blocking bandits. arXiv preprint arXiv:2003.03426.Google Scholar
Basu S, Sen R, Sanghavi S, Shakkottai S (2019) Blocking bandits. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 4784–4793.Google Scholar
Bertsimas D, Popescu I (2003) Revenue management in a dynamic network environment. Transportation Sci. 37(3):257–277.Link, Google Scholar
Borodin A, Karavasilis C, Pankratov D (2020) An experimental study of algorithms for online bipartite matching. ACM J. Experiment. Algorithmics 25:1–37.Crossref, Google Scholar
Bumpensanti P, Wang H (2020) A re-solving heuristic with uniformly bounded loss for network revenue management. Management Sci. 66(7):2993–3009.Link, Google Scholar
Cella L, Cesa-Bianchi N (2020) Stochastic bandits with delay-dependent payoffs. Proc. Internat. Conf. Artificial Intelligence and Statist. (PMLR, New York), 1168–1177.Google Scholar
Chan CW, Farias VF (2009) Stochastic depletion problems: Effective myopic policies for a class of dynamic optimization problems. Math. Oper Res. 34(2):333–350.Link, Google Scholar
Chen W, Wang Y, Yuan Y (2013) Combinatorial multi-armed bandit: General framework and applications. Proc. Internat. Conf. Machine Learn. (PMLR), 151–159.Google Scholar
Dai J, Kleywegt AJ, Xiao Y (2019) Network revenue management with cancellations and no-shows. Production Oper. Management 28(2):292–318.Crossref, Google Scholar
Delong S, Farhadi A, Niazadeh R, Sivan B (2022) Online bipartite matching with reusable resources. Proc. 23rd ACM Conf. Econom. Comput. (ACM, New York), 962–963.Google Scholar
Dickerson J, Sankararaman K, Srinivasan A, Xu P (2021) Allocation problems in ride-sharing platforms: Online matching with offline reusable resources. ACM Trans. Econom. Comput. (TEAC) 9(3):1–17.Google Scholar
Feng Y, Niazadeh R, Saberi A (2019) Linear programming based online policies for real-time assortment of reusable resources. Chicago Booth Res. Paper, 20–25.Google Scholar
Feng Y, Niazadeh R, Saberi A (2022) Near-optimal Bayesian online assortment of reusable resources. Proc. 23rd ACM Conf. Econom. Comput. (ACM, New York), 964–965.Google Scholar
Golrezaei N, Nazerzadeh H, Rusmevichientong P (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.Link, Google Scholar
Gong XY, Goyal V, Iyengar GN, Simchi-Levi D, Udwani R, Wang S (2022) Online assortment optimization with reusable resources. Management Sci. 68(7):4772–4785.Link, Google Scholar
Goyal V, Iyengar G, Udwani R (2025) Asymptotically optimal competitive ratio for online allocation of reusable resources. Oper. Res., ePub ahead of print January 21, https://doi.org/10.1287/opre.2021.0695.Google Scholar
Hazan E, Safra S, Schwartz O (2006) On the complexity of approximating k-set packing. Comput. Complex 15(1):20–39.Crossref, Google Scholar
Heidari H, Kearns MJ, Roth A (2016) Tight policy regret bounds for improving and decaying bandits. Proc. Twenty-Fifth Internat. Joint Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 1562–1570.Google Scholar
Hu M, Zhou Y (2022) Dynamic type matching. Manufacturing Service Oper. Management 24(1):125–142.Link, Google Scholar
Jasin S, Kumar S (2012) A re-solving heuristic with bounded revenue loss for network revenue management with customer choice. Math. Oper. Res. 37(2):313–345.Link, Google Scholar
Jasin S, Kumar S (2013) Analysis of deterministic lp-based booking limit and bid price controls for revenue management. Oper. Res. 61(6):1312–1320.Link, Google Scholar
Karp RM, Vazirani UV, Vazirani VV (1990) An optimal algorithm for on-line bipartite matching. Proc. 22nd Annual ACM Sympos. Theory Comput. (ACM, New York), 352–358.Google Scholar
Kleinberg R, Immorlica N (2018) Recharging bandits. Proc. IEEE 59th Annual Sympos. Foundations Comput. Sci. (IEEE, New York), 309–319.Google Scholar
Kunnumkal S, Topaloglu H (2010) Computing time-dependent bid prices in network revenue management problems. Transportation Sci. 44(1):38–62.Link, Google Scholar
Lei Y, Jasin S (2020) Real-time dynamic pricing for revenue management with reusable resources, advance reservation, and deterministic service time requirements. Oper. Res. 68(3):676–685.Link, Google Scholar
Levi R, Radovanović A (2010) Provably near-optimal LP-based policies for revenue management in systems with reusable resources. Oper. Res. 58(2):503–507.Link, Google Scholar
Levine N, Crammer K, Mannor S (2017) Rotting bandits. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY).Google Scholar
Ma W, Simchi-Levi D (2020) Algorithms for online matching, assortment, and pricing with tight weight-dependent competitive ratios. Oper. Res. 68(6):1787–1803.Link, Google Scholar
Ma Y, Rusmevichientong P, Sumida M, Topaloglu H (2020) An approximation algorithm for network revenue management under nonstationary arrivals. Oper. Res. 68(3):834–855.Link, Google Scholar
Mehta A (2013) Online matching and ad allocation. Foundations Trends Theoretical Comput. Sci. 8(4):265–368.Crossref, Google Scholar
Owen Z, Simchi-Levi D (2018) Price and assortment optimization for reusable resources. Preprint, submitted February 19, https://dx.doi.org/10.2139/ssrn.3070625.Google Scholar
Papadigenopoulos O, Caramanis C (2021) Recurrent submodular welfare and matroid blocking semi-bandits. Adv. Neural Inform. Processing Systems 34:23334–23346.Google Scholar
Pike-Burke C, Grunewalder S (2019) Recovering bandits. Wallach H, Larochelle H, Beygelzimer A, d’Alche-Buc, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 14122–14131.Google Scholar
Reiman MI, Wang Q (2008) An asymptotically optimal policy for a quantity-based network revenue management problem. Math. Oper. Res. 33(2):257–282.Link, Google Scholar
Rusmevichientong P, Sumida M, Topaloglu H (2020) Dynamic assortment optimization for reusable products with random usage durations. Management Sci. 66(7):2820–2844.Link, Google Scholar
Rusmevichientong P, Sumida M, Topaloglu H, Bai Y (2023) Revenue management with heterogenous resources: Unit resource capacities, advance bookings, and itineraries over time intervals. Oper. Res. 71(6):2196–2216.Google Scholar
Seznec J, Locatelli A, Carpentier A, Lazaric A, Valko M (2023) Rotting bandits are no harder than stochastic ones. Proc. 22nd Internat. Conf. Artificial Intelligence and Statist. (PMLR, New York), 2564–2572.Google Scholar
Seznec J, Menard P, Lazaric A, Valko M (2020) A single algorithm for both restless and rested rotting bandits. Proc. Internat. Conf. Artificial Intelligence and Statist. (PMLR, New York), 3784–3794.Google Scholar
Simchi-Levi D, Zheng Z, Zhu F (2021) Dynamic planning and learning under recovering rewards. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 9702–9711.Google Scholar
Topaloglu H (2009) Using lagrangian relaxation to compute capacity-dependent bid prices in network revenue management. Oper. Res. 57(3):637–649.Link, Google Scholar
Udwani R (2022) Periodic reranking for online matching of reusable resources. Proc. 23rd ACM Conf. Econom. Comput., 966.Google Scholar
Vera A, Banerjee S (2021) The bayesian prophet: A low-regret framework for online decision making. Management Sci. 67(3):1368–1391.Link, Google Scholar
Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transportation Sci. 43(3):381–394.Link, Google Scholar
Zhu F, Liu S, Wang R, Wang Z (2023) Assign-to-seat: Dynamic capacity control for selling high-speed train tickets. Manufacturing Service Oper. Management 25(3):921–938.Google Scholar

Volume 71, Issue 10

October 2025

Pages 8097-8993, iv-vi

Article Information

Supplemental Material

Metrics

Information

Received:August 16, 2023
Accepted:September 13, 2024
Published Online:February 21, 2025

Cite as

David Simchi-Levi, Zeyu Zheng, Feng Zhu (2025) On Greedy-Like Policies in Online Matching with Reusable Network Resources and Decaying Rewards. Management Science 71(10):8908-8926.

https://doi.org/10.1287/mnsc.2023.02588

Keywords

Acknowledgments

The authors thank the editor and reviewers for their time and comments that improved the manuscript and Xiaoyue Gong, Rajan Udwani, and Yunzong Xu for helpful comments and discussions. The authors are listed in alphabetical order.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

On Greedy-Like Policies in Online Matching with Reusable Network Resources and Decaying Rewards

References

Volume 71, Issue 10

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News