Online Learning and Optimization for Queues with Unknown Arrival Rate and Service Distribution

Xinyun Chen
Xinyun Chen
[email protected]
https://orcid.org/0000-0003-1727-0923
School of Data Science, School of Management and Economics, The Chinese University of Hong Kong, Shenzhen 518172, China
Search for more papers by this author
,
Guiyu Hong
Guiyu Hong
[email protected]
https://orcid.org/0009-0006-9616-318X
College of Business, Shanghai University of Finance and Economics, Shanghai 200433, China
Search for more papers by this author
,
Yunan Liu
Corresponding Author
Yunan Liu
[email protected]
https://orcid.org/0000-0001-9961-2610
Supply Chain Optimization Technology, Amazon, New York, New York 10001; and Industrial and Systems Engineering, North Carolina State University, Raleigh, North Carolina 27695
Search for more papers by this author

School of Data Science, School of Management and Economics, The Chinese University of Hong Kong, Shenzhen 518172, China

Search for more papers by this author

Guiyu Hong

[email protected]

https://orcid.org/0009-0006-9616-318X

College of Business, Shanghai University of Finance and Economics, Shanghai 200433, China

Search for more papers by this author

Yunan Liu

Corresponding Author

Yunan Liu

[email protected]

https://orcid.org/0000-0001-9961-2610

Supply Chain Optimization Technology, Amazon, New York, New York 10001; and Industrial and Systems Engineering, North Carolina State University, Raleigh, North Carolina 27695

Search for more papers by this author

Published Online:18 Feb 2026https://doi.org/10.1287/opre.2023.0304

References

Abate J, Choudhury GL, Whitt W (1993) Calculation of the GI/G/1 waiting-time distribution and its cumulants from Pollaczek’s formulas. Arch. Elektronik Ubertragungstechnik 47(5/6):311–321.Google Scholar
Asmussen S (2003) Applied Probability and Queues, vol. 2 (Springer, Berlin).Google Scholar
Baron O, Krass D, Senderovich A, Sherzer E (2024) Supervised ML for solving the GI/GI/1 queue. INFORMS J. Comput. 36(3):766–786.Link, Google Scholar
Bergquist J, Elmachtoub AN (2023) Static pricing guarantees for queueing systems. Preprint, submitted May 16, https://arxiv.org/abs/2305.09168. Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O, Zeevi A (2015) On the (surprising) sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.Link, Google Scholar
Blanchet J, Chen X (2020) Rates of convergence to stationarity for reflected Brownian motion. Math. Oper. Res. 45(2):660–681.Link, Google Scholar
Broadie M, Cicek D, Zeevi A (2011) General bounds and finite-time improvement for the Kiefer-Wolfowitz stochastic approximation algorithm. Oper. Res. 59(5):1211–1224.Link, Google Scholar
Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
Chen X, Liu Y, Hong G (2024) An online learning approach to dynamic pricing and capacity sizing in service systems. Oper. Res. 72(6):2677–2697.Link, Google Scholar
Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
Chong EK, Ramadge PJ (1993) Optimization of queues using an infnitesimal perturbation analysis-based stochastic algorithm with general update times. SIAM J. Control Optim. 31(3):698–732.Crossref, Google Scholar
Comte C, Jonckheere M, Sanders J, Senen-Cerda A (2023) Score-aware policy-gradient methods and performance guarantees using local lyapunov conditions: Applications to product-form stochastic networks and queueing systems. Preprint, submitted December 5, https://arxiv.org/abs/2312.02804.Google Scholar
Cosmetatos GP (1976) Some approximate equilibrium results for the multi-server queue (M/G/r). J. Oper. Res. Soc. 27(3):615–620.Crossref, Google Scholar
Dai JG, Gluzman M (2021) Queueing network controls via deep reinforcement learning. Stochastic Systems 12(1):30–67.Link, Google Scholar
Elmachtoub AN, Shi J (2023) The power of static pricing for reusable resources. Preprint, submitted February 23, https://arxiv.org/abs/2302.11723.Google Scholar
Fu MC (1990) Convergence of a stochastic approximation algorithm for the GI/G/1 queue using infinitesimal perturbation analysis. J. Optim. Theory Appl. 65:149–160.Crossref, Google Scholar
Garyfallos S, Liu Y, Barlet-Ros P, Cabellos-Aparicio A (2024) Service level prediction in non-Markovian nonstationary queues: A simulation-based deep learning approach. Lam H, Azar E, Batur D, Gao S, Xie W, eds. Proc. Winter Simulation Conf. (IEEE, Piscataway, NJ), 2655–2666.Google Scholar
Garyfallos S, Liu Y, Barlet-Ros P, Cabellos-Aparicio A (2025) NeuraliNQ: A neural network method for the transient performance analysis in non-Markovian queues. Queueing Systems 109(4):24.Google Scholar
Glasserman P (1992) Stationary waiting time derivatives. Queueing Systems 12(3):369–390.Crossref, Google Scholar
Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.Link, Google Scholar
Jia H, Shi C, Shen S (2022) Online learning and pricing with reusable resources: Linear bandits with sub-exponential rewards. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. Internat. Conf. Machine Learn. (International Conference on Machine Learning (ICML), San Diego), 10135–10160.Google Scholar
Jia H, Shi C, Shen S (2024) Online learning and pricing for service systems with reusable resources. Oper. Res. 72(3):1203–1241.Link, Google Scholar
Keskin NB, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.Link, Google Scholar
Kim J, Randhawa RS (2018) The value of dynamic pricing in large queueing systems. Oper. Res. 66(2):409–425.Link, Google Scholar
Krishnasamy S, Sen R, Johari R, Shakkottai S (2021) Learning unknown service rates in queues: A multiarmed bandit approach. Oper. Res. 69(1):315–330.Link, Google Scholar
Kumar S, Randhawa RS (2010) Exploiting market size in service systems. Manufacturing Service Oper. Management 12(3):511–526.Link, Google Scholar
L’Ecuyer P, Glynn PW (1994) Stochastic optimization by simulation: Convergence proofs for the GI/GI/1 queue in steady state. Management Sci. 40(11):1562–1578.Link, Google Scholar
L’Ecuyer P, Giroux N, Glynn PW (1994) Stochastic optimization by simulation: Numerical experiments with the M/M/1 queue in steady-state. Management Sci. 40(10):1245–1261.Link, Google Scholar
Lee C, Ward AR (2014) Optimal pricing and capacity sizing for the GI/GI/1 queue. Oper. Res. Lett. 42:527–531.Crossref, Google Scholar
Lee C, Ward AR (2019) Pricing and capacity sizing of a service facility: Customer abandonment effects. Production Oper. Management 28(8):2031–2043.Crossref, Google Scholar
Liu B, Xie Q, Modiano E (2019) Reinforcement learning for optimal control of queueing systems. Liberzon D, Dominguez-Garcia A, eds. Proc. 57th Annual Allerton Conf. Comm. Control Comput. (IEEE, Piscataway, NJ), 663–670.Google Scholar
Maglaras C, Zeevi A (2003) Pricing and capacity sizing for systems with shared resources: Approximate solutions and scaling relations. Management Sci. 49(8):1018–1038.Link, Google Scholar
Murthy Y, Grosof I, Maguluri ST, Srikant R (2024) Performance of NPG in countable state-space average-cost RL. Preprint, submitted May 30, https://arxiv.org/abs/2405.20467.Google Scholar
Nair J, Wierman A, Zwart B (2016) Provisioning of large-scale systems: The interplay between network effects and strategic behavior in the user base. Management Sci. 62(6):1830–1841.Link, Google Scholar
Pollaczek F (1930) Über eine aufgabe der wahrscheinlichkeitstheorie. I. Math. Z 32(1):64–100.Crossref, Google Scholar
Raeis M, Tizghadam A, Leon-Garcia A (2021) Queue-learning: A reinforcement learning approach for providing quality of service. Leyton-Brown K, Mausam M, eds. Proc. AAAI Conf. Artificial Intelligence, vol. 35 (AAAI Press, Palo Alto, CA), 461–468.Crossref, Google Scholar
Shah D, Xie Q, Xu Z (2020) Stable reinforcement learning with unbounded state space. Bayen AM, Jadbabaie A, Pappas G, Parrilo PA, Recht B, Tomlin C, Zeilinger M, eds. Proc. 2nd Conf. Learn. Dynamics Control, Proceedings of Machine Learning Research, vol. 120, 581.Google Scholar
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction, 2nd ed. (MIT Press, Cambridge, MA).Google Scholar
Walton N, Xu K (2021) Learning and information in stochastic networks and queues. Carlsson JG, ed. Tutorials in Operations Research: Emerging Optimization Methods and Modeling Techniques with Applications (INFORMS, Catonsville, MD), 161–198.Link, Google Scholar
Yuan H, Luo Q, Shi C (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Sci. 67(10):6089–6115.Link, Google Scholar
Zhang H, Chao X, Shi C (2020) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.Link, Google Scholar
Zhong Y, Birge JR, Ward AR (2025) Learning to schedule in multiclass many-server queues with abandonment. Oper. Res. 73(6):3085–3103.Link, Google Scholar

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:June 08, 2023
Accepted:December 18, 2025
Published Online:February 18, 2026

Cite as

Xinyun Chen, Guiyu Hong, Yunan Liu (2026) Online Learning and Optimization for Queues with Unknown Arrival Rate and Service Distribution. Operations Research 0(0).

https://doi.org/10.1287/opre.2023.0304

Keywords

Acknowledgments

The authors thank the editors and anonymous referees for helpful comments.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning and Optimization for Queues with Unknown Arrival Rate and Service Distribution

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News