Learning Payoffs While Routing in Skill-Based Queues
References
- (2012) Exact FCFS matching rates for two infinite multitype sequences. Oper. Res. 60(2):475–489.Link, Google Scholar
- (2014) A skill-based parallel service system under FCFS-ALIS—Steady state, overloads, and abandonments. Stochastic Systems 4(1):250–299.Link, Google Scholar
- (2013) Stochastic convex optimization with bandit feedback. SIAM J. Optim. 23(1):213–240.Google Scholar
- (2014) A dynamic near-optimal algorithm for online linear programming. Oper. Res. 62(4):876–890.Link, Google Scholar
- (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Google Scholar
- (1995) The achievable region method in the optimal control of queueing systems; formulations, bounds and policies. Queueing Systems 21(3–4):337–389.Google Scholar
- (1997) Introduction to Linear Optimization (Athena Scientific, Belmont, MA).Google Scholar
- (2015) On the (surprising) sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.Link, Google Scholar
- (2019) Learning and hierarchies in service systems. Management Sci. 65(3):1268–1285.Link, Google Scholar
- (2012) Adaptive policies for sequential sampling under incomplete information and a cost constraint. Daras N, ed. Applications of Mathematics and Informatics in Military Science, Springer Optimization and its Applications, vol. 71 (Springer, New York), 97–112. Google Scholar
- (1996) Optimal adaptive policies for sequential allocation problems. Adv. Appl. Math. 17(2):122–142.Google Scholar
- (1997) Optimal adaptive policies for Markov Decision Processes. Math. Oper. Res. 22(1):222–255.Link, Google Scholar
- (2017) Asymptotically optimal multi-armed bandit policies under a cost constraint. Probab. Engrg. Infom. Sci. 31(3):284–310.Google Scholar
- (2020) A survey on skill-based routing with applications to service operations management. Queueing Systems 96(1–2):53–82.Google Scholar
- (2021) Job dispatching policies for queueing systems with unknown service rates. MobiHoc’21 Proc. Twenty-Second Internat. Sympos. Theory Algorithmic Foundations Protocol Design Mobile Networks Mobile Comput. (Association for Computing Machinery, New York), 181–190.Google Scholar
- (1968) On the superposition of point processes. J. Roy. Statist. Soc. Ser. B Methodological 30:576–581.Google Scholar
- (1982) The Single Server Queue. 2nd ed. (Elsevier, Amsterdam).Google Scholar
- (2023) Score-aware policy-gradient methods and performance guarantees using local Lyapunov conditions: Applications to product-form stochastic networks and queueing systems. Preprint, submitted December 5, https://arxiv.org/abs/2312.02804v1.Google Scholar
- (1999) The achievable region approach to the optimal control of stochastic systems. J. Roy. Statist. Soc. Ser. B Statist. Methodology 61(4):747–791.Google Scholar
- (2008) Stochastic linear optimization under bandit feedback. 21st Annual Conf. Learn. Theory COLT 2008 (Omnipress, Madison, WI), 355–366.Google Scholar
- (2005) A primer in column generation. Desaulniers G, Desrosiers J, Solomon MM, eds. Column Generation (Springer, Boston), 1–32.Google Scholar
- (2021) Learning-NUM: Network Utility Maximization with unknown utility functions and queueing delay. Proc. Internat. Sympos. Mobile Ad Hoc Networking Comput. MobiHoc (Association for Computing Machinery, New York), 21–30.Google Scholar
- (2022) Joint learning and control in stochastic queueing networks with unknown utilities. Proc. ACM Measurement Anal. Comput. Systems 6(3):58.Google Scholar
- (2023) A Benders decomposition approach for solving a two-stage local energy market problem under uncertainty. Appl. Energy 329:120226.Google Scholar
- (2010) Service-level differentiation in many-server service systems via queue-ratio routing. Oper. Res. 58(2):316–328.Link, Google Scholar
- (2013) Performance Modeling and Design of Computer Systems: Queueing Theory in Action (Cambridge University Press, Cambridge, UK).Google Scholar
- (1963) Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58(301):13–30.Google Scholar
- (2022) Integrated online learning and adaptive control in queueing systems with uncertain payoffs. Oper. Res. 70(2):1166–1181.Link, Google Scholar
- (2024) Online learning and pricing for service systems with reusable resources. Oper. Res. 72(3):1203–1241.Link, Google Scholar
- (2010) A distributed CSMA algorithm for throughput and utility maximization in wireless networks. IEEE/ACM Trans. Networking 18(3):960–972.Google Scholar
- (2021) Matching while learning. Oper. Res. 69(2):655–681.Link, Google Scholar
- (2021) Scheduling servers with stochastic bilinear rewards. Preprint, submitted December 13, https://arxiv.org/abs/2112.06362v1.Google Scholar
- (2002) Queueing models of call centers: An introduction. Ann. Oper. Res. 113(1–4):41–59.Google Scholar
- (2018) On learning the cμ rule in single and parallel server networks. 2018 56th Annual Allerton Conf. Commun. Control Comput. Allerton 2018 (IEEE, Piscataway, NJ), 153–154.Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Google Scholar
- (2021) Scheduling jobs with stochastic holding costs. Adv. Neural Inform. Processing Systems 23:19375–19384.Google Scholar
- (2019) RL-QN: A reinforcement learning framework for optimal control of queueing systems. 2019 57th Annual Allerton Conf. Commun. Control Comput. Allerton 2019 (IEEE, Piscataway, NJ), 663–670.Google Scholar
- (2020) POND: Pessimistic–Optimistic oNline Dispatching. Preprint, submitted October 20, https://arxiv.org/abs/2010.09995v1.Google Scholar
- (2024) The generalized c/μ rule for queues with heterogeneous server pools. Oper. Res. 72(6):2488–2506.Link, Google Scholar
- (2016) Online network optimization using product-form Markov processes. IEEE Trans. Automatic Control 61(7):1838–1853.Google Scholar
- (2020) Adaptive matching for expert systems with uncertain task types. Oper. Res. 68(5):1403–1424.Link, Google Scholar
- (1971) The assignment game I: The core. Internat. J. Game Theory 1(1):111–130.Google Scholar
- (1956) Various optimizers for single‐stage production. Naval Res. Logist. Quart. 3:59–66.Google Scholar
- (2022) Learning from delayed semi-bandit feedback under strong fairness guarantees. IEEE INFOCOM 2022 IEEE Conf. Comput. Commun. (IEEE, Piscataway, NJ), 1379–1388.Google Scholar
- (2023) Congestion-aware matching and learning for service platforms. Preprint, submitted March 14, https://doi.org/10.2139/ssrn.5258944.Google Scholar
- (2012) Online advertisement, optimization and stochastic networks. IEEE Trans. Automatic Control 57(11):2854–2868.Google Scholar
- (2020) Mechanism design for online resource allocation: A unified approach. SIGMETRICS’20 Abstracts 2020 SIGMETRICS/Performance Joint Internat. Conf. Measurement Modeling Comput. Systems (Association for Computing Machinery, New York), 11–12.Google Scholar
- (2021) The Bayesian prophet: A low-regret framework for online decision making. Management Sci. 67(3):1368–1391.Link, Google Scholar
- (2012) A product form solution to a system with multi-type jobs and multi-type servers. Queueing Systems 70(3):269–298.Google Scholar
- (2023) Constant regret primal-dual policy for multi-way dynamic matching. Performance Evaluation Rev. 51(1):79–80.Google Scholar
- (2021) Scheduling and Control of Queueing Networks (Cambridge University Press, Cambridge, UK).Google Scholar
- (2022) A c/μ-rule for job assignment in heterogeneous group-server queues. Production Oper. Management 31(3):1191–1215.Google Scholar
- (2023) Learning while scheduling in multi-server systems with unknown statistics: MaxWeight with Discounted UCB. Proc. Machine Learn. Res. 206:4275–4312.Google Scholar
- (2017) Online convex optimization with stochastic constraints. Adv. Neural Inform. Processing Systems 30:1429–1439.Google Scholar
- (2022) Learning the scheduling policy in time-varying multiclass many server queues with abandonment. Preprint, submitted April 21, https://doi.org/10.2139/ssrn.4090021.Google Scholar

