Integrated Online Learning and Adaptive Control in Queueing Systems with Uncertain Payoffs

Wei-Kang Hsu
Corresponding Author
Wei-Kang Hsu
[email protected]
https://orcid.org/0000-0001-5969-6530
School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907;
Search for more papers by this author
,
Jiaming Xu
Jiaming Xu
[email protected]
https://orcid.org/0000-0001-6104-4742
The Fuqua School of Business, Duke University, Durham, North Carolina 27708
Search for more papers by this author
,
Xiaojun Lin
Xiaojun Lin
[email protected]
School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907;
Search for more papers by this author
,
Mark R. Bell
Mark R. Bell
[email protected]
School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907;
Search for more papers by this author

Wei-Kang Hsu

Corresponding Author

Wei-Kang Hsu

[email protected]

https://orcid.org/0000-0001-5969-6530

School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907;

Search for more papers by this author

Jiaming Xu

[email protected]

https://orcid.org/0000-0001-6104-4742

The Fuqua School of Business, Duke University, Durham, North Carolina 27708

Search for more papers by this author

Xiaojun Lin

[email protected]

School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907;

Search for more papers by this author

Mark R. Bell

[email protected]

School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907;

Search for more papers by this author

Published Online:9 Mar 2021https://doi.org/10.1287/opre.2021.2100

References

Airbnb (2021) Accessed February 19, 2021, https://www.airbnb.com/.Google Scholar
Anandkumar A, Michael N, Tang AK, Swami A (2011) Distributed algorithms for learning and cognitive medium access with logarithmic regret. IEEE J. Selected Areas Comm. 29(4):731–745.Crossref, Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002a) Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2-3):235–256.Crossref, Google Scholar
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002b) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.Crossref, Google Scholar
Badanidiyuru A, Langford J, Slivkins A (2014) Resourceful contextual bandits. Conf. Learn. Theory (PMLR, Barcelona), 1109–1134.Google Scholar
Bimpikis K, Markakis MG (2019) Learning and hierarchies in service systems. Management Sci. 65(3):1268–1285.Google Scholar
Bonald T, Massoulie L (2001) Impact of fairness on internet performance. Proc. ACM Sigmetrics, Measurement and modeling of computer systems (ACM, Cambridge, Massachusetts), 82–91.Google Scholar
Buccapatnam S, Tan J, Zhang L (2015) Information sharing in distributed stochastic bandits. IEEE INFOCOM (IEEE, Hong Kong), 2605–2613.Google Scholar
Combes R, Proutiere A (2015) Dynamic rate and channel selection in cognitive radio systems.IEEE J. Selected Areas Comm. 33(5):910–921.Crossref, Google Scholar
Combes R, Jiang C, Srikant R (2015) Bandits with budgets: Regret lower bounds and optimal algorithms. Performance Evaluation Rev. (Portland, OR), 43(1):245–257.Crossref, Google Scholar
Coursera (2021) Accessed February 19, 2021, https://www.coursera.org/.Google Scholar
De Veciana G, Lee TJ, Konstantopoulos T (2001) Stability and performance analysis of networks supporting elastic services. IEEE/ACM Trans. Networks 9(1):2–14.Crossref, Google Scholar
Evans PC, Gawer A (2016) The rise of the platform enterprise: a global survey. Accessed February 19, 2021, https://www.thecge.net/archived-papers/the-rise-of-the-platform-enterprise-a-global-survey/.Google Scholar
Feldman J, Korula N, Mirrokni VS, Muthukrishnan S, Pál M (2009) Online ad assignment with free disposal. WINE (Springer, Berlin, Heidelberg), 9:374–385.Google Scholar
Gittins J, Glazebrook K, Weber R (2011) Multi-Armed Bandit Allocation Indices (John Wiley & Sons, Hoboken, NJ).Google Scholar
Google Ads (2021) Accessed February 19, 2017, https://ads.google.com/.Google Scholar
Ipeirotis PG (2010) Analyzing the Amazon mechanical turk marketplace. XRDS: Crossroads, The ACM Magazine for Students 17(2):16–21.Google Scholar
Johari R, Kamble V, Kanoria Y (2016) Know your customer: Multi-armed bandits withcapacity constraints. Preprint, submitted March 5, https://arxiv.org/abs/1603.04549v1.Google Scholar
Johari R, Kamble V, Kanoria Y (2017) Matching while learning. Proc. 2017 ACM Conf. on Economics and Computation (ACM, Cambridge, MA), 119.Google Scholar
Kalathil D, Nayyar N, Jain R (2014) Decentralized learning for multiplayer multiarmed bandits. IEEE Trans. Inform. Theory 60(4):2331–2345.Crossref, Google Scholar
Karger DR, Oh S, Shah D (2011a) Budget-optimal crowdsourcing using low-rank matrix approximations. 49th Annu. Allerton Conf. on Communication, Control, and Computing (IEEE, Monticello, IL), 284–291.Google Scholar
Karger DR, Oh S, Shah D (2011b) Iterative learning for reliable crowdsourcing systems. Adv. Neural Inform. Processing Systems 24:1953–1961.Google Scholar
Karger DR, Oh S, Shah D (2013) Efficient crowdsourcing for multi-class labeling. Proc. ACM SIGMETRICS/ Internat. Conf. Measurement and modeling of computer systems (ACM, Pittsburgh), 41(1):81–92.Google Scholar
Kenney M, Zysman J (2016) The rise of the platform economy. Issues Sci. Tech. 32(3):61.Google Scholar
Khetan A, Oh S (2016) Achieving budget-optimality with adaptive schemes in crowdsourcing. Adv. Neural Inform. Processing Systems 29:4844–4852.Google Scholar
Kumar PR, Meyn SP (1995) Stability of queueing networks and scheduling policies. IEEE Trans. Automatic Control 40:251–260.Crossref, Google Scholar
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Maths. 6(1):4–22.Crossref, Google Scholar
Lai TL, Ying Z (1988) Open bandit processes and optimal scheduling of queueing networks. Adv. Appl. Probabilities 20(2):447–472.Crossref, Google Scholar
Lai L, El Gamal H, Jiang H, Poor HV (2011) Cognitive medium access: Exploration, exploitation, and competition. IEEE Trans. Mobile Comput. 10(2):239–253.Crossref, Google Scholar
Lin X, Shroff NB, Srikant R (2006) A tutorial on cross-layer optimization in wireless networks. IEEE J. Selected Areas Comm. 24(8):1452–1463.Crossref, Google Scholar
Lin X, Shroff NB, Srikant R (2008) On the connection-level stability of congestion-controlled communication networks. IEEE Trans. Inform. Theory 54(5):2317–2338.Crossref, Google Scholar
Massoulie L, Xu K (2016) On the capacity of information processing systems. Conf. on Learning Theory (PMLR, New York City), 1292–1297.Google Scholar
Neely MJ, Modiano E, Li C (2005) Fairness and Optimal Stochastic Control for Heterogeneous Networks (IEEE INFOCOM, Miami, FL).Crossref, Google Scholar
Shah V, Gulikers L, Massoulie L, Vojnovic M (2020) Adaptive matching for expert systems with uncertain task types. Oper. Res. 68(5):1403–1424.Google Scholar
Tan B, Srikant R (2012) Online advertisement, optimization and stochastic networks. IEEE Trans. Automated Control 57(11):2854–2868.Crossref, Google Scholar
Tassiulas L, Ephremides A (1992) Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. IEEE Trans. Automated Control 37(12):1936–1948.Crossref, Google Scholar
Upwork (2021) Accessed February 19, 2021, https://www.upwork.com/.Google Scholar
van Dijck J, Poell T, de Waal M (2018) The Platform Society: Public Values in a Connective World (Oxford University Press, Oxford, UK).Crossref, Google Scholar
Whittle P (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probability 25(A):287–298.Google Scholar

Volume 70, Issue 2

March-April 2022

Pages iii-viii, 641-1291, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:December 11, 2018
Accepted:August 05, 2020
Published Online:March 09, 2021

Cite as

Wei-Kang Hsu, Jiaming Xu, Xiaojun Lin, Mark R. Bell (2021) Integrated Online Learning and Adaptive Control in Queueing Systems with Uncertain Payoffs. Operations Research 70(2):1166-1181.

https://doi.org/10.1287/opre.2021.2100

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Integrated Online Learning and Adaptive Control in Queueing Systems with Uncertain Payoffs

References

Volume 70, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News