Fair Exploration via Axiomatic Bargaining

Jackie Baek
Corresponding Author
Jackie Baek
[email protected]
https://orcid.org/0000-0001-5538-509X
Stern School of Business, New York University, New York, New York 10012;
Search for more papers by this author
,
Vivek F. Farias
Vivek F. Farias
[email protected]
https://orcid.org/0000-0002-5856-9246
Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142
Search for more papers by this author

Jackie Baek

Corresponding Author

Jackie Baek

[email protected]

https://orcid.org/0000-0001-5538-509X

Stern School of Business, New York University, New York, New York 10012;

Search for more papers by this author

Vivek F. Farias

[email protected]

https://orcid.org/0000-0002-5856-9246

Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142

Search for more papers by this author

Published Online:15 Mar 2024https://doi.org/10.1287/mnsc.2022.01985

References

Alexander BM, Ba S, Berger MS, Berry DA, Cavenee WK, Chang SM, Cloughesy TF, et al. (2018) Adaptive global innovative learning environment for glioblastoma: GBM AGILE. Clinical Cancer Res. 24(4):737–743.Crossref, Google Scholar
Barker A, Sigman C, Kelloff G, Hylton N, Berry D, Esserman L (2009) I-spy 2: An adaptive breast cancer trial design in the setting of neoadjuvant chemotherapy. Clinical Pharmacology Therapy 86(1):97–100.Crossref, Google Scholar
Bastani H, Bayati M (2020) Online decision making with high-dimensional covariates. Oper. Res. 68(1):276–294.Link, Google Scholar
Bastani H, Bayati M, Khosravi K (2021) Mostly exploration-free algorithms for contextual bandits. Management Sci. 67(3):1329–1349.Link, Google Scholar
Berge C (1963) Topological Spaces (Oliver and Boyd Ltd., Edinburgh, UK).Google Scholar
Berry SM, Broglio KR, Groshen S, Berry DA (2013) Bayesian hierarchical modeling of patient subpopulations: Efficient designs of phase II oncology clinical trials. Clinical Trials 10(5):720–734.Crossref, Google Scholar
Berry SM, Carlin BP, Lee JJ, Muller P (2010) Bayesian Adaptive Methods for Clinical Trials (CRC Press, Boca Raton, FL).Crossref, Google Scholar
Bertsimas D, Farias VF, Trichakis N (2011) The price of fairness. Oper. Res. 59(1):17–31.Link, Google Scholar
Chen B, Frazier P, Kempe D (2018) Incentivizing exploration by heterogeneous users. Bubeck S, Perchet V, Rigollet P, eds. Proc. 31st Conf. Learning Theory, vol. 75 (PMLR, New York), 798–818.Google Scholar
Combes R, Magureanu S, Proutiere A (2017) Minimal exploration in structured stochastic bandits. Preprint, submitted November 1, https://arxiv.org/abs/1711.00400.Google Scholar
Frazier P, Kempe D, Kleinberg J, Kleinberg R (2014) Incentivizing exploration. Proc. 15th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 5–22.Google Scholar
Garivier A, Cappé O (2011) The KL-UCB algorithm for bounded stochastic bandits and beyond. J. Machine Learning Res. 19:359–376.Google Scholar
Gillen S, Jung C, Kearns M, Roth A (2018) Online learning with an unknown fairness metric. Preprint, submitted February 20, https://arxiv.org/abs/1802.06936.Google Scholar
Goldenshluger A, Zeevi A (2013) A linear response bandit problem. Stochastic Systems 3(1):230–261.Link, Google Scholar
Graves TL, Lai TL (1997) Asymptotically efficient adaptive choice of control laws in controlled Markov chains. SIAM J. Control Optim. 35(3):715–743.Crossref, Google Scholar
Hao B, Lattimore T, Szepesvari C (2020) Adaptive exploration in linear contextual bandit. Chiappa S, Calandra R, eds. Proc. Twenty Third Internat. Conf. Artificial Intelligence Statist., vol. 108 (PMLR, New York), 3536–3545.Google Scholar
Immorlica N, Mao J, Slivkins A, Wu ZS (2018) Incentivizing exploration with selective data disclosure. Preprint, submitted November 14, https://arxiv.org/abs/1811.06026.Google Scholar
Jiang LB, Liew SC (2005) Proportional fairness in wireless LANS and ad hoc networks. IEEE Wireless Comm. Networking Conf., vol. 3 (Institute of Electrical and Electronics Engineers, Piscataway, NJ), 1551–1556.Google Scholar
Joseph M, Kearns M, Morgenstern J, Roth A (2016) Fairness in learning: Classic and contextual bandits. Preprint, submitted May 23, https://arxiv.org/abs/1605.07139.Google Scholar
Jung C, Kannan S, Lutz N (2020) Quantifying the burden of exploration and the unfairness of free riding. Proc. 14th Annual ACM-SIAM Sympos. Discrete Algorithms (Society for Industrial and Applied Mathematics, Philadelphia), 1892–1904.Google Scholar
Kalai E, Smorodinsky M (1975) Other solutions to Nash’s bargaining problem. Econometrica 43(3):513–518.Crossref, Google Scholar
Kaneko M, Nakamura K (1979) The Nash social welfare function. Econometrica 47(2):423–435.Crossref, Google Scholar
Kannan S, Morgenstern JH, Roth A, Waggoner B, Wu ZS (2018) A smoothed analysis of the greedy algorithm for the linear contextual bandit problem. Advances Neural Inform. Processing Systems (NeurIPS 2018) (Curran Associates, Red Hook, NY), 2227–2236.Google Scholar
Kannan S, Kearns M, Morgenstern J, Pai M, Roth A, Vohra R, Wu ZS (2017) Fairness incentives for myopic agents. Proc. 2017 ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 369–386.Google Scholar
Kelly FP, Maulloo AK, Tan DK (1998) Rate control for communication networks: Shadow prices, proportional fairness and stability. J. Oper. Res. Soc. 49(3):237–252.Crossref, Google Scholar
Kim ES, Herbst RS, Wistuba II, Lee JJ, Blumenschein GR, Tsao A, Stewart DJ, et al. (2011) The BATTLE trial: Personalizing therapy for lung cancer. Cancer Discovery 1(1):44–53.Crossref, Google Scholar
Kleinberg R, Niculescu-Mizil A, Sharma Y (2010) Regret bounds for sleeping experts and bandits. Machine Learning 80(2):245–272.Crossref, Google Scholar
Kremer I, Mansour Y, Perry M (2014) Implementing the “wisdom of the crowd.” J. Political Econom. 122(5):988–1012.Crossref, Google Scholar
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Advances Appl. Math. 6(1):4–22.Crossref, Google Scholar
Lattimore T (2018) Refining the confidence level for optimistic bandit strategies. J. Machine Learning Res. 19(1):765–796.Google Scholar
Lattimore T, Szepesvari C (2017) The end of optimism? An asymptotic analysis of finite-armed linear bandits. Singh A, Zhu J, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist., vol. 54 (PMLR, New York), 728–737.Google Scholar
Liu Y, Radanovic G, Dimitrakakis C, Mandal D, Parkes DC (2017) Calibrated fairness in bandits. Preprint, submitted July 6, https://arxiv.org/abs/1707.01875.Google Scholar
Mansour Y, Slivkins A, Syrgkanis V (2015) Bayesian incentive-compatible bandit exploration. Proc. 16th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 565–582.Google Scholar
Mas-Colell A, Whinston MD, Green JR (1995) Microeconomic Theory, vol. 1 (Oxford University Press, New York).Google Scholar
Mo J, Walrand J (2000) Fair end-to-end window-based congestion control. IEEE Trans. Networking 8(5):556–567.Crossref, Google Scholar
Nash JF (1950) The bargaining problem. Econometrica 18(2):155–162.Crossref, Google Scholar
Papanastasiou Y, Bimpikis K, Savva N (2018) Crowdsourcing exploration. Management Sci. 64(4):1727–1746.Link, Google Scholar
Patil V, Ghalme G, Nair V, Narahari Y (2020) Achieving fairness in the stochastic multi-armed bandit problem. Proc. AAAI Conf. Artificial Intelligence, vol. 34 (AAAI Association for the Advancement of Artificial Intelligence, Washington, DC), 5379–5386.Google Scholar
Polyak K (2011) Heterogeneity in breast cancer. J. Clinical Investigation 121(10):3786–3788.Crossref, Google Scholar
Raghavan M, Slivkins A, Wortman JV, Wu ZS (2018) The externalities of exploration and how data diversity helps exploitation. Bubeck S, Perchet V, Rigollet P, eds. Proc. 31st Conf. Learning Theory, vol. 75 (PMLR, New York), 1724–1738.Google Scholar
Sen A, Foster JE (1997) On Economic Inequality (Oxford University Press, Oxford, UK).Google Scholar
Takeuchi F, McGinnis R, Bourgeois S, Barnes C, Eriksson N, Soranzo N, Whittaker P, et al. (2009) A genome-wide association study confirms VKORC1, CYP2C9, and CYP4f2 as principal genetic determinants of warfarin dose. PLoS Genetics 5(3):e1000433.Crossref, Google Scholar
Van Parys B, Golrezaei N (2020) Optimal learning for structured bandits. Preprint, submitted August 12, https://dx.doi.org/10.2139/ssrn.3651397.Google Scholar
Whirl-Carrillo M, McDonagh EM, Hebert J, Gong L, Sangkuhl K, Thorn C, Altman RB, Klein TE (2012) Pharmacogenomics knowledge for personalized medicine. Clinical Pharmacology Therapy 92(4):414–417.Crossref, Google Scholar
Wysowski DK, Nourjah P, Swartz L (2007) Bleeding complications with warfarin use: A prevalent adverse effect resulting in regulatory action. Archives Internal Medicine 167(13):1414–1419.Crossref, Google Scholar
Yang L, Chen YZJ, Hajiemaili MH, Lui JC, Towsley D (2022) Distributed bandits with heterogeneous agents. IEEE INFOCOM 2022-IEEE Conf. Comput. Comm. (Institute of Electrical and Electronics Engineers, Piscataway, NJ), 200–209.Google Scholar
Young HP (1995) Equity: In Theory and Practice (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Zhou X, Liu S, Kim ES, Herbst RS, Lee JJ (2008) Bayesian adaptive design for targeted therapy development in lung cancer–A step toward personalized medicine. Clinical Trials 5(3):181–193.Crossref, Google Scholar

Volume 70, Issue 12

December 2024

Pages 8217-9119, iv-vi

Article Information

Supplemental Material

Metrics

Information

Received:November 03, 2021
Accepted:July 21, 2023
Published Online:March 15, 2024

Cite as

Jackie Baek, Vivek F. Farias (2024) Fair Exploration via Axiomatic Bargaining. Management Science 70(12):8922-8939.

https://doi.org/10.1287/mnsc.2022.01985

Keywords

Acknowledgments

The authors are grateful to the anonymous reviewers for their insightful and thorough comments that greatly improved this paper from its earlier versions. The authors also thank Retsef Levi, Thodoris Lykouris, Tianyi Peng, Manish Raghavan, Andy Zheng, and numerous seminar participants for many helpful discussions and comments. Both authors were partially supported by NSF Grant CMMI 1727239. An abridged version of this work appeared in the proceedings of Advances in Neural Information Processing Systems 34 (NeurIPS 2021).

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Fair Exploration via Axiomatic Bargaining

References

Volume 70, Issue 12

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News