Open Access

On Statistical Discrimination as a Failure of Social Learning: A Multiarmed Bandit Approach

Junpei Komiyama
Corresponding Author
Junpei Komiyama
[email protected]
https://orcid.org/0000-0003-0095-6558
Leonard N. Stern School of Business, New York University, New York, New York 10012
Search for more papers by this author
,
Shunya Noda
Shunya Noda
[email protected]
https://orcid.org/0000-0001-6955-3373
Graduate School of Economics, The University of Tokyo, Tokyo 113-0033, Japan
Search for more papers by this author

Junpei Komiyama

Corresponding Author

Junpei Komiyama

[email protected]

https://orcid.org/0000-0003-0095-6558

Leonard N. Stern School of Business, New York University, New York, New York 10012

Search for more papers by this author

Shunya Noda

[email protected]

https://orcid.org/0000-0001-6955-3373

Graduate School of Economics, The University of Tokyo, Tokyo 113-0033, Japan

Search for more papers by this author

Published Online:29 Mar 2024https://doi.org/10.1287/mnsc.2022.00893

References

Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Taylor JS, Zemel RS, Bartlett PL, Pereira FCN, Weinberger KQ, eds. Proc. Twenty-fourth Conf. Neural Inform. Processing Systems (Curran Associates, New York), 2312–2320.Google Scholar
Abe N, Long PM (1999) Associative reinforcement learning using linear probabilistic concepts. Bratko I, Dzeroski S, eds. Proc. Sixteenth Internat. Conf. Machine Learn. (Morgan Kaufmann, San Francisco), 3–11.Google Scholar
Al-Ali MN (2004) How to get yourself on the door of a job: A cross-cultural contrastive study of Arabic and English job application letters. J. Multilingual Multicultural Development 25(1):1–23.Crossref, Google Scholar
Arrow K (1973) The theory of discrimination. Ashenfelter O, Rees A, eds. Discrimination in Labor Markets (Princeton University Press, Princeton, NJ), 3–33.Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multi-armed bandit problem. Machine Learn. 47(2):235–256.Crossref, Google Scholar
Banerjee AV (1992) A simple model of herd behavior. Quart. J. Econom. 107(3):797–817.Crossref, Google Scholar
Bardhi A, Guo Y, Strulovici B (2020) Early-career discrimination: Spiraling or self-correcting? Working paper, Duke University, Durham, NC.Google Scholar
Bastani H, Bayati M, Khosravi K (2021) Mostly exploration-free algorithms for contextual bandits. Management Sci. 67(3):1329–1349.Link, Google Scholar
Bechavod Y, Ligett K, Roth A, Waggoner B, Wu SZ (2019) Equal opportunity in online classification with partial feedback. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R, eds. Proc. Thirty-Third Conf. Neural Inform. Processing Systems (Curran Associates, New York), 8972–8982.Google Scholar
Bikhchandani S, Hirshleifer D, Welch I (1992) A theory of fads, fashion, custom, and cultural change as informational cascades. J. Political Econom. 100(5):992–1026.Crossref, Google Scholar
Bohren JA, Imas A, Rosenberg M (2019) The dynamics of discrimination: Theory and evidence. Amer. Econom. Rev. 109(10):3395–3436.Crossref, Google Scholar
Bohren JA, Haggag K, Imas A, Pope DG (2023) Inaccurate statistical discrimination: An identification problem. Rev. Econom. Statist., 1–45.Crossref, Google Scholar
Calders T, Verwer S (2010) Three naive Bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21(2):277–292.Crossref, Google Scholar
Che YK, Kim K, Zhong W (2019) Statistical discrimination in ratings-guided markets. Working paper, Columbia University, New York.Google Scholar
Chen Y, Cuellar A, Luo H, Modi J, Nemlekar H, Nikolaidis S (2020) The fair contextual multi-armed bandit. Jonas P, David S, eds. Proc. Nineteenth Internat. Conf. Autonomous Agents Multiagent Systems (Journal of Machine Learning Research), 1810–1812.Google Scholar
Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions. Geoffrey G, David D, Miroslav D, eds. Proc. Fourteenth Internat. Conf. Artificial Intelligence Statist. (Journal of Machine Learning Research), 208–214.Google Scholar
Coate S, Loury GC (1993) Will affirmative-action policies eliminate negative stereotypes? Amer. Econom. Rev. 83(5):1220–1240.Google Scholar
Cornell B, Welch I (1996) Culture, information, and screening discrimination. J. Political Econom. 104(3):542–571.Crossref, Google Scholar
Dianat A, Echenique F, Yariv L (2022) Statistical discrimination and affirmative action in the laboratory. Games Econom. Behav. 132:41–58.Crossref, Google Scholar
Foster D, Vohra R (1992) An economic argument for affirmative action. Rationality Soc. 4(2):176–188.Crossref, Google Scholar
Frazier P, Kempe D, Kleinberg J, Kleinberg R (2014) Incentivizing exploration. Babaioff M, Conitzer V, Easley DA, eds. Proc. Fifteenth ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 5–22.Google Scholar
Hanna RN, Linden LL (2012) Discrimination in grading. Amer. Econom. J. Econom. Policy 4(4):146–168.Crossref, Google Scholar
Hannák A, Wagner C, Garcia D, Mislove A, Strohmaier M, Wilson C (2017) Bias in online freelance marketplaces: Evidence from TaskRabbit and Fiverr. Lee CP, Poltrock SE, Barkhuus L, Borges M, Kellogg WA, eds. Proc. 2017 ACM Conf. Comput. Supported Cooperative Work Soc. Comput. (Association for Computing Machinery, New York), 1914–1933.Google Scholar
Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. Lee DD, Sugiyama M, von Luxburg U, Guyon I, Garnett R, eds. Proc. Twenty-ninth Conf. Neural Inform. Processing Systems (Curran Associates, New York), 3315–3323.Google Scholar
Hu L, Chen Y (2018) A short-term intervention for long-term fairness in the labor market. Champin PA, Gandon G, Lalmas M, Ipeirotis PG, eds. Proc. 2018 World Wide Web Conf. (Association for Computing Machinery, New York), 1389–1398.Google Scholar
Immorlica N, Mao J, Slivkins A, Wu ZS (2020) Incentivizing exploration with selective data disclosure. Biró P, Hartline JD, Ostrovsky M, Procaccia AD, eds. Proc. Twenty-First ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 647–648.Google Scholar
Johari R, Kamble V, Krishnaswamy AK, Li H (2018) Exploration vs. exploitation in team formation. Christodoulou G, Harks T, eds. Proc. Fourteenth Conf. Web Internet Econom., vol. 11316 (Springer, Berlin, Heidelberg), 452.Google Scholar
Joseph M, Kearns M, Morgenstern JH, Roth A (2016) Fairness in learning: Classic and contextual bandits. Lee DD, Sugiyama M, von Luxburg U, Guyon I, Garnett R, eds. Proc. Twenty-Ninth Conf. Neural Inform. Processing Systems (Curran Associates, New York), 325–333.Google Scholar
Kannan S, Roth A, Ziani J (2019) Downstream effects of affirmative action. Boyd D, Morgenstern JH, eds. Proc. Second Conf. Fairness Accountability Transparency (Association for Computing Machinery, New York), 240–248.Google Scholar
Kannan S, Morgenstern JH, Roth A, Waggoner B, Wu ZS (2018) A smoothed analysis of the greedy algorithm for the linear contextual bandit problem. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Proc. Thirty-Second Conf. Neural Inform. Processing Systems (Curran Associates, New York), 2227–2236.Google Scholar
Kannan S, Kearns M, Morgenstern J, Pai M, Roth A, Vohra R, Wu ZS (2017) Fairness incentives for myopic agents. Proc. 2017 ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 369–386.Google Scholar
Kennedy P (2008) A Guide to Econometrics, 6th ed. (Wiley-Blackwell, Hoboken, NJ), 192–202.Google Scholar
Kleinberg JM, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. Papadimitriou CH, ed. Proc. Eighth Conf. Innovations Theoret. Comput. Sci. (Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl), 43:1–43:23.Google Scholar
Kremer I, Mansour Y, Perry M (2014) Implementing the ‘wisdom of the crowd’. J. Political Econom. 122(5):988–1012.Crossref, Google Scholar
Lai T, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
Langford J, Zhang T (2008) The epoch-greedy algorithm for contextual multi-armed bandits. Proc. Twentieth Conf. Neural Inform. Processing Systems (Curran Associates, New York), 817–824.Google Scholar
Li D, Raymond L, Bergman P (2020) Hiring as exploration. NBER Working Paper No. 27736, National Bureau of Economic Research, Cambridge, MA.Google Scholar
MacNell L, Driscoll A, Hunt AN (2015) What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Ed. 40(4):291–303.Crossref, Google Scholar
Mailath GJ, Samuelson L, Shaked A (2000) Endogenous inequality in integrated labor markets with two-sided search. Amer. Econom. Rev. 90(1):46–72.Crossref, Google Scholar
Makhlouf K, Zhioua S, Palamidessi C (2021) Machine learning fairness notions: Bridging the gap with real-world applications. Inform. Processing Management 58(5):102642.Crossref, Google Scholar
Mansour Y, Slivkins A, Syrgkanis V (2020) Bayesian incentive-compatible bandit exploration. Oper. Res. 68(4):1132–1161.Link, Google Scholar
Mitchell KM, Martin J (2018) Gender bias in student evaluations. PS Political Sci. Politics 51(3):648–652.Crossref, Google Scholar
Monachou FG, Ashlagi I (2019) Discrimination in online markets: Effects of social bias on learning from reviews and policy design. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R, eds. Proc. Thirty-Second Conf. Neural Inform. Processing Systems (Curran Associates, New York), 2145–2155.Google Scholar
Moro A, Norman P (2004) A general equilibrium model of statistical discrimination. J. Econom. Theory 114(1):1–30.Crossref, Google Scholar
Neumark D (2018) Experimental research on labor market discrimination. J. Econom. Literature 56(3):799–866.Crossref, Google Scholar
Owen AB, Varian H (2020) Optimizing the tie-breaker regression discontinuity design. Electronic J. Statist. 14(2):4004–4027.Crossref, Google Scholar
Papanastasiou Y, Bimpikis K, Savva N (2018) Crowdsourcing exploration. Management Sci. 64(4):1727–1746.Link, Google Scholar
Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. Li Y, Liu B, Sarawagi S, eds. Proc. Fourteenth ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 560–568.Google Scholar
Peña VH, Lai TL, Shao QM (2008) Self-Normalized Processes: Limit Theory and Statistical Applications (Springer Science & Business Media, Berlin).Google Scholar
Phelps ES (1972) The statistical theory of racism and sexism. Amer. Econom. Rev. 62(4):659–661.Google Scholar
Precht K (1998) A cross-cultural comparison of letters of recommendation. English Specific Purposes 17(3):241–265.Crossref, Google Scholar
Raghavan M, Slivkins A, Vaughan JW, Wu ZS (2018) The externalities of exploration and how data diversity helps exploitation. Bubeck S, Perchet V, Rigollet P, eds. Proc. Machine Learn. Res., vol. 75 (PMLR), 1724–1738.Google Scholar
Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (N.S.) 58(5):527–535.Crossref, Google Scholar
Rusmevichientong P, Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
Smith L, Sørensen P (2000) Pathological outcomes of observational learning. Econometrica 68(2):371–398.Crossref, Google Scholar
Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3–4):285–294.Crossref, Google Scholar
Trix F, Psenka C (2003) Exploring the color of glass: Letters of recommendation for female and male medical faculty. Discourse Soc. 14(2):191–220.Crossref, Google Scholar
Xu L, Honda J, Sugiyama M (2018) A fully adaptive algorithm for pure exploration in linear bandits. Perez-Cruz, ed. Proc. Twenty-First Internat. Conf. Artificial Intelligence Statist. (Journal of Machine Learning Research), 843–851.Google Scholar

Volume 72, Issue 1

January 2026

Pages 1-782, iv-vi

Article Information

Supplemental Material

Metrics

Information

Received:March 22, 2022
Accepted:October 31, 2023
Published Online:March 29, 2024

Cite as

Junpei Komiyama, Shunya Noda (2024) On Statistical Discrimination as a Failure of Social Learning: A Multiarmed Bandit Approach. Management Science 72(1):442-455.

https://doi.org/10.1287/mnsc.2022.00893

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

On Statistical Discrimination as a Failure of Social Learning: A Multiarmed Bandit Approach

References

Volume 72, Issue 1

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News