Adepoju I-OO, Albersen BJA, De Brouwere V, van Roosmalen J, Zweekhorst M (2017) mHealth for clinical decision-making in sub-Saharan Africa: A scoping review. JMIR mHealth uHealth 5(3):e38.Crossref, Google Scholar
Alur R, Raghavan M, Shah D (2024) Human expertise in algorithmic prediction. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. Adv. Neural Inform. Processing Systems, vol. 37 (Curran Associates, Inc., Red Hook, NY), 138088–138129.Google Scholar
Alur R, Laine L, Li DK, Raghavan M, Shah D, Shung D (2023) Auditing for human expertise. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. Adv. Neural Inform. Processing Systems, vol. 36 (Curran Associates, Inc., Red Hook, NY), 79439–79468.Google Scholar
Ardila D, Kiraly AP, Bharadwaj S, Choi B, Reicher JJ, Peng L, Tse D, et al. (2019) End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine 25(6):954–961.Crossref, Google Scholar
Ashar YK, Andrews-Hanna JR, Dimidjian S, Wager TD (2017) Empathic care and distress: Predictive brain markers and dissociable brain systems. Neuron 94(6):1263–1273.Crossref, Google Scholar
Athey S, Imbens GW (2019) Machine learning methods that economists should know about. Annual Rev. Econom. 11:685–725.Crossref, Google Scholar
Athey S, Wager S (2021) Policy learning with observational data. Econometrica 89(1):133–161.Crossref, Google Scholar
Baeza-Yates R, Ribeiro-Neto B (1999) Modern Information Retrieval, vol. 463 (ACM Press, New York).Google Scholar
Bansal G, Wu T, Zhou J, Fok R, Nushi B, Kamar E, Ribeiro MT, Weld D (2021) Does the whole exceed its parts? The effect of AI explanations on complementary team performance. Proc. 2021 CHI Conf. Human Factors Comput. Systems, vol. 81 (Association for Computing Machinery, New York), 1–16.Google Scholar
Berg T, Burg V, Gombović A, Puri M (2020) On the rise of FinTechs: Credit scoring using digital footprints. Rev. Financial Stud. 33(7):2845–2897.Crossref, Google Scholar
Berk R (2017) An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. J. Experiment. Criminology 13(2):193–216.Crossref, Google Scholar
Berk R, Heidari H, Jabbari S, Kearns M, Roth A (2021) Fairness in criminal justice risk assessments: The state of the art. Sociol. Methods Res. 50(1):3–44.Crossref, Google Scholar
Breiman L (2001) Random forests. Machine Learn. 45(1):5–32.Crossref, Google Scholar
Brennan M, Puri S, Ozrazgat-Baslanti T, Feng Z, Ruppert M, Hashemighouchani H, Momcilovic P, Li X, Wang DZ, Bihorac A (2019) Comparing clinical judgment with the MySurgeryRisk algorithm for preoperative risk assessment: A pilot usability study. Surgery 165(5):1035–1045.Crossref, Google Scholar
Bulathwela S, Pérez-Ortiz M, Holloway C, Shawe-Taylor J (2021) Could AI democratise education? Socio-technical imaginaries of an edtech revolution. Preprint, submitted December 3, https://arxiv.org/abs/2112.02034.Google Scholar
Chalfin A, Danieli O, Hillis A, Jelveh Z, Luca M, Ludwig J, Mullainathan S (2016) Productivity and selection of human capital with machine learning. Amer. Econom. Rev. 106(5):124–127.Crossref, Google Scholar
Chan DC, Gentzkow M, Yu C (2022) Selection with variation in diagnostic skill: Evidence from radiologists. Quart. J. Econom. 137(2):729–783.Crossref, Google Scholar
Chan S, Reddy V, Myers B, Thibodeaux Q, Brownstone N, Liao W (2020) Machine learning in dermatology: Current applications, opportunities, and limitations. Dermatology Therapy 10(3):365–386.Crossref, Google Scholar
Currie J, MacLeod WB (2017) Diagnosing expertise: Human capital, decision making, and performance among physicians. J. Labor Econom. 35(1):1–43.Crossref, Google Scholar
Daneshjou R, Vodrahalli K, Novoa RA, Jenkins M, Liang W, Rotemberg V, Ko J, et al. (2022) Disparities in dermatology AI performance on a diverse, curated clinical image set. Sci. Adv. 8(31):eabq6147.Crossref, Google Scholar
De Fauw J, Ledsam JR, Romera-Paredes B, Nikolov S, Tomasev N, Blackwell S, Askham H, et al. (2018) Clinically applicable deep learning for diagnosis and referral in retinal disease. Nature Medicine 24(9):1342–1350.Crossref, Google Scholar
Donahue K, Chouldechova A, Kenthapadi K (2022) Human-algorithm collaboration: Achieving complementarity and avoiding unfairness. Proc. 2022 ACM Conf. Fairness Accountability Transparency (Association for Computing Machinery, New York), 1639–1656.Google Scholar
Dwyer DB, Falkai P, Koutsouleris N (2018) Machine learning approaches for clinical psychology and psychiatry. Annual Rev. Clinical Psych. 14:91–118.Crossref, Google Scholar
Elliott G, Lieli RP (2013) Predicting binary outcomes. J. Econometrics 174(1):15–26.Crossref, Google Scholar
Erickson BJ, Korfiatis P, Akkus Z, Kline TL (2017) Machine learning for medical imaging. Radiographics 37(2):505–515.Crossref, Google Scholar
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118.Crossref, Google Scholar
Eubanks V (2018) Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor (St. Martin’s Press, New York). Google Scholar
Fekadu R, Getachew A, Tadele Y, Ali N, Goytom I (2022) Machine learning models evaluation and feature importance analysis on NPL dataset. Preprint, submitted August 28, https://arxiv.org/abs/2209.09638.Google Scholar
Feng K, Hong H (2024) Statistical inference of optimal allocations I: Regularities and their implications. Preprint, submitted March 27, https://arxiv.org/abs/2403.18248.Google Scholar
Feng K, Hong H, Tang K, Wang J (2022) Properties of ROC curves. Preprint, submitted May 30, https://dx.doi.org/10.2139/ssrn.3382962.Google Scholar
Fuster A, Goldsmith-Pinkham P, Ramadorai T, Walther A (2022) Predictably unequal? The effects of machine learning on credit markets. J. Finance 77(1):5–47.Crossref, Google Scholar
Fuster A, Plosser M, Schnabl P, Vickery J (2019) The role of technology in mortgage lending. Rev. Financial Stud. 32(5):1854–1899.Crossref, Google Scholar
Gillis T, McLaughlin B, Spiess J (2021) On the fairness of machine-assisted human decisions. Preprint, submitted October 28, https://arxiv.org/abs/2110.15310.Google Scholar
Guo J, Li B (2018) The application of medical artificial intelligence technology in rural areas of developing countries. Health Equity 2(1):174–181.Crossref, Google Scholar
Han SS, Park I, Eun Chang S, Lim W, Kim MS, Park GH, Chae JB, Huh CH, Na J-I (2020) Augmented intelligence dermatology: Deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J. Investigative Dermatology 140(9):1753–1761.Crossref, Google Scholar
Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, Ng AY (2019) Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nature Medicine 25(1):65–69.Crossref, Google Scholar
Huang S-C, Pareek A, Seyyedi S, Banerjee I, Lungren MP (2020) Fusion of medical imaging and electronic health records using deep learning: A systematic review and implementation guidelines. NPJ Digital Medicine 3(1):136.Crossref, Google Scholar
Iakovlev A, Liang A (2024) The value of context: Human versus black box evaluators. Preprint, submitted February 17, https://arxiv.org/abs/2402.11157.Google Scholar
Jin W, Fatehi M, Guo R, Hamarneh G (2024) Evaluating the clinical utility of artificial intelligence assistance and its explanation on the glioma grading task. Artificial Intelligence Medicine 148:102751.Crossref, Google Scholar
Johnson EM, Rehavi MM (2016) Physicians treating physicians: Information and incentives in childbirth. Amer. Econom. J. Econom. Policy 8(1):115–141.Crossref, Google Scholar
Johnson KW, Torres Soto J, Glicksberg BS, Shameer K, Miotto R, Ali M, Ashley E, Dudley JT (2018) Artificial intelligence in cardiology. J. Amer. College Cardiology 71(23):2668–2679.Crossref, Google Scholar
Kahneman D, Klein G (2009) Conditions for intuitive expertise: A failure to disagree. Amer. Psych. 64(6):515–526.Crossref, Google Scholar
Kawaguchi K (2021) When will workers follow an algorithm? A field experiment with a retail business. Management Sci. 67(3):1670–1695.Link, Google Scholar
Kermany DS, Goldbaum M, Cai W, Valentim CC, Liang H, Baxter SL, McKeown A, et al. (2018) Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172(5):1122–1131.Crossref, Google Scholar
Kitagawa T, Tetenov A (2018) Who should be treated? Empirical welfare maximization methods for treatment choice. Econometrica 86(2):591–616.Crossref, Google Scholar
Kleinberg J, Lakkaraju H, Leskovec J, Ludwig J, Mullainathan S (2018) Human decisions and machine predictions. Quart. J. Econom. 133(1):237–293.Crossref, Google Scholar
Kotz S, Balakrishnan N, Johnson NL (2019) Continuous Multivariate Distributions, Volume 1: Models and Applications, vol. 334 (John Wiley & Sons, Hoboken, NJ).Google Scholar
Liang H, Tsui BY, Ni H, Valentim CCS, Baxter SL, Liu G, Cai W, et al. (2019) Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nature Medicine 25(3):433–438.Crossref, Google Scholar
Long E, Lin H, Liu Z, Wu X, Wang L, Jiang J, An Y, et al. (2017) An artificial intelligence platform for the multihospital collaborative management of congenital cataracts. Nature Biomedical Engrg. 1(2):0024.Crossref, Google Scholar
Lopez C, Sautmann A, Schaner S (2018) The contribution of patients and providers to the overuse of prescription drugs. NBER Working Paper No. w25284, National Bureau of Economic Research, Cambridge, MA.Crossref, Google Scholar
Manski CF (2018) Credible ecological inference for medical decisions with personalized risk assessment. Quant. Econom. 9(2):541–569.Crossref, Google Scholar
Mbakop E, Tabord-Meehan M (2021) Model selection for treatment choice: Penalized welfare maximization. Econometrica 89(2):825–848.Crossref, Google Scholar
McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, Back T, et al. (2020) International evaluation of an AI system for breast cancer screening. Nature 577(7788):89–94.Crossref, Google Scholar
Mondal H, Mondal S, Singla RK (2023) Artificial intelligence in rural health in developing countries. Chatterjee JM, Saxena SK, eds. Artificial Intelligence in Medical Virology (Springer Nature, Singapore), 37–48.Crossref, Google Scholar
Mullainathan S, Obermeyer Z (2022) Diagnosing physician error: A machine learning approach to low-value health care. Quart. J. Econom. 137(2):679–727.Crossref, Google Scholar
Mullainathan S, Spiess J (2017) Machine learning: An applied econometric approach. J. Econom. Perspect. 31(2):87–106.Crossref, Google Scholar
O’Neil C (2017) Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Crown, New York).Google Scholar
Peng S, Liu Y, Lv W, Liu L, Zhou Q, Yang H, Ren J, et al. (2021) Deep learning-based artificial intelligence model to assist thyroid nodule diagnosis and management: A multicentre diagnostic study. Lancet Digital Health 3(4):e250–e259.Crossref, Google Scholar
Rajpurkar P, Hannun AY, Haghpanahi M, Bourn C, Ng AY (2017) Cardiologist-level arrhythmia detection with convolutional neural networks. Preprint, submitted July 6, https://arxiv.org/abs/1707.01836.Google Scholar
Rambachan A, Roth J (2019) Bias in, bias out? Evaluating the folk wisdom. Preprint, submitted September 18, https://arxiv.org/abs/1909.08518.Google Scholar
Rambachan A, Kleinberg J, Ludwig J, Mullainathan S (2020) An economic perspective on algorithmic fairness. AEA Papers Proc., vol. 110, 91–95.Google Scholar
Ren H, Wang J, Zhao WX, Wu N (2021) RAPT: Pre-training of time-aware transformer for learning robust healthcare representation. Proc. 27th ACM SIGKDD Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 3503–3511.Google Scholar
Sadhwani A, Giesecke K, Sirignano J (2021) Deep learning for mortgage risk. J. Financial Econometrics 19(2):313–368.Crossref, Google Scholar
Sajjadiani S, Sojourner AJ, Kammeyer-Mueller JD, Mykerezi E (2019) Using machine learning to translate applicant work history into predictors of performance and turnover. J. Appl. Psych. 104(10):1207–1225.Crossref, Google Scholar
Studdert DM, Mello MM, Sage WM, DesRoches CM, Peugh J, Zapert K, Brennan TA (2005) Defensive medicine among high-risk specialist physicians in a volatile malpractice environment. JAMA 293(21):2609–2617.Crossref, Google Scholar
Swanson K, Wu E, Zhang A, Alizadeh AA, Zou J (2023) From patterns to patients: Advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell 186(8):1772–1791.Crossref, Google Scholar
Tian F, Liu D, Wei N, Fu Q, Sun L, Liu W, Sui X, et al. (2024) Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning. Nature Medicine 30(5):1309–1319.Crossref, Google Scholar
Trivedi A, Mukherjee S, Tse E, Ewing A, Ferres JL (2019) Risks of using non-verified open data: A case study on using machine learning techniques for predicting pregnancy outcomes in India. Preprint, submitted October 4, https://arxiv.org/abs/1910.02136.Google Scholar
Uhm K-H, Jung S-W, Choi MH, Shin H-K, Yoo J-I, Oh SW, Kim JY, et al. (2021) Deep learning for end-to-end kidney cancer diagnosis on multi-phase abdominal computed tomography. NPJ Precision Oncology 5(1):54.Crossref, Google Scholar
Vaccaro M, Almaatouq A, Malone T (2024) When combinations of humans and AI are useful: A systematic review and meta-analysis. Nature Human Behav. 8(12):2293–2303.Crossref, Google Scholar
Vallee B, Zeng Y (2019) Marketplace lending: A new banking paradigm? Rev. Financial Stud. 32(5):1939–1982.Crossref, Google Scholar
Wang Z, Wei L, Xue L (2024) Overcoming medical overuse with AI assistance: An experimental investigation. Preprint, submitted May 17, https://arxiv.org/abs/2405.10539.Google Scholar
Wang J, Gallo E, Zhang W, Tang K, Hong H (2023) Diagnosing with the help of artificial intelligence. Working paper, Beihang University, Beijing.Google Scholar
Yang J, Xie M, Hu C, Alwalid O, Xu Y, Liu J, Jin T, et al. (2021) Deep learning for detecting cerebral aneurysms with CT angiography. Radiology 298(1):155–163.Crossref, Google Scholar
Yoon J, Kang C, Kim S, Han J (2022) D-vlog: Multimodal vlog dataset for depression detection. Proc. AAAI Conf. Artificial Intelligence, vol. 36 (PKP Publishing Services Network, Vancouver), 12226–12234.Google Scholar
Yu F, Moehring A, Banerjee O, Salz T, Agarwal N, Rajpurkar P (2024) Heterogeneity and predictors of the effects of AI assistance on radiologists. Nature Medicine 30(3):837–849.Crossref, Google Scholar
Zeng J, Ustun B, Rudin C (2017) Interpretable classification models for recidivism prediction. J. Roy. Statist. Soc. Ser. A Statist. Soc. 180(3):689–722.Crossref, Google Scholar

Volume 71, Issue 11

November 2025

Pages 8995-9868, iv-vi

Article Information

Supplemental Material

Metrics

Information

Received:June 27, 2023
Accepted:December 15, 2024
Published Online:March 05, 2025

Cite as

Kai Feng, Han Hong, Ke Tang, Jingyuan Wang (2025) Statistical Tests for Replacing Human Decision Makers with Algorithms. Management Science 71(11):9145-9170.

https://doi.org/10.1287/mnsc.2023.01845

Keywords

Acknowledgments

The authors thank the editors and four anonymous referees, Brendan Beare, Bo Honore, Xiaohong Chen, Ron Gallant, Bruce Hansen, Peter Hansen, Bentley MacLeod, Jack Porter, Adam Rosen, Andres Santos, George Tauchen, Valentin Verdier, and participants at various seminars and conferences for insightful comments. The authors also thank Yichuan Zhang, Xin Lin, and Zhonghao Huang for excellent research assistance. This study was approved by the National Health Commission. Informed consents were obtained from all the NFPC participants.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Statistical Tests for Replacing Human Decision Makers with Algorithms

References

Volume 71, Issue 11

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News