Minimax-Optimal Policy Learning Under Unobserved Confounding

Nathan Kallus
Corresponding Author
Nathan Kallus
[email protected]
https://orcid.org/0000-0002-2757-1570
aCornell University, New York, New York 10044
Search for more papers by this author
,
Angela Zhou
Angela Zhou
[email protected]
https://orcid.org/0000-0003-2814-5693
aCornell University, New York, New York 10044
Search for more papers by this author

Nathan Kallus

Corresponding Author

Nathan Kallus

[email protected]

https://orcid.org/0000-0002-2757-1570

aCornell University, New York, New York 10044

Search for more papers by this author

Angela Zhou

[email protected]

https://orcid.org/0000-0003-2814-5693

aCornell University, New York, New York 10044

Search for more papers by this author

Published Online:6 Oct 2020https://doi.org/10.1287/mnsc.2020.3699

References

Aronow P, Lee D (2012) Interval estimation of population means under unknown but bounded probabilities of sample selection. Biometrika 100(1):235–240.Google Scholar
Athey S, Tibshirani J, Wager S (2019) Generalized random forests. Ann. Statist. 47(2):1148–1178.Crossref, Google Scholar
Bakour SH, Williamson J (2015) Latest evidence on using hormone replacement therapy in the menopause. Obstetrician Gynaecologist 17(1):20–28.Crossref, Google Scholar
Bartlett PL, Bousquet O, Mendelson S (2005) Local Rademacher complexities. Ann. Statist. 33(4):1497–1537.Crossref, Google Scholar
Bertsimas D, Dunn J (2017) Optimal classification trees. Machine Learn. 106:1039–1082.Crossref, Google Scholar
Bertsimas D, Kallus N, Weinstein AM, Zhuo YD (2016) Personalized diabetes management using electronic medical records. Diabetes Care 40(2):210–217.Google Scholar
Beygelzimer A, Langford J (2009) The offset tree for learning with partial labels. Proc. 15th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining. (ACM, New York), 129–138.Google Scholar
Blake T, Nosko C, Tadelis S (2015) Consumer heterogeneity and paid search effectiveness: A large-scale field experiment. Econometrica 83(1):155–174.Crossref, Google Scholar
Brookhart MA, Sturmer T, Glynn RJ, Rassen J, Schneeweiss S (2010) Confounding control in healthcare database research: Challenges and potential approaches. Medical Care 48(6 Suppl):S114–S120.Crossref, Google Scholar
Charnes A, Cooper W (1962) Programming with linear fractional functionals. Naval Res. Logist. Quart. 9(3–4):181–186.Crossref, Google Scholar
Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C (2018) Double machine learning for treatment and causal parameters. Econometrics J. 21(1):C1–C68.Google Scholar
Dudik M, Erhan D, Langford J, Li L (2014) Doubly robust policy evaluation and optimization. Statist. Sci. 29(4):485–511.Crossref, Google Scholar
Dudley R (1987) Universal Donsker classes and metric entropy. Ann. Probab. 15(4):1306–1326.Google Scholar
Fogarty C, Hasegawa R (2019) An extended sensitivity analysis for heterogeneous unmeasured confounding. Ann. Appl. Statist. 13(2):767–796.Crossref, Google Scholar
Fogarty CB, Small DS (2016) Sensitivity analysis for multiple comparisons in matched observational studies through quadratically constrained linear programming. J. Amer. Statist. Assoc. 111(516):1820–1830.Crossref, Google Scholar
Giné E, Nickl R (2016) Mathematical Foundations of Infinite-Dimensional Statistical Models, vol. 40 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Golea M, Bartlett PL, Lee WS, Mason L (1998) Generalization in decision trees and DNF: Does size matter? Jordan M, Kearns M, Solla S, eds. Adv. Neural Inform. Processing Systems (MIT Press, Boston), 259–265.Google Scholar
Gordon BR, Zettelmeyer F, Bhargava N, Chapsky D (2019) A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook. Marketing Sci. 38(2):193–225.Link, Google Scholar
Hájek J (1971) Comment on “An essay on the logical foundations of survey sampling, part one.” Godambe VP, Thompson, ME, eds. The Foundations of Survey Sampling, vol. 236 (Holt, Rinehart and Winston, Toronto).Google Scholar
Hasegawa R, Small D (2017) Sensitivity analysis for matched pair analysis of binary data: From worst case to average case analysis. Biometrics 73(4):1424–1432.Crossref, Google Scholar
Hirano K, Porter JR (2019) Statistical decision rules in econometrics. Durlauf S, Hansen L, Heckman J, Matzkin R, eds. Handbook of Econometrics (Elsevier, Amsterdam), 7.Google Scholar
Ho T-H, Lim N, Reza S, Xia X (2017) OM forum—Causal inference models in operations management. Manufacturing Service Oper. Management 19(4):509–525.Link, Google Scholar
Hoffman MA, Williams MS (2011) Electronic medical records and personalized medicine. Human Genetics 130(1):33–39.Crossref, Google Scholar
Horvitz D, Thompson D (1952) A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47(260):663–685.Crossref, Google Scholar
Hsu JY, Small DS (2013) Calibrating sensitivity analyses to observed covariates in observational studies. Biometrics 69(4):803–811.Crossref, Google Scholar
Imbens G, Rubin D (2015) Causal Inference for Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, England).Crossref, Google Scholar
Kallus N (2017a) Recursive partitioning for personalization using observational data. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn, vol. 70 (PMLR), 1789–1798.Google Scholar
Kallus N (2017b) Balanced policy evaluation and learning. Adv. Neural Inform. Processing Systems, 8895–8906.Google Scholar
Kallus N, Zhou A (2018a) Confounding-robust policy improvement. Adv. Neural Inform. Processing Systems (PMLR), 9269–9279.Google Scholar
Kallus N, Zhou A (2018b) Policy evaluation and optimization with continuous treatments. Internat. Conf. Artificial Intelligence Statist. 1243–1251.Google Scholar
Kitagawa T, Tetenov A (2018) Who should be treated? Empirical welfare maximization methods for treatment choice. Econometrica 86(2):591–616.Crossref, Google Scholar
Künzel S, Sekhon J, Bickel P, Yu B (2019) Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. USA 116(10):4156–4165.Google Scholar
Lawlor DA, Smith GD, Ebrahim S (2004) Commentary: The hormone replacement-coronary heart disease conundrum: Is this the death of observational epidemiology? Internat. J. Epidemiology 33(3):464–467.Crossref, Google Scholar
Li L, Chu W, Langford J, Wang X (2011) Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. Proc. Fourth ACM Internat. Conf. Web Search Data Mining (ACM, New York), 297–306.Google Scholar
Manski C (2005) Social Choice with Partial Knowledge of Treatment Response (The Econometric Institute Lectures, Princeton, NJ).Google Scholar
Manski C (2008) Identification for Prediction and Decision (Harvard University Press, Cambridge, MA).Crossref, Google Scholar
Manson JE, Chlebowski RT, Stefanick ML, Aragaki AK, Rossouw JE, Prentice RL, Anderson G, et al.. (2013) The women’s health initiative hormone therapy trials: Update and overview of health outcomes during the intervention and post-stopping phases. JAMA 310(13):1353–1368.Google Scholar
Masten M, Poirier A (2018) Identification of treatment effects under conditional partial independence. Econometrica 86(1):317–351.Crossref, Google Scholar
Miratrix LW, Wager S, Zubizarreta JR (2018) Shape-constrained partial identification of a population mean under unknown probabilities of sample selection. Biometrika 105(1):103–114.Crossref, Google Scholar
Nie X, Wager S (2017) Learning objectives for treatment effect estimation. Preprint, submitted 2017, https://arxiv.org/pdf/1712.04912.pdf.Google Scholar
Pedersen AT, Ottesen B (2003) Issues to debate on the women’s health initiative (WHI) study. Epidemiology or randomized clinical trials—Time out for hormone replacement therapy studies? Human Reproduction 18(11):2241–2244.Crossref, Google Scholar
Petrik M, Ghavamzadeh M, Chow Y (2016) Safe policy improvement by minimizing robust baseline regret. 29th Conf. Neural Inform. Processing Systems (PMLR), 2298–2306.Google Scholar
Prentice RL, Pettinger M, Anderson GL (2005) Statistical issues arising in the women’s health initiative. Biometrics 61(4):899–911.Crossref, Google Scholar
Qian M, Murphy SA (2011) Performance guarantees for individualized treatment rules. Ann. Statist. 39(2):1180–1210.Crossref, Google Scholar
Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J. Amer. Statist. Assoc. 89(427):846–866.Crossref, Google Scholar
Rosenbaum P (2002) Observational Studies. Springer Series Statistics (Springer, New York).Google Scholar
Rossouw JE, Manson JE, Kaunitz AM, Anderson GL (2013) Lessons learned from the Women’s Health Initiative trials of menopausal hormone therapy. Obstetrics Gynecology 121(1):172–176.Crossref, Google Scholar
Rubin D (1974) Estimating causal effect of treatments in randomized and nonrandomized studies. J. Ed. Psych. 66(5):688–701.Crossref, Google Scholar
Rubin DB (1980) Comments on “Randomization analysis of experimental data: The Fisher randomization test comment.” J. Amer. Statist. Assoc. 75(371):591–593.Google Scholar
Shalit U, Johansson FD, Sontag D (2017) Estimating individual treatment effect: Generalization bounds and algorithms. Internat. Conf. Machine Learning (PMLR), 3076–3085.Google Scholar
Stoye J (2009) Minimax regret treatment choice with finite samples. J. Econometrics 151(1):70–81.Crossref, Google Scholar
Stoye J (2012) Minimax regret treatment choice with limited validity of experiments or with covariates. J. Econometrics 166(1):138–156.Crossref, Google Scholar
Swaminathan A, Joachims T (2015a) Batch learning from logged bandit feedback through counterfactual risk minimization. J. Machine Learn. Res. 16(52):1731–1755.Google Scholar
Swaminathan A, Joachims T (2015b) The self-normalized estimator for counterfactual learning. Adv. Neural Inform. Processing Systems (PMLR), 3231–3239.Google Scholar
Tan Z (2012) A distributional approach for causal inference using propensity scores. J. Amer. Statist. Assoc. 101(476):1619–1637.Google Scholar
Thomas P, Theocharous G, Ghavamzadeh M (2015) High confidence policy improvement. Proc. 32nd Internat. Conf. Machine Learn (PMLR), 2380–2388.Google Scholar
Ustun B, Rudin C (2015) Supersparse linear integer models for optimized medical scoring systems. Machine Learn. 102(3):349–91.Crossref, Google Scholar
Van Der Vaart AW, Wellner JA (1996) Weak convergence. Weak Convergence and Empirical Processes (Springer, New York), 16–28.Crossref, Google Scholar
Wager S, Athey S (2017a) Efficient policy learning. Preprint, submitted 2017, https://arxiv.org/abs/1702.02896.Google Scholar
Wager S, Athey S (2017b) Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113(523)1228–1242.Crossref, Google Scholar
Wang YX, Agarwal A, Dudik M (2017) Optimal and adaptive off-policy evaluation in contextual bandits. Proc. Neural Inform. Processing System (PMLR), 3589–3597.Google Scholar
Zhao Q, Small DS, Bhattacharya BB (2019) Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap. J. Roy. Statist. Soc. Ser. B. Statist. Methodology 81(4):735–761.Crossref, Google Scholar

Volume 67, Issue 5

May 2021

Pages 2657-3320, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:June 27, 2018
Accepted:May 01, 2020
Published Online:October 06, 2020

Cite as

Nathan Kallus, Angela Zhou (2020) Minimax-Optimal Policy Learning Under Unobserved Confounding. Management Science 67(5):2870-2890.

https://doi.org/10.1287/mnsc.2020.3699

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Minimax-Optimal Policy Learning Under Unobserved Confounding

References

Volume 67, Issue 5

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News