Calibration of Heterogeneous Treatment Effects in Randomized Experiments

Published Online:https://doi.org/10.1287/isre.2021.0343

References

  • Aronow P, Robins JM, Saarinen T, Sävje F, Sekhon J (2021) Nonparametric identification is not enough, but randomized controlled trials are. Preprint, submitted September 27, https://arxiv.org/abs/2108.11342.Google Scholar
  • Athey S, Imbens GW (2015) Machine learning methods for estimating heterogeneous causal effects. Statistics 1050(5):1–26.Google Scholar
  • Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc. Natl. Acad. Sci. USA 113(27):7353–7360.CrossrefGoogle Scholar
  • Breiman L (2001) Random forests. Machine Learn. 45(1):5–32.CrossrefGoogle Scholar
  • Casella G, Berger RL (2002) Statistical Inference, vol. 2 (Duxbury, Pacific Grove, CA).Google Scholar
  • Chatterji AK, Fabrizio KR (2014) Using users: When does external knowledge enhance corporate product innovation? Strategic Management J. 35(10):1427–1445.CrossrefGoogle Scholar
  • Chernozhukov V, Fernández-Val I, Luo Y (2018) The sorted effects method: Discovering heterogeneous effects beyond their averages. Econometrica 86(6):1911–1938.CrossrefGoogle Scholar
  • Chernozhukov V, Demirer M, Duflo E, Fernandez-Val I (2023) Generic machine learning inference on heterogenous treatment effects in randomized experiments. NBER Working Paper No. 24678, National Bureau of Economic Research, Cambridge, MA.Google Scholar
  • Deng A, Shi X (2016) Data-driven metric development for online controlled experiments: Seven lessons learned. Proc. 22nd ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (ACM, New York), 77–86.Google Scholar
  • Diemert E, Betlei A, Renaudin C, Amini MR (2018) A large scale benchmark for uplift modeling. Proc. AdKDD & TargetAd (ADKDD’18) (ACM, New York), 1–6.Google Scholar
  • Dudík M, Erhan D, Langford J, Li L (2014) Doubly robust policy evaluation and optimization. Statist. Sci. 29(4):485–511.CrossrefGoogle Scholar
  • Dwivedi R, Tan YS, Park B, Wei M, Horgan K, Madigan D, Yu B (2020) Stable discovery of interpretable subgroups via calibration in causal studies. Internat. Statist. Rev. 88:S135–S178.Google Scholar
  • Eckles D (2022) Commentary on “Causal decision making and causal effect estimation are not the same… and why it matters”: On loss functions and bias–variance tradeoffs in causal estimation and decisions. INFORMS J. Data Sci. 1(1):17–18.LinkGoogle Scholar
  • Fernández-Loría C, Provost F (2022a) Causal classification: Treatment effect estimation vs. outcome prediction. J. Machine Learn. Res. 23(59):1–35.Google Scholar
  • Fernández-Loría C, Provost F (2022b) Causal decision making and causal effect estimation are not the same… and why it matters. INFORMS J. Data Sci. 1(1):4–16.Google Scholar
  • Fernández-Loría C, Provost F (2022c) Rejoinder to “causal decision making and causal effect estimation are not the same… and why it matters”. INFORMS J. Data Sci. 1(1):23–26.Google Scholar
  • Fernández-Loría C, Provost F, Anderton J, Carterette B, Chandar P (2023) A comparison of methods for treatment assignment with an application to playlist generation. Inform. Systems Res. 34(2):786–803.LinkGoogle Scholar
  • Galasso A, Simcoe TS (2011) CEO overconfidence and innovation. Management Sci. 57(8):1469–1484.LinkGoogle Scholar
  • Green DP, Kern HL (2012) Modeling heterogeneous treatment effects in survey experiments with Bayesian additive regression trees. Public Opinion Quart. 76(3):491–511.CrossrefGoogle Scholar
  • Greenfeld D, Shalit U (2020) Robust learning with the Hilbert-Schmidt independence criterion. Proc. Internat. Conf. on Machine Learn. (PMLR, New York), 3759–3768.Google Scholar
  • Grimmer J, Messing S, Westwood SJ (2017) Estimating heterogeneous treatment effects and the effects of heterogeneous treatments with ensemble methods. Political Analysis 25(4):413–434.CrossrefGoogle Scholar
  • Guelman L, Guillén M, Pérez-Marín AM (2015) A decision support framework to implement optimal personalized marketing interventions. Decision Support Systems 72:24–32.CrossrefGoogle Scholar
  • Hahn PR, Carvalho CM, Puelz D, He J, et al. (2018) Regularization and confounding in linear regression for treatment effect estimation. Bayesian Anal. 13(1):163–182.CrossrefGoogle Scholar
  • Hill JL (2011) Bayesian nonparametric modeling for causal inference. J. Comput. Graphical Statist. 20(1):217–240.CrossrefGoogle Scholar
  • Holland PW (1986) Statistics and causal inference. J. Amer. Statist. Assoc. 81(396):945–960.CrossrefGoogle Scholar
  • Imai K, Ratkovic M (2013) Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Statist. 7(1):443–470.CrossrefGoogle Scholar
  • Imai K, Strauss A (2011) Estimation of heterogeneous treatment effects from randomized experiments, with application to the optimal planning of the get-out-the-vote campaign. Political Anal. 19(1):1–19.CrossrefGoogle Scholar
  • Imbens GW, Rubin DB (2015) Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Jacob D (2020) Cross-fitting and averaging for machine learning estimation of heterogeneous treatment effects. Preprint, submitted July 6, https://arxiv.org/abs/2007.02852.Google Scholar
  • Jung J, Bapna R, Golden JM, Sun T (2020) Words matter! Toward a prosocial call-to-action for online referral: Evidence from two field experiments. Inform. Systems Res. 31(1):16–36.LinkGoogle Scholar
  • Kennedy EH (2023) Toward optimal doubly robust estimation of heterogeneous causal effects. Electronic J. Statis. 17(2):3008–3049.Google Scholar
  • Kohavi R, Henne RM, Sommerfield D (2007) Practical guide to controlled experiments on the web: Listen to your customers not to the hippo. Proc. 13th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (ACM, New York), 959–967.Google Scholar
  • Kuleshov V, Fenner N, Ermon S (2018) Accurate uncertainties for deep learning using calibrated regression. Proc. Internat. Conf. on Machine Learn. (ACM, New York), 2801–2809.Google Scholar
  • Künzel SR, Sekhon JS, Bickel PJ, Yu B (2019) Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. USA 116(10):4156–4165.CrossrefGoogle Scholar
  • Kuppuswamy V, Bayus BL (2017) Does my contribution to your crowdfunding project matter? J. Bus. Venturing 32(1):72–89.CrossrefGoogle Scholar
  • Kuusela P, Keil T, Maula M (2017) Driven by aspirations, but in what direction? Performance shortfalls, slack resources, and resource-consuming vs. resource-freeing organizational change. Strategic Management J. 38(5):1101–1120.CrossrefGoogle Scholar
  • Letham B, Karrer B, Ottoni G, Bakshy E (2019) Constrained bayesian optimization with noisy experiments. Bayesian Anal. 14(2):495–519.CrossrefGoogle Scholar
  • Lewandowski D, Kurowicka D, Joe H (2009) Generating random correlation matrices based on vines and extended onion method. J. Multivariate Anal. 100(9):1989–2001.CrossrefGoogle Scholar
  • Lin W (2013) Agnostic notes on regression adjustments to experimental data: Reexamining freedman’s critique. Ann. Appl. Statist. 7(1):295–318.CrossrefGoogle Scholar
  • Markov IL, Wang H, Kasturi N, Singh S, Yuen SW, Garrard M, Tran S, et al. (2021) Looper: An end-to-end ml platform for product decisions. Preprint, submitted October 14, https://arxiv.org/abs/2110.07554.Google Scholar
  • McFowland E III (2022) Commentary on “Causal decision making and causal effect estimation are not the same… and why it matters”. INFORMS J. Data Sci. 1(1):21–22.LinkGoogle Scholar
  • McFowland E, Gangarapu S, Bapna R, Sun T (2021) A prescriptive analytics framework for optimal policy deployment using heterogeneous treatment effects. MIS Quart. 45(4):1807–1832.CrossrefGoogle Scholar
  • Morgan NA, Rego LL (2006) The value of different customer satisfaction and loyalty metrics in predicting business performance. Marketing Sci. 25(5):426–439.LinkGoogle Scholar
  • Nie X, Wager S (2021) Quasi-oracle estimation of heterogeneous treatment effects. Biometrika 108(2):299–319.CrossrefGoogle Scholar
  • Oettl A (2012) Reconceptualizing stars: Scientist helpfulness and peer performance. Management Sci. 58(6):1122–1140.LinkGoogle Scholar
  • Olaya D, Coussement K, Verbeke W (2020) A survey and benchmarking study of multitreatment uplift modeling. Data Mining Knowledge Discovery 34(2):273–308.CrossrefGoogle Scholar
  • Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classifiers 10(3):61–74.Google Scholar
  • Prais SJ, Aitchison J (1954) The grouping of observations in regression analysis. Rev. Inst. Internat. Statist. 22(1/3):1–22.Google Scholar
  • Prosperi M, Guo Y, Sperrin M, Koopman JS, Min JS, He X, Rich S, et al. (2020) Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence 2(7):369–375.CrossrefGoogle Scholar
  • Rzepakowski P, Jaroszewicz S (2012) Decision trees for uplift modeling with single and multiple treatments. Knowledge Inform. Systems 32(2):303–327.CrossrefGoogle Scholar
  • Schuler A, Baiocchi M, Tibshirani R, Shah N (2018) A comparison of methods for model selection when estimating individual treatment effects. Preprint, submitted June 13, https://arxiv.org/abs/1804.05146.Google Scholar
  • Shalit U (2022) Commentary on “Causal decision making and causal effect estimation are not the same… and why it matters”. INFORMS J. Data Sci. 1(1):19–20.LinkGoogle Scholar
  • Sun T, Gao G, Jin GZ (2019) Mobile messaging for offline group formation in prosocial activities: A large field experiment. Management Sci. 65(6):2717–2736.LinkGoogle Scholar
  • Van der Laan MJ, Polley EC, Hubbard AE (2007) Super learner. Statist. Appl. Genetic Molecular Biology, vol. 6 (De Gruyter, Berlin), 1–23.CrossrefGoogle Scholar
  • Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113(523):1228–1242.CrossrefGoogle Scholar
  • Wolpert DH (1992) Stacked generalization. Neural Networks 5(2):241–259.CrossrefGoogle Scholar
  • Wooldridge JM (1999) Distribution-free estimation of some nonlinear panel data models. J. Econometrics 90(1):77–97.CrossrefGoogle Scholar
  • Wu H, Tan S, Li W, Garrard M, Obeng A, Dimmery D, Singh S, et al. (2022) Interpretable personalized experimentation. Proc. 28th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (ACM, New York), 4173–4183.Google Scholar
  • Xie Y (2007) Otis dudley duncan’s legacy: The demographic approach to quantitative reasoning in social science. Res. Soc. Stratification Mobility 25(2):141–156.CrossrefGoogle Scholar
  • Xie Y, Brand JE, Jann B (2012) Estimating heterogeneous treatment effects with observational data. Sociol. Methodology 42(1):314–347.CrossrefGoogle Scholar
  • Zhang M, Tsiatis AA, Davidian M (2008) Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics 64(3):707–715.CrossrefGoogle Scholar
  • Zhao Q, Small DS, Ertefaie A (2017a) Selective inference for effect modification via the lasso. Preprint, submitted May 22, https://arxiv.org/abs/1705.08020.Google Scholar
  • Zhao Y, Fang X, Simchi-Levi D (2017b) Uplift modeling with multiple treatments and general response types. Proc. SIAM Internat. Conf. on Data Mining (SIAM, Philadelphia), 588–596.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.