Causal Decision Making and Causal Effect Estimation Are Not the Same…and Why It Matters
Published Online:10 Mar 2022https://doi.org/10.1287/ijds.2021.0006
References
- (1996) Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc. 91(434):444–455.Google Scholar
- (2018) Retention futility: Targeting high-risk customers might be ineffective. J. Marketing Res. 55(1):80–98.Google Scholar
- (2018) In pursuit of enhanced customer retention management: Review, key issues, and future directions. Customer Needs Solutions 5(1-2):65–81.Google Scholar
- (2016) Recursive partitioning for heterogeneous causal effects. Proc. National Acad. Sci. USA 113(27):7353–7360.Google Scholar
- (2017) The state of applied econometrics: Causality and policy evaluation. J. Econom. Perspectives 31(2):3–32.Google Scholar
- (2019) Machine learning methods that economists should know about. Annu. Rev. Econom. 11:685–725.Google Scholar
- (2021) Policy learning with observational data. Econometrica. 89(1):133–161.Google Scholar
- (2020) Combining experimental and observational data to estimate treatment effects on long term outcomes. Preprint, submitted June 17, https://arxiv.org/abs/2006.09676.Google Scholar
- (2016) Estimating treatment effects using multiple surrogates: The role of the surrogate score and the surrogate index. Preprint, submitted March 30 (v1), last revised February 29, 2020 (v3), https://arxiv.org/abs/1603.09326.Google Scholar
- (2009) The offset tree for learning with partial labels. Proc.15th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (ACM, New York), 129–138.Google Scholar
- (2012) Inferring welfare maximizing treatment assignment under budget constraints. J. Econometrics 167(1):168–196.Google Scholar
- (2001) Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statist. Sci. 16(3):199–231.Google Scholar
- (2014) Dynamic treatment regimes. Annu. Rev. Statist. Appl. 1:447–464.Google Scholar
- (1958) Planning of Experiments (Wiley, New York).Google Scholar
- (2001). Statistical modeling: The two cultures. Statist. Sci. 16(3):216–218.Google Scholar
- (2015) Evaluating and optimizing online advertising: Forget the click, but there are good proxies. Big Data 3(2):90–102.Google Scholar
- (2018) A large scale benchmark for uplift modeling. Proc. 24th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (ACM, New York).Google Scholar
- (2019) Automated vs. do-it-yourself methods for causal inference: Lessons learned from a data analysis competition. Statist. Sci. 34(1):43–68.Google Scholar
- (2017) Personalized pricing and customer welfare. Preprint, submitted June 26, https://dx.doi.org/10.2139/ssrn.2992257.Google Scholar
- (2021) Smart “predict, then optimize”. Management Sci., ePub ahead of print March 12, https://doi.org/10.1287/mnsc.2020.3922.Link, Google Scholar
- (2020) Decision trees for decision-making under the predict-then-optimize framework. Internat. Conf. on Machine Learning (IMLS), 2858–2867.Google Scholar
- (2019) Test & roll: Profit-maximizing a/b tests. Marketing Sci. 38(6):1038–1058.Link, Google Scholar
- (2019a) Causal classification: Treatment effect vs. outcome prediction. Preprint, submitted June 26, https://dx.doi.org/10.2139/ssrn.3408524.Google Scholar
- (2019b) Observational vs experimental data when making automated decisions using machine learning. Preprint, submitted September 5, https://dx.doi.org/10.2139/ssrn.3444678.Google Scholar
- (2020) Combining observational and experimental data to improve large-scale decision-making. Proc. Internat. Conf. on Inform. Systems (Association for Information Systems, Atlanta), 1583.Google Scholar
- (2020) A comparison of methods for treatment assignment with an application to playlist generation. Preprint, submitted April 24, 2020 (v1), last revised April 23, 2021 (v4), https://arxiv.org/abs/.Google Scholar
- (1997) On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining Knowledge Discovery 1(1):55–77.Google Scholar
- (2019) A comparison of approaches to advertising measurement: Evidence from big field experiments at facebook. Marketing Sci. 38(2):193–225.Link, Google Scholar
- (2011) Bayesian nonparametric modeling for causal inference. J. Comput. Graphics Statist. 20(1):217–240.Google Scholar
- . (2015) Telco churn prediction with big data. Proc. ACM SIGMOD Internat. Conf. on Management of Data (ACM, New York), 607–618.Google Scholar
- (2013) Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Statist. 7(1):443–470.Google Scholar
- (2021) Breiman’s two cultures: A perspective from econometrics. Observational Stud. 7(1):127–133.Google Scholar
- (2018) Removing hidden confounding by experimental grounding. Proc. 32nd Internat. Conf. on Neural Inform. Processing Systems, NeurIPS(Neural Information Processing Systems Foundation, California), 10911–10920.Google Scholar
- (2020) Managing churn to maximize profits. Marketing Sci. 39(5):956–973.Link, Google Scholar
- (2010) A contextual-bandit approach to personalized news article recommendation. Proc. 19th Internat. Conf. on World Wide Web (ACM, New York), 661–670.Google Scholar
- (2013) How Retailers Can Keep Up with Consumers (McKinsey & Company).Google Scholar
- (2004) Statistical treatment rules for heterogeneous populations. Econometrica 72(4):1221–1246.Google Scholar
- (2018) Efficient discovery of heterogeneous treatment effects in randomized experiments via anomalous pattern detection. Preprint, submitted March 24 (v1), last revised June 7, 2018 (v2), https://arxiv.org/abs/1803.09159.Google Scholar
- (2022) A prescriptive analytics framework for optimal policy deployment using heterogeneous treatment effects. MIS Quart. Forthcoming.Google Scholar
- (2020) Test, target, & roll: Optimal explore-first contextual targeting in finite populations. Poster Session, Conference on Digital Experimentation (CODE@MIT, Boston).Google Scholar
- (2020) Personalized discount targeting with causal machine learning. Proc. Internat. Conf. on Inform. Systems (Association for Information Systems, Atlanta), 1682.Google Scholar
- (2021) Introduction. Observational Stud. 7(1):1–2.Google Scholar
- (2020) A survey and benchmarking study of multitreatment uplift modeling. Data Mining Knowledge Discovery 34(2):273–308.Google Scholar
- (2009) Causality: Models, Reasoning and Inference (Cambridge University Press).Google Scholar
- (2021) Causally colored reflections on leo breiman’s “statistical modeling: The two cultures” (2001). Observational Stud. 7(1):187–190.Google Scholar
- (2011) Transportability of causal and statistical relations: A formal approach. Proc. AAAI Conf. on Artificial Intelligence AAAI, vol. 25 (Advancement of Artificial Intelligence, California).Google Scholar
- (2014) Machine learning for targeted display advertising: Transfer learning in action. Machine Learn. 95(1):103–127.Google Scholar
- (2016) Combining observational and experimental data to find heterogeneous treatment effects. Preprint, submitted November 8, https://arxiv.org/abs/1611.02385.Google Scholar
- (1989) Surrogate endpoints in clinical trials: definition and operational criteria. Statist. Medicine 8(4):431–440.Google Scholar
- (2001) Robust classification for imprecise environments. Machine Learn. 42(3):203–231.Google Scholar
- (2013) Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking (O’Reilly Media).Google Scholar
- (2011) Real-world uplift modelling with significance-based uplift trees. White Paper TR-2011-1, Stochastic Solutions.Google Scholar
- (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55.Google Scholar
- (2020) Combining observational and experimental datasets using shrinkage estimators. Preprint, submitted February 16 (v1), last revised May 18, 2020 (v2), https://arxiv.org/abs/2002.06708.Google Scholar
- (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J. Edu. Psych. 66(5):688.Google Scholar
- (2007) Decision-centric active learning of binary-outcome models. Inform. Systems Res. 18(1):4–22.Link, Google Scholar
- (2018) A comparison of methods for model selection when estimating individual treatment effects. Preprint, submitted April 14 (v1), last revised June 13, 2018 (v2), https://arxiv.org/abs/1804.05146.Google Scholar
- (2010) To explain or to predict? Statist. Sci. 25(3):289–310.Google Scholar
- (2020) Efficiently evaluating targeting policies: Improving on champion vs. challenger experiments. Management Sci. 66(8):3412–3424.Link, Google Scholar
- (2019) Introduction to multi-armed bandits. Foundations Trends Machine Learn. 12(1-2):1–286.Google Scholar
- (2011) Estimating the effect of online display advertising on browser conversion. Proc. Fifth International Workshop on Data Mining and Audience Intelligence for Advertising, ADKDD (ACM, New York), 8–16.Google Scholar
- (2020) Beyond overall treatment effects: Leveraging covariates in randomized experiments guided by causal structure. Inform. Systems Res. 31(4):1183–1199.Link, Google Scholar
- (2013) Surrogate measures and consistent surrogates. Biometrics 69(3):561–565.Google Scholar
- (2018) Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113(523):1228–1242.Google Scholar
- (2015) Introductory Econometrics: A Modern Approach (Nelson Education).Google Scholar
- (2015) A tree-based approach for addressing self-selection in impact studies with big data. Management Inform. Systems Quart. 40(4):819–848.Google Scholar
- (2020) Targeting for long-term outcomes. Preprint, submitted October 29, https://arxiv.org/abs/2010.15835.Google Scholar
- (2003) Policy mining: Learning decision policies from fixed sets of data. PhD thesis, University of California, San Diego.Google Scholar
- (2012) Estimating individualized treatment rules using outcome weighted learning. J. Amer. Statist. Assoc. 107(499):1106–1118.Google Scholar

