Deep Learning-Based Causal Inference for Large-Scale Combinatorial Experiments: Theory and Empirical Evidence

Published Online:https://doi.org/10.1287/mnsc.2024.04625

References

  • Adcock B, Dexter N (2021) The gap between theory and practice in function approximation with deep neural networks. SIAM J. Math. Data Sci. 3(2):624–655.CrossrefGoogle Scholar
  • Angrist JD, Pischke JS (2009) Mostly Harmless Econometrics: An Empiricist’s Companion (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Arkhangelsky D, Athey S, Hirshberg DA, Imbens GW, Wager S (2021) Synthetic difference-in-differences. Amer. Econom. Rev. 111(12):4088–4118.CrossrefGoogle Scholar
  • Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc. Natl. Acad. Sci. USA 113(27):7353–7360.CrossrefGoogle Scholar
  • Athey S, Imbens GW, Wager S (2018) Approximate residual balancing: Debiased inference of average treatment effects in high dimensions. J. Roy. Statist. Soc. Ser. B Statist. Methodology 80(4):597–623.CrossrefGoogle Scholar
  • Bertsimas D, Imai K, Li ML (2022) Distributionally robust causal inference with observational data. Preprint, submitted October 15, https://arxiv.org/abs/2210.08326.Google Scholar
  • Bojinov I, Simchi-Levi D, Zhao J (2023) Design and analysis of switchback experiments. Management Sci. 69(7):3759–3777.LinkGoogle Scholar
  • Box GEP, Hunter WG, Hunter JS (1978) Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building (John Wiley and Sons, New York).Google Scholar
  • Burtch G, Ghose A, Wattal S (2015) The hidden cost of accommodating crowdfunder privacy preferences: A randomized field experiment. Management Sci. 61(5):949–962.LinkGoogle Scholar
  • Candogan O, Chen C, Niazadeh R (2024) Correlated cluster-based randomized experiments: Robust variance minimization. Management Sci. 70(6):4069–4086.Google Scholar
  • Chernozhukov V, Newey WK, Singh R (2022) Automatic debiased machine learning of causal and structural effects. Econometrica 90(3):967–1027.CrossrefGoogle Scholar
  • Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J (2018) Double/debiased machine learning for treatment and structural parameters. Econom. J. 21(1):C1–C68.CrossrefGoogle Scholar
  • Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.LinkGoogle Scholar
  • Chiang HD, Kato K, Ma Y, Sasaki Y (2022) Multiway cluster robust double/debiased machine learning. J. Bus. Econom. Statist. 40(3):1046–1056.CrossrefGoogle Scholar
  • Dasgupta T, Pillai NS, Rubin DB (2015) Causal inference from 2K factorial designs by using potential outcomes. J. Roy. Statist. Soc. Ser. B Statist. Methodology 77(4):727–753.CrossrefGoogle Scholar
  • Dube A, Jacobs J, Naidu S, Suri S (2020) Monopsony in online labor markets. Amer. Econom. Rev. Insights 2(1):33–46.CrossrefGoogle Scholar
  • Edelman B, Luca M, Svirsky D (2017) Racial discrimination in the sharing economy: Evidence from a field experiment. Amer. Econom. J. Appl. Econom. 9(2):1–22.CrossrefGoogle Scholar
  • Fan Q, Hsu YC, Lieli RP, Zhang Y (2022) Estimation of conditional average treatment effects with high-dimensional data. J. Bus. Econom. Statist. 40(1):313–327.CrossrefGoogle Scholar
  • Farbmacher H, Huber M, Lafférs L, Langen H, Spindler M (2022) Causal mediation analysis with double machine learning. Econom. J. 25(2):277–300.CrossrefGoogle Scholar
  • Farias V, Li A, Peng T (2021) Learning treatment effects in panels with general intervention patterns. Adv. Neural Inform. Processing Systems 34:14001–14013. Google Scholar
  • Farrell MH, Liang T, Misra S (2020) Deep learning for individual heterogeneity: An automatic inference framework. Preprint, submitted October 28, https://arxiv.org/abs/2010.14694.Google Scholar
  • Farrell MH, Liang T, Misra S (2021) Deep neural networks for estimation and inference. Econometrica 89(1):181–213.CrossrefGoogle Scholar
  • Goli A, Lambrecht A, Yoganarasimhan H (2024) A bias correction approach for interference in ranking experiments. Marketing Sci. 43(3):590–614.Google Scholar
  • Gordon BR, Moakler R, Zettelmeyer F (2023) Close enough? A large-scale exploration of non-experimental approaches to advertising measurement. Marketing Sci. 42(4):768–793.LinkGoogle Scholar
  • Guo Y, Coey D, Konutgan M, Li W, Schoener C, Goldman M (2021) Machine learning for variance reduction in online experiments. Adv. Neural Inform. Processing Systems 34:8637–8648.Google Scholar
  • Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5):359–366.CrossrefGoogle Scholar
  • Imbens GW, Rubin DB (2015) Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Johari R, Li H, Liskovich I, Weintraub GY (2022) Experimental design in two-sided platforms: An analysis of bias. Management Sci. 68(10):7069–7089.LinkGoogle Scholar
  • Kallus N, Mao X, Udell M (2018) Causal inference with noisy and missing covariates via matrix factorization. Adv. Neural Inform. Processing Systems 31:6921–6932.Google Scholar
  • Kingma DP (2014) Adam: A method for stochastic optimization. Preprint, submitted December 22, https://arxiv.org/abs/1412.6980.Google Scholar
  • Knaus MC (2022) Double machine learning-based programme evaluation under unconfoundedness. Econometrics J. 25(3):602–627.Google Scholar
  • Kohavi R, Thomke S (2017) The surprising power of online experiments. Harvard Bus. Rev. 95(5):74–82.Google Scholar
  • Kohavi R, Tang D, Xu Y (2020) Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Lee MR, Shen M (2018) Winner’s curse: Bias estimation for total effects of features in online controlled experiments. Proc. 24th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 491–499.Google Scholar
  • Lim AE, Shanthikumar JG, Shen ZM (2006) Model uncertainty, robust optimization, and learning. Models, Methods, and Applications for Innovative Decision Making (INFORMS, Cantonsville, MD), 66–94.Google Scholar
  • Nandy P, Venugopalan D, Lo C, Chatterjee S (2021) A/B testing for recommender systems in a two-sided marketplace. Adv. Neural Inform. Processing Systems 34:6466–6477.Google Scholar
  • Newey WK (1994) The asymptotic variance of semiparametric estimators. Econometrica 62(6):1349–1382.CrossrefGoogle Scholar
  • Pashley NE, Bind MAC (2023) Causal inference for multiple treatments using fractional factorial designs. Canadian J. Statist. 51(2):444–468.Google Scholar
  • Song Y, Sun T (2024) Ensemble experiments to optimize interventions along the customer journey: A reinforcement learning approach. Management Sci. 70(8):5115–5130.Google Scholar
  • Tang D, Agarwal A, O’Brien D, Meyer M (2010) Overlapping experiment infrastructure: More, better, faster experimentation. Proc. 16th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 17–26.Google Scholar
  • Tang J, Qi Z, Fang E, Shi C (2025) Offline feature-based pricing under censored demand: A causal inference approach. Manufacturing Service Oper. Management 27(2):535–553.Google Scholar
  • Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113(523):1228–1242.CrossrefGoogle Scholar
  • Wooldridge JM (2010) Econometric Analysis of Cross Section and Panel Data (MIT Press, Cambridge, MA).Google Scholar
  • Wu CFJ, Hamada MS (2011) Experiments: Planning, Analysis, and Optimization (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Xie H, Aurisset J (2016) Improving the sensitivity of online controlled experiments: Case studies at netflix. Proc. 22nd ACM SIGKDD International Conference Knowledge Discovery Data Mining (ACM, New York), 645–654.Google Scholar
  • Xiong T, Wang Y, Zheng S (2020) Orthogonal traffic assignment in online overlapping A/B tests. EasyChair technical report, Tencent Inc., Shenzhen, China.Google Scholar
  • Xiong R, Chin A, Taylor S (2023) Bias-variance tradeoffs for designing simultaneous temporal experiments. The KDD’23 Workshop Causal Discovery, Prediction Decision (PMLR, New York), 115–131.Google Scholar
  • Yarotsky D (2017) Error bounds for approximations with deep ReLU networks. Neural Networks 94:103–114.CrossrefGoogle Scholar
  • Ye Z, Zhang DJ, Zhang H, Zhang R, Chen X, Xu Z (2023) Cold start to improve market thickness on online advertising platforms: Data-driven algorithms and field experiments. Management Sci. 69(7):3838–3860.LinkGoogle Scholar
  • Zeng Z, Dai H, Zhang DJ, Zhang H, Zhang RP, Xu Z, Shen Z-JM (2023) The impact of social nudges on user-generated content for social network platforms. Management Sci. 69(9):5189–5208.LinkGoogle Scholar
  • Zhang Y, Politis DN (2022) Ridge regression revisited: Debiasing, thresholding and bootstrap. Ann. Statist. 50(3):1401–1422.CrossrefGoogle Scholar
  • Zhang DJ, Dai H, Dong L, Qi F, Zhang N, Liu X, Liu Z, Yang J (2020) The long-term and spillover effects of price promotions on retailing platforms: Evidence from a large randomized experiment on Alibaba. Management Sci. 66(6):2589–2609.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.