Deep Learning-Based Causal Inference for Large-Scale Combinatorial Experiments: Theory and Empirical Evidence

Zikun Ye
Zikun Ye
[email protected]
https://orcid.org/0000-0001-9914-7966
Michael G. Foster School of Business, University of Washington, Seattle, Washington 98195
Search for more papers by this author
,
Zhiqi Zhang
Zhiqi Zhang
[email protected]
https://orcid.org/0009-0005-4566-8148
Olin Business School, Washington University in St. Louis, St. Louis, Missouri 63130
Search for more papers by this author
,
Dennis J. Zhang
Dennis J. Zhang
[email protected]
https://orcid.org/0000-0002-4544-775X
Olin Business School, Washington University in St. Louis, St. Louis, Missouri 63130
Search for more papers by this author
,
Heng Zhang
Heng Zhang
[email protected]
https://orcid.org/0000-0002-6105-6994
W. P. Carey School of Business, Arizona State University, Tempe, Arizona 85287
Search for more papers by this author
,
Renyu Zhang
Corresponding Author
Renyu Zhang
[email protected]
https://orcid.org/0000-0003-0284-164X
Chinese University of Hong Kong Business School, The Chinese University of Hong Kong, Hong Kong, China
Search for more papers by this author

Michael G. Foster School of Business, University of Washington, Seattle, Washington 98195

Search for more papers by this author

Zhiqi Zhang

[email protected]

https://orcid.org/0009-0005-4566-8148

Olin Business School, Washington University in St. Louis, St. Louis, Missouri 63130

Search for more papers by this author

Dennis J. Zhang

[email protected]

https://orcid.org/0000-0002-4544-775X

Olin Business School, Washington University in St. Louis, St. Louis, Missouri 63130

Search for more papers by this author

Heng Zhang

[email protected]

https://orcid.org/0000-0002-6105-6994

W. P. Carey School of Business, Arizona State University, Tempe, Arizona 85287

Search for more papers by this author

Renyu Zhang

Corresponding Author

Renyu Zhang

[email protected]

https://orcid.org/0000-0003-0284-164X

Chinese University of Hong Kong Business School, The Chinese University of Hong Kong, Hong Kong, China

Search for more papers by this author

Published Online:15 Oct 2025https://doi.org/10.1287/mnsc.2024.04625

References

Adcock B, Dexter N (2021) The gap between theory and practice in function approximation with deep neural networks. SIAM J. Math. Data Sci. 3(2):624–655.Crossref, Google Scholar
Angrist JD, Pischke JS (2009) Mostly Harmless Econometrics: An Empiricist’s Companion (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Arkhangelsky D, Athey S, Hirshberg DA, Imbens GW, Wager S (2021) Synthetic difference-in-differences. Amer. Econom. Rev. 111(12):4088–4118.Crossref, Google Scholar
Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc. Natl. Acad. Sci. USA 113(27):7353–7360.Crossref, Google Scholar
Athey S, Imbens GW, Wager S (2018) Approximate residual balancing: Debiased inference of average treatment effects in high dimensions. J. Roy. Statist. Soc. Ser. B Statist. Methodology 80(4):597–623.Crossref, Google Scholar
Bertsimas D, Imai K, Li ML (2022) Distributionally robust causal inference with observational data. Preprint, submitted October 15, https://arxiv.org/abs/2210.08326.Google Scholar
Bojinov I, Simchi-Levi D, Zhao J (2023) Design and analysis of switchback experiments. Management Sci. 69(7):3759–3777.Link, Google Scholar
Box GEP, Hunter WG, Hunter JS (1978) Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building (John Wiley and Sons, New York).Google Scholar
Burtch G, Ghose A, Wattal S (2015) The hidden cost of accommodating crowdfunder privacy preferences: A randomized field experiment. Management Sci. 61(5):949–962.Link, Google Scholar
Candogan O, Chen C, Niazadeh R (2024) Correlated cluster-based randomized experiments: Robust variance minimization. Management Sci. 70(6):4069–4086.Google Scholar
Chernozhukov V, Newey WK, Singh R (2022) Automatic debiased machine learning of causal and structural effects. Econometrica 90(3):967–1027.Crossref, Google Scholar
Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J (2018) Double/debiased machine learning for treatment and structural parameters. Econom. J. 21(1):C1–C68.Crossref, Google Scholar
Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
Chiang HD, Kato K, Ma Y, Sasaki Y (2022) Multiway cluster robust double/debiased machine learning. J. Bus. Econom. Statist. 40(3):1046–1056.Crossref, Google Scholar
Dasgupta T, Pillai NS, Rubin DB (2015) Causal inference from 2K factorial designs by using potential outcomes. J. Roy. Statist. Soc. Ser. B Statist. Methodology 77(4):727–753.Crossref, Google Scholar
Dube A, Jacobs J, Naidu S, Suri S (2020) Monopsony in online labor markets. Amer. Econom. Rev. Insights 2(1):33–46.Crossref, Google Scholar
Edelman B, Luca M, Svirsky D (2017) Racial discrimination in the sharing economy: Evidence from a field experiment. Amer. Econom. J. Appl. Econom. 9(2):1–22.Crossref, Google Scholar
Fan Q, Hsu YC, Lieli RP, Zhang Y (2022) Estimation of conditional average treatment effects with high-dimensional data. J. Bus. Econom. Statist. 40(1):313–327.Crossref, Google Scholar
Farbmacher H, Huber M, Lafférs L, Langen H, Spindler M (2022) Causal mediation analysis with double machine learning. Econom. J. 25(2):277–300.Crossref, Google Scholar
Farias V, Li A, Peng T (2021) Learning treatment effects in panels with general intervention patterns. Adv. Neural Inform. Processing Systems 34:14001–14013. Google Scholar
Farrell MH, Liang T, Misra S (2020) Deep learning for individual heterogeneity: An automatic inference framework. Preprint, submitted October 28, https://arxiv.org/abs/2010.14694.Google Scholar
Farrell MH, Liang T, Misra S (2021) Deep neural networks for estimation and inference. Econometrica 89(1):181–213.Crossref, Google Scholar
Goli A, Lambrecht A, Yoganarasimhan H (2024) A bias correction approach for interference in ranking experiments. Marketing Sci. 43(3):590–614.Google Scholar
Gordon BR, Moakler R, Zettelmeyer F (2023) Close enough? A large-scale exploration of non-experimental approaches to advertising measurement. Marketing Sci. 42(4):768–793.Link, Google Scholar
Guo Y, Coey D, Konutgan M, Li W, Schoener C, Goldman M (2021) Machine learning for variance reduction in online experiments. Adv. Neural Inform. Processing Systems 34:8637–8648.Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5):359–366.Crossref, Google Scholar
Imbens GW, Rubin DB (2015) Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Johari R, Li H, Liskovich I, Weintraub GY (2022) Experimental design in two-sided platforms: An analysis of bias. Management Sci. 68(10):7069–7089.Link, Google Scholar
Kallus N, Mao X, Udell M (2018) Causal inference with noisy and missing covariates via matrix factorization. Adv. Neural Inform. Processing Systems 31:6921–6932.Google Scholar
Kingma DP (2014) Adam: A method for stochastic optimization. Preprint, submitted December 22, https://arxiv.org/abs/1412.6980.Google Scholar
Knaus MC (2022) Double machine learning-based programme evaluation under unconfoundedness. Econometrics J. 25(3):602–627.Google Scholar
Kohavi R, Thomke S (2017) The surprising power of online experiments. Harvard Bus. Rev. 95(5):74–82.Google Scholar
Kohavi R, Tang D, Xu Y (2020) Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Lee MR, Shen M (2018) Winner’s curse: Bias estimation for total effects of features in online controlled experiments. Proc. 24th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 491–499.Google Scholar
Lim AE, Shanthikumar JG, Shen ZM (2006) Model uncertainty, robust optimization, and learning. Models, Methods, and Applications for Innovative Decision Making (INFORMS, Cantonsville, MD), 66–94.Google Scholar
Nandy P, Venugopalan D, Lo C, Chatterjee S (2021) A/B testing for recommender systems in a two-sided marketplace. Adv. Neural Inform. Processing Systems 34:6466–6477.Google Scholar
Newey WK (1994) The asymptotic variance of semiparametric estimators. Econometrica 62(6):1349–1382.Crossref, Google Scholar
Pashley NE, Bind MAC (2023) Causal inference for multiple treatments using fractional factorial designs. Canadian J. Statist. 51(2):444–468.Google Scholar
Song Y, Sun T (2024) Ensemble experiments to optimize interventions along the customer journey: A reinforcement learning approach. Management Sci. 70(8):5115–5130.Google Scholar
Tang D, Agarwal A, O’Brien D, Meyer M (2010) Overlapping experiment infrastructure: More, better, faster experimentation. Proc. 16th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 17–26.Google Scholar
Tang J, Qi Z, Fang E, Shi C (2025) Offline feature-based pricing under censored demand: A causal inference approach. Manufacturing Service Oper. Management 27(2):535–553.Google Scholar
Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113(523):1228–1242.Crossref, Google Scholar
Wooldridge JM (2010) Econometric Analysis of Cross Section and Panel Data (MIT Press, Cambridge, MA).Google Scholar
Wu CFJ, Hamada MS (2011) Experiments: Planning, Analysis, and Optimization (John Wiley & Sons, Hoboken, NJ).Google Scholar
Xie H, Aurisset J (2016) Improving the sensitivity of online controlled experiments: Case studies at netflix. Proc. 22nd ACM SIGKDD International Conference Knowledge Discovery Data Mining (ACM, New York), 645–654.Google Scholar
Xiong T, Wang Y, Zheng S (2020) Orthogonal traffic assignment in online overlapping A/B tests. EasyChair technical report, Tencent Inc., Shenzhen, China.Google Scholar
Xiong R, Chin A, Taylor S (2023) Bias-variance tradeoffs for designing simultaneous temporal experiments. The KDD’23 Workshop Causal Discovery, Prediction Decision (PMLR, New York), 115–131.Google Scholar
Yarotsky D (2017) Error bounds for approximations with deep ReLU networks. Neural Networks 94:103–114.Crossref, Google Scholar
Ye Z, Zhang DJ, Zhang H, Zhang R, Chen X, Xu Z (2023) Cold start to improve market thickness on online advertising platforms: Data-driven algorithms and field experiments. Management Sci. 69(7):3838–3860.Link, Google Scholar
Zeng Z, Dai H, Zhang DJ, Zhang H, Zhang RP, Xu Z, Shen Z-JM (2023) The impact of social nudges on user-generated content for social network platforms. Management Sci. 69(9):5189–5208.Link, Google Scholar
Zhang Y, Politis DN (2022) Ridge regression revisited: Debiasing, thresholding and bootstrap. Ann. Statist. 50(3):1401–1422.Crossref, Google Scholar
Zhang DJ, Dai H, Dong L, Qi F, Zhang N, Liu X, Liu Z, Yang J (2020) The long-term and spillover effects of price promotions on retailing platforms: Evidence from a large randomized experiment on Alibaba. Management Sci. 66(6):2589–2609.Link, Google Scholar

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:January 23, 2024
Accepted:March 02, 2025
Published Online:October 15, 2025

Cite as

Zikun Ye, Zhiqi Zhang, Dennis J. Zhang, Heng Zhang, Renyu Zhang (2025) Deep Learning-Based Causal Inference for Large-Scale Combinatorial Experiments: Theory and Empirical Evidence. Management Science 0(0).

https://doi.org/10.1287/mnsc.2024.04625

Keywords

Acknowledgments

The authors thank Department Editor Vivek Farias, the anonymous associate editor, and three referees for their very helpful and constructive comments, which have led to significant improvements in both the content and exposition of this study. The authors also thank the industry partner for their support in sharing the data, implementing the algorithm, and conducting the experiment.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Deep Learning-Based Causal Inference for Large-Scale Combinatorial Experiments: Theory and Empirical Evidence

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News