Abadie A, Zhao J (2021) Synthetic controls for experimental design. Preprint, submitted August 4, https://arxiv.org/abs/2108.02196.Google Scholar
Anderer A, Bastani H, Silberholz J (2022) Adaptive clinical trial designs with surrogates: When should we bother? Management Sci. 68(3):1982–2002.Link, Google Scholar
Andersen TG, Bollerslev T, Diebold FX, Labys P (2003) Modeling and forecasting realized volatility. Econometrica 71(2):579–625.Crossref, Google Scholar
Athey S, Chetty R, Imbens GW, Kang H (2025) The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely. Rev. Econom. Stud., ePub ahead of print September 30, https://doi.org/10.1093/restud/rdaf087.Crossref, Google Scholar
Athey S, Bayati M, Doudchenko N, Imbens G, Khosravi K (2021) Matrix completion methods for causal panel data models. J. Amer. Statist. Assoc. 116(536):1716–1730.Crossref, Google Scholar
Baiocchi M, Cheng J, Small DS (2014) Instrumental variable methods for causal inference. Statist. Medicine 33(13):2297–2340.Crossref, Google Scholar
Bakshy E, Eckles D, Bernstein MS (2014) Designing and deploying online field experiments. Wang C-W, ed. Proc. 23rd Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 283–292.Google Scholar
Basse G, Ding Y, Toulis P (2019) Minimax designs for causal effects in temporal experiments with treatment habituation. Preprint, submitted August 9, https://arxiv.org/abs/1908.03531.Google Scholar
Battocchi K, Dillon E, Hei M, Lewis G, Oprescu M, Syrgkanis V (2021) Estimating the long-term effects of novel treatments. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 2925–2935.Google Scholar
Berman R, Van den Bulte C (2022) False discovery in A/B testing. Management Sci. 68(9):6762–6782.Link, Google Scholar
Bojinov I, Gupta S (2022) Online experimentation: Benefits, operational and methodological challenges, and scaling guide. Harvard Data Sci. Rev. 4(3).Google Scholar
Bojinov I, Simchi-Levi D, Zhao J (2023) Design and analysis of switchback experiments. Management Sci. 69(7):3759–3777.Link, Google Scholar
Bright I, Delarue A, Lobel I (2022) Reducing marketplace interference bias via shadow prices. Preprint, submitted May 4, https://arxiv.org/abs/2205.02274v1.Google Scholar
Brown CA, Lilford RJ (2006) The stepped wedge trial design: A systematic review. BMC Medical Res. Methodology 6(1):54.Crossref, Google Scholar
Chen W, Bayati M (2021) Learning to recommend using non-uniform data. Preprint, submitted October 21, https://arxiv.org/abs/2110.11248v1.Google Scholar
Cochran W, Autrey K, Cannon C (1941) A double change-over design for dairy cattle feeding experiments. J. Dairy Sci. 24(11):937–951.Crossref, Google Scholar
Deng A, Lu J, Chen S (2016) Continuous monitoring of A/B tests without pain: Optional stopping in Bayesian testing. 2016 IEEE Internat. Conf. Data Sci. Adv. Anal. (DSAA) (IEEE, New York), 243–252.Google Scholar
Deng A, Xu Y, Kohavi R, Walker T (2013) Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. Leonardi S, Panconesi A, eds. Proc. 6th ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 123–132.Google Scholar
Doudchenko N, Gilinson D, Taylor S, Wernerfelt N (2019) Designing experiments with synthetic controls. Working paper, Google, New York.Google Scholar
Doudchenko N, Khosravi K, Pouget-Abadie J, Lahaie S, Lubin M, Mirrokni V, Spiess J, Imbens G (2021) Synthetic design: An optimization approach to experimental design with synthetic controls. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 8691–8701.Google Scholar
Duan W, Ba S, Zhang C (2021) Online experimentation with surrogate metrics: Guidelines and a case study. Lewin-Eytan L, Carmel D, Yom-Tov E, eds. Proc. 14th ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 193–201.Google Scholar
Efron B (1987) Better bootstrap confidence intervals. J. Amer. Statist. Assoc. 82(397):171–185.Crossref, Google Scholar
Efron B, Tibshirani RJ (1994) An Introduction to the Bootstrap (CRC Press, Boca Raton, FL).Crossref, Google Scholar
Fabijan A, Gupchup J, Gupta S, Omhover J, Qin W, Vermeer L, Dmitriev P (2019) Diagnosing sample ratio mismatch in online controlled experiments: A taxonomy and rules of thumb for practitioners. Teredesai A, Kumar V, eds. Proc. 25th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2156–2164.Google Scholar
Farias V, Li A, Peng T, Zheng A (2022) Markovian interference in experiments. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inform. Processing Systems 35 (NeurIPS 2022) (Curran Associates, Red Hook, NY), 535–549.Google Scholar
Fuller WA (2009) Introduction to Statistical Time Series (John Wiley & Sons, Hoboken, NJ).Google Scholar
Glynn PW, Johari R, Rasouli M (2020) Adaptive experimental design with temporal interference: A maximum likelihood approach. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems 33 (NeurIPS 2020) (Curran Associates, Red Hook, NY), 15054–15064.Google Scholar
Gupta S, Kohavi R, Tang D, Xu Y, Andersen R, Bakshy E, Cardin N, et al. (2019) Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter 21(1):20–35.Crossref, Google Scholar
Hamilton JD (2020) Time Series Analysis (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Hemming K, Haines TP, Chilton PJ, Girling AJ, Lilford RJ (2015) The stepped wedge cluster randomised trial: Rationale, design, analysis, and reporting. BMJ 350:h391.Crossref, Google Scholar
Hohnhold H, O’Brien D, Tang D (2015) Focusing on the long-term: It’s good for users and business. Cao L, Zhang C, eds. Proc. 21st ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1849–1858.Google Scholar
Holland PW (1986) Statistics and causal inference. J. Amer. Statist. Assoc. 81(396):945–960.Crossref, Google Scholar
Hu Y, Wager S (2022) Switchback experiments under geometric mixing. Preprint, submitted September 1, https://arxiv.org/abs/2209.00197v1.Google Scholar
Hussey MA, Hughes JP (2007) Design and analysis of stepped wedge cluster randomized trials. Contemporary Clinical Trials 28(2):182–191.Crossref, Google Scholar
Imbens GW, Rubin DB (2015) Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Imbens G, Kallus N, Mao X, Wang Y (2022) Long-term causal inference under persistent confounding via data combination. Preprint, submitted February 15, https://arxiv.org/abs/2202.07234v1.Google Scholar
Joffe MM, Greene T (2009) Related causal frameworks for surrogate outcomes. Biometrics 65(2):530–538.Crossref, Google Scholar
Johari R, Li H, Liskovich I, Weintraub GY (2022) Experimental design in two-sided platforms: An analysis of bias. Management Sci. 68(10):7069–7089.Link, Google Scholar
Kohavi R, Tang D, Xu Y (2020) Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Kohavi R, Deng A, Frasca B, Longbotham R, Walker T, Xu Y (2012) Trustworthy online controlled experiments: Five puzzling outcomes explained. Yang Q, ed. Proc. 18th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 786–794.Google Scholar
Kohavi R, Deng A, Frasca B, Walker T, Xu Y, Pohlmann N (2013) Online controlled experiments at large scale. Ghani R, Senator TE, Bradley P, Parekh R, He J, eds. Proc. 19th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1168–1176.Google Scholar
Larsen N, Stallrich J, Sengupta S, Deng A, Kohavi R, Stevens N (2024) Statistical challenges in online controlled experiments: A review of A/B testing methodology. Amer. Statistician 78(2):135–149.Crossref, Google Scholar
Leng Y, Dimmery D (2021) Calibration of heterogeneous treatment effects in random experiments. Preprint, submitted June 28, https://doi.org/10.2139/ssrn.3875850.Google Scholar
Li F, Turner EL, Preisser JS (2018) Optimal allocation of clusters in cohort stepped wedge designs. Statist. Probab. Lett. 137:257–263.Crossref, Google Scholar
Munro E, Wager S, Xu K (2021) Treatment effects in market equilibrium. Preprint, submitted September 23, https://arxiv.org/abs/2109.11647v1.Google Scholar
Munro E, Jones D, Brennan J, Nelet R, Mirrokni V, Pouget-Abadie J (2023) Causal estimation of user learning in personalized systems. Leyton-Brown K, ed. Proc. 24th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 992–1016.Google Scholar
Neyman J (1923) On the application of probability theory to agricultural experiments. Essay on principles. Ann. Agricultural Sci. 1–51.Google Scholar
Ni T, Bojinov I, Zhao J (2023) Design of panel experiments with spatial and temporal interference. Preprint, submitted June 2, http://dx/doi.org/10.2139/ssrn.4466598.Google Scholar
Pearl J (1995) Causal diagrams for empirical research. Biometrika 82(4):669–688.Crossref, Google Scholar
Prentice RL (1989) Surrogate endpoints in clinical trials: Definition and operational criteria. Statist. Medicine 8(4):431–440.Crossref, Google Scholar
Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J. Ed. Psych. 66(5):688–701.Crossref, Google Scholar
Stock JH, Watson MW (2001) Vector autoregressions. J. Econom. Perspect. 15(4):101–115.Crossref, Google Scholar
Stock JH, Watson MW (2020) Introduction to Econometrics (Pearson, London).Google Scholar
Wager S, Xu K (2021) Experimenting in equilibrium. Management Sci. 67(11):6694–6715.Link, Google Scholar
Weir CJ, Walley RJ (2006) Statistical evaluation of biomarkers as surrogate endpoints: A literature review. Statist. Medicine 25(2):183–203.Crossref, Google Scholar
Xiong R, Chin A, Taylor SJ (2023) Data-driven switchback designs: Theoretical tradeoffs and empirical calibration. Preprint, submitted November 7, https://doi.org/10.2139/ssrn.4626245.Google Scholar
Xiong R, Athey SC, Bayati M, Imbens GW (2019) Optimal experimental design for staggered rollouts. Preprint, submitted November 9, https://doi.org/10.2139/ssrn.3483934.Google Scholar
Xu Y, Chen N, Fernandez A, Sinno O, Bhasin A (2015) From infrastructure to culture: A/B testing challenges in large scale social networks. Cao L, Zhang C, eds. Proc. 21st ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2227–2236.Google Scholar
Yang J, Eckles D, Dhillon P, Aral S (2023) Targeting for long-term outcomes. Management Sci. 70(6):3841–3855.Link, Google Scholar
Ye Z, Zhang Z, Zhang D, Zhang H, Zhang RP (2023) Deep-learning-based causal inference for large-scale combinatorial experiments: Theory and empirical evidence. Preprint, submitted March 1, https://doi.org/10.2139/ssrn.4375327.Google Scholar
Ye Z, Zhang DJ, Zhang H, Zhang R, Chen X, Xu Z (2022) Cold start to improve market thickness on online advertising platforms: Data-driven algorithms and field experiments. Management Sci. 69(7):3838–3860.Link, Google Scholar

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:August 15, 2023
Accepted:July 31, 2025
Published Online:April 09, 2026

Cite as

Shan Huang, Chen Wang, Yuan Yuan, Jinglong Zhao, Brocco (Jingjing) Zhang (2026) Estimating Effects of Long-Term Treatments. Management Science 0(0).

https://doi.org/10.1287/mnsc.2023.02575

Keywords

Acknowledgments

The first four authors are listed in alphabetical order. The authors also thank Department Editor Omar Besbes, the anonymous Associate Editor, and the anonymous referees, whose comments significantly improved the manuscript throughout the review process.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Estimating Effects of Long-Term Treatments

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News