Estimating Effects of Long-Term Treatments

Published Online:https://doi.org/10.1287/mnsc.2023.02575

References

  • Abadie A, Zhao J (2021) Synthetic controls for experimental design. Preprint, submitted August 4, https://arxiv.org/abs/2108.02196.Google Scholar
  • Anderer A, Bastani H, Silberholz J (2022) Adaptive clinical trial designs with surrogates: When should we bother? Management Sci. 68(3):1982–2002.LinkGoogle Scholar
  • Andersen TG, Bollerslev T, Diebold FX, Labys P (2003) Modeling and forecasting realized volatility. Econometrica 71(2):579–625.CrossrefGoogle Scholar
  • Athey S, Chetty R, Imbens GW, Kang H (2025) The surrogate index: Combining short-term proxies to estimate long-term treatment effects more rapidly and precisely. Rev. Econom. Stud., ePub ahead of print September 30, https://doi.org/10.1093/restud/rdaf087.CrossrefGoogle Scholar
  • Athey S, Bayati M, Doudchenko N, Imbens G, Khosravi K (2021) Matrix completion methods for causal panel data models. J. Amer. Statist. Assoc. 116(536):1716–1730.CrossrefGoogle Scholar
  • Baiocchi M, Cheng J, Small DS (2014) Instrumental variable methods for causal inference. Statist. Medicine 33(13):2297–2340.CrossrefGoogle Scholar
  • Bakshy E, Eckles D, Bernstein MS (2014) Designing and deploying online field experiments. Wang C-W, ed. Proc. 23rd Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 283–292.Google Scholar
  • Basse G, Ding Y, Toulis P (2019) Minimax designs for causal effects in temporal experiments with treatment habituation. Preprint, submitted August 9, https://arxiv.org/abs/1908.03531.Google Scholar
  • Battocchi K, Dillon E, Hei M, Lewis G, Oprescu M, Syrgkanis V (2021) Estimating the long-term effects of novel treatments. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 2925–2935.Google Scholar
  • Berman R, Van den Bulte C (2022) False discovery in A/B testing. Management Sci. 68(9):6762–6782.LinkGoogle Scholar
  • Bojinov I, Gupta S (2022) Online experimentation: Benefits, operational and methodological challenges, and scaling guide. Harvard Data Sci. Rev. 4(3).Google Scholar
  • Bojinov I, Simchi-Levi D, Zhao J (2023) Design and analysis of switchback experiments. Management Sci. 69(7):3759–3777.LinkGoogle Scholar
  • Bright I, Delarue A, Lobel I (2022) Reducing marketplace interference bias via shadow prices. Preprint, submitted May 4, https://arxiv.org/abs/2205.02274v1.Google Scholar
  • Brown CA, Lilford RJ (2006) The stepped wedge trial design: A systematic review. BMC Medical Res. Methodology 6(1):54.CrossrefGoogle Scholar
  • Chen W, Bayati M (2021) Learning to recommend using non-uniform data. Preprint, submitted October 21, https://arxiv.org/abs/2110.11248v1.Google Scholar
  • Cochran W, Autrey K, Cannon C (1941) A double change-over design for dairy cattle feeding experiments. J. Dairy Sci. 24(11):937–951.CrossrefGoogle Scholar
  • Deng A, Lu J, Chen S (2016) Continuous monitoring of A/B tests without pain: Optional stopping in Bayesian testing. 2016 IEEE Internat. Conf. Data Sci. Adv. Anal. (DSAA) (IEEE, New York), 243–252.Google Scholar
  • Deng A, Xu Y, Kohavi R, Walker T (2013) Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. Leonardi S, Panconesi A, eds. Proc. 6th ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 123–132.Google Scholar
  • Doudchenko N, Gilinson D, Taylor S, Wernerfelt N (2019) Designing experiments with synthetic controls. Working paper, Google, New York.Google Scholar
  • Doudchenko N, Khosravi K, Pouget-Abadie J, Lahaie S, Lubin M, Mirrokni V, Spiess J, Imbens G (2021) Synthetic design: An optimization approach to experimental design with synthetic controls. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Inform. Processing Systems 34 (NeurIPS 2021) (Curran Associates, Red Hook, NY), 8691–8701.Google Scholar
  • Duan W, Ba S, Zhang C (2021) Online experimentation with surrogate metrics: Guidelines and a case study. Lewin-Eytan L, Carmel D, Yom-Tov E, eds. Proc. 14th ACM Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 193–201.Google Scholar
  • Efron B (1987) Better bootstrap confidence intervals. J. Amer. Statist. Assoc. 82(397):171–185.CrossrefGoogle Scholar
  • Efron B, Tibshirani RJ (1994) An Introduction to the Bootstrap (CRC Press, Boca Raton, FL).CrossrefGoogle Scholar
  • Fabijan A, Gupchup J, Gupta S, Omhover J, Qin W, Vermeer L, Dmitriev P (2019) Diagnosing sample ratio mismatch in online controlled experiments: A taxonomy and rules of thumb for practitioners. Teredesai A, Kumar V, eds. Proc. 25th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2156–2164.Google Scholar
  • Farias V, Li A, Peng T, Zheng A (2022) Markovian interference in experiments. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inform. Processing Systems 35 (NeurIPS 2022) (Curran Associates, Red Hook, NY), 535–549.Google Scholar
  • Fuller WA (2009) Introduction to Statistical Time Series (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Glynn PW, Johari R, Rasouli M (2020) Adaptive experimental design with temporal interference: A maximum likelihood approach. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems 33 (NeurIPS 2020) (Curran Associates, Red Hook, NY), 15054–15064.Google Scholar
  • Gupta S, Kohavi R, Tang D, Xu Y, Andersen R, Bakshy E, Cardin N, et al. (2019) Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explorations Newsletter 21(1):20–35.CrossrefGoogle Scholar
  • Hamilton JD (2020) Time Series Analysis (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Hemming K, Haines TP, Chilton PJ, Girling AJ, Lilford RJ (2015) The stepped wedge cluster randomised trial: Rationale, design, analysis, and reporting. BMJ 350:h391.CrossrefGoogle Scholar
  • Hohnhold H, O’Brien D, Tang D (2015) Focusing on the long-term: It’s good for users and business. Cao L, Zhang C, eds. Proc. 21st ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1849–1858.Google Scholar
  • Holland PW (1986) Statistics and causal inference. J. Amer. Statist. Assoc. 81(396):945–960.CrossrefGoogle Scholar
  • Hu Y, Wager S (2022) Switchback experiments under geometric mixing. Preprint, submitted September 1, https://arxiv.org/abs/2209.00197v1.Google Scholar
  • Hussey MA, Hughes JP (2007) Design and analysis of stepped wedge cluster randomized trials. Contemporary Clinical Trials 28(2):182–191.CrossrefGoogle Scholar
  • Imbens GW, Rubin DB (2015) Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Imbens G, Kallus N, Mao X, Wang Y (2022) Long-term causal inference under persistent confounding via data combination. Preprint, submitted February 15, https://arxiv.org/abs/2202.07234v1.Google Scholar
  • Joffe MM, Greene T (2009) Related causal frameworks for surrogate outcomes. Biometrics 65(2):530–538.CrossrefGoogle Scholar
  • Johari R, Li H, Liskovich I, Weintraub GY (2022) Experimental design in two-sided platforms: An analysis of bias. Management Sci. 68(10):7069–7089.LinkGoogle Scholar
  • Kohavi R, Tang D, Xu Y (2020) Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Kohavi R, Deng A, Frasca B, Longbotham R, Walker T, Xu Y (2012) Trustworthy online controlled experiments: Five puzzling outcomes explained. Yang Q, ed. Proc. 18th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 786–794.Google Scholar
  • Kohavi R, Deng A, Frasca B, Walker T, Xu Y, Pohlmann N (2013) Online controlled experiments at large scale. Ghani R, Senator TE, Bradley P, Parekh R, He J, eds. Proc. 19th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1168–1176.Google Scholar
  • Larsen N, Stallrich J, Sengupta S, Deng A, Kohavi R, Stevens N (2024) Statistical challenges in online controlled experiments: A review of A/B testing methodology. Amer. Statistician 78(2):135–149.CrossrefGoogle Scholar
  • Leng Y, Dimmery D (2021) Calibration of heterogeneous treatment effects in random experiments. Preprint, submitted June 28, https://doi.org/10.2139/ssrn.3875850.Google Scholar
  • Li F, Turner EL, Preisser JS (2018) Optimal allocation of clusters in cohort stepped wedge designs. Statist. Probab. Lett. 137:257–263.CrossrefGoogle Scholar
  • Munro E, Wager S, Xu K (2021) Treatment effects in market equilibrium. Preprint, submitted September 23, https://arxiv.org/abs/2109.11647v1.Google Scholar
  • Munro E, Jones D, Brennan J, Nelet R, Mirrokni V, Pouget-Abadie J (2023) Causal estimation of user learning in personalized systems. Leyton-Brown K, ed. Proc. 24th ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 992–1016.Google Scholar
  • Neyman J (1923) On the application of probability theory to agricultural experiments. Essay on principles. Ann. Agricultural Sci. 1–51.Google Scholar
  • Ni T, Bojinov I, Zhao J (2023) Design of panel experiments with spatial and temporal interference. Preprint, submitted June 2, http://dx/doi.org/10.2139/ssrn.4466598.Google Scholar
  • Pearl J (1995) Causal diagrams for empirical research. Biometrika 82(4):669–688.CrossrefGoogle Scholar
  • Prentice RL (1989) Surrogate endpoints in clinical trials: Definition and operational criteria. Statist. Medicine 8(4):431–440.CrossrefGoogle Scholar
  • Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J. Ed. Psych. 66(5):688–701.CrossrefGoogle Scholar
  • Stock JH, Watson MW (2001) Vector autoregressions. J. Econom. Perspect. 15(4):101–115.CrossrefGoogle Scholar
  • Stock JH, Watson MW (2020) Introduction to Econometrics (Pearson, London).Google Scholar
  • Wager S, Xu K (2021) Experimenting in equilibrium. Management Sci. 67(11):6694–6715.LinkGoogle Scholar
  • Weir CJ, Walley RJ (2006) Statistical evaluation of biomarkers as surrogate endpoints: A literature review. Statist. Medicine 25(2):183–203.CrossrefGoogle Scholar
  • Xiong R, Chin A, Taylor SJ (2023) Data-driven switchback designs: Theoretical tradeoffs and empirical calibration. Preprint, submitted November 7, https://doi.org/10.2139/ssrn.4626245.Google Scholar
  • Xiong R, Athey SC, Bayati M, Imbens GW (2019) Optimal experimental design for staggered rollouts. Preprint, submitted November 9, https://doi.org/10.2139/ssrn.3483934.Google Scholar
  • Xu Y, Chen N, Fernandez A, Sinno O, Bhasin A (2015) From infrastructure to culture: A/B testing challenges in large scale social networks. Cao L, Zhang C, eds. Proc. 21st ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2227–2236.Google Scholar
  • Yang J, Eckles D, Dhillon P, Aral S (2023) Targeting for long-term outcomes. Management Sci. 70(6):3841–3855.LinkGoogle Scholar
  • Ye Z, Zhang Z, Zhang D, Zhang H, Zhang RP (2023) Deep-learning-based causal inference for large-scale combinatorial experiments: Theory and empirical evidence. Preprint, submitted March 1, https://doi.org/10.2139/ssrn.4375327.Google Scholar
  • Ye Z, Zhang DJ, Zhang H, Zhang R, Chen X, Xu Z (2022) Cold start to improve market thickness on online advertising platforms: Data-driven algorithms and field experiments. Management Sci. 69(7):3838–3860.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.