Abbasi-Yadkori Y, Pál D, Szepesvári C (2011) Improved algorithms for linear stochastic bandits. Adv. Neural Inform. Processing Systems 24:2312–2320.Google Scholar
Balsubramani A (2014) Sharp finite-time iterated-logarithm martingale concentration. Preprint, submitted May 12, https://arxiv.org/abs/1405.2639.Google Scholar
Balsubramani A, Ramdas A (2015) Sequential nonparametric testing with the law of the iterated logarithm. Preprint, submitted June 10, https://arxiv.org/abs/1506.03486.Google Scholar
Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.Google Scholar
Bubeck S, Munos R, Stoltz G (2009) Pure exploration in multi-armed bandits problems. Gavaldà R, Lugosi G, Zeugmann T, Zilles S, eds. Internat. Conf. Algorithmic Learning Theory, Lecture Notes in Computer Science, vol. 5809 (Springer, Berlin), 23–37.Google Scholar
Darling D, Robbins H (1967) Confidence sequences for mean, variance, and median. Proc. Natl. Acad. Sci. USA 58(1):66–68.Crossref, Google Scholar
de la Peña VH, Lai TL, Shao QM (2008) Self-Normalized Processes: Limit Theory and Statistical Applications, Probability and its Applications (Springer Science & Business Media, Berlin).Google Scholar
Demets DL, Lan KG (1994) Interim analysis: The alpha spending function approach. Statis. Med. 13(13-14):1341–1352.Crossref, Google Scholar
Even-Dar E, Mannor S, Mansour Y (2002) PAC bounds for multi-armed bandit and Markov decision processes. Kivinen J, Sloan RH, ed. COLT’02 Proc. 15th Annu. Conf. Comput. Learning Theory (Springer, Berlin), 255–270.Google Scholar
Fithian W, Wager S (2014) Semiparametric exponential families for heavy-tailed data. Biometrika 102(2):486–493.Crossref, Google Scholar
Foster DP, Stine RA (2008) α-investing: A procedure for sequential control of expected false discoveries. J. Roy. Statist. Soc. Ser. B Statist. Methodology 70(2):429–444.Crossref, Google Scholar
Ghosh BK, Sen PK (1991) Handbook of Sequential Analysis (CRC Press, Boca Raton, FL).Google Scholar
Howard SR, Ramdas A, McAuliffe J, Sekhon J (2018) Uniform, nonparametric, non-asymptotic confidence sequences. Preprint, submitted October 18, https://arxiv.org/abs/1810.08240.Google Scholar
James W, Stein C (1961) Estimation with quadratic loss. Neyman J, ed. Proc. Fourth Berkeley Sympos. Math. Statist. Probab., vol. 1 (University of California Press, Berkeley, CA), 361–379.Google Scholar
Jamieson K, Jain L (2018) A bandit approach to multiple testing with false discovery control. Preprint, submitted September 6, https://arxiv.org/abs/1809.02235.Google Scholar
Jamieson K, Malloy M, Nowak R, Bubeck S (2014) lil’UCB: An optimal exploration algorithm for multi-armed bandits. Proc. 27th Conf. Learning Theory, Proc. Machine Learn. Res., vol. 35 (PMLR, Barcelona, Spain), 423–439.Google Scholar
Javanmard A, Montanari A (2016) Online rules for control of false discovery rate and false discovery exceedance. Preprint, submitted March 29, https://arxiv.org/abs/1603.09000.Google Scholar
Johari R, Koomen P, Pekelis L, Walsh D (2017) Peeking at A/B tests: Why it matters, and what to do about it. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining KDD ’17 (Association for Computing Machinery, New York), 1517–1525.Google Scholar
Kalyanakrishnan S, Tewari A, Auer P, Stone P (2012) PAC subset selection in stochastic multi-armed bandits. Proc. 29th Internat. Conf. Machine Learning (Omnipress, Madison, WI), 655–662.Google Scholar
Kaufmann E, Cappé O, Garivier A (2014) On the complexity of A/B testing. Preprint, submitted May 13, https://arxiv.org/abs/1405.3224.Google Scholar
Kohavi R, Deng A, Frasca B, Walker T, Xu Y, Pohlmann N (2013) Online controlled experiments at large scale. Proc. 19th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1168–1176.Google Scholar
Lai TL (2001) Sequential analysis: Some classical problems and new challenges. Statist. Sinica 11(2):303–351.Google Scholar
Lattimore T, Szepesvári C (2020) Bandit Algorithms. (Cambridge University Press).Google Scholar
Lehmann EL, Romano JP, Casella G (1986) Testing Statistical Hypotheses. Springer Texts in Statistics, vol. 150 (Wiley, New York).Crossref, Google Scholar
Malek A, Katariya S, Chow Y, Ghavamzadeh M (2017) Sequential multiple hypothesis testing with type I error control. Proc. 20th Internat. Conf. Artificial Intelligence Statist., Proc. Machine Learn. Res., vol. 54 (PMLR, Fort Lauderdale, FL), 1468–1476.Google Scholar
Miller E (2010) How not to run an A/B test. Accessed June 2, 2021, http://www.evanmiller.org/how-not-to-run-an-ab-test.html.Google Scholar
Miller E (2015) Simple sequential A/B testing. Accessed June 2, 2021, http://www.evanmiller.org/sequential-ab-testing.html.Google Scholar
Pollak M, Siegmund D (1975) Approximations to the expected sample size of certain sequential tests. Ann. Statist. 3(6):1267–1282.Google Scholar
Robbins H (1970) Statistical methods related to the law of the iterated logarithm. Ann. Math. Statist. 41(5):1397–1409.Crossref, Google Scholar
Robbins H, Siegmund D (1974) The expected sample size of some tests of power one. Ann. Statist. 2(3):415–436.Crossref, Google Scholar
Scott SL (2015) Multi-armed bandit experiments in the online service economy. Appl. Stochastic Models Bus. Indust. 31(1):37–45.Crossref, Google Scholar
Siegmund D (1978) Estimation following sequential tests. Biometrika 65(2):341–349.Crossref, Google Scholar
Siegmund D (1985) Sequential Analysis: Tests and Confidence Intervals, Springer Series in Statistics (Springer, New York).Crossref, Google Scholar
Tang D, Agarwal A, O’Brien D, Meyer M (2010) Overlapping experiment infrastructure: More, better, faster experimentation. Proc. 16th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 17–26.Google Scholar
Wald A (1945) Sequential tests of statistical hypotheses. Ann. Math. Statist. 16(2):117–186.Crossref, Google Scholar
Yang F, Ramdas A, Jamieson K, Wainwright MJ (2017) A framework for multi-A(rmed)/B(andit) testing with online FDR control. Preprint, submitted June 16, https://arxiv.org/abs/1706.05378.Google Scholar
Zhao S, Zhou E, Sabharwal A, Ermon S (2016) Adaptive concentration inequalities for sequential decision problems. Adv. Neural Inform. Processing Systems 29:1343–1351.Google Scholar

Volume 70, Issue 3

May-June 2022

Pages iii-viii, 1293-1952, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:December 11, 2017
Accepted:November 02, 2020
Published Online:August 10, 2021

Cite as

Ramesh Johari, Pete Koomen, Leonid Pekelis, David Walsh (2021) Always Valid Inference: Continuous Monitoring of A/B Tests. Operations Research 70(3):1806-1821.

https://doi.org/10.1287/opre.2021.2135

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Always Valid Inference: Continuous Monitoring of A/B Tests

References

Volume 70, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News