Always Valid Inference: Continuous Monitoring of A/B Tests
Published Online:10 Aug 2021https://doi.org/10.1287/opre.2021.2135
References
- (2011) Improved algorithms for linear stochastic bandits. Adv. Neural Inform. Processing Systems 24:2312–2320.Google Scholar
- (2014) Sharp finite-time iterated-logarithm martingale concentration. Preprint, submitted May 12, https://arxiv.org/abs/1405.2639.Google Scholar
- (2015) Sequential nonparametric testing with the law of the iterated logarithm. Preprint, submitted June 10, https://arxiv.org/abs/1506.03486.Google Scholar
- (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.Google Scholar
- (2009) Pure exploration in multi-armed bandits problems. Gavaldà R, Lugosi G, Zeugmann T, Zilles S, eds. Internat. Conf. Algorithmic Learning Theory, Lecture Notes in Computer Science, vol. 5809 (Springer, Berlin), 23–37.Google Scholar
- (1967) Confidence sequences for mean, variance, and median. Proc. Natl. Acad. Sci. USA 58(1):66–68.Crossref, Google Scholar
- (2008) Self-Normalized Processes: Limit Theory and Statistical Applications, Probability and its Applications (Springer Science & Business Media, Berlin).Google Scholar
- (1994) Interim analysis: The alpha spending function approach. Statis. Med. 13(13-14):1341–1352.Crossref, Google Scholar
- (2002) PAC bounds for multi-armed bandit and Markov decision processes. Kivinen J, Sloan RH, ed. COLT’02 Proc. 15th Annu. Conf. Comput. Learning Theory (Springer, Berlin), 255–270.Google Scholar
- (2014) Semiparametric exponential families for heavy-tailed data. Biometrika 102(2):486–493.Crossref, Google Scholar
- (2008) α-investing: A procedure for sequential control of expected false discoveries. J. Roy. Statist. Soc. Ser. B Statist. Methodology 70(2):429–444.Crossref, Google Scholar
- (1991) Handbook of Sequential Analysis (CRC Press, Boca Raton, FL).Google Scholar
- (2018) Uniform, nonparametric, non-asymptotic confidence sequences. Preprint, submitted October 18, https://arxiv.org/abs/1810.08240.Google Scholar
- (1961) Estimation with quadratic loss. Neyman J, ed. Proc. Fourth Berkeley Sympos. Math. Statist. Probab., vol. 1 (University of California Press, Berkeley, CA), 361–379.Google Scholar
- (2018) A bandit approach to multiple testing with false discovery control. Preprint, submitted September 6, https://arxiv.org/abs/1809.02235.Google Scholar
- (2014) lil’UCB: An optimal exploration algorithm for multi-armed bandits. Proc. 27th Conf. Learning Theory, Proc. Machine Learn. Res., vol. 35 (PMLR, Barcelona, Spain), 423–439.Google Scholar
- (2016) Online rules for control of false discovery rate and false discovery exceedance. Preprint, submitted March 29, https://arxiv.org/abs/1603.09000.Google Scholar
- (2017) Peeking at A/B tests: Why it matters, and what to do about it. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining KDD ’17 (Association for Computing Machinery, New York), 1517–1525.Google Scholar
- (2012) PAC subset selection in stochastic multi-armed bandits. Proc. 29th Internat. Conf. Machine Learning (Omnipress, Madison, WI), 655–662.Google Scholar
- (2014) On the complexity of A/B testing. Preprint, submitted May 13, https://arxiv.org/abs/1405.3224.Google Scholar
- (2013) Online controlled experiments at large scale. Proc. 19th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1168–1176.Google Scholar
- (2001) Sequential analysis: Some classical problems and new challenges. Statist. Sinica 11(2):303–351.Google Scholar
- (2020) Bandit Algorithms. (Cambridge University Press).Google Scholar
- (1986) Testing Statistical Hypotheses. Springer Texts in Statistics, vol. 150 (Wiley, New York).Crossref, Google Scholar
- (2017) Sequential multiple hypothesis testing with type I error control. Proc. 20th Internat. Conf. Artificial Intelligence Statist., Proc. Machine Learn. Res., vol. 54 (PMLR, Fort Lauderdale, FL), 1468–1476.Google Scholar
- (2010) How not to run an A/B test. Accessed June 2, 2021, http://www.evanmiller.org/how-not-to-run-an-ab-test.html.Google Scholar
- (2015) Simple sequential A/B testing. Accessed June 2, 2021, http://www.evanmiller.org/sequential-ab-testing.html.Google Scholar
- (1975) Approximations to the expected sample size of certain sequential tests. Ann. Statist. 3(6):1267–1282.Google Scholar
- (1970) Statistical methods related to the law of the iterated logarithm. Ann. Math. Statist. 41(5):1397–1409.Crossref, Google Scholar
- (1974) The expected sample size of some tests of power one. Ann. Statist. 2(3):415–436.Crossref, Google Scholar
- (2015) Multi-armed bandit experiments in the online service economy. Appl. Stochastic Models Bus. Indust. 31(1):37–45.Crossref, Google Scholar
- (1978) Estimation following sequential tests. Biometrika 65(2):341–349.Crossref, Google Scholar
- (1985) Sequential Analysis: Tests and Confidence Intervals, Springer Series in Statistics (Springer, New York).Crossref, Google Scholar
- (2010) Overlapping experiment infrastructure: More, better, faster experimentation. Proc. 16th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 17–26.Google Scholar
- (1945) Sequential tests of statistical hypotheses. Ann. Math. Statist. 16(2):117–186.Crossref, Google Scholar
- (2017) A framework for multi-A(rmed)/B(andit) testing with online FDR control. Preprint, submitted June 16, https://arxiv.org/abs/1706.05378.Google Scholar
- (2016) Adaptive concentration inequalities for sequential decision problems. Adv. Neural Inform. Processing Systems 29:1343–1351.Google Scholar

