Getting the Most Out of A/B Tests Using the Asymptotic Minimax-Regret Criteria
References
- (2006) Large sample properties of matching estimators for average treatment effects. Econometrica 74:235–267.Crossref, Google Scholar
- (2016) Matching on the estimated propensity score. Econometrica 84:781–807.Crossref, Google Scholar
- (2022) Neyman allocation is minimax optimal for best arm identification with two arms. Working paper, University of Pennsylvania, Philadelphia.Google Scholar
- (2019) Optimal delta-correct best-arm selection for general distributions. Preprint, submitted August 24, https://arxiv.org/abs/1908.09094v1.Google Scholar
- (1980) Statistics and ethics in medical research: III. How large a sample? British Medical J. 281:1336–1338.Crossref, Google Scholar
- (2019a) Scientists rise up against statistical significance. Nature 567:305–307.Crossref, Google Scholar
- (2019b) Inferential statistics as descriptive statistics: There is no replication crisis if we don’t expect replication. Amer. Statist. 73:262–270.Crossref, Google Scholar
- (2021) Policy learning with observational data. Econometrica 89:133–161.Crossref, Google Scholar
- (2010) Best arm identification in multi-armed bandits. Kalai AT, Mohri M, eds. COLT 23rd Conf. Learn. Theory (Omnipress, Madison, WI), 41–53.Google Scholar
- (2013) Uncertainty in online experiments with dependent data: An evaluation of bootstrap methods. Proc.19th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1303–1311.Google Scholar
- (2015) The miracle of microfinance? Evidence from a randomized evaluation. Amer. Econom. J. Appl. Econom. 7:22–53.Crossref, Google Scholar
- (1985) Statistical Decision Theory and Bayesian Analysis, 2nd ed. (Springer, New York).Crossref, Google Scholar
- (1986) Approximate binomial confidence limits. J. Amer. Statist. Assoc. 81:843–855.Crossref, Google Scholar
- (2000) How large does n have to be for Z and t intervals? Amer. Statist. 54:121–128.Google Scholar
- (2004) When BDT in marketing meant Bayesian decision theory: The influence of Paul Green’s research. Wind Y, Green PE, eds. Market Research and Modeling: Progress and Prospects (Kluwer Academic Publishers, Boston), 17–39.Crossref, Google Scholar
- (2019) Improving treatment effect estimators through experiment splitting. World Wide Web Conf. (Association for Computing Machinery, New York), 285–295.Google Scholar
- (2021) Deep neural networks for estimation and inference. Econometrica 89:181–213.Crossref, Google Scholar
- (2019) Test and roll: Profit-maximizing A/B tests. Marketing Sci. 38:1038–1058.Link, Google Scholar
- , Oregon Health Study Group (2012) The Oregon health insurance experiment: Evidence from the first year. Quart. J. Econom. 127:1057–1106.Crossref, Google Scholar
- (1925) Statistical Methods for Research Workers, 2nd ed. (Oliver and Boyd, Edinburgh, UK).Google Scholar
- (2016) Optimal best arm identification with fixed confidence. Conf. Learn. Theory (PMLR, New York), 998–1027.Google Scholar
- (1961) Some intra-firm applications of Bayesian Decision Theory to problems in business planning. PhD thesis, The Wharton School of the University of Pennsylvania, Philadelphia.Google Scholar
- (1962a) Bayesian decision theory in advertising. J. Advertising Res. 2:33–41.Crossref, Google Scholar
- (1962b) Bayesian statistics and product decisions. Bus. Horizons 5:101–109.Crossref, Google Scholar
- (1963) Bayesian decision theory in pricing strategy. J. Marketing 27:5–14.Crossref, Google Scholar
- (2018) Best arm identification in multi-armed bandits with delayed feedback. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 833–842.Google Scholar
- (1998) On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica 66:315–331.Crossref, Google Scholar
- (2022) Probability and Statistics for Economists (Princeton University Press, Princeton, NJ).Google Scholar
- (2009) Asymptotics for statistical treatment rules. Econometrica 77:1683–1701.Crossref, Google Scholar
- (2020) Asymptotic analysis of statistical decision rules in econometrics. Durlauf SN, Hansen LP, Heckman JJ, Matzkin RL, eds. Handbook of Econometrics, vol. 7A (North Holland, Amsterdam), 283–354.Crossref, Google Scholar
- (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71:1161–1189.Crossref, Google Scholar
- (2021) Time-uniform, nonparametric, nonasymptotic confidence sequences. Ann. Statist. 49:1055–1080.Crossref, Google Scholar
- International Conference on Harmonisation E9 Expert Working Group (1999) ICH harmonised tripartite guideline: Statistical principles for clinical trials. Statist. Med. 18:1905–1942.Google Scholar
- (2014) Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting. 2014 48th Annual Conf. Inform. Sci. Systems CISS (IEEE, Piscataway, NJ), 1–6.Google Scholar
- (1961) Theory of Probability (Oxford University Press, Oxford, UK).Google Scholar
- (2017) When less is more: Data and power in advertising experiments. Marketing Sci. 36:43–53.Link, Google Scholar
- (1956) The theory of decision procedures for distributions with monotone likelihood ratio. Ann. Math. Statist. 27:272–299.Crossref, Google Scholar
- (2016) On the complexity of best arm identification in multi-armed bandit models. J. Machine Learn. Res. 17:1–42.Google Scholar
- (2018) Who should be treated? Empirical welfare maximization methods for treatment choice. Econometrica 86:591–616.Crossref, Google Scholar
- (2020) Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2014) Seven rules of thumb for web site experimenters. Proc. 20th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1857–1866.Google Scholar
- (2021) Optimal simple regret in Bayesian best arm identification. Preprint, submitted November 18, https://arxiv.org/abs/2111.09885v1.Google Scholar
- (1993) The Fisher, Neyman-Pearson theories of testing hypotheses: One theory or two? J. Amer. Statist. Assoc. 88:1242–1249.Crossref, Google Scholar
- (2022) Testing Statistical Hypotheses, 4th ed. (Springer, Cham, Switzerland).Crossref, Google Scholar
- (2001) Some practical guidelines for effective sample size determination. Amer. Statist. 55:187–193.Crossref, Google Scholar
- (2015) The unfavorable economics of measuring the returns to advertising. Quart. J. Econom. 130:1941–1973.Crossref, Google Scholar
- (2008) Statistical Decision Theory—Estimation, Testing, and Selection (Springer, New York).Google Scholar
- (2004) Statistical treatment rules for heterogeneous populations. Econometrica 72:1221–1246.Crossref, Google Scholar
- (2019) Treatment choice with trial data: Statistical decision theory should supplant hypothesis testing. Amer. Statist. 73:296–304.Crossref, Google Scholar
- (2021) Econometrics for decision making: Building foundations sketched by Haavelmo and Wald. Econometrica 89:2827–2853.Crossref, Google Scholar
- (2016) Sufficient trial size to inform clinical practice. Proc. Natl. Acad. Sci. USA 113:10518–10523.Crossref, Google Scholar
- (2021) Model selection for treatment choice: Penalized welfare maximization. Econometrica 89:825–848.Crossref, Google Scholar
- (2024) “Statistical significance” and statistical reporting: Moving beyond binary. J. Marketing 88:1–19.Crossref, Google Scholar
- (2019) Abandon statistical significance. Amer. Statist. 73:235–245.Crossref, Google Scholar
- (1994) Large sample estimation and hypothesis testing. Engle R, McFadden D, eds. Handbook of Econometrics, vol. 4 (Elsevier, Amsterdam), 2111–2245.Crossref, Google Scholar
- Neyman J, Pearson ES (1928a) On the use and interpretation of certain test criteria for purposes of statistical inference: Part I. Biometrika 20(1/2):175–240.Google Scholar
- (1928b) On the use and interpretation of certain test criteria for purposes of statistical inference: Part II. Biometrika 20(1/2):263–294.Google Scholar
- (2015) Fisher, Neyman-Pearson or NHST? A tutorial for teaching data testing. Frontiers Psych. 6:223.Google Scholar
- (2003) Bayesian statistics and marketing. Marketing Sci. 22:304–328.Link, Google Scholar
- (1996) The value of purchase history data in target marketing. Marketing Sci. 15:321–340.Link, Google Scholar
- (1984) Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12(4):1151–1172.Crossref, Google Scholar
- (2020) Simple Bayesian algorithms for best-arm identification. Oper. Res. 68:1625–1647.Link, Google Scholar
- (2020a) Does advertising serve as a signal? Evidence from a field experiment in mobile search. Rev. Econom. Stud. 87:1529–1564.Crossref, Google Scholar
- (2020b) Sponsorship disclosure and consumer deception: Experimental evidence from native advertising in mobile search. Marketing Sci. 39:5–32.Link, Google Scholar
- (1951) The theory of statistical decision. J. Amer. Statist. Assoc. 46:55–67.Crossref, Google Scholar
- (1983) The significance of statistical significance tests in marketing research. J. Marketing Res. 20:122–133.Crossref, Google Scholar
- (2010) A modern Bayesian look at the multi-armed bandit. Appl. Stochastic Models Bus. Indust. 26:639–658.Crossref, Google Scholar
- (2009) Minimax regret treatment choice with finite samples. J. Econometrics 151:70–81.Crossref, Google Scholar
- (2012) Minimax regret treatment choice with covariates or with limited validity of experiments. J. Econometrics 166:138–156.Crossref, Google Scholar
- (2012) Statistical treatment choice based on asymmetric minimax regret criteria. J. Econometrics 166:157–165.Crossref, Google Scholar
- The American Statistician (2019) Statistical Inference in the 21st Century: A World Beyond p < 0.05, vol. 73 (The American Statistician).Google Scholar
- (1998) Asymptotic Statistics (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (1950) Statistical Decision Functions (Wiley, Hoboken, NJ).Google Scholar
- (2016) The ASA statement on p-values: Context, process, and purpose. Amer. Statistician 70(2):129–133.Google Scholar
- (2019) Moving to a world beyond “p < 0.05”. Amer. Statist. 73:1–19.Crossref, Google Scholar

