Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments
Published Online:20 Apr 2017https://doi.org/10.1287/mksc.2016.1023
References
- (2008) Explore/exploit schemes for web content optimization. Yahoo Research paper series.Google Scholar
- (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probab. 27(4):1054–1078.Crossref, Google Scholar
- (2012) Analysis of Thompson sampling for the multi-armed bandit problem. J. Machine Learn. Res. Workshop Conf. Proc., Vol. 23, 39.1–39.26.Google Scholar
- (2011) A step-by-step guide to smart business experiments. Harvard Bus. Rev. 89(3):98–105.Google Scholar
- (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3:397–422.Google Scholar
- (1988) Nonlinear Regression Analysis and Its Applications (Wiley, New York).Crossref, Google Scholar
- (2013) R Package ’lme4’. http://cran.r-project.org/web/packages/lme4/lme4.pdf.Google Scholar
- (1972) A Bernoulli two-armed bandit. Ann. Math. Statist. 43(3):871–897.Crossref, Google Scholar
- (2004) Bayesian statistics and the efficiency and ethics of clinical trials. Statist. Sci. 19(1):175–187.Crossref, Google Scholar
- (1985) Bandit Problems (Chapman & Hall, London).Crossref, Google Scholar
- (2007) Learning approach for interactive marketing. Oper. Res. 55(6):1120–1135.Link, Google Scholar
- (1956) On sequential designs for maximizing the sum of n observations. Ann. Math. Statist. 27(4): 1060–1074.Crossref, Google Scholar
- (2013) Online display advertising: Modeling the effects of multiple creatives and individual impression histories. Marketing Sci. 32(5):753–767.Link, Google Scholar
- (2002) Optimal learning and experimentation in bandit problems. J. Econom. Dynam. Control 27:87–108.Crossref, Google Scholar
- (2011) Advances in neural information processing systems. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, Vol. 24, 1–9.Google Scholar
- (2012) Sequential sampling with economics of selection procedures. Management Sci. 58(3):550–569.Link, Google Scholar
- (2009) Economic analysis of simulation selection problems. Management Sci. 55(3):421–437.Link, Google Scholar
- (2001) New two-stage and sequential procedures for selecting the best simulated system. Oper. Res. 49(5):732–743.Link, Google Scholar
- (2010) Sequential sampling to myopically maximize the expected value of information. INFORMS J. Comput. 22(1):71–80.Link, Google Scholar
- (2008) Stochastic linear optimization under bandit feedback. Conf. Learn. Theory, 355–366.Google Scholar
- (2009) How to design smart business experiments. Harvard Bus. Rev. 87(2):1–9.Google Scholar
- (2011) How eBay developed a culture of experimentation: HBR interview of John Donahoe. Havard Bus. Rev. 89(3):92–97.Google Scholar
- (2010) Parametric bandits: The generalized linear case. Lafferty J, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, eds. Adv. Neural Inform. Processing Systems, Vol. 23, 1–9.Google Scholar
- (2009) The knowledge-gradient policy for correlated normal beliefs. INFORMS J. Comput. 21(4):599–613.Link, Google Scholar
- (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge University Press, New York).Crossref, Google Scholar
- (2004) Bayesian Data Analysis, 2 ed. (Chapman & Hall, New York).Google Scholar
- (1979) Bandit processes and dynamic allocation indices. J. Royal Statist. Soc., Ser. B 41(2):148–177.Google Scholar
- (2011) Multi-Armed Bandit Allocation Indices, 2 ed. (John Wiley and Sons, New York).Crossref, Google Scholar
- (2011) Online display advertising: Targeting and obtrusiveness. Marketing Sci. 30(3):389–404.Link, Google Scholar
- (2010) Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton. Internat. J. Intelligent Comput. Cybernetics 3(2):207–232.Crossref, Google Scholar
- (2014) Website morphing 2.0: Technical and implementation advances and a field experiment. Management Sci. 60(6):1594–1616.Link, Google Scholar
- (2009) Website morphing. Marketing Sci. 28(2):202–223.Link, Google Scholar
- (2015) Effects of Internet display advertising in the purchase funnel: Model-based insights from a randomized field experiment. J. Marketing Res. 52(3):375–393.Crossref, Google Scholar
- (2012) Thompson sampling: An asymptotically optimal finite time analysis. Bshouty NH, Stoltz G, Vayatis N, Zeugmann T, eds. Algorithmic Learning Theory (Springer-Verlag, Berlin Heidelberg), 199–213.Crossref, Google Scholar
- (2003) Branching bandits: A sequential search process with correlated pay-offs. J. Econom. Theory 113(2):302–315.Crossref, Google Scholar
- (2009) Partially observed Markov decision process multiarmed bandits: Structural results. Math. Oper. Res. 34(2):287–302.Link, Google Scholar
- (1987) Adaptive treatment allocation and the multi-armed bandit problem. Ann. Statist. 15(3):1091–1114.Crossref, Google Scholar
- (2013) When does retargeting work? Information specificity in online advertising. J. Marketing Res. 50(5): 561–576.Crossref, Google Scholar
- (2015) Learning from experience, simply. Marketing Sci. 34(1):1–19.Link, Google Scholar
- (2006) The effect of banner advertising on Internet purchasing. J. Marketing Res. 43(1):98–108.Crossref, Google Scholar
- (2011) Optimistic Bayesian sampling in contextual bandit problems. Technical report, Department of Mathematics, University of Bristol, Bristol, UK.Google Scholar
- (1995) Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management Sci. 41(5):817–834.Link, Google Scholar
- (2005) An experimental design for the development of adaptive treatment strategies. Statist. Medicine 24:1455–1481.Crossref, Google Scholar
- (2010) A minimum relative entropy principle for learning and acting. J. Artificial Intelligence Res. 38:475–511.Crossref, Google Scholar
- (2014) Generalized Thompson sampling for sequential decision-making and causal inference. Complex Adaptive Systems Modeling 2(2).Google Scholar
- (2013) (More) efficient reinforcement learning via posterior sampling. Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, Vol. 26, 3003–3011.Google Scholar
- (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Crossref, Google Scholar
- (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley, Hoboken, NJ).Crossref, Google Scholar
- (2011) Display advertising impact: Search lift and social influence. Proc. 17th ACM SIGKDD Conf. Knowledge Discovery Data Mining (ACM, New York), 1019–1027.Google Scholar
- (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58(5):527–535.Crossref, Google Scholar
- (1990) Estimating causal effects of treatments in randomized and nonrandomized studies. J. Ed. Psych. 66(5):688–701.Crossref, Google Scholar
- (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
- (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.Link, Google Scholar
- (2010) A modern Bayesian look at the multi-armed bandit. Appl. Stochastic Models Bus. Indust. 26(6):639–658.Crossref, Google Scholar
- (1998) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
- (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3):285–294.Crossref, Google Scholar
- (1986) A lemma on the multi-armed bandit problem. IEEE Trans. Automatic Control 31(6):576–577.Crossref, Google Scholar
- (2014) Morphing banner advertising. Marketing Sci. 33(1):27–46.Link, Google Scholar
- (1977) Bayesian rules for the two-armed bandit problem. Biometrika 64(1):172–174.Crossref, Google Scholar
- (2012) Bandit Algorithms for Website Optimization (O’Reilly Media, Sebastopol, CA).Google Scholar
- (1980) Multi-armed bandits and the Gittins index. J. Royal Statist. Soc., Ser. B 42(2):143–149.Google Scholar

