Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments

Eric M. Schwartz
Corresponding Author
Eric M. Schwartz
[email protected]
Stephen M. Ross School of Business, University of Michigan, Ann Arbor, Michigan 48109
Search for more papers by this author
,
Eric T. Bradlow
Eric T. Bradlow
[email protected]
The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Search for more papers by this author
,
Peter S. Fader
Peter S. Fader
[email protected]
The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Search for more papers by this author

Eric M. Schwartz

Corresponding Author

Eric M. Schwartz

[email protected]

Stephen M. Ross School of Business, University of Michigan, Ann Arbor, Michigan 48109

Search for more papers by this author

Eric T. Bradlow

[email protected]

The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Search for more papers by this author

Peter S. Fader

[email protected]

The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Search for more papers by this author

Published Online:20 Apr 2017https://doi.org/10.1287/mksc.2016.1023

References

Agarwal D, Chen B-C, Elango P (2008) Explore/exploit schemes for web content optimization. Yahoo Research paper series.Google Scholar
Agrawal R (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probab. 27(4):1054–1078.Crossref, Google Scholar
Agrawal S, Goyal N (2012) Analysis of Thompson sampling for the multi-armed bandit problem. J. Machine Learn. Res. Workshop Conf. Proc., Vol. 23, 39.1–39.26.Google Scholar
Anderson E, Simester D (2011) A step-by-step guide to smart business experiments. Harvard Bus. Rev. 89(3):98–105.Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3:397–422.Google Scholar
Bates D, Watts DG (1988) Nonlinear Regression Analysis and Its Applications (Wiley, New York).Crossref, Google Scholar
Bates D, Maechler M, Bolker B, Walker S (2013) R Package ’lme4’. http://cran.r-project.org/web/packages/lme4/lme4.pdf.Google Scholar
Berry DA (1972) A Bernoulli two-armed bandit. Ann. Math. Statist. 43(3):871–897.Crossref, Google Scholar
Berry DA (2004) Bayesian statistics and the efficiency and ethics of clinical trials. Statist. Sci. 19(1):175–187.Crossref, Google Scholar
Berry DA, Fristedt B (1985) Bandit Problems (Chapman & Hall, London).Crossref, Google Scholar
Bertsimas D, Mersereau AJ (2007) Learning approach for interactive marketing. Oper. Res. 55(6):1120–1135.Link, Google Scholar
Bradt RN, Johnson SM, Karlin S (1956) On sequential designs for maximizing the sum of n observations. Ann. Math. Statist. 27(4): 1060–1074.Crossref, Google Scholar
Braun M, Moe WW (2013) Online display advertising: Modeling the effects of multiple creatives and individual impression histories. Marketing Sci. 32(5):753–767.Link, Google Scholar
Brezzi M, Lai TL (2002) Optimal learning and experimentation in bandit problems. J. Econom. Dynam. Control 27:87–108.Crossref, Google Scholar
Chapelle O, Li L (2011) Advances in neural information processing systems. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, Vol. 24, 1–9.Google Scholar
Chick SE, Frazier P (2012) Sequential sampling with economics of selection procedures. Management Sci. 58(3):550–569.Link, Google Scholar
Chick SE, Gans N (2009) Economic analysis of simulation selection problems. Management Sci. 55(3):421–437.Link, Google Scholar
Chick SE, Inoue K (2001) New two-stage and sequential procedures for selecting the best simulated system. Oper. Res. 49(5):732–743.Link, Google Scholar
Chick SE, Branke J, Schmidt C (2010) Sequential sampling to myopically maximize the expected value of information. INFORMS J. Comput. 22(1):71–80.Link, Google Scholar
Dani V, Hayes TP, Kakade SM (2008) Stochastic linear optimization under bandit feedback. Conf. Learn. Theory, 355–366.Google Scholar
Davenport TH (2009) How to design smart business experiments. Harvard Bus. Rev. 87(2):1–9.Google Scholar
Donahoe J (2011) How eBay developed a culture of experimentation: HBR interview of John Donahoe. Havard Bus. Rev. 89(3):92–97.Google Scholar
Filippi S, Cappe O, Garivier A, Szepesvári C (2010) Parametric bandits: The generalized linear case. Lafferty J, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, eds. Adv. Neural Inform. Processing Systems, Vol. 23, 1–9.Google Scholar
Frazier PI, Powell WB, Dayanik S (2009) The knowledge-gradient policy for correlated normal beliefs. INFORMS J. Comput. 21(4):599–613.Link, Google Scholar
Gelman A, Hill J (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge University Press, New York).Crossref, Google Scholar
Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian Data Analysis, 2 ed. (Chapman & Hall, New York).Google Scholar
Gittins JC (1979) Bandit processes and dynamic allocation indices. J. Royal Statist. Soc., Ser. B 41(2):148–177.Google Scholar
Gittins JC, Glazebrook K, Weber R (2011) Multi-Armed Bandit Allocation Indices, 2 ed. (John Wiley and Sons, New York).Crossref, Google Scholar
Goldfarb A, Tucker C (2011) Online display advertising: Targeting and obtrusiveness. Marketing Sci. 30(3):389–404.Link, Google Scholar
Granmo O-C (2010) Solving two-armed Bernoulli bandit problems using a Bayesian learning automaton. Internat. J. Intelligent Comput. Cybernetics 3(2):207–232.Crossref, Google Scholar
Hauser JR, Liberali G, Urban GL (2014) Website morphing 2.0: Technical and implementation advances and a field experiment. Management Sci. 60(6):1594–1616.Link, Google Scholar
Hauser JR, Urban GL, Liberali G, Braun M (2009) Website morphing. Marketing Sci. 28(2):202–223.Link, Google Scholar
Hoban P, Bucklin R (2015) Effects of Internet display advertising in the purchase funnel: Model-based insights from a randomized field experiment. J. Marketing Res. 52(3):375–393.Crossref, Google Scholar
Kaufmann E, Korda N, Munos R (2012) Thompson sampling: An asymptotically optimal finite time analysis. Bshouty NH, Stoltz G, Vayatis N, Zeugmann T, eds. Algorithmic Learning Theory (Springer-Verlag, Berlin Heidelberg), 199–213.Crossref, Google Scholar
Keller G, Oldale A (2003) Branching bandits: A sequential search process with correlated pay-offs. J. Econom. Theory 113(2):302–315.Crossref, Google Scholar
Krishnamurthy V, Wahlberg B (2009) Partially observed Markov decision process multiarmed bandits: Structural results. Math. Oper. Res. 34(2):287–302.Link, Google Scholar
Lai TL (1987) Adaptive treatment allocation and the multi-armed bandit problem. Ann. Statist. 15(3):1091–1114.Crossref, Google Scholar
Lambrecht A, Tucker C (2013) When does retargeting work? Information specificity in online advertising. J. Marketing Res. 50(5): 561–576.Crossref, Google Scholar
Lin S, Zhang J, Hauser J (2015) Learning from experience, simply. Marketing Sci. 34(1):1–19.Link, Google Scholar
Manchanda P, Dubé J-P, Goh KY, Chintagunta PK (2006) The effect of banner advertising on Internet purchasing. J. Marketing Res. 43(1):98–108.Crossref, Google Scholar
May BC, Korda N, Lee A, Leslie DS (2011) Optimistic Bayesian sampling in contextual bandit problems. Technical report, Department of Mathematics, University of Bristol, Bristol, UK.Google Scholar
Meyer RJ, Shi Y (1995) Sequential choice under ambiguity: Intuitive solutions to the armed-bandit problem. Management Sci. 41(5):817–834.Link, Google Scholar
Murphy SA (2005) An experimental design for the development of adaptive treatment strategies. Statist. Medicine 24:1455–1481.Crossref, Google Scholar
Ortega PA, Braun DA (2010) A minimum relative entropy principle for learning and acting. J. Artificial Intelligence Res. 38:475–511.Crossref, Google Scholar
Ortega PA, Braun DA (2014) Generalized Thompson sampling for sequential decision-making and causal inference. Complex Adaptive Systems Modeling 2(2).Google Scholar
Osband I, Russo D, Van Roy B (2013) (More) efficient reinforcement learning via posterior sampling. Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, Vol. 26, 3003–3011.Google Scholar
Perchet V, Rigollet P, Chassang S, Snowberg E (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Crossref, Google Scholar
Powell WB (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley, Hoboken, NJ).Crossref, Google Scholar
Reiley D, Lewis RA, Papadimitriou P, Garcia-Molina H, Krishnamurthy P (2011) Display advertising impact: Search lift and social influence. Proc. 17th ACM SIGKDD Conf. Knowledge Discovery Data Mining (ACM, New York), 1019–1027.Google Scholar
Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58(5):527–535.Crossref, Google Scholar
Rubin D (1990) Estimating causal effects of treatments in randomized and nonrandomized studies. J. Ed. Psych. 66(5):688–701.Crossref, Google Scholar
Rusmevichientong P, Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.Link, Google Scholar
Scott SL (2010) A modern Bayesian look at the multi-armed bandit. Appl. Stochastic Models Bus. Indust. 26(6):639–658.Crossref, Google Scholar
Sutton RS, Barto AG (1998) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3):285–294.Crossref, Google Scholar
Tsitsiklis JN (1986) A lemma on the multi-armed bandit problem. IEEE Trans. Automatic Control 31(6):576–577.Crossref, Google Scholar
Urban GL, Liberali G, Bordley R, MacDonald E, Hauser JR (2014) Morphing banner advertising. Marketing Sci. 33(1):27–46.Link, Google Scholar
Wahrenberger DL, Antle CE, Klimko LA (1977) Bayesian rules for the two-armed bandit problem. Biometrika 64(1):172–174.Crossref, Google Scholar
White JM (2012) Bandit Algorithms for Website Optimization (O’Reilly Media, Sebastopol, CA).Google Scholar
Whittle P (1980) Multi-armed bandits and the Gittins index. J. Royal Statist. Soc., Ser. B 42(2):143–149.Google Scholar

Volume 36, Issue 4

July-August 2017

Pages 471-643

Article Information

Supplemental Material

Metrics

Information

Received:December 16, 2013
Accepted:March 29, 2016
Published Online:April 20, 2017

Cite as

Eric M. Schwartz, Eric T. Bradlow, Peter S. Fader (2017) Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments. Marketing Science 36(4):500-522.

https://doi.org/10.1287/mksc.2016.1023

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments

References

Volume 36, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News