Aghion P, Bolton P, Harris C, Jullien B (1991) Optimal learning by experimentation. Rev. Econom. Stud. 58(4):621–654.Crossref, Google Scholar
Agrawal R (1995) Sample mean based index policies by O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probab. 27(4):1054–1078.Crossref, Google Scholar
Agrawal S, Goyal N (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro N, Williamson RC, eds. Proc. 25th Annual Conf. Learn. Theory, vol. 23 (PMLR, New York), 39.1–39.26.Google Scholar
Aparicio D, Simester D (2022) Price frictions and the success of new products. Marketing Sci. 41(6):1057–1073.Link, Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(November):397–422.Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
Bayati M, Hamidi N, Johari R, Khosravi K (2020) Unreasonable effectiveness of greedy algorithms in multi-armed bandit with many arms. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 1713–1723.Google Scholar
Bergemann D, Schlag KH (2008) Pricing without priors. J. Eur. Econom. Assoc. 6(2–3):560–569.Crossref, Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Botev Z, Belzile L (2025) TruncatedNormal: Truncated multivariate normal and student distributions. https://github.com/lbelzile/truncatednormal.Google Scholar
Brochu E, Hoffman MW, de Freitas N (2010) Portfolio allocation for Bayesian optimization. Preprint, submitted September 28, https://arxiv.org/abs/1009.5419.Google Scholar
Chapelle O, Li L (2011) An empirical evaluation of Thompson sampling. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 24 (Curran Associates Inc., Red Hook, NY), 2249–2257.Google Scholar
Chatterjee S, Sen S (2021) Regret minimization in isotonic, heavy-tailed contextual bandits via adaptive confidence bands. Preprint, submitted October 19, https://arxiv.org/abs/2110.10245.Google Scholar
Chen Q, Jasin S, Duenyas I (2019) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.Link, Google Scholar
Cheshire J, Ménard P, Carpentier A (2020) The influence of shape constraints on the thresholding bandit problem. Abernethy J, Agarwal S, eds. Proc. Thirty Third Conf. Learn. Theory, vol. 125 (PMLR, New York), 1228–1275.Google Scholar
Ching AT, Osborne M (2020) Identification and estimation of forward-looking behavior: The case of consumer stockpiling. Marketing Sci. 39(4):707–726.Link, Google Scholar
Chowdhury SR, Gopalan A (2017) On kernelized multi-armed bandits. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 844–853.Google Scholar
Cohen SN, Treetanthiploet T (2020) Asymptotic randomised control with applications to bandits. Preprint, submitted October 14, https://arxiv.org/abs/2010.07252.Google Scholar
Dann C, Mansour Y, Mohri M, Sekhari A, Sridharan K (2022) Guarantees for epsilon-greedy reinforcement learning with function approximation. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. 39th Internat. Conf. Machine Learn., vol. 162 (PMLR, New York), 4666–4689.Google Scholar
Dholakia U (2015) The risks of changing your prices too often. Harvard Bus. Rev. (July 6), https://hbr.org/2015/07/the-risks-of-changing-your-prices-too-often?ab=HP-hero-for-you-text-2.Google Scholar
Duvenaud D (2014) Automatic model construction with Gaussian processes. PhD thesis, University of Cambridge, Cambridge, UK.Google Scholar
Erdem T, Keane MP (1996) Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets. Marketing Sci. 15(1):1–20.Link, Google Scholar
Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
Filippi S, Cappe O, Garivier A, Szepesvári C (2010) Parametric bandits: The generalized linear case. Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, eds. Adv. Neural Inform. Processing Systems, vol. 23 (Curran Associates Inc., Red Hook, NY).Google Scholar
Furman J, Simcoe T (2015) The economics of big data and differential pricing. The White House President Barack Obama (blog) (February 6), https://obamawhitehouse.archives.gov/blog/2015/02/06/economics-big-data-and-differential-pricing.Google Scholar
Gittins J (1974) A dynamic allocation index for the sequential design of experiments. Gittins JC, Jones DM, eds. Progress in Statistics (North-Holland, Amsterdam), 241–266.Google Scholar
Goli A, Reiley DH, Zhang H (2025) Personalizing ad load to optimize subscription and ad revenues: Product strategies constructed from experiments on pandora. Marketing Sci. 44(2):327–352.Google Scholar
Gordon BR, Zettelmeyer F, Bhargava N, Chapsky D (2019) A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook. Marketing Sci. 38(2):193–225.Link, Google Scholar
Guntuboyina A, Sen B (2018) Nonparametric shape-restricted regression. Statist. Sci. 33(4):568–594.Crossref, Google Scholar
Handel BR, Misra K (2015) Robust new product pricing. Marketing Sci. 34(6):864–881.Link, Google Scholar
Hanssens DM, Pauwels KH (2016) Demonstrating the value of marketing. J. Marketing 80(6):173–190.Crossref, Google Scholar
Hauser JR, Urban GL, Liberali G, Braun M (2009) Website morphing. Marketing Sci. 28(2):202–223.Link, Google Scholar
Hendel I, Nevo A (2006) Measuring the implications of sales and consumer inventory behavior. Econometrica 74(6):1637–1673.Crossref, Google Scholar
Hill DN, Nassif H, Liu Y, Iyer A, Vishwanathan S (2017) An efficient bandit algorithm for realtime multivariate optimization. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1813–1821.Google Scholar
Hoban PR, Bucklin RE (2015) Effects of internet display advertising in the purchase funnel: Model-based insights from a randomized field experiment. J. Marketing Res. 52(3):375–393.Crossref, Google Scholar
Huang Y, Ellickson PB, Lovett MJ (2022) Learning to set prices. J. Marketing Res. 59(2):411–434.Crossref, Google Scholar
Jindal P, Zhu T, Chintagunta P, Dhar S (2020) Marketing-mix response across retail formats: The role of shopping trip types. J. Marketing 84(2):114–132.Crossref, Google Scholar
Kawale J, Bui HH, Kveton B, Tran-Thanh L, Chawla S (2015) Efficient Thompson sampling for online matrix-factorization recommendation. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (Curran Associates Inc., Red Hook, NY), 1297–1305.Google Scholar
Kermisch R, Burns D (2018) Is pricing killing your profits? Bain & Company. Accessed June 16, 2018, http://www.bain.com/publications/articles/is-pricing-killing-your-profits.aspx.Google Scholar
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
Lambrecht A, Tucker C, Wiertz C (2018) Advertising to early trend propagators: Evidence from Twitter. Marketing Sci. 37(2):177–199.Link, Google Scholar
Maatouk H, Bay X (2017) Gaussian process emulators for computer experiments with inequality constraints. Math. Geosciences 49(5):557–582.Crossref, Google Scholar
Miao S, Wang Y (2024) Demand balancing in primal-dual optimization for blind network revenue management. Preprint, submitted April 6, https://arxiv.org/abs/2404.04467.Google Scholar
Micchelli CA, Xu Y, Zhang H (2006) Universal kernels. J. Machine Learn. Res. 7(12):2651–2667.Google Scholar
Misra K, Schwartz EM, Abernethy J (2019) Dynamic online pricing with incomplete information using multiarmed bandit experiments. Marketing Sci. 38(2):226–252.Link, Google Scholar
Nair H (2007) Intertemporal price discrimination with forward-looking consumers: Application to the US market for console video-games. Quant. Marketing Econom. 5(3):239–292.Crossref, Google Scholar
Oren SS, Smith SA, Wilson RB (1982) Nonlinear pricing in markets with interdependent demand. Marketing Sci. 1(3):287–313.Link, Google Scholar
Rao RC, Bass FM (1985) Competition, strategy, and price dynamics: A theoretical and empirical investigation. J. Marketing Res. 22(3):283–296.Crossref, Google Scholar
Ringbeck D, Huchzermeier A (2019) Dynamic pricing and learning: An application of Gaussian process regression. Preprint, submitted June 24, http://dx.doi.org/10.2139/ssrn.3406293.Google Scholar
Rothschild M (1974) A two-armed bandit theory of market pricing. J. Econom. Theory 9(2):185–202.Crossref, Google Scholar
Rubel O (2013) Stochastic competitive entries and dynamic pricing. Eur. J. Oper. Res. 231(2):381–392.Crossref, Google Scholar
Sahni NS, Nair HS (2020) Does advertising serve as a signal? Evidence from a field experiment in mobile search. Rev. Econom. Stud. 87(3):1529–1564.Crossref, Google Scholar
Schwartz EM, Bradlow ET, Fader PS (2017) Customer acquisition via display advertising using multi-armed bandit experiments. Marketing Sci. 36(4):500–522.Link, Google Scholar
Shahriari B, Swersky K, Wang Z, Adams RP, De Freitas N (2015) Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 104(1):148–175.Crossref, Google Scholar
Simester D, Hu Y, Brynjolfsson E, Anderson ET (2009) Dynamics of retail advertising: Evidence from a field experiment. Econom. Inquiry 47(3):482–499.Crossref, Google Scholar
Srinivas N, Krause A, Kakade SM, Seeger M (2009) Gaussian process optimization in the bandit setting: No regret and experimental design. Preprint, submitted December 21, https://arxiv.org/abs/0912.3995.Google Scholar
Thomas M, Morwitz V (2005) Penny wise and pound foolish: The left-digit effect in price cognition. J. Consumer Res. 32(1):54–64.Crossref, Google Scholar
Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.Crossref, Google Scholar
Tirole J (1988) The Theory of Industrial Organization (MIT Press, Cambridge, MA).Google Scholar
Urteaga I, Wiggins CH (2018) Nonparametric Gaussian mixture models for the multi-armed bandit. Preprint, submitted August 8, https://arxiv.org/abs/1808.02932.Google Scholar
Wang Y, Chen B, Simchi-Levi D (2021) Multimodal dynamic pricing. Management Sci. 67(10):6136–6152.Link, Google Scholar
Williams CK, Rasmussen CE (2006) Gaussian Processes for Machine Learning, vol. 2 (MIT Press, Cambridge, MA).Google Scholar
Yu M, Debo L, Kapuscinski R (2016) Strategic waiting for consumer-generated quality information: Dynamic pricing of new experience goods. Management Sci. 62(2):410–435.Link, Google Scholar
Zhang L, Chung DJ (2020) Price bargaining and competition in online platforms: An empirical analysis of the daily deal market. Marketing Sci. 39(4):687–706.Link, Google Scholar
Zhao H, He J, Zhou D, Zhang T, Gu Q (2023) Variance-dependent regret bounds for linear bandits and reinforcement learning: Adaptivity and computational efficiency. Neu G, Rosasco L, eds. Proc. Thirty Sixth Conf. Learn. Theory, vol. 195 (PMLR, New York), 4977–5020.Google Scholar

Volume 44, Issue 6

November-December 2025

Pages 1217-1459, ii

Article Information

Supplemental Material

Metrics

Information

Received:July 06, 2022
Accepted:January 01, 2025
Published Online:September 26, 2025

Cite as

Ian N. Weaver, Vineet Kumar, Lalit Jain (2025) Nonparametric Pricing Bandits Leveraging Informational Externalities to Learn the Demand Curve. Marketing Science 44(6):1299-1320.

https://doi.org/10.1287/mksc.2022.0247

Keywords

Acknowledgments

The authors are grateful for discussions and comments from Kanishka Misra, Oded Netzer, Eric Schwartz, Jiwoong Shin, Herb Sussman, and Kosuke Uetake as well as participants at the Yale School of Management Marketing Seminar; the University of Colorado Boulder Marketing Seminar; the Marketing Science Conference 2021; the Conference on Artificial Intelligence, Machine Learning, and Business Analytics 2023; the Marketing Dynamics Conference 2023; and the AMA Summer Academic Conference 2024. The authors also thank Yanlei He for his excellent research assistance.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Nonparametric Pricing Bandits Leveraging Informational Externalities to Learn the Demand Curve

References

Volume 44, Issue 6

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News