Nonparametric Pricing Bandits Leveraging Informational Externalities to Learn the Demand Curve
References
- (1991) Optimal learning by experimentation. Rev. Econom. Stud. 58(4):621–654.Crossref, Google Scholar
- (1995) Sample mean based index policies by O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probab. 27(4):1054–1078.Crossref, Google Scholar
- (2012) Analysis of Thompson sampling for the multi-armed bandit problem. Mannor S, Srebro N, Williamson RC, eds. Proc. 25th Annual Conf. Learn. Theory, vol. 23 (PMLR, New York), 39.1–39.26.Google Scholar
- (2022) Price frictions and the success of new products. Marketing Sci. 41(6):1057–1073.Link, Google Scholar
- (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(November):397–422.Google Scholar
- (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
- (2020) Unreasonable effectiveness of greedy algorithms in multi-armed bandit with many arms. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 1713–1723.Google Scholar
- (2008) Pricing without priors. J. Eur. Econom. Assoc. 6(2–3):560–569.Crossref, Google Scholar
- (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
- (2025) TruncatedNormal: Truncated multivariate normal and student distributions. https://github.com/lbelzile/truncatednormal.Google Scholar
- Brochu E, Hoffman MW, de Freitas N (2010) Portfolio allocation for Bayesian optimization. Preprint, submitted September 28, https://arxiv.org/abs/1009.5419.Google Scholar
- (2011) An empirical evaluation of Thompson sampling. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 24 (Curran Associates Inc., Red Hook, NY), 2249–2257.Google Scholar
- (2021) Regret minimization in isotonic, heavy-tailed contextual bandits via adaptive confidence bands. Preprint, submitted October 19, https://arxiv.org/abs/2110.10245.Google Scholar
- (2019) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.Link, Google Scholar
- (2020) The influence of shape constraints on the thresholding bandit problem. Abernethy J, Agarwal S, eds. Proc. Thirty Third Conf. Learn. Theory, vol. 125 (PMLR, New York), 1228–1275.Google Scholar
- (2020) Identification and estimation of forward-looking behavior: The case of consumer stockpiling. Marketing Sci. 39(4):707–726.Link, Google Scholar
- (2017) On kernelized multi-armed bandits. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 844–853.Google Scholar
- (2020) Asymptotic randomised control with applications to bandits. Preprint, submitted October 14, https://arxiv.org/abs/2010.07252.Google Scholar
- (2022) Guarantees for epsilon-greedy reinforcement learning with function approximation. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. 39th Internat. Conf. Machine Learn., vol. 162 (PMLR, New York), 4666–4689.Google Scholar
- (2015) The risks of changing your prices too often. Harvard Bus. Rev. (July 6), https://hbr.org/2015/07/the-risks-of-changing-your-prices-too-often?ab=HP-hero-for-you-text-2.Google Scholar
- (2014) Automatic model construction with Gaussian processes. PhD thesis, University of Cambridge, Cambridge, UK.Google Scholar
- (1996) Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets. Marketing Sci. 15(1):1–20.Link, Google Scholar
- (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
- (2010) Parametric bandits: The generalized linear case. Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, eds. Adv. Neural Inform. Processing Systems, vol. 23 (Curran Associates Inc., Red Hook, NY).Google Scholar
- (2015) The economics of big data and differential pricing. The White House President Barack Obama (blog) (February 6), https://obamawhitehouse.archives.gov/blog/2015/02/06/economics-big-data-and-differential-pricing.Google Scholar
- (1974) A dynamic allocation index for the sequential design of experiments. Gittins JC, Jones DM, eds. Progress in Statistics (North-Holland, Amsterdam), 241–266.Google Scholar
- Goli A, Reiley DH, Zhang H (2025) Personalizing ad load to optimize subscription and ad revenues: Product strategies constructed from experiments on pandora. Marketing Sci. 44(2):327–352.Google Scholar
- (2019) A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook. Marketing Sci. 38(2):193–225.Link, Google Scholar
- (2018) Nonparametric shape-restricted regression. Statist. Sci. 33(4):568–594.Crossref, Google Scholar
- (2015) Robust new product pricing. Marketing Sci. 34(6):864–881.Link, Google Scholar
- (2016) Demonstrating the value of marketing. J. Marketing 80(6):173–190.Crossref, Google Scholar
- (2009) Website morphing. Marketing Sci. 28(2):202–223.Link, Google Scholar
- (2006) Measuring the implications of sales and consumer inventory behavior. Econometrica 74(6):1637–1673.Crossref, Google Scholar
- (2017) An efficient bandit algorithm for realtime multivariate optimization. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1813–1821.Google Scholar
- (2015) Effects of internet display advertising in the purchase funnel: Model-based insights from a randomized field experiment. J. Marketing Res. 52(3):375–393.Crossref, Google Scholar
- (2022) Learning to set prices. J. Marketing Res. 59(2):411–434.Crossref, Google Scholar
- (2020) Marketing-mix response across retail formats: The role of shopping trip types. J. Marketing 84(2):114–132.Crossref, Google Scholar
- (2015) Efficient Thompson sampling for online matrix-factorization recommendation. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (Curran Associates Inc., Red Hook, NY), 1297–1305.Google Scholar
- (2018) Is pricing killing your profits? Bain & Company. Accessed June 16, 2018, http://www.bain.com/publications/articles/is-pricing-killing-your-profits.aspx.Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
- (2018) Advertising to early trend propagators: Evidence from Twitter. Marketing Sci. 37(2):177–199.Link, Google Scholar
- (2017) Gaussian process emulators for computer experiments with inequality constraints. Math. Geosciences 49(5):557–582.Crossref, Google Scholar
- (2024) Demand balancing in primal-dual optimization for blind network revenue management. Preprint, submitted April 6, https://arxiv.org/abs/2404.04467.Google Scholar
- (2006) Universal kernels. J. Machine Learn. Res. 7(12):2651–2667.Google Scholar
- (2019) Dynamic online pricing with incomplete information using multiarmed bandit experiments. Marketing Sci. 38(2):226–252.Link, Google Scholar
- (2007) Intertemporal price discrimination with forward-looking consumers: Application to the US market for console video-games. Quant. Marketing Econom. 5(3):239–292.Crossref, Google Scholar
- (1982) Nonlinear pricing in markets with interdependent demand. Marketing Sci. 1(3):287–313.Link, Google Scholar
- (1985) Competition, strategy, and price dynamics: A theoretical and empirical investigation. J. Marketing Res. 22(3):283–296.Crossref, Google Scholar
- (2019) Dynamic pricing and learning: An application of Gaussian process regression. Preprint, submitted June 24, http://dx.doi.org/10.2139/ssrn.3406293.Google Scholar
- (1974) A two-armed bandit theory of market pricing. J. Econom. Theory 9(2):185–202.Crossref, Google Scholar
- (2013) Stochastic competitive entries and dynamic pricing. Eur. J. Oper. Res. 231(2):381–392.Crossref, Google Scholar
- (2020) Does advertising serve as a signal? Evidence from a field experiment in mobile search. Rev. Econom. Stud. 87(3):1529–1564.Crossref, Google Scholar
- (2017) Customer acquisition via display advertising using multi-armed bandit experiments. Marketing Sci. 36(4):500–522.Link, Google Scholar
- (2015) Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 104(1):148–175.Crossref, Google Scholar
- (2009) Dynamics of retail advertising: Evidence from a field experiment. Econom. Inquiry 47(3):482–499.Crossref, Google Scholar
- (2009) Gaussian process optimization in the bandit setting: No regret and experimental design. Preprint, submitted December 21, https://arxiv.org/abs/0912.3995.Google Scholar
- (2005) Penny wise and pound foolish: The left-digit effect in price cognition. J. Consumer Res. 32(1):54–64.Crossref, Google Scholar
- (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.Crossref, Google Scholar
- (1988) The Theory of Industrial Organization (MIT Press, Cambridge, MA).Google Scholar
- (2018) Nonparametric Gaussian mixture models for the multi-armed bandit. Preprint, submitted August 8, https://arxiv.org/abs/1808.02932.Google Scholar
- (2021) Multimodal dynamic pricing. Management Sci. 67(10):6136–6152.Link, Google Scholar
- (2006) Gaussian Processes for Machine Learning, vol. 2 (MIT Press, Cambridge, MA).Google Scholar
- (2016) Strategic waiting for consumer-generated quality information: Dynamic pricing of new experience goods. Management Sci. 62(2):410–435.Link, Google Scholar
- (2020) Price bargaining and competition in online platforms: An empirical analysis of the daily deal market. Marketing Sci. 39(4):687–706.Link, Google Scholar
- (2023) Variance-dependent regret bounds for linear bandits and reinforcement learning: Adaptivity and computational efficiency. Neu G, Rosasco L, eds. Proc. Thirty Sixth Conf. Learn. Theory, vol. 195 (PMLR, New York), 4977–5020.Google Scholar

