Optimal No-Regret Learning in Repeated First-Price Auctions

Published Online:https://doi.org/10.1287/opre.2020.0282

References

  • Alon N, Cesa-Bianchi N, Dekel O, Koren T (2015) Online learning with feedback graphs: Beyond bandits. Grünwald P, Hazan E, Kale S, eds. Proc. Annual Conf. Learn. Theory, vol. 40 (PMLR, New York), 23–35.Google Scholar
  • Alon N, Cesa-Bianchi N, Gentile C, Mannor S, Mansour Y, Shamir O (2017) Nonstochastic multi-armed bandits with graph-structured feedback. SIAM J. Comput. 46(6):1785–1826.CrossrefGoogle Scholar
  • AppNexus (2018) Demystifying auction dynamics for digital buyers and sellers. White paper, AppNexus, New York.Google Scholar
  • Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(Nov):397–422.Google Scholar
  • Babaioff M, Kleinberg RD, Slivkins A (2015) Truthful mechanisms with implicit payment computation. J. ACM 62(2):1–37.CrossrefGoogle Scholar
  • Babaioff M, Sharma Y, Slivkins A (2014) Characterizing truthful multi-armed bandit mechanisms. SIAM J. Comput. 43(1):194–230.CrossrefGoogle Scholar
  • Balseiro S, Golrezaei N, Mahdian M, Mirrokni V, Schneider J (2023) Contextual bandits with cross-learning. Math. Oper. Res. 48(3):1607–1629.Google Scholar
  • Benes R (2018) First-price auctions are driving up ad prices. Accessed October 17, 2018, https://www.emarketer.com/content/first-price-auctions-are-driving-up-ad-prices?Google Scholar
  • Bercu B, Touati A (2008) Exponential inequalities for self-normalized martingales with applications. Ann. Appl. Probability 18(5):1848–1869.CrossrefGoogle Scholar
  • Blum A, Kumar V, Rudra A, Wu F (2004) Online learning in online auctions. Theoretical Comput. Sci. 324(2–3):137–146.CrossrefGoogle Scholar
  • Boucheron S, Lugosi G, Massart P (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence (Oxford University Press, Oxford, UK).CrossrefGoogle Scholar
  • Cai Y, Daskalakis C (2017) Learning multi-item auctions with (or without) samples. Proc. IEEE 58th Annual Sympos. Foundations Comput. Sci. (IEEE, New York), 516–527.Google Scholar
  • Capen EC, Clapp RV, Campbell WM (1971) Competitive bidding in high-risk situations. J. Petroleum Tech. 23(06):641–653.CrossrefGoogle Scholar
  • Cesa-Bianchi N, Cesari T, Perchet V (2019) Dynamic pricing with finitely many unknown valuations. Garivier A, Kale S, eds. Proc. 30th Internat. Conf. Algorithmic Learn. Theory, vol. 98 (PMLR, New York), 247–273.Google Scholar
  • Cesa-Bianchi N, Gentile C, Mansour Y (2014) Regret minimization for reserve prices in second-price auctions. IEEE Trans. Inform. Theory 61(1):549–564.CrossrefGoogle Scholar
  • Cesa-Bianchi N, Gaillard P, Gentile C, Gerchinovitz S (2017) Algorithmic chaining and the role of partial feedback in online nonparametric learning. Proc. Conf. Learn. Theory, vol. 65 (PMLR, New York), 465–481.Google Scholar
  • Choi H, Mela CF, Balseiro SR, Leary A (2020) Online display advertising markets: A literature review and future directions. Inform. Systems Res. 31(2):556–575.LinkGoogle Scholar
  • Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions. Gordon G, Dunson D, Dudík M, eds. Proc. 14th Internat. Conf. Artificial Intelligence Statist., vol. 15 (PMLR, New York), 208–214.Google Scholar
  • Cohen A, Hazan T, Koren T (2016) Online learning with feedback graphs without the graphs. Balcan M, Florina W, Kilian Q, eds. Proc. Internat. Conf. Machine Learn., vol. 48 (PMLR, New York), 811–819.Google Scholar
  • Dann C, Mansour Y, Mohri M, Sekhari A, Sridharan K (2020) Reinforcement learning with feedback graphs. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Advances in Neural Information Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 16868–16878.Google Scholar
  • Davies J (2019) What to know about Google’s implementation of first-price ad auctions. Accessed September 6, 2019, https://digiday.com/media/buyers-welcome-auction-standardization-as-google-finally-goes-all-in-on-first-price/.Google Scholar
  • de la Pena VH, Klass MJ, Lai TL (2004) Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws. Ann. Probability 32(3):1902–1933.CrossrefGoogle Scholar
  • Despotakis S, Ravi R, Sayedi A (2021) First-price auctions in online display advertising. J. Marketing Res. 58(5):888–907.Google Scholar
  • Devanur NR, Kakade SM (2009) The price of truthfulness for pay-per-click auctions. Proc. 10th ACM Conf. Electronic Commerce (ACM, New York), 99–106.Google Scholar
  • Dilworth R (1950) A decomposition theorem for partially ordered sets. Ann. Math. 51(1):161–166.CrossrefGoogle Scholar
  • Esponda I (2008) Information feedback in first price auctions. RAND J. Econom. 39(2):491–508.CrossrefGoogle Scholar
  • Even-Dar E, Mannor S, Mansour Y (2006) Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. J. Machine Learn. Res. 7(Jun):1079–1105.Google Scholar
  • Feng Z, Podimata C, Syrgkanis V (2018) Learning to bid without knowing your value. Proc. ACM Conf. Econom. Comput. (ACM, New York), 505–522.Google Scholar
  • Gao Z, Han Y, Ren Z, Zhou Z (2019) Batched multi-armed bandits problem. Wallach H, Larochelle H, Beygelzimer A, Alché-Buc FD, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 503–513.Google Scholar
  • Goke S, Weintraub GY, Mastromonaco R, Seljan S (2021) Learning new auction format by bidders in Internet display ad auctions. Preprint, submitted October 19, https://arxiv.org/abs/2110.13814.Google Scholar
  • Golrezaei N, Javanmard A, Mirrokni V (2019) Dynamic incentive-aware learning: Robust pricing in contextual auctions. Wallach H, Larochelle H, Beygelzimer A, Alché-Buc FD, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 9756–9766.Google Scholar
  • Joag-Dev K, Proschan F (1983) Negative association of random variables with applications. Ann. Statist. 11(1):286–295.CrossrefGoogle Scholar
  • Kleinberg R, Leighton T (2003) The value of knowing a demand curve: Bounds on regret for online posted-price auctions. Proc. 44th Annual IEEE Sympos. Foundations Comput. Sci. (IEEE, New York), 594–605.Google Scholar
  • Klemperer P (2004) Auctions: Theory and Practice (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Kocák T, Neu G, Valko M, Munos R (2014) Efficient learning by implicit exploration in bandit problems with side observations. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates Inc., Red Hook, NY), 613–621.Google Scholar
  • Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 2071–2080.Google Scholar
  • Lucking-Reiley D (2000) Vickrey auctions in practice: From nineteenth-century philately to twenty-first-century e-commerce. J. Econom. Perspective 14(3):183–192.CrossrefGoogle Scholar
  • Lucking-Reiley D, Bryan D, Prasad N, Reeves D (2007) Pennies from ebay: The determinants of price in online auctions. J. Industry Econom. 55(2):223–233.CrossrefGoogle Scholar
  • Lykouris T, Tardos E, Wali D (2020) Feedback graph regret bounds for Thompson sampling and UCB. Kontorovich A, Neu G, eds. Algorithmic Learning Theory, vol. 117 (PMLR, New York), 592–614.Google Scholar
  • McAfee RP (2011) The design of advertising exchanges. Rev. Industry Organ. 39(3):169–185.CrossrefGoogle Scholar
  • Medina AM, Mohri M (2014) Learning theory and algorithms for revenue optimization in second price auctions with reserve. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn., vol. 32 (PMLR, New York), 262–270.Google Scholar
  • Mitzenmacher M, Upfal E (2017) Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis (Cambridge University Press, Cambridge, UK).Google Scholar
  • Mohri M, Munoz A (2015) Revenue optimization against strategic buyers. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 28 (Curran Associates Inc., Red Hook, NY), 2530–2538.Google Scholar
  • Myerson RB (1981) Optimal auction design. Math. Oper. Res. 6(1):58–73.LinkGoogle Scholar
  • Riley JG, Samuelson WF (1981) Optimal auctions. Amer. Econom. Rev. 71(3):381–392.Google Scholar
  • Roughgarden T, Wang JR (2019) Minimizing regret with multiple reserves. ACM Trans. Econom. Comput. 7(3):1–18.CrossrefGoogle Scholar
  • Slefo GP (2019) Google’s ad manager will move to first-price auction. Accessed March 6, 2019, https://adage.com/article/digital/google-adx-moving-a-price-auction/316894.Google Scholar
  • Sluis S (2017) Big changes coming to auctions, as exchanges roll the dice on first-price. Accessed September 5, 2017, https://www.adexchanger.com/platforms/big-changes-coming-auctions-exchanges-roll-dice-first-price/.Google Scholar
  • Tsybakov A (2008) Introduction to Nonparametric Estimation (Springer-Verlag, Berlin).Google Scholar
  • van Schaik FD, Kleijnen JP (2001) Sealed-bid auctions: Case study. Working paper, Tilburg University, Tilburg, Netherlands.Google Scholar
  • Vickrey W (1961) Counterspeculation, auctions, and competitive sealed tenders. J. Finance 16(1):8–37.CrossrefGoogle Scholar
  • Wagner K (2019) Digital advertising in the US is finally bigger than print and television. Accessed February 20, 2019, https://www.vox.com/2019/2/20/18232433/digital-advertising-facebook-google-growth-tv-print-emarketer-2019.Google Scholar
  • Waisman C, Nair HS, Carrion C, Xu N (2019) Online causal inference for advertising in real-time bidding auctions. Preprint, submitted August 22, https://arxiv.org/abs/1908.08600.Google Scholar
  • Weed J, Perchet V, Rigollet P (2016) Online learning in repeated auctions. Feldman V, Rakhlin A, Shamir O, eds. Proc. Conf. Learn. Theory (PMLR, New York), 1562–1583.Google Scholar
  • Wilson RB (1969) Communications to the editor–competitive bidding with disparate information. Management Sci. 15(7):446–452.LinkGoogle Scholar
  • Wilson R (1985) Game-theoretic analysis of trading processes. Technical report, Stanford University Institute for Mathematical Studies in the Social Sciences, Stanford, CA.Google Scholar
  • Zhao H, Chen W (2020a) Online second price auction with semi-bandit feedback under the non-stationary setting. Proc. AAAI Conf. Artificial Intelligence 34(4):6893–6900.Google Scholar
  • Zhao H, Chen W (2020b) Stochastic one-sided full-information bandit. Brefeld U, Fromont E, Hotho A, Knobbe A, Maathuis M, Robardet C, eds. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019, Lecture Notes in Computer Science, vol. 11908 (Springer, Cham, Switzerland).Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.