Optimal No-Regret Learning in Repeated First-Price Auctions

Yanjun Han
Yanjun Han
[email protected]
https://orcid.org/0000-0002-8335-2364
Courant Institute of Mathematical Sciences and Center for Data Science, New York University, New York, New York 10012
Search for more papers by this author
,
Tsachy Weissman
Tsachy Weissman
[email protected]
https://orcid.org/0009-0008-1099-691X
Department of Electrical Engineering, Stanford University, Stanford, California 94305
Search for more papers by this author
,
Zhengyuan Zhou
Corresponding Author
Zhengyuan Zhou
[email protected]
https://orcid.org/0000-0002-0005-9411
Stern School of Business, New York University, New York, New York 10012
Search for more papers by this author

Courant Institute of Mathematical Sciences and Center for Data Science, New York University, New York, New York 10012

Search for more papers by this author

Tsachy Weissman

[email protected]

https://orcid.org/0009-0008-1099-691X

Department of Electrical Engineering, Stanford University, Stanford, California 94305

Search for more papers by this author

Zhengyuan Zhou

Corresponding Author

Zhengyuan Zhou

[email protected]

https://orcid.org/0000-0002-0005-9411

Stern School of Business, New York University, New York, New York 10012

Search for more papers by this author

Published Online:8 Jul 2024https://doi.org/10.1287/opre.2020.0282

References

Alon N, Cesa-Bianchi N, Dekel O, Koren T (2015) Online learning with feedback graphs: Beyond bandits. Grünwald P, Hazan E, Kale S, eds. Proc. Annual Conf. Learn. Theory, vol. 40 (PMLR, New York), 23–35.Google Scholar
Alon N, Cesa-Bianchi N, Gentile C, Mannor S, Mansour Y, Shamir O (2017) Nonstochastic multi-armed bandits with graph-structured feedback. SIAM J. Comput. 46(6):1785–1826.Crossref, Google Scholar
AppNexus (2018) Demystifying auction dynamics for digital buyers and sellers. White paper, AppNexus, New York.Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(Nov):397–422.Google Scholar
Babaioff M, Kleinberg RD, Slivkins A (2015) Truthful mechanisms with implicit payment computation. J. ACM 62(2):1–37.Crossref, Google Scholar
Babaioff M, Sharma Y, Slivkins A (2014) Characterizing truthful multi-armed bandit mechanisms. SIAM J. Comput. 43(1):194–230.Crossref, Google Scholar
Balseiro S, Golrezaei N, Mahdian M, Mirrokni V, Schneider J (2023) Contextual bandits with cross-learning. Math. Oper. Res. 48(3):1607–1629.Google Scholar
Benes R (2018) First-price auctions are driving up ad prices. Accessed October 17, 2018, https://www.emarketer.com/content/first-price-auctions-are-driving-up-ad-prices?Google Scholar
Bercu B, Touati A (2008) Exponential inequalities for self-normalized martingales with applications. Ann. Appl. Probability 18(5):1848–1869.Crossref, Google Scholar
Blum A, Kumar V, Rudra A, Wu F (2004) Online learning in online auctions. Theoretical Comput. Sci. 324(2–3):137–146.Crossref, Google Scholar
Boucheron S, Lugosi G, Massart P (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence (Oxford University Press, Oxford, UK).Crossref, Google Scholar
Cai Y, Daskalakis C (2017) Learning multi-item auctions with (or without) samples. Proc. IEEE 58th Annual Sympos. Foundations Comput. Sci. (IEEE, New York), 516–527.Google Scholar
Capen EC, Clapp RV, Campbell WM (1971) Competitive bidding in high-risk situations. J. Petroleum Tech. 23(06):641–653.Crossref, Google Scholar
Cesa-Bianchi N, Cesari T, Perchet V (2019) Dynamic pricing with finitely many unknown valuations. Garivier A, Kale S, eds. Proc. 30th Internat. Conf. Algorithmic Learn. Theory, vol. 98 (PMLR, New York), 247–273.Google Scholar
Cesa-Bianchi N, Gentile C, Mansour Y (2014) Regret minimization for reserve prices in second-price auctions. IEEE Trans. Inform. Theory 61(1):549–564.Crossref, Google Scholar
Cesa-Bianchi N, Gaillard P, Gentile C, Gerchinovitz S (2017) Algorithmic chaining and the role of partial feedback in online nonparametric learning. Proc. Conf. Learn. Theory, vol. 65 (PMLR, New York), 465–481.Google Scholar
Choi H, Mela CF, Balseiro SR, Leary A (2020) Online display advertising markets: A literature review and future directions. Inform. Systems Res. 31(2):556–575.Link, Google Scholar
Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions. Gordon G, Dunson D, Dudík M, eds. Proc. 14th Internat. Conf. Artificial Intelligence Statist., vol. 15 (PMLR, New York), 208–214.Google Scholar
Cohen A, Hazan T, Koren T (2016) Online learning with feedback graphs without the graphs. Balcan M, Florina W, Kilian Q, eds. Proc. Internat. Conf. Machine Learn., vol. 48 (PMLR, New York), 811–819.Google Scholar
Dann C, Mansour Y, Mohri M, Sekhari A, Sridharan K (2020) Reinforcement learning with feedback graphs. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Advances in Neural Information Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 16868–16878.Google Scholar
Davies J (2019) What to know about Google’s implementation of first-price ad auctions. Accessed September 6, 2019, https://digiday.com/media/buyers-welcome-auction-standardization-as-google-finally-goes-all-in-on-first-price/.Google Scholar
de la Pena VH, Klass MJ, Lai TL (2004) Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws. Ann. Probability 32(3):1902–1933.Crossref, Google Scholar
Despotakis S, Ravi R, Sayedi A (2021) First-price auctions in online display advertising. J. Marketing Res. 58(5):888–907.Google Scholar
Devanur NR, Kakade SM (2009) The price of truthfulness for pay-per-click auctions. Proc. 10th ACM Conf. Electronic Commerce (ACM, New York), 99–106.Google Scholar
Dilworth R (1950) A decomposition theorem for partially ordered sets. Ann. Math. 51(1):161–166.Crossref, Google Scholar
Esponda I (2008) Information feedback in first price auctions. RAND J. Econom. 39(2):491–508.Crossref, Google Scholar
Even-Dar E, Mannor S, Mansour Y (2006) Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. J. Machine Learn. Res. 7(Jun):1079–1105.Google Scholar
Feng Z, Podimata C, Syrgkanis V (2018) Learning to bid without knowing your value. Proc. ACM Conf. Econom. Comput. (ACM, New York), 505–522.Google Scholar
Gao Z, Han Y, Ren Z, Zhou Z (2019) Batched multi-armed bandits problem. Wallach H, Larochelle H, Beygelzimer A, Alché-Buc FD, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 503–513.Google Scholar
Goke S, Weintraub GY, Mastromonaco R, Seljan S (2021) Learning new auction format by bidders in Internet display ad auctions. Preprint, submitted October 19, https://arxiv.org/abs/2110.13814.Google Scholar
Golrezaei N, Javanmard A, Mirrokni V (2019) Dynamic incentive-aware learning: Robust pricing in contextual auctions. Wallach H, Larochelle H, Beygelzimer A, Alché-Buc FD, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 9756–9766.Google Scholar
Joag-Dev K, Proschan F (1983) Negative association of random variables with applications. Ann. Statist. 11(1):286–295.Crossref, Google Scholar
Kleinberg R, Leighton T (2003) The value of knowing a demand curve: Bounds on regret for online posted-price auctions. Proc. 44th Annual IEEE Sympos. Foundations Comput. Sci. (IEEE, New York), 594–605.Google Scholar
Klemperer P (2004) Auctions: Theory and Practice (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Kocák T, Neu G, Valko M, Munos R (2014) Efficient learning by implicit exploration in bandit problems with side observations. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates Inc., Red Hook, NY), 613–621.Google Scholar
Li L, Lu Y, Zhou D (2017) Provably optimal algorithms for generalized linear contextual bandits. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 2071–2080.Google Scholar
Lucking-Reiley D (2000) Vickrey auctions in practice: From nineteenth-century philately to twenty-first-century e-commerce. J. Econom. Perspective 14(3):183–192.Crossref, Google Scholar
Lucking-Reiley D, Bryan D, Prasad N, Reeves D (2007) Pennies from ebay: The determinants of price in online auctions. J. Industry Econom. 55(2):223–233.Crossref, Google Scholar
Lykouris T, Tardos E, Wali D (2020) Feedback graph regret bounds for Thompson sampling and UCB. Kontorovich A, Neu G, eds. Algorithmic Learning Theory, vol. 117 (PMLR, New York), 592–614.Google Scholar
McAfee RP (2011) The design of advertising exchanges. Rev. Industry Organ. 39(3):169–185.Crossref, Google Scholar
Medina AM, Mohri M (2014) Learning theory and algorithms for revenue optimization in second price auctions with reserve. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn., vol. 32 (PMLR, New York), 262–270.Google Scholar
Mitzenmacher M, Upfal E (2017) Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis (Cambridge University Press, Cambridge, UK).Google Scholar
Mohri M, Munoz A (2015) Revenue optimization against strategic buyers. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 28 (Curran Associates Inc., Red Hook, NY), 2530–2538.Google Scholar
Myerson RB (1981) Optimal auction design. Math. Oper. Res. 6(1):58–73.Link, Google Scholar
Riley JG, Samuelson WF (1981) Optimal auctions. Amer. Econom. Rev. 71(3):381–392.Google Scholar
Roughgarden T, Wang JR (2019) Minimizing regret with multiple reserves. ACM Trans. Econom. Comput. 7(3):1–18.Crossref, Google Scholar
Slefo GP (2019) Google’s ad manager will move to first-price auction. Accessed March 6, 2019, https://adage.com/article/digital/google-adx-moving-a-price-auction/316894.Google Scholar
Sluis S (2017) Big changes coming to auctions, as exchanges roll the dice on first-price. Accessed September 5, 2017, https://www.adexchanger.com/platforms/big-changes-coming-auctions-exchanges-roll-dice-first-price/.Google Scholar
Tsybakov A (2008) Introduction to Nonparametric Estimation (Springer-Verlag, Berlin).Google Scholar
van Schaik FD, Kleijnen JP (2001) Sealed-bid auctions: Case study. Working paper, Tilburg University, Tilburg, Netherlands.Google Scholar
Vickrey W (1961) Counterspeculation, auctions, and competitive sealed tenders. J. Finance 16(1):8–37.Crossref, Google Scholar
Wagner K (2019) Digital advertising in the US is finally bigger than print and television. Accessed February 20, 2019, https://www.vox.com/2019/2/20/18232433/digital-advertising-facebook-google-growth-tv-print-emarketer-2019.Google Scholar
Waisman C, Nair HS, Carrion C, Xu N (2019) Online causal inference for advertising in real-time bidding auctions. Preprint, submitted August 22, https://arxiv.org/abs/1908.08600.Google Scholar
Weed J, Perchet V, Rigollet P (2016) Online learning in repeated auctions. Feldman V, Rakhlin A, Shamir O, eds. Proc. Conf. Learn. Theory (PMLR, New York), 1562–1583.Google Scholar
Wilson RB (1969) Communications to the editor–competitive bidding with disparate information. Management Sci. 15(7):446–452.Link, Google Scholar
Wilson R (1985) Game-theoretic analysis of trading processes. Technical report, Stanford University Institute for Mathematical Studies in the Social Sciences, Stanford, CA.Google Scholar
Zhao H, Chen W (2020a) Online second price auction with semi-bandit feedback under the non-stationary setting. Proc. AAAI Conf. Artificial Intelligence 34(4):6893–6900.Google Scholar
Zhao H, Chen W (2020b) Stochastic one-sided full-information bandit. Brefeld U, Fromont E, Hotho A, Knobbe A, Maathuis M, Robardet C, eds. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019, Lecture Notes in Computer Science, vol. 11908 (Springer, Cham, Switzerland).Google Scholar

Volume 73, Issue 1

January-February 2025

Pages iii-vii, 1-582, C2-C3

Article Information

Metrics

Information

Received:May 08, 2020
Accepted:December 04, 2023
Published Online:July 08, 2024

Cite as

Yanjun Han; , Tsachy Weissman; , Zhengyuan Zhou (2024) Optimal No-Regret Learning in Repeated First-Price Auctions. Operations Research 73(1):209-238.

https://doi.org/10.1287/opre.2020.0282

Keywords

Acknowledgments

The authors thank Ruiyuan Huang, Yuxiao Wen, and Tiancheng Yu for spotting technical issues in previous versions; the area editor, associate editor, and three anonymous reviewers for various suggestions on improving this paper; and Harikesh S. Nair, Erik Ordentlich, and Caio Waisman for many helpful discussions on the problem setup.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Optimal No-Regret Learning in Repeated First-Price Auctions

References

Volume 73, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News