Revenue Maximization and Learning in Product Ranking

Ningyuan Chen
Ningyuan Chen
[email protected]
https://orcid.org/0000-0002-3948-1011
Rotman School of Management, University of Toronto, Toronto, Ontario M5S 3E6, Canada
Search for more papers by this author
,
Anran Li
Anran Li
[email protected]
https://orcid.org/0000-0001-7001-2240
Department of Decisions, Operations and Technology, The Chinese University of Hong Kong, Hong Kong SAR, China
Search for more papers by this author
,
Shuoguang Yang
Corresponding Author
Shuoguang Yang
[email protected]
https://orcid.org/0000-0002-0915-196X
Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Hong Kong SAR, China
Search for more papers by this author

Rotman School of Management, University of Toronto, Toronto, Ontario M5S 3E6, Canada

Search for more papers by this author

Anran Li

[email protected]

https://orcid.org/0000-0001-7001-2240

Department of Decisions, Operations and Technology, The Chinese University of Hong Kong, Hong Kong SAR, China

Search for more papers by this author

Shuoguang Yang

Corresponding Author

Shuoguang Yang

[email protected]

https://orcid.org/0000-0002-0915-196X

Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Hong Kong SAR, China

Search for more papers by this author

Published Online:5 Jan 2026https://doi.org/10.1287/opre.2020.0781

References

Abeliuk A, Berbeglia G, Cebrian M, Van Hentenryck P (2016) Assortment optimization under a multinomial logit model with position bias and social influence. 4OR 14(1):57–75.Crossref, Google Scholar
Agarwal A, Hosanagar K, Smith MD (2011) Location, location, location: An analysis of profitability of position in online advertising markets. J. Marketing Res. 48(6):1057–1073.Crossref, Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
Aouad A, Segev D (2020) Display optimization for vertically differentiated locations under multinomial logit preferences. Management Sci. 67(6):3519–3550.Link, Google Scholar
Aouad A, Feldman J, Segev D, Zhang DJ (2025) The click-based MNL model: A framework for modeling click data in assortment optimization. Management Sci. 71(8):6943–6960.Link, Google Scholar
Araman VF, Caldentey R (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.Link, Google Scholar
Asadpour A, Niazadeh R, Saberi A, Shameli A (2023) Sequential submodular maximization and applications to ranking an assortment of products. Oper. Res. 71(4):1154–1170.Link, Google Scholar
Baye MR, Gatti JRJ, Kattuman P, Morgan J (2009) Clicks, discontinuities, and firm demand online. J. Econom. Management Strategy 18(4):935–975.Crossref, Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
Besbes O, Gur Y, Zeevi A (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.Link, Google Scholar
Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
Brubach B, Grammel N, Ma W, Srinivasan A (2025) Online matching frameworks under stochastic rewards, product ranking, and unknown patience. Oper. Res. 73(2):995–1010.Link, Google Scholar
Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.Crossref, Google Scholar
Cao J, Sun W (2019) Dynamic learning of sequential choice bandit problem under marketing fatigue. Proc. AAAI Conf. Artificial Intelligence 33(1):3264–3271.Crossref, Google Scholar
Cao J, Sun W, Shen Z-JM (2019) Doubly adaptive cascading bandits with user abandonment. Working paper, University of Texas, Austin.Google Scholar
Caplin A, Dean M, Martin D (2011) Search and satisficing. Amer. Econom. Rev. 101(7):2899–2922.Crossref, Google Scholar
Chen N, Gallego G (2021) Nonparametric pricing analytics with customer covariates. Oper. Res. 69(3):974–984.Link, Google Scholar
Chen N, Gallego G (2022) A primal-dual learning algorithm for personalized dynamic pricing with an inventory constraint. Math. Oper. Res. 47(4):2585–2613.Link, Google Scholar
Chen Q, Jasin S, Duenyas I (2019) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.Link, Google Scholar
Chen X, Wang Y, Zhou Y (2020) Dynamic assortment optimization with changing contextual information. J. Machine Learn. Res. 21(216):1–44.Google Scholar
Chen Y-J, Gallego G, Gao P, Li Y (2025) Position auctions with endogenous product information: Why live-streaming advertising is thriving. Management Sci. 71(11):9290–9307.Link, Google Scholar
Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
Cheung WC, Tan V, Zhong Z (2019) A Thompson sampling algorithm for cascading bandits. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 438–447.Google Scholar
Craswell N, Zoeter O, Taylor M, Ramsey B (2008) An experimental comparison of click position-bias models. Proc. 2008 Internat. Conf. Web Search Data Mining (Association for Computing Machinery, New York), 87–94.Google Scholar
den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
Derakhshan M, Golrezaei N, Manshadi V, Mirrokni V (2022) Product ranking on online platforms. Management Sci. 68(6):4024–4041.Link, Google Scholar
Feldman J, Segev D (2019) Improved approximation schemes for MNL-driven sequential assortment optimization. Working paper, Washington University, St. Louis.Google Scholar
Feng J, Bhargava HK, Pennock DM (2007) Implementing sponsored search in web search engines: Computational evaluation of alternative mechanisms. INFORMS J. Comput. 19(1):137–148.Link, Google Scholar
Ferreira KJ, Parthasarathy S, Sekar S (2022) Learning to rank an assortment of products. Management Sci. 68(3):1828–1848.Link, Google Scholar
Ferreira KJ, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
Flores A, Berbeglia G, Van Hentenryck P (2019) Assortment optimization under the sequential multinomial logit model. Eur. J. Oper. Res. 273(3):1052 –1064.Crossref, Google Scholar
Gallego G, Li A (2024) A random consideration set model for demand estimation, assortment optimization, and pricing. Oper Res. 72(6):2358–2374.Link, Google Scholar
Gallego G, Li A, Truong V-A, Wang X (2020) Approximation algorithms for product framing and pricing. Oper. Res. 68(1):134–160.Link, Google Scholar
Gao X, Jasin S, Najafi S, Zhang H (2022) Joint learning and optimization for multi-product pricing (and ranking) under a general cascade click model. Management Sci. 68(10):7362–7382.Link, Google Scholar
Gao P, Ma Y, Chen N, Gallego G, Li A, Rusmevichientong P, Topaloglu H (2021) Assortment optimization and pricing under the multinomial logit model with impatient customers: Sequential recommendation and selection. Oper. Res. 69(5):1509–1532.Link, Google Scholar
Gah-Yi B, Bora KN (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.Google Scholar
Ghose A, Yang S (2009) An empirical analysis of search engine advertising: Sponsored search in electronic markets. Management Sci. 55(10):1605–1622.Link, Google Scholar
Golrezaei N, Manshadi V, Schneider J, Sekar S (2022) Learning product rankings robust to fake users. Oper. Res. 71(4):1171–1196.Link, Google Scholar
Kallus N, Udell M (2020) Dynamic assortment personalization in high dimensions. Oper. Res. 68(4):1020–1037.Link, Google Scholar
Katariya S, Kveton B, Szepesvari C, Wen Z (2016) DCM bandits: Learning to rank with multiple clicks. Balcan MF, Weinberger KQ, eds. Proc. 33rd Internat. Conf. Machine Learn. (PMLR, New York), 1215–1224.Google Scholar
Kempe D, Mahdian M (2008) A cascade model for externalities in sponsored search. Papadimitriou C, Zhang S, eds. Internet Network Econom. WINE 2008 (Springer, Berlin), 585–596.Google Scholar
Kveton B, Szepesvari C, Wen Z, Ashkan A (2015a) Cascading bandits: Learning to rank in the cascade model. Bach F, Blei D, eds. Proc. 32nd Internat. Conf. Machine Learn., vol. 37 (PMLR, New York), 767–776.Google Scholar
Kveton B, Wen Z, Ashkan A, Szepesvari C (2015b) Combinatorial cascading bandits. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 28 (NIPS 2015) (Curran Associates Inc., Red Hook, NY), 1450–1458.Google Scholar
Lagrée P, Vernade C, Cappe O (2016) Multiple-play bandits in the position-based model. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (NIPS 2016) (Curran Associates Inc., Red Hook, NY), 1597–1605.Google Scholar
Liu N, Ma Y, Topaloglu H (2020) Assortment optimization under the multinomial logit model with sequential offerings. INFORMS J. Comput. 32(3):835–853.Link, Google Scholar
Mahajan S, van Ryzin G (2001) Stocking retail assortments under dynamic consumer substitution. Oper. Res. 49(3):334–351.Link, Google Scholar
Miao S, Chen X, Chao X, Liu J, Zhang Y (2022) Context-based dynamic pricing with online clustering. Production Oper. Management 31(9):3559–3575.Crossref, Google Scholar
Niazadeh R, Golrezaei N, Wang JR, Susan F, Badanidiyuru A (2021) Online learning via offline greedy algorithms: Applications in market design and optimization. Proc. 22nd ACM Conf. Econom. Comput. (Association for Computing Machinery, New York), 737–738.Google Scholar
Oh M-H, Iyengar G (2019) Thompson sampling for multinomial logit contextual bandits. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY), 3151–3161.Google Scholar
Rinne H (2014) The Hazard Rate: Theory and Inference (with Supplementary MATLAB-Programs) (Justus-Liebig-University, Giessen, Germany).Google Scholar
Rusmevichientong P, Shen Z-JM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
Sauré D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
Simon HA (1955) A behavioral model of rational choice. Quart. J. Econom. 69(1):99–118.Crossref, Google Scholar
Talluri K, van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
Wang R, Sahin O (2018) The impact of consumer search cost on assortment planning and pricing. Management Sci. 64(8):3649–3666.Link, Google Scholar
Zhang Z, Ahn H-S, Baardman L (2024) Inventory Ordering and Product Ranking for Online Curation Retailers. Working paper, Boston College, Chestnut Hill, MA.Google Scholar
Zoghi M, Tunys T, Ghavamzadeh M, Kveton B, Szepesvari C, Wen Z (2017) Online learning to rank in stochastic click models. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn. (PMLR, New York), 4199–4208.Google Scholar
Zong S, Ni H, Sung K, Ke NR, Wen Z, Kveton B (2016) Cascading bandits for large-scale recommendation problems. Ihler A, Janzing D, eds. UAI’16 Proc. 32nd Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 835–844.Google Scholar

Volume 74, Issue 3

May-June 2026

Pages v-x, 1153-1728, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:December 03, 2020
Accepted:September 19, 2025
Published Online:January 05, 2026

Cite as

Ningyuan Chen, Anran Li, Shuoguang Yang (2026) Revenue Maximization and Learning in Product Ranking. Operations Research 74(3):1284-1303.

https://doi.org/10.1287/opre.2020.0781

Keywords

Acknowledgments

The authors thank the editors and anonymous reviewers for their suggestions on this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Revenue Maximization and Learning in Product Ranking

References

Volume 74, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News