Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models

Xi Chen
Xi Chen
[email protected]
https://orcid.org/0000-0002-9049-9452
Stern School of Business, New York University, New York, New York 10012;
Search for more papers by this author
,
Yining Wang
Yining Wang
[email protected]
Warrington College of Business, University of Florida, Gainesville, Florida 32611;
Search for more papers by this author
,
Yuan Zhou
Yuan Zhou
[email protected]
Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801
Search for more papers by this author

Stern School of Business, New York University, New York, New York 10012;

Search for more papers by this author

Yining Wang

[email protected]

Warrington College of Business, University of Florida, Gainesville, Florida 32611;

Search for more papers by this author

Yuan Zhou

[email protected]

Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801

Search for more papers by this author

Published Online:13 May 2021https://doi.org/10.1287/moor.2021.1133

References

[1] Agrawal S, Avandhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Kale S, Shamir O, eds. Proc. 30th Annual Conf. Learn. Theory (ML Research Press), 76–78.Google Scholar
[2] Agrawal S, Avandhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
[3] Agarwal A, Foster DP, Hsu D, Kakade SM, Rakhlin A (2013) Stochastic convex optimization with bandit feedback. SIAM J. Optim. 23(1):213–240.Crossref, Google Scholar
[4] Audibert JY, Bubeck S (2009) Minimax policies for adversarial and stochastic bandits. Dasgupta S, Klivans, A, eds. Proc. 22nd Annual Conf. Learn. Theory (ML Research Press), 217–226.Google Scholar
[5] Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.Crossref, Google Scholar
[6] Bubeck S, Munos R, Stoltz G (2009) Pure exploration in multi-armed bandits problems. Gavaldà R, Lugosi G, Zeugmann T, Zilles S, eds. Proc. Internat. Conf. Algorithmic Learn. Theory, (Springer, Berlin), 23–37.Google Scholar
[7] Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
[8] Chen X, Wang Y (2018) A note on tight lower bound for MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.Crossref, Google Scholar
[9] Chen X, Krishnamurthy A, Wang Y (2019) Robust dynamic assortment optimization in the presence of outlier customers. Preprint, submitted October 9, https://arxiv.org/abs/1910.04183.Google Scholar
[10] Chen X, Wang Y, Zhou Y (2020) Dynamic assortment optimization with changing contextual information. J. Machine. Learn. Res. 21(216):1−44.Google Scholar
[11] Chen X, Wang Y, Zhou Y (2021) Dynamic assortment selection under nested logit models. Production Oper. Management 30(1):85−102.Crossref, Google Scholar
[12] Chen X, Ma W, Simchi-Levi D, Xin L (2016) Dynamic recommendation at checkout under inventory constraint. Preprint, submitted October 17, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2853093.Google Scholar
[13] Cheung WC, Simchi-Levi D (2017) Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Technical report, Massachusetts Institute of Technology, Cambridge, MA.Crossref, Google Scholar
[14] Cheung WC, Ma W, Simchi-Levi D, Wang X (2018) Inventory balancing with online learning. Preprint, submitted October 11, https://arxiv.org/abs/1810.05640.Google Scholar
[15] Cohen M, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.Link, Google Scholar
[16] Cohen-Addad V, Kanade V (2017) Online optimization of smoothed piecewise constant functions. Singh A, Zhu J, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist. (ML Research Press), 412–420.Google Scholar
[17] Combes R, Proutiere A (2014) Unimodal bandits: Regret lower bounds and optimal algorithms. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn. (ML Research Press), 521–529.Google Scholar
[18] Cope EW (2009) Regret and convergence bounds for a class of continuum-armed bandit problems. IEEE Trans. Automatic Control 54(6):1243–1253.Crossref, Google Scholar
[19] Gallego G, Iyengar G, Phillips R, Dubey A (2004) Managing flexible products on a network. Technical Report CORC TR-2004-01, Department of Industrial Engineering and Operations Research, Columbia University, New York.Crossref, Google Scholar
[20] Golrezaei N, Nazerzadeh H, Rusmevichientong P (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.Link, Google Scholar
[21] Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58(301):13–30.Crossref, Google Scholar
[22] Leme RP, Schneider J (2018) Contextual search via intrinsic volumes. IEEE Annual Sympos. Found. Comput. Sci. (IEEE Computer Society, Piscataway, NJ), 268–282.Google Scholar
[23] Liu Q, van Ryzin G (2008) On the choice-based linear programming model for network revenue management. Manufacturing Service Oper. Management 10(2):288–310.Link, Google Scholar
[24] Lobel I, Leme RP, Vladu A (2018) Multidimensional binary search for contextual decision-making. Oper. Res. 66(5):1346–1361.Link, Google Scholar
[25] Mahajan S, van Ryzin G (2001) Stocking retail assortments under dynamic consumer substitution. Oper. Res. 49(3):334–351.Link, Google Scholar
[26] McFadden D (1974) Conditional logit analysis of qualitative choice behavior. Zarembka P, ed. Frontiers in Econometrics (Academic Press, New York), 105–142.Google Scholar
[27] Rusmevichientong P, Topaloglu H (2012) Robust assortment optimization in revenue management under the multinomial logit choice model. Oper. Res. 60(4):865–882.Link, Google Scholar
[28] Rusmevichientong P, Shen ZJ, Shmoys D (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
[29] Saure D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
[30] Talluri K, van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
[31] van Ryzin G, Mahajan S (1999) On the relationships between inventory costs and variety benefits in retail assortments. Management Sci. 45(11):1496–1509.Link, Google Scholar
[32] Wang Y, Chen X, Zhou Y (2018) Near-optimal policies for dynamic multinomial logit assortment selection models. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 3105–3114.Google Scholar
[33] Yu JY, Mannor S (2011) Unimodal bandits. Getoor L, Scheffer T, eds. Proc. 28th Internat. Conf. Machine Learn (Omnipress, Madison, WI), 41–48.Google Scholar

cover image Mathematics of Operations Research

Volume 46, Issue 4

November 2021

Pages 1235-1657, C2

Article Information

Supplemental Material

Metrics

Information

Received:February 01, 2019
Accepted:November 15, 2020
Published Online:May 13, 2021

Cite as

Xi Chen, Yining Wang, Yuan Zhou (2021) Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models. Mathematics of Operations Research 46(4):1639-1657.

https://doi.org/10.1287/moor.2021.1133

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models

References

Volume 46, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News