Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models
Published Online:13 May 2021https://doi.org/10.1287/moor.2021.1133
References
- [1] (2017) Thompson sampling for the MNL-bandit. Kale S, Shamir O, eds. Proc. 30th Annual Conf. Learn. Theory (ML Research Press), 76–78.Google Scholar
- [2] (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
- [3] (2013) Stochastic convex optimization with bandit feedback. SIAM J. Optim. 23(1):213–240.Crossref, Google Scholar
- [4] (2009) Minimax policies for adversarial and stochastic bandits. Dasgupta S, Klivans, A, eds. Proc. 22nd Annual Conf. Learn. Theory (ML Research Press), 217–226.Google Scholar
- [5] (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122.Crossref, Google Scholar
- [6] (2009) Pure exploration in multi-armed bandits problems. Gavaldà R, Lugosi G, Zeugmann T, Zilles S, eds. Proc. Internat. Conf. Algorithmic Learn. Theory, (Springer, Berlin), 23–37.Google Scholar
- [7] (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- [8] (2018) A note on tight lower bound for MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.Crossref, Google Scholar
- [9] (2019) Robust dynamic assortment optimization in the presence of outlier customers. Preprint, submitted October 9, https://arxiv.org/abs/1910.04183.Google Scholar
- [10] (2020) Dynamic assortment optimization with changing contextual information. J. Machine. Learn. Res. 21(216):1−44.Google Scholar
- [11] Chen X, Wang Y, Zhou Y (2021) Dynamic assortment selection under nested logit models. Production Oper. Management 30(1):85−102.Crossref, Google Scholar
- [12] (2016) Dynamic recommendation at checkout under inventory constraint. Preprint, submitted October 17, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2853093.Google Scholar
- [13] (2017) Thompson sampling for online personalized assortment optimization problems with multinomial logit choice models. Technical report, Massachusetts Institute of Technology, Cambridge, MA.Crossref, Google Scholar
- [14] (2018) Inventory balancing with online learning. Preprint, submitted October 11, https://arxiv.org/abs/1810.05640.Google Scholar
- [15] (2020) Feature-based dynamic pricing. Management Sci. 66(11):4921–4943.Link, Google Scholar
- [16] (2017) Online optimization of smoothed piecewise constant functions. Singh A, Zhu J, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist. (ML Research Press), 412–420.Google Scholar
- [17] (2014) Unimodal bandits: Regret lower bounds and optimal algorithms. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn. (ML Research Press), 521–529.Google Scholar
- [18] (2009) Regret and convergence bounds for a class of continuum-armed bandit problems. IEEE Trans. Automatic Control 54(6):1243–1253.Crossref, Google Scholar
- [19] (2004) Managing flexible products on a network. Technical Report CORC TR-2004-01, Department of Industrial Engineering and Operations Research, Columbia University, New York.Crossref, Google Scholar
- [20] (2014) Real-time optimization of personalized assortments. Management Sci. 60(6):1532–1551.Link, Google Scholar
- [21] (1963) Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58(301):13–30.Crossref, Google Scholar
- [22] (2018) Contextual search via intrinsic volumes. IEEE Annual Sympos. Found. Comput. Sci. (IEEE Computer Society, Piscataway, NJ), 268–282.Google Scholar
- [23] (2008) On the choice-based linear programming model for network revenue management. Manufacturing Service Oper. Management 10(2):288–310.Link, Google Scholar
- [24] (2018) Multidimensional binary search for contextual decision-making. Oper. Res. 66(5):1346–1361.Link, Google Scholar
- [25] (2001) Stocking retail assortments under dynamic consumer substitution. Oper. Res. 49(3):334–351.Link, Google Scholar
- [26] (1974) Conditional logit analysis of qualitative choice behavior. Zarembka P, ed. Frontiers in Econometrics (Academic Press, New York), 105–142.Google Scholar
- [27] (2012) Robust assortment optimization in revenue management under the multinomial logit choice model. Oper. Res. 60(4):865–882.Link, Google Scholar
- [28] (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
- [29] (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
- [30] (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
- [31] (1999) On the relationships between inventory costs and variety benefits in retail assortments. Management Sci. 45(11):1496–1509.Link, Google Scholar
- [32] (2018) Near-optimal policies for dynamic multinomial logit assortment selection models. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 3105–3114.Google Scholar
- [33] (2011) Unimodal bandits. Getoor L, Scheffer T, eds. Proc. 28th Internat. Conf. Machine Learn (Omnipress, Madison, WI), 41–48.Google Scholar

