Online Joint Assortment-Inventory Optimization Under MNL Choices
References
- (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probability 27(4):1054–1078.Crossref, Google Scholar
- (2019) Recent advances in multiarmed bandits for sequential decision making. Operation Research and Management Science in the Age of Analytics (INFORMS, Cantonsville, MD), 167–188. Link, Google Scholar
- (2013) Further optimal regret bounds for Thompson sampling. Carvalho CM, Ravikumar P, eds. Proc. 16th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 99–107.Google Scholar
- (2017) Near-optimal regret bounds for Thompson sampling. J. ACM 64(5):1–24.Crossref, Google Scholar
- (2017) Thompson sampling for the MNL-bandit. Kale S, Shamir O, eds. Proc. 2017 Conf. Learn. Theory, Proceedings of Machine Learning Research, vol. 65 (PMLR, New York), 76–78.Google Scholar
- (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
- (2022) The stability of MNL-based demand under dynamic customer substitution and its algorithmic implications. Oper. Res. 71(4):1216–1249.Link, Google Scholar
- (2018) Greedy-like algorithms for dynamic assortment planning under multinomial logit preferences. Oper. Res. 66(5):1321–1345.Link, Google Scholar
- (2019) Approximation algorithms for dynamic assortment optimization models. Math. Oper. Res. 44(2):487–511.Link, Google Scholar
- (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
- (2021) Personalized dynamic pricing with machine learning: High-dimensional features and heterogeneous elasticity. Management Sci. 67(9):5549–5568.Link, Google Scholar
- (2007) A multiperiod newsvendor problem with partially observed demand. Math. Oper. Res. 32(2):322–344.Link, Google Scholar
- (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends Machine Learn. 5(1):1–122. Crossref, Google Scholar
- (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- (2020) Dynamic inventory control with stockout substitution and demand learning. Management Sci. 66(11):5108–5127.Link, Google Scholar
- (2018) A note on a tight lower bound for capacitated MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.Crossref, Google Scholar
- (2013) Combinatorial multi-armed bandit: General framework and applications. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 28 (PMLR, New York), 151–159.Google Scholar
- (2020) Dynamic assortment optimization with changing contextual information. J. Machine Learn. Res. 21(216):1–44.Google Scholar
- (2021a) Optimal policy for dynamic assortment planning under multinomial logit models. Math. Oper. Res. 46(4):1639–1657.Link, Google Scholar
- (2021b) Dynamic assortment planning under nested logit models. Production Oper. Management 30(1):85–102.Crossref, Google Scholar
- (2022) Inventory balancing with online learning. Management Sci. 68(3):1776–1807.Link, Google Scholar
- (2002) The censored newsvendor and the optimal acquisition of information. Oper. Res. 50(3):517–527.Link, Google Scholar
- (2018) An Introduction to Generalized Linear Models (Chapman and Hall/CRC, New York).Google Scholar
- (2018) Offline assortment optimization in the presence of an online channel. Management Sci. 64(6):2767–2786.Link, Google Scholar
- (2018) The multiproduct newsvendor problem with customer choice. Oper. Res. 66(1):123–136.Link, Google Scholar
- (2013) A nonparametric approach to modeling choice with limited data. Management Sci. 59(2):305–322.Link, Google Scholar
- (2022) Customer choice models vs. machine learning: Finding optimal product displays on Alibaba. Oper. Res. 70(1):309–328.Link, Google Scholar
- (2010) Parametric bandits: The generalized linear case. Proc. 24th Internat. Conf. Neural Inform. Process. Syst., vol. 1 (Curran Associates Inc., Red Hook, NY), 586–594.Google Scholar
- (2022) An efficient learning framework for multiproduct inventory systems with customer choices. Production Oper. Management 31(6):2492–2516.Crossref, Google Scholar
- (2021) Assortment optimization and pricing under the multinomial logit model with impatient customers: Sequential recommendation and selection. Oper. Res. 69(5):1509–1532.Link, Google Scholar
- (2016) Near-optimal algorithms for the assortment planning problem under dynamic substitution and stochastic demand. Oper. Res. 64(1):219–235.Link, Google Scholar
- (2013) Fixed vs. random proportions demand models for the assortment planning problem under stockout-based substitution. Manufacturing Service Oper. Management 15(3):378–386.Link, Google Scholar
- (2010) Assortment planning and inventory decisions under stockout-based substitution. Oper. Res. 58(5):1364–1379.Link, Google Scholar
- (2003) Nantonac collaborative filtering: Recommendation based on order responses. Proc 9th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 583–588.Google Scholar
- (2008) Assortment planning: Review of literature and industry practice. Retail Supply Chain Management 122(1):99–153.Crossref, Google Scholar
- (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
- (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2015) The d-level nested logit model: Assortment and price optimization problems. Oper. Res. 63(2):325–342.Link, Google Scholar
- (2021) Assortment and inventory planning under dynamic substitution with MNL model: An LP approach and an asymptotically optimal policy. Technical report, University of Michigan, Ann Arbor.Google Scholar
- (2005) On “the censored newsvendor and the optimal acquisition of information.” Oper. Res. 53(6):1024–1026.Link, Google Scholar
- (2024) Distribution-free contextual dynamic pricing. Math. Oper. Res. 49(1):599–618.Link, Google Scholar
- (2025) A minibatch stochastic gradient descent-based learning metapolicy for inventory systems with myopic optimal policy. Management Sci. 71(7):5572–5588.Google Scholar
- (2001) Stocking retail assortments under dynamic consumer substitution. Oper. Res. 49(3):334–351.Link, Google Scholar
- (2026) Joint assortment and inventory planning under the Markov chain choice model. Management Sci., ePub ahead of print February 4, https://doi.org/10.1287/mnsc.2023.01322Google Scholar
- (2019) Thompson sampling for multinomial logit contextual bandits. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates Inc., Red Hook, NY).Google Scholar
- (2021) Multinomial logit contextual bandits: Provable optimality and practicality. Proc. AAAI Conf. Artificial Intelligence 35(10):9205–9213.Crossref, Google Scholar
- (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
- (2014) Assortment optimization under the multinomial logit model with random choice parameters. Production Oper. Management 23(11):2023–2039.Crossref, Google Scholar
- (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
- (2019) Assortment planning with nested preferences: Dynamic programming with distributions as states? Algorithmica 81(1):393–417.Crossref, Google Scholar
- (2019) Introduction to multi-armed bandits. Foundations Trends Machine Learn. 12(1–2):1–286.Crossref, Google Scholar
- (2000) Management of multi-item retail inventory systems with demand substitution. Oper. Res. 48(1):50–64.Link, Google Scholar
- (2021) Revenue-utility tradeoff in assortment optimization under the multinomial logit model with totally unimodular constraints. Management Sci. 67(5):2845–2869.Link, Google Scholar
- (2025) A unified algorithmic framework for dynamic assortment optimization under MNL choice. Proc. 26th ACM Conf. Econom. Comput. (ACM, New York), 789.Google Scholar
- (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
- (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4):285–294.Crossref, Google Scholar
- (1999) On the relationship between inventory costs and variety benefits in retail assortments. Management Sci. 45(11):1496–1509.Link, Google Scholar
- (2025) Technical note—Leveraging the degree of dynamic substitution in assortment and inventory planning. Oper. Res. 73(3):1248–11259.Google Scholar

