Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information

Yannik Peeters
Yannik Peeters
[email protected]
https://orcid.org/0000-0003-0335-8836
Amsterdam Business School, University of Amsterdam, 1018 TV Amsterdam, Netherlands;
Search for more papers by this author
,
Arnoud V. den Boer
Arnoud V. den Boer
[email protected]
https://orcid.org/0000-0003-4779-0436
Amsterdam Business School, University of Amsterdam, 1018 TV Amsterdam, NetherlandsKorteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands
Search for more papers by this author
,
Michel Mandjes
Michel Mandjes
[email protected]
Amsterdam Business School, University of Amsterdam, 1018 TV Amsterdam, Netherlands;Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands
Search for more papers by this author

Amsterdam Business School, University of Amsterdam, 1018 TV Amsterdam, Netherlands;

Amsterdam Business School, University of Amsterdam, 1018 TV Amsterdam, NetherlandsKorteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands

Search for more papers by this author

Michel Mandjes

[email protected]

Amsterdam Business School, University of Amsterdam, 1018 TV Amsterdam, Netherlands;Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands

Search for more papers by this author

Published Online:8 Feb 2022https://doi.org/10.1287/opre.2021.2235

References

Agarwal A, Foster DP, Hsu DJ, Kakade SM, Rakhlin A (2011) Stochastic convex optimization with bandit feedback. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira FCN, Weinberger KQ, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 1035–1043.Google Scholar
Agrawal R (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probability 27(4):1054–1078. Crossref, Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Satyen K, Ohad S, eds. Proc. Conf. on Learning Theory (Springer, Berlin, Germany), 76–78.Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
Auer P, Ortner R, Szepesvári C (2007) Improved rates for the stochastic continuum-armed bandit problem. Proc. 20th Internat. Conf. on Learn. Theory (Springer, Berlin, Germany), 454–468.Google Scholar
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.Crossref, Google Scholar
Ben-Akiva M, Lerman SR (1985) Discrete Choice Analysis: Theory and Application to Travel Demand (MIT Press, Cambridge, MA).Google Scholar
Bubeck S, Stoltz G, Yu JY (2011a) Lipschitz bandits without the Lipschitz constant. Proc. Internat. Conf. on Algorithmic Learn. Theory (Springer, Berlin, Germany), 144–158.Crossref, Google Scholar
Bubeck S, Munos R, Stoltz G, Szepesvári C (2011b) X-armed bandits. J. Machine Learn. Res. 12(5):1655–1695.Google Scholar
Bubeck S, Stoltz G, Szepesvári C, Munos R (2009) Online optimization in x-armed bandits. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Advances in Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 201–208.Google Scholar
Cesa-Bianchi N, Lugosi G (2012) Combinatorial bandits. J. Comput. System Sci. 78(5):1404–1422.Crossref, Google Scholar
Chen X, Wang Y (2018) A note on a tight lower bound for MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.Crossref, Google Scholar
Chen W, Wang Y, Yuan Y (2013) Combinatorial multi-armed bandit: General framework and applications. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. on Machine Learn. (JMLR, Cambridge, MA), 151–159.Google Scholar
Chen X, Wang Y, Zhou Y (2021) Optimal policy for dynamic assortment planning under multinomial logit models. Math. Oper. Res. 46(4):1639–1657.Link, Google Scholar
Combes R, Talebi Mazraeh Shahi MS, Proutiere A, Lelarge M (2015) Combinatorial bandits revisited. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. Advances in Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 2116–2124.Google Scholar
Cope EW (2009) Regret and convergence bounds for a class of continuum-armed bandit problems. IEEE Trans. Automated Control 54(6):1243–1253.Crossref, Google Scholar
den Boer AV, Chen B, Wang Y (2021) Pricing and positioning of horizontally differentiated products with incomplete demand information. Preprint, submitted January 29, https://dx.doi.org/10.2139/ssrn.3682921.Google Scholar
Dewan R, Jing B, Seidmann A (2003) Product customization and price competition on the Internet. Management Sci. 49(8):1055–1070.Link, Google Scholar
Fisher M, Vaidyanathan R (2014) A demand estimation procedure for retail assortment optimization with results from implementations. Management Sci. 60(10):2401–2415.Link, Google Scholar
Flaxman AD, Kalai AT, McMahan HB (2005) Online convex optimization in the bandit setting: gradient descent without a gradient. Annual ACM-SIAM Sympos. on Discrete Algorithms (SIAM, Philadelphia), 385–394.Google Scholar
Fogliatto FS, Da Silveira GJ, Borenstein D (2012) The mass customization decade: An updated review of the literature. Internat. J. Production Econom. 138(1):14–25.Crossref, Google Scholar
Gaur V, Honhon D (2006) Assortment planning and inventory decisions under a locational choice model. Management Sci. 52(10):1528–1543.Link, Google Scholar
Gill R, Levit B (1995) Applications of the van Trees inequality: A Bayesian Cramér-Rao bound. Bernoulli 1(1/2):59–79.Crossref, Google Scholar
Keskin NB, Birge JR (2019) Dynamic selling mechanisms for product differentiation and learning. Oper. Res. 67(4):1069–1089.Google Scholar
Kleinberg R (2005) Nearly tight bounds for the continuum-armed bandit problem. Weiss Y, Schölkopf B, Platt J, eds. Advances in Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 697–704.Google Scholar
Kleinberg R, Slivkins A, Upfal E (2008) Multi-armed bandits in metric spaces. Proc. 40th ACM Sympos. Theory Comput., (ACM, New York), 681–690.Google Scholar
Kushner HJ, Yin GG (1997) Stochastic Approximation and Recursive Algorithms and Applications (Springer-Verlag, New York).Google Scholar
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
Mahajan S, van Ryzin G (2001) Inventory competition under dynamic consumer choice. Oper. Res. 49(5):646–657.Link, Google Scholar
Moorthy KS (1984) Market segmentation, self-selection, and product line design. Marketing Sci. 3(4):288–307.Link, Google Scholar
Mussa M, Rosen S (1978) Monopoly and product quality. J. Econom. Theory 18(2):301–317.Crossref, Google Scholar
Ou M, Li N, Zhu S, Jin R (2018) Multinomial logit bandit with linear utility functions. Lang J, ed. Proc. 27th Internat. Joint Conf. on Artificial Intelligence, Pasadena, CA, 2602–2608.Google Scholar
Pan XA, Honhon D (2012) Assortment planning for vertically differentiated products. Production Oper. Management 21(2):253–275.Crossref, Google Scholar
Pine BJ (1993) Mass Customization (Harvard Business School Press, Boston, MA).Google Scholar
Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (New Series) 58(5):527–535.Crossref, Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.Crossref, Google Scholar
Rusmevichientong P, Shen ZJM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
Sauré D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
Shamir O (2013) On the complexity of bandit and derivative-free stochastic convex optimization. Proc. 26th Internat. Conf. on Learn. Theory, (Springer, Berlin, Germany), 3–24.Google Scholar
Talluri K, van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.Link, Google Scholar
Train KE (2009) Discrete Choice Methods with Simulation (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar

Volume 70, Issue 3

May-June 2022

Pages iii-viii, 1293-1952, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:September 06, 2018
Accepted:October 13, 2021
Published Online:February 08, 2022

Cite as

Yannik Peeters, Arnoud V. den Boer, Michel Mandjes, (2022) Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information. Operations Research 70(3):1613-1628.

https://doi.org/10.1287/opre.2021.2235

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information

References

Volume 70, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News