Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information

Published Online:https://doi.org/10.1287/opre.2021.2235

References

  • Agarwal A, Foster DP, Hsu DJ, Kakade SM, Rakhlin A (2011) Stochastic convex optimization with bandit feedback. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira FCN, Weinberger KQ, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 1035–1043.Google Scholar
  • Agrawal R (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv. Appl. Probability 27(4):1054–1078. CrossrefGoogle Scholar
  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Satyen K, Ohad S, eds. Proc. Conf. on Learning Theory (Springer, Berlin, Germany), 76–78.Google Scholar
  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.LinkGoogle Scholar
  • Auer P, Ortner R, Szepesvári C (2007) Improved rates for the stochastic continuum-armed bandit problem. Proc. 20th Internat. Conf. on Learn. Theory (Springer, Berlin, Germany), 454–468.Google Scholar
  • Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.CrossrefGoogle Scholar
  • Ben-Akiva M, Lerman SR (1985) Discrete Choice Analysis: Theory and Application to Travel Demand (MIT Press, Cambridge, MA).Google Scholar
  • Bubeck S, Stoltz G, Yu JY (2011a) Lipschitz bandits without the Lipschitz constant. Proc. Internat. Conf. on Algorithmic Learn. Theory (Springer, Berlin, Germany), 144–158.CrossrefGoogle Scholar
  • Bubeck S, Munos R, Stoltz G, Szepesvári C (2011b) X-armed bandits. J. Machine Learn. Res. 12(5):1655–1695.Google Scholar
  • Bubeck S, Stoltz G, Szepesvári C, Munos R (2009) Online optimization in x-armed bandits. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Advances in Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 201–208.Google Scholar
  • Cesa-Bianchi N, Lugosi G (2012) Combinatorial bandits. J. Comput. System Sci. 78(5):1404–1422.CrossrefGoogle Scholar
  • Chen X, Wang Y (2018) A note on a tight lower bound for MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.CrossrefGoogle Scholar
  • Chen W, Wang Y, Yuan Y (2013) Combinatorial multi-armed bandit: General framework and applications. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. on Machine Learn. (JMLR, Cambridge, MA), 151–159.Google Scholar
  • Chen X, Wang Y, Zhou Y (2021) Optimal policy for dynamic assortment planning under multinomial logit models. Math. Oper. Res. 46(4):1639–1657.LinkGoogle Scholar
  • Combes R, Talebi Mazraeh Shahi MS, Proutiere A, Lelarge M (2015) Combinatorial bandits revisited. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. Advances in Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 2116–2124.Google Scholar
  • Cope EW (2009) Regret and convergence bounds for a class of continuum-armed bandit problems. IEEE Trans. Automated Control 54(6):1243–1253.CrossrefGoogle Scholar
  • den Boer AV, Chen B, Wang Y (2021) Pricing and positioning of horizontally differentiated products with incomplete demand information. Preprint, submitted January 29, https://dx.doi.org/10.2139/ssrn.3682921.Google Scholar
  • Dewan R, Jing B, Seidmann A (2003) Product customization and price competition on the Internet. Management Sci. 49(8):1055–1070.LinkGoogle Scholar
  • Fisher M, Vaidyanathan R (2014) A demand estimation procedure for retail assortment optimization with results from implementations. Management Sci. 60(10):2401–2415.LinkGoogle Scholar
  • Flaxman AD, Kalai AT, McMahan HB (2005) Online convex optimization in the bandit setting: gradient descent without a gradient. Annual ACM-SIAM Sympos. on Discrete Algorithms (SIAM, Philadelphia), 385–394.Google Scholar
  • Fogliatto FS, Da Silveira GJ, Borenstein D (2012) The mass customization decade: An updated review of the literature. Internat. J. Production Econom. 138(1):14–25.CrossrefGoogle Scholar
  • Gaur V, Honhon D (2006) Assortment planning and inventory decisions under a locational choice model. Management Sci. 52(10):1528–1543.LinkGoogle Scholar
  • Gill R, Levit B (1995) Applications of the van Trees inequality: A Bayesian Cramér-Rao bound. Bernoulli 1(1/2):59–79.CrossrefGoogle Scholar
  • Keskin NB, Birge JR (2019) Dynamic selling mechanisms for product differentiation and learning. Oper. Res. 67(4):1069–1089.Google Scholar
  • Kleinberg R (2005) Nearly tight bounds for the continuum-armed bandit problem. Weiss Y, Schölkopf B, Platt J, eds. Advances in Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 697–704.Google Scholar
  • Kleinberg R, Slivkins A, Upfal E (2008) Multi-armed bandits in metric spaces. Proc. 40th ACM Sympos. Theory Comput., (ACM, New York), 681–690.Google Scholar
  • Kushner HJ, Yin GG (1997) Stochastic Approximation and Recursive Algorithms and Applications (Springer-Verlag, New York).Google Scholar
  • Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.CrossrefGoogle Scholar
  • Mahajan S, van Ryzin G (2001) Inventory competition under dynamic consumer choice. Oper. Res. 49(5):646–657.LinkGoogle Scholar
  • Moorthy KS (1984) Market segmentation, self-selection, and product line design. Marketing Sci. 3(4):288–307.LinkGoogle Scholar
  • Mussa M, Rosen S (1978) Monopoly and product quality. J. Econom. Theory 18(2):301–317.CrossrefGoogle Scholar
  • Ou M, Li N, Zhu S, Jin R (2018) Multinomial logit bandit with linear utility functions. Lang J, ed. Proc. 27th Internat. Joint Conf. on Artificial Intelligence, Pasadena, CA, 2602–2608.Google Scholar
  • Pan XA, Honhon D (2012) Assortment planning for vertically differentiated products. Production Oper. Management 21(2):253–275.CrossrefGoogle Scholar
  • Pine BJ (1993) Mass Customization (Harvard Business School Press, Boston, MA).Google Scholar
  • Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (New Series) 58(5):527–535.CrossrefGoogle Scholar
  • Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.CrossrefGoogle Scholar
  • Rusmevichientong P, Shen ZJM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.LinkGoogle Scholar
  • Sauré D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.LinkGoogle Scholar
  • Shamir O (2013) On the complexity of bandit and derivative-free stochastic convex optimization. Proc. 26th Internat. Conf. on Learn. Theory, (Springer, Berlin, Germany), 3–24.Google Scholar
  • Talluri K, van Ryzin G (2004) Revenue management under a general discrete choice model of consumer behavior. Management Sci. 50(1):15–33.LinkGoogle Scholar
  • Train KE (2009) Discrete Choice Methods with Simulation (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.