Online Learning for Constrained Assortment Optimization Under Markov Chain Choice Model

Published Online:https://doi.org/10.1287/opre.2022.0693

References

  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Kale S, Shamir O, eds. Proc. 30th Conf. Learning Theory (PMLR, New York), 76–78.Google Scholar
  • Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.LinkGoogle Scholar
  • Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.CrossrefGoogle Scholar
  • Balakrishnan S, Wainwright MJ, Yu B (2017) Statistical guarantees for the EM algorithm: From population to sample-based analysis. Ann. Statist. 45(1):77–120.CrossrefGoogle Scholar
  • Berbeglia G (2016) Discrete choice models based on random walks. Oper. Res. Lett. 44(2):234–237.CrossrefGoogle Scholar
  • Berbeglia G, Garassino A, Vulcano G (2022) A comparative empirical study of discrete choice models in retail operations. Management Sci. 68(6):4005–4023.LinkGoogle Scholar
  • Bernstein F, Modaresi S, Sauré D (2019) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.AbstractGoogle Scholar
  • Blanchet J, Gallego G, Goyal V (2016) A Markov chain approximation to choice modeling. Oper. Res. 64(4):886–905.LinkGoogle Scholar
  • Chen X, Wang Y (2018) A note on a tight lower bound for capacitated MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.CrossrefGoogle Scholar
  • Chen X, Wang Y, Zhou Y (2020) Dynamic assortment optimization with changing contextual information. J. Machine Learn. Res. 21:216–1.Google Scholar
  • Chen X, Wang Y, Zhou Y (2021b) Optimal policy for dynamic assortment planning under multinomial logit models. Math. Oper. Res. 46(4):1639–1657.LinkGoogle Scholar
  • Chen X, Shi C, Wang Y, Zhou Y (2021a) Dynamic assortment planning under nested logit models. Production Oper. Management 30(1):85–102.CrossrefGoogle Scholar
  • Davis J, Gallego G, Topaloglu H (2013) Assortment planning under the multinomial logit model with totally unimodular constraint structures. Technical report, Cornell University, Ithaca, NY.Google Scholar
  • Désir A, Goyal V, Segev D, Ye C (2020) Constrained assortment optimization under the Markov chain-based choice model. Management Sci. 66(2):698–721.LinkGoogle Scholar
  • Dong J, Şimşek AS, Topaloglu H (2019) Pricing problems under the Markov chain choice model. Production Oper. Management 28(1):157–175.CrossrefGoogle Scholar
  • El Housni O, Goyal V, Humair S, Mouchtaki O, Sadighian A, Wu J (2021) Joint assortment and inventory planning for heavy tailed demand. Technical report, Cornell Tech, New York.CrossrefGoogle Scholar
  • Feldman JB, Topaloglu H (2017) Revenue management under the Markov chain choice model. Oper. Res. 65(5):1322–1342.LinkGoogle Scholar
  • Gallego G, Kim S (2020) Joint pricing and inventory decisions for substitutable products. Technical report, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
  • Gallego G, Lu W (2021) An optimal greedy heuristic with minimal learning regret for the Markov chain choice model. Technical report, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
  • Gallego G, Topaloglu H (2019) Revenue Management and Pricing Analytics (Springer, New York).CrossrefGoogle Scholar
  • Gallego G, Ratliff R, Shebalov S (2015) A general attraction model and sales-based linear program for network revenue management under customer choice. Oper. Res. 63(1):212–232.LinkGoogle Scholar
  • Gupta A, Hsu D (2020) Parameter identification in Markov chain choice models. Theoretical Comput. Sci. 808:99–107.CrossrefGoogle Scholar
  • Kallus N, Udell M (2020) Dynamic assortment personalization in high dimensions. Oper. Res. 68(4):1020–1037.LinkGoogle Scholar
  • Kosorok MR (2006) Introduction to Empirical Processes and Semiparametric Inference (Springer, New York).Google Scholar
  • Miao S, Chao X (2021) Dynamic joint assortment and pricing optimization with demand learning. Manufacturing Service Oper. Management 23(2):525–545.Google Scholar
  • Nip K, Wang Z, Wang Z (2021) Assortment optimization under a single transition choice model. Production Oper. Management 30(7):2122–2142.CrossrefGoogle Scholar
  • Oh M, Iyengar G (2019) Thompson sampling for multinomial logit contextual bandits. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems: Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 3145–3155.Google Scholar
  • Perchet V, Rigollet P, Chassang S, Snowberg E (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Google Scholar
  • Ragain S, Ugander J (2016) Pairwise choice Markov chains. Lee DD, Sugiyama M, von Luxburg U, Guyon I, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems: Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 3198–3206.Google Scholar
  • Rusmevichientong P, Shen ZJM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.LinkGoogle Scholar
  • Sauré D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.LinkGoogle Scholar
  • Şimşek AS, Topaloglu H (2018) An expectation-maximization algorithm to estimate the parameters of the Markov chain choice model. Oper. Res. 66(3):748–760.LinkGoogle Scholar
  • Udwani R (2021) Submodular order functions and assortment optimization. Technical report, University of California, Berkeley, Berkeley, CA.Google Scholar
  • Wang R (2013) Assortment management under the generalized attraction model with a capacity constraint. J. Revenue Pricing Management 12(3):254–270.CrossrefGoogle Scholar
  • Wang Y, Chen X, Zhou Y (2018) Near-optimal policies for dynamic multinomial logit assortment selection models. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems: Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 3105–3114.Google Scholar
  • Zhang D, Cooper WL (2005) Revenue management for parallel flights with customer-choice behavior. Oper. Res. 53(3):415–431.LinkGoogle Scholar
  • Zhong Y, Birge JR, Ward A (2022) Learning the scheduling policy in time-varying multiclass many server queues with abandonment. Technical report, University of Chicago, Chicago.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.