Online Learning for Constrained Assortment Optimization Under Markov Chain Choice Model

Shukai Li
Shukai Li
[email protected]
https://orcid.org/0009-0005-1406-5803
Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208
Search for more papers by this author
,
Qi Luo
Qi Luo
[email protected]
https://orcid.org/0000-0002-4103-7112
Department of Business Analytics, University of Iowa, Iowa City, Iowa 52242
Search for more papers by this author
,
Zhiyuan Huang
Zhiyuan Huang
[email protected]
https://orcid.org/0000-0003-1284-2128
Department of Management Science and Engineering, Tongji University, Shanghai 200092, China
Search for more papers by this author
,
Cong Shi
Corresponding Author
Cong Shi
[email protected]
https://orcid.org/0000-0003-3564-3391
Management Science, Miami Herbert Business School, University of Miami, Coral Gables, Florida 33146
Search for more papers by this author

Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208

Search for more papers by this author

Qi Luo

[email protected]

https://orcid.org/0000-0002-4103-7112

Department of Business Analytics, University of Iowa, Iowa City, Iowa 52242

Search for more papers by this author

Zhiyuan Huang

[email protected]

https://orcid.org/0000-0003-1284-2128

Department of Management Science and Engineering, Tongji University, Shanghai 200092, China

Search for more papers by this author

Cong Shi

Corresponding Author

Cong Shi

[email protected]

https://orcid.org/0000-0003-3564-3391

Management Science, Miami Herbert Business School, University of Miami, Coral Gables, Florida 33146

Search for more papers by this author

Published Online:15 May 2024https://doi.org/10.1287/opre.2022.0693

References

Agrawal S, Avadhanula V, Goyal V, Zeevi A (2017) Thompson sampling for the MNL-bandit. Kale S, Shamir O, eds. Proc. 30th Conf. Learning Theory (PMLR, New York), 76–78.Google Scholar
Agrawal S, Avadhanula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48–77.Crossref, Google Scholar
Balakrishnan S, Wainwright MJ, Yu B (2017) Statistical guarantees for the EM algorithm: From population to sample-based analysis. Ann. Statist. 45(1):77–120.Crossref, Google Scholar
Berbeglia G (2016) Discrete choice models based on random walks. Oper. Res. Lett. 44(2):234–237.Crossref, Google Scholar
Berbeglia G, Garassino A, Vulcano G (2022) A comparative empirical study of discrete choice models in retail operations. Management Sci. 68(6):4005–4023.Link, Google Scholar
Bernstein F, Modaresi S, Sauré D (2019) A dynamic clustering approach to data-driven assortment personalization. Management Sci. 65(5):2095–2115.Abstract, Google Scholar
Blanchet J, Gallego G, Goyal V (2016) A Markov chain approximation to choice modeling. Oper. Res. 64(4):886–905.Link, Google Scholar
Chen X, Wang Y (2018) A note on a tight lower bound for capacitated MNL-bandit assortment selection models. Oper. Res. Lett. 46(5):534–537.Crossref, Google Scholar
Chen X, Wang Y, Zhou Y (2020) Dynamic assortment optimization with changing contextual information. J. Machine Learn. Res. 21:216–1.Google Scholar
Chen X, Wang Y, Zhou Y (2021b) Optimal policy for dynamic assortment planning under multinomial logit models. Math. Oper. Res. 46(4):1639–1657.Link, Google Scholar
Chen X, Shi C, Wang Y, Zhou Y (2021a) Dynamic assortment planning under nested logit models. Production Oper. Management 30(1):85–102.Crossref, Google Scholar
Davis J, Gallego G, Topaloglu H (2013) Assortment planning under the multinomial logit model with totally unimodular constraint structures. Technical report, Cornell University, Ithaca, NY.Google Scholar
Désir A, Goyal V, Segev D, Ye C (2020) Constrained assortment optimization under the Markov chain-based choice model. Management Sci. 66(2):698–721.Link, Google Scholar
Dong J, Şimşek AS, Topaloglu H (2019) Pricing problems under the Markov chain choice model. Production Oper. Management 28(1):157–175.Crossref, Google Scholar
El Housni O, Goyal V, Humair S, Mouchtaki O, Sadighian A, Wu J (2021) Joint assortment and inventory planning for heavy tailed demand. Technical report, Cornell Tech, New York.Crossref, Google Scholar
Feldman JB, Topaloglu H (2017) Revenue management under the Markov chain choice model. Oper. Res. 65(5):1322–1342.Link, Google Scholar
Gallego G, Kim S (2020) Joint pricing and inventory decisions for substitutable products. Technical report, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
Gallego G, Lu W (2021) An optimal greedy heuristic with minimal learning regret for the Markov chain choice model. Technical report, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
Gallego G, Topaloglu H (2019) Revenue Management and Pricing Analytics (Springer, New York).Crossref, Google Scholar
Gallego G, Ratliff R, Shebalov S (2015) A general attraction model and sales-based linear program for network revenue management under customer choice. Oper. Res. 63(1):212–232.Link, Google Scholar
Gupta A, Hsu D (2020) Parameter identification in Markov chain choice models. Theoretical Comput. Sci. 808:99–107.Crossref, Google Scholar
Kallus N, Udell M (2020) Dynamic assortment personalization in high dimensions. Oper. Res. 68(4):1020–1037.Link, Google Scholar
Kosorok MR (2006) Introduction to Empirical Processes and Semiparametric Inference (Springer, New York).Google Scholar
Miao S, Chao X (2021) Dynamic joint assortment and pricing optimization with demand learning. Manufacturing Service Oper. Management 23(2):525–545.Google Scholar
Nip K, Wang Z, Wang Z (2021) Assortment optimization under a single transition choice model. Production Oper. Management 30(7):2122–2142.Crossref, Google Scholar
Oh M, Iyengar G (2019) Thompson sampling for multinomial logit contextual bandits. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems: Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 3145–3155.Google Scholar
Perchet V, Rigollet P, Chassang S, Snowberg E (2016) Batched bandit problems. Ann. Statist. 44(2):660–681.Google Scholar
Ragain S, Ugander J (2016) Pairwise choice Markov chains. Lee DD, Sugiyama M, von Luxburg U, Guyon I, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems: Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 3198–3206.Google Scholar
Rusmevichientong P, Shen ZJM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
Sauré D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
Şimşek AS, Topaloglu H (2018) An expectation-maximization algorithm to estimate the parameters of the Markov chain choice model. Oper. Res. 66(3):748–760.Link, Google Scholar
Udwani R (2021) Submodular order functions and assortment optimization. Technical report, University of California, Berkeley, Berkeley, CA.Google Scholar
Wang R (2013) Assortment management under the generalized attraction model with a capacity constraint. J. Revenue Pricing Management 12(3):254–270.Crossref, Google Scholar
Wang Y, Chen X, Zhou Y (2018) Near-optimal policies for dynamic multinomial logit assortment selection models. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Proc. Adv. Neural Inform. Processing Systems: Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY), 3105–3114.Google Scholar
Zhang D, Cooper WL (2005) Revenue management for parallel flights with customer-choice behavior. Oper. Res. 53(3):415–431.Link, Google Scholar
Zhong Y, Birge JR, Ward A (2022) Learning the scheduling policy in time-varying multiclass many server queues with abandonment. Technical report, University of Chicago, Chicago.Google Scholar

Volume 73, Issue 1

January-February 2025

Pages iii-vii, 1-582, C2-C3

Article Information

Metrics

Information

Received:December 31, 2022
Accepted:March 07, 2024
Published Online:May 15, 2024

Cite as

Shukai Li; , Qi Luo; , Zhiyuan Huang; , Cong Shi (2024) Online Learning for Constrained Assortment Optimization Under Markov Chain Choice Model. Operations Research 73(1):109-138.

https://doi.org/10.1287/opre.2022.0693

Keywords

Acknowledgments

The authors thank the area editor Professor Ilan Lobel, associate editor, and anonymous referees for detailed and constructive comments that significantly improved the content and exposition of this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning for Constrained Assortment Optimization Under Markov Chain Choice Model

References

Volume 73, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News