Multimodal Dynamic Pricing

Yining Wang
Yining Wang
[email protected]
https://orcid.org/0000-0001-9410-0392
Department of Information Systems and Operations Management, Warrington College of Business, University of Florida, Gainesville, Florida 32611;
Search for more papers by this author
,
Boxiao Chen
Corresponding Author
Boxiao Chen
[email protected]
https://orcid.org/0000-0002-5967-4822
Department of Information and Decision Sciences, College of Business Administration, University of Illinois, Chicago, Illinois 60607;
Search for more papers by this author
,
David Simchi-Levi
David Simchi-Levi
[email protected]
https://orcid.org/0000-0002-4650-1519
Institute for Data, Systems, and Society, Department of Civil and Environmental Engineering, and Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Department of Information Systems and Operations Management, Warrington College of Business, University of Florida, Gainesville, Florida 32611;

Search for more papers by this author

Boxiao Chen

Corresponding Author

Boxiao Chen

[email protected]

https://orcid.org/0000-0002-5967-4822

Department of Information and Decision Sciences, College of Business Administration, University of Illinois, Chicago, Illinois 60607;

Search for more papers by this author

David Simchi-Levi

[email protected]

https://orcid.org/0000-0002-4650-1519

Institute for Data, Systems, and Society, Department of Civil and Environmental Engineering, and Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:27 Jan 2021https://doi.org/10.1287/mnsc.2020.3819

References

Abbasi-Yadkori Y , Pal D , Szepesvari C (2012) Online-to-confidence-set conversions and application to sparse stochastic bandits. Proc. Internat. Conf. Artificial Intelligence Statist. (AISTATS), 1–9.Google Scholar
Agarwal A , Foster DP , Hsu DJ , Kakade SM , Rakhlin A (2013) Stochastic convex optimization with bandit feedback. SIAM J. Optim. 23(1):213–240.Crossref, Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3(Nov):397–422.Google Scholar
Auer P , Ortner R , Szepesvári C (2007) Improved rates for the stochastic continuum-armed bandit problem. Proc. Conf. Comput. Learn. Theory (COLT) (Springer, Berlin, Heidelberg), 454–468.Google Scholar
Badanidiyuru A , Kleinberg R , Slivkins A (2013) Bandits with knapsacks. IEEE 54th Annual Sympos. Foundations Comput. Sci. (FOCS) (IEEE, Piscataway, NJ), 207–216.Google Scholar
Bastani H , Bayati M (2020) Online decision-making with high-dimensional covariates. Oper. Res. 68(1):276–294.Link, Google Scholar
Besbes O , Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O , Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
Besbes O , Zeevi A (2015) On the surprising sufficiency of linear models for dynamic pricing with demand learning. Management Sci. 61(4):723–739.Link, Google Scholar
Broder J , Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
Bubeck S , Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations Trends® Machine Learn. 5(1):1–122.Google Scholar
Bubeck S , Munos R , Stoltz G , Szepesvári C (2011) X-armed bandits. J. Machine Learn. Res. 12(May):1655–1695.Google Scholar
Bubeck S , Stoltz G , Szepesvári C , Munos R (2009) Online optimization in x-armed bandits. D. Koller, D. Schuurmans, Y. Bengio, L. Bottou, eds. Proc. Adv. Neural Inform. Processing Systems (NIPS), vol. 21 (Curran Associates, Inc.), 201–208.Google Scholar
Bull AD (2011) Convergence rates of efficient global optimization algorithms. J. Machine Learn. Res. 12(Oct):2879–2904.Google Scholar
Chen N , Gallego G (2019) A primal-dual learning algorithm for personalized dynamic pricing with an inventory constraint. Working paper, Hong Kong University of Science and Technology, Hong Kong.Google Scholar
Chen Y , Shi C (2019) Network revenue management with online inverse batch gradient descent method. Working paper, University of Cincinnati, Cincinnati.Google Scholar
Chen B , Chao X , Shi C (2021) Nonparametric learning algorithms for joint pricing and inventory control with lost-sales and censored demand. Math. Oper. Res. Forthcoming.Link, Google Scholar
Chen Q , Jasin S , Duenyas I (2019) A nonparametric self-adjusting control for joint learning and optimization of multi-product pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.Link, Google Scholar
Cheung WC , Simchi-Levi D , Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
Chu W , Li L , Reyzin L , Schapire R (2011) Contextual bandits with linear payoff functions. Proc. Internat. Conf. Artificial Intelligence Statist. (AISTATS), 208–214.Google Scholar
Cope E (2009) Regret and convergence bounds for immediate-reward reinforcement learning with continuous action spaces. IEEE Trans. Automat. Control 54(6):1243–1253.Crossref, Google Scholar
Ebert DS , Musgrave FK (2003) Texturing & Modeling: A Procedural Approach (Chapman and Hall/CRC, London).Google Scholar
Fan J (1993) Local linear regression smoothers and their minimax efficiencies. Ann. Statist. 21(1):196–216.Crossref, Google Scholar
Fan J , Gijbels I (2018) Local Polynomial Modelling and Its Applications (Routledge, Abingdon-on-Thames, UK).Crossref, Google Scholar
Ferreira KJ , Simchi-Levi D , Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
Flaxman AD , Kalai AT , Kalai AT , McMahan HB (2005) Online convex optimization in the bandit setting: gradient descent without a gradient. Proc. Annual ACM-SIAM Sympos. Discrete Algorithms (SODA), 385–394.Google Scholar
Gittins J , Glazebrook K , Weber R (2011) Multi-Armed Bandit Allocation Indices (John Wiley & Sons, Hoboken, NJ).Google Scholar
Goldenshluger A , Zeevi A (2013) A linear response bandit problem. Stochastic Systems 3(1):230–261.Link, Google Scholar
Grill J-B , Valko M , Munos R (2015) Black-box optimization of noisy functions with unknown smoothness. Proc. Adv. Neural Inform. Processing Systems (NIPS), 667–675.Google Scholar
Gur Y , Momeni A , Wager S (2019) Smoothness-adaptive stochastic bandits. Preprint, submitted October 22, https://arxiv.org/abs/1910.09714.Google Scholar
Hazan E , Klivans A , Yuan Y (2018) Hyperparameter optimization: A spectral approach. Proc. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Keskin NB , Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.Link, Google Scholar
Lai TL , Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Applied Math . 6(1):4–22.Crossref, Google Scholar
Lei Y , Jasin S , Sinha A (2019) Near-optimal bisection search for nonparametric dynamic pricing with inventory constraint. Working paper, University of Michigan, Ann Arbor.Google Scholar
Li L , Jamieson K , DeSalvo G , Rostamizadeh A , Talwalkar A (2017) Hyperband: A novel bandit-based approach to hyperparameter optimization. J. Machine Learn. Res. 18(1):6765–6816.Google Scholar
Malherbe C , Vayatis N (2016) A ranking approach to global optimization. Proc. Internat. Conf. Machine Learn. (ICML), 1539–1547.Google Scholar
Malherbe C , Vayatis N (2017) Global optimization of Lipschitz functions. Proc. Internat. Conf. Machine Learn. (ICML), 2314–2323.Google Scholar
Rusmevichientong P , Tsitsiklis JN (2010) Linearly parameterized bandits. Math. Oper. Res. 35(2):395–411.Link, Google Scholar
Simchi-Levi D , Xu Y (2019) Phase transitions and cyclic phenomena in bandits with switching constraints. Preprint, submitted June 6, https://ssrn.com/abstract=3380783.Google Scholar
Wang Y , Balakrishnan S , Singh A (2019) Optimization of smooth functions with noisy observations: Local minimax rates. IEEE Trans. Inform. Theory 65(11):7350–7366.Crossref, Google Scholar
Wang Z , Deng S , Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.Link, Google Scholar
Weber R (1992) On the Gittens index for multiarmed bandits. Ann. Appl. Probab. 2(4):1024–1033.Crossref, Google Scholar
Whittle P (1980) Multi-armed bandits and the Gittens index. J. R. Statist. Soc. B . 42(2):143–149.Google Scholar

Volume 67, Issue 10

October 2021

Pages 5969-6627, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:December 15, 2019
Accepted:July 14, 2020
Published Online:January 27, 2021

Cite as

Yining Wang , Boxiao Chen , David Simchi-Levi (2021) Multimodal Dynamic Pricing. Management Science 67(10):6136-6152.

https://doi.org/10.1287/mnsc.2020.3819

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Multimodal Dynamic Pricing

References

Volume 67, Issue 10

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News