Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers

Huanan Zhang
Huanan Zhang
[email protected]
https://orcid.org/0000-0002-0672-5227
Leeds School of Business, University of Colorado Boulder, Boulder, Colorado 80309;
Search for more papers by this author
,
Stefanus Jasin
Stefanus Jasin
[email protected]
https://orcid.org/0000-0003-3709-3928
Stephen M. Ross School of Business, University of Michigan, Ann Arbor, Michigan 48109
Search for more papers by this author

Leeds School of Business, University of Colorado Boulder, Boulder, Colorado 80309;

Search for more papers by this author

Stefanus Jasin

[email protected]

https://orcid.org/0000-0003-3709-3928

Stephen M. Ross School of Business, University of Michigan, Ann Arbor, Michigan 48109

Search for more papers by this author

Published Online:26 Oct 2021https://doi.org/10.1287/msom.2021.0979

References

Agrawal S, Jia R (2019) Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management. Karlin A, ed. Proc. 2019 ACM Conf. Econom. Comput. (ACM, New York), 743–744.Crossref, Google Scholar
Ahn H, Gümüş M, Kaminsky P (2007) Pricing and manufacturing decisions when demand is a function of prices in multiple periods. Oper. Res. 55(6):1039–1057.Link, Google Scholar
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2–3):235–256.Crossref, Google Scholar
Aviv Y, Levin Y, Nediak M (2009) Counteracting strategic consumer behavior in dynamic pricing systems. Netessine S, Tang C, eds. Consumer-Driven Demand and Operations Management Models. International Series in Operations Research and Management Science, vol. 131 (Springer, New York), 323–352.Google Scholar
Besbes O, Lobel I (2015) Intertemporal price discrimination: Structure and computation of optimal policies. Management Sci. 61(1):92–110.Link, Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O, Zeevi A (2012) Blind network revenue management. Oper. Res. 60(6):1537–1550.Link, Google Scholar
Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
Caplin A, Dean M, Martin D (2011) Search and satisficing. Amer. Econom. Rev. 101(7):2899–2922.Crossref, Google Scholar
Chen Q, Jasin S, Duenyas I (2019a) A non-parametric self-adjusting control for joint learning and optimization of multi-product pricing with finite resource capacity. Math. Oper. Res. 44(2):601–631.Link, Google Scholar
Chen X, Wang Z (2016) Bayesian dynamic learning and pricing with strategic customers. Working paper, New York University, New York.Google Scholar
Chen Y, Farias VF (2018) Robust dynamic pricing with strategic customers. Math. Oper. Res. 43(4):1119–1142.Link, Google Scholar
Chen Y, Farias VF, Trichakis NK (2019b) On the efficacy of static prices for revenue management in the face of strategic customers. Management Sci. 65(12):5535–5555.Link, Google Scholar
Cheung WC, Simchi-Levi D, Wang H (2017) Dynamic pricing and demand learning with limited price experimentation. Oper. Res. 65(6):1722–1731.Link, Google Scholar
Conlisk J, Gerstner E, Sobel J (1984) Cyclic pricing by a durable goods monopolist. Quart. J. Econom. 99(3):489–505.Crossref, Google Scholar
den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
den Boer AV, Keskin NB (2020) Discontinuous demand functions: Estimation and pricing. Management Sci. 66(10):4516–4534.Google Scholar
den Boer AV, Zwart B (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770–783.Link, Google Scholar
den Boer AV, Zwart B (2015) Dynamic pricing and learning with finite inventories. Oper. Res. 63(4):965–978.Link, Google Scholar
Ferreira K, Simchi-Levi D, Wang H (2018) Online network revenue management using Thompson sampling. Oper. Res. 66(6):1586–1602.Link, Google Scholar
Glazer A, Hassin R (1986) A deterministic single-item inventory model with seller holding cost and buyer holding and shortage costs. Oper. Res. 34(4):613–618.Link, Google Scholar
Harrison JM, Keskin NB, Zeevi A (2012) Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Sci. 58(3):570–586.Link, Google Scholar
Huh WT, Janakiraman G, Muckstadt JA, Rusmevichientong P (2009) An adaptive algorithm for finding the optimal base-stock policy in lost sales inventory systems with censored demand. Math. Oper. Res. 34(2):397–416.Link, Google Scholar
Kazerouni A, Van Roy B (2017) Learning to price with reference effects. Working paper, Stanford University, Stanford, CA.Google Scholar
Keskin NB, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 62(5):1142–1167.Link, Google Scholar
Kopalle PK, Rao AG, Assuncao JL (1996) Asymmetric reference price effects and dynamic pricing policies. Marketing Sci. 15(1):60–85.Link, Google Scholar
Liu Y, Cooper WL (2015) Optimal dynamic pricing with patient customers. Oper. Res. 63(6):1307–1319.Link, Google Scholar
Lobel I (2020) Technical note—Dynamic pricing with heterogeneous patience levels. Oper. Res. 68(4):1038–1046.Google Scholar
Nasiry J, Popescu I (2011) Dynamic pricing with loss-averse consumers and peak-end anchoring. Oper. Res. 59(6):1361–1368.Link, Google Scholar
Osband I, Van Roy B (2014) Near-optimal reinforcement learning in factored MDPs. Adv. Neural Inform. Processing Systems 27:604–612.Google Scholar
Popescu I, Wu Y (2007) Dynamic pricing strategies with reference effects. Oper. Res. 55(3):413–429.Link, Google Scholar
Rejwan I, Mansour Y (2020) Top-k combinatorial bandits with full-bandit feedback. Kontorovich A, Neu G, eds. Proc. 31st Internat. Conf. Algorithmic Learning Theory, vol. 117 (PMLR, San Diego), 752–776.Google Scholar
Reutskaja E, Nagel R, Camerer CF, Rangel A (2011) Search dynamics in consumer choice under time pressure: An eye-tracking study. Amer. Econom. Rev. 101(2):900–926.Crossref, Google Scholar
Shen ZM, Su X (2007) Customer behavior modeling in revenue management and auctions: A review and new research opportunities. Production Oper. Management 16(6):713–728.Crossref, Google Scholar
Simon HA (1955) A behavioral model of rational choice. Quart. J. Econom. 69(1):99–118.Crossref, Google Scholar
Wang Z (2016) Intertemporal price discrimination via reference price effects. Oper. Res. 64(2):290–296.Link, Google Scholar
Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing algorithm for single-product revenue management problems. Oper. Res. 62(2):318–331.Link, Google Scholar
Zhang H, Chao X, Shi C (2020) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.Link, Google Scholar

cover image Manufacturing & Service Operations Management

Volume 24, Issue 2

March-April 2022

Pages 691-1260, C2

Article Information

Supplemental Material

Metrics

Information

Received:July 23, 2019
Accepted:December 16, 2020
Published Online:October 26, 2021

Cite as

Huanan Zhang, Stefanus Jasin (2021) Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers. Manufacturing & Service Operations Management 24(2):1165-1182.

https://doi.org/10.1287/msom.2021.0979

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Online Learning and Optimization of (Some) Cyclic Pricing Policies in the Presence of Patient Customers

References

Volume 24, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News