Simultaneously Learning and Optimizing Using Controlled Variance Pricing

Arnoud V. den Boer
Arnoud V. den Boer
[email protected]
Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands; and University of Amsterdam, 1098 XH Amsterdam, The Netherlands
Search for more papers by this author
,
Bert Zwart
Bert Zwart
[email protected]
Centrum Wiskunde and Informatica, 1098 XG Amsterdam, The Netherlands; and Department of Mathematics, VU University Amsterdam, 1081 HV Amsterdam, The Netherlands
Search for more papers by this author

Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands; and University of Amsterdam, 1098 XH Amsterdam, The Netherlands

Search for more papers by this author

Bert Zwart

[email protected]

Centrum Wiskunde and Informatica, 1098 XG Amsterdam, The Netherlands; and Department of Mathematics, VU University Amsterdam, 1081 HV Amsterdam, The Netherlands

Search for more papers by this author

Published Online:10 Dec 2013https://doi.org/10.1287/mnsc.2013.1788

References

Aghion P, Bolton P, Harris C, Jullien B (1991) Optimal learning by experimentation. Rev. Econom. Stud. 58(4):621–654.Crossref, Google Scholar
Anderson TW, Taylor JB (1976) Some experimental results on the statistical properties of least squares estimates in control problems. Econometrica 44(6):1289–1302.Crossref, Google Scholar
Araman VF, Caldentey R (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.Link, Google Scholar
Balvers RJ, Cosimano TF (1990) Actively learning about demand and the dynamics of price adjustment. Econom. J. 100(402):882–898.Google Scholar
Bartlett MS (1951) An inverse matrix adjustment arising in discriminant analysis. Ann. Math. Statist. 22(1):107–111.Crossref, Google Scholar
Bertsimas D, Perakis G (2006) Dynamic pricing: A learning approach. Hearn D, Lawphongpanich S, eds. Mathematical and Computational Models for Congestion Charging (Springer, New York), 45–79.Crossref, Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Besbes O, Zeevi A (2011) On the minimax complexity of pricing in a changing environment. Oper. Res. 59(1):66–79.Link, Google Scholar
Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
Carvalho AX, Puterman ML (2005a) Dynamic optimization and learning: How should a manager set prices when the demand function is unknown? IPEA Discussion Paper 1117. Instituto de Pesquisa Economica Aplicada, Brasilia.Google Scholar
Carvalho AX, Puterman ML (2005b) Learning and pricing in an Internet environment with binomial demand. J. Revenue Pricing Management 3(4):320–336.Crossref, Google Scholar
Cesa-Bianchi N, Lugosi G (2006) Prediction, Learning, and Games (Cambridge University Press, New York).Crossref, Google Scholar
Chen K, Hu I (1998) On consistency of Bayes estimates in a certainty equivalence adaptive system. IEEE Trans. Automatic Control 43(7):943–947.Crossref, Google Scholar
Chow YS, Teicher H (2003) Probability Theory: Independence, Interchangeability, Martingales, 3rd ed. (Springer Verlag, New York).Google Scholar
Cope E (2007) Bayesian strategies for dynamic pricing in e-commerce. Naval Res. Logist. 54(3):265–281.Crossref, Google Scholar
den Boer AV, Zwart B (2012) Mean square convergence rates for maximum quasi-likelihood estimators. Working paper, University of Technology, Eindhoven, The Netherlands. https://www.researchgate.net/publication/257985753_Mean_square_convergence_rates_for_maximum_quasi-likelihood_estimators.Google Scholar
Duistermaat JJ, Kolk JAC (2004) Multidimensional Real Analysis: Differentiation, Cambridge Studies in Advanced Mathematics, Vol. 86 (Cambridge University Press, Cambridge, UK).Google Scholar
Easley D, Kiefer NM (1988) Controlling a stochastic process with unknown parameters. Econometrica 56(5):1045–1064.Crossref, Google Scholar
Eren SS, Maglaras C (2010) Monopoly pricing with limited demand information. J. Revenue Pricing Management 9:23–48.Crossref, Google Scholar
Farias VF, van Roy B (2010) Dynamic pricing with a prior on market response. Oper. Res. 58(1):16–29.Link, Google Scholar
Gill J (2001) Generalized Linear Models: A Unified Approach (Sage Publications, Thousand Oaks, CA).Crossref, Google Scholar
Gittins JC (1989) Multi-Armed Bandit Allocation Indices, Wiley Interscience Series in Systems and Optimization (John Wiley & Sons, New York).Google Scholar
Godambe VP, Heyde CC (1987) Quasi-likelihood and optimal estimation. Internat. Statist. Rev. 55(3):231–244.Crossref, Google Scholar
Goldenshluger A, Zeevi A (2009) Woodroofe's one-armed bandit problem revisited. Ann. Appl. Probab. 19(4):1603–1633.Crossref, Google Scholar
Harrison JM, Keskin NB, Zeevi A (2012) Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Sci. 58(3):570–586.Link, Google Scholar
Heyde CC (1997) Quasi-Likelihood and Its Application. Springer Series in Statistics (Springer Verlag, New York).Crossref, Google Scholar
Keller G, Rady S (1999) Optimal experimentation in a changing environment. Rev. Econom. Stud. 66(3):475–507.Crossref, Google Scholar
Keskin NB, Zeevi A (2013) Dynamic pricing with an unknown linear demand model: Asymptotically optimal semi-myopic policies. Working paper, University of Chicago Booth School of Business, Chicago. http://faculty.chicagobooth.edu/bora.keskin/pdfs/DynamicPricingUnknownDemandModel.pdf.Google Scholar
Kiefer J, Wolfowitz J (1952) Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23(3):462–466.Crossref, Google Scholar
Kiefer NM, Nyarko Y (1989) Optimal control of an unknown linear process with learning. Internat. Econom. Rev. 30(3):571–586.Crossref, Google Scholar
Kleinberg R, Leighton T (2003) The value of knowing a demand curve: Bounds on regret for online posted-price auctions. Proc. 44th IEEE Sympos. Foundations Comput. Sci. (IEEE Computer Society, Washington, DC), 594–605.Crossref, Google Scholar
Lai TL, Robbins H (1982) Iterated least squares in multiperiod control. Adv. Appl. Math. 3(1):50–73.Crossref, Google Scholar
Lai TL, Robbins H (1985) Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1):4–22.Crossref, Google Scholar
Lai TL, Wei CZ (1982) Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. Ann. Statist. 10(1):154–166.Crossref, Google Scholar
Lai TL, Robbins H, Wei CZ (1979) Strong consistency of least squares estimates in multiple regression II. J. Multivariate Anal. 9(3):343–361.Crossref, Google Scholar
Lim AEB, Shanthikumar JG (2007) Relative entropy, exponential utility, and robust dynamic pricing. Oper. Res. 55(2):198–214.Link, Google Scholar
Lin KY (2006) Dynamic pricing with real-time demand learning. Eur. J. Oper. Res. 174(1):522–538.Crossref, Google Scholar
Lobo MS, Boyd S (2003) Pricing and learning with uncertain demand. Working paper, Stanford University, Stanford, CA. http://www.stanford.edu/~boyd/papers/pdf/pric_learn_unc_dem.pdf.Google Scholar
McCullagh P (1983) Quasi-likelihood functions. Ann. Statist. 11(1):59–67.Crossref, Google Scholar
McCullagh P, Nelder JA (1983) Generalized Linear Models (Chapman & Hall, London).Crossref, Google Scholar
McLennan A (1984) Price dispersion and incomplete learning in the long run. J. Econom. Dynam. Control 7(3):331–347.Crossref, Google Scholar
Nassiri-Toussi K, Ren W (1994) On the convergence of least squares estimates in white noise. IEEE Trans. Automatic Control 39(2):364–368.Crossref, Google Scholar
Powell WB (2010) The knowledge gradient for optimal learning. Cochran JJ, Cox LA Jr, Keskinocak P, Kharoufeh JP, Smith JC, eds. Encyclopedia of Operations Research and Management Science (John Wiley & Sons, New York).Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.Crossref, Google Scholar
Taylor JB (1974) Asymptotic properties of multiperiod control rules in the linear regression model. Internat. Econom. Rev. 15(2):472–484.Crossref, Google Scholar
Vermorel J, Mohri M (2005) Multi-armed bandit algorithms and empirical evaluation. Gama J, Camacho R, Brazdil P, Jorge A, Torgo L, eds. Proceedings of the 16th European Conference on Machine Learning, Lecture Notes in Computer Science, Vol. 3720 (Springer-Verlag, Berlin), 437–448.Crossref, Google Scholar
Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 61(3):439–447.Google Scholar

Volume 60, Issue 3

March 2014

Pages iv-vii, 541-804

Article Information

Metrics

Information

Received:March 16, 2010
Accepted:June 04, 2013
Published Online:December 10, 2013

Cite as

Arnoud V. den Boer, Bert Zwart (2013) Simultaneously Learning and Optimizing Using Controlled Variance Pricing. Management Science 60(3):770-783.

https://doi.org/10.1287/mnsc.2013.1788

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Simultaneously Learning and Optimizing Using Controlled Variance Pricing

References

Volume 60, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News