Diffusion Approximations for a Class of Sequential Experimentation Problems

Victor F. Araman
Victor F. Araman
[email protected]
https://orcid.org/0000-0002-3583-8124
Olayan School of Business, American University of Beirut, Beirut 1107 2020, Lebanon;
Search for more papers by this author
,
René A. Caldentey
René A. Caldentey
[email protected]
https://orcid.org/0000-0002-6767-9770
Booth School of Business, The University of Chicago, Chicago, Illinois 60637
Search for more papers by this author

Olayan School of Business, American University of Beirut, Beirut 1107 2020, Lebanon;

Search for more papers by this author

René A. Caldentey

[email protected]

https://orcid.org/0000-0002-6767-9770

Booth School of Business, The University of Chicago, Chicago, Illinois 60637

Search for more papers by this author

Published Online:28 Dec 2021https://doi.org/10.1287/mnsc.2021.4195

References

Agrawal S, Avadhandula V, Goyal V, Zeevi A (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
Alaei S, Malekian A, Mostagir M (2016) A dynamic model of crowdfunding. Working paper, Ross School of Business, University of Michigan, Ann Arbor.Google Scholar
Araman V, Caldentey R (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.Link, Google Scholar
Araman V, Caldentey R (2011) Revenue management with incomplete demand information. Cochran JJ, Cox LA, Keskinocak P, Kharoufeh JP, Smith JC, eds. Wiley Encyclopedia of Operations Research and Management Science (John Wiley and Sons, Hoboken, NJ), 1–17.Google Scholar
Armitage P, Berry G, Matthews JNS (2002) Statistical Methods in Medical Research, 4th ed. (Blackwell Science, MA).Crossref, Google Scholar
Bartroff J, Finkelman M, Lai TL (2008) Modern sequential analysis and its applications to computerized adaptive testing. Psychometrika 73(3):473–486.Crossref, Google Scholar
Bastani H, Bayati M, Khashayar K (2021) Mostly exploration-free algorithms for contextual bandits. Management Sci. 67(3):1329–1349.Link, Google Scholar
Besbes O, Zeevi A (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
Blackwell D (1951) Comparison of experiments. Proc. Second Berkeley Sympos. Math. Statist. Probab. (University of California Press, Berkeley, CA), 93–102.Google Scholar
Blackwell D (1965) Discounted dynamic programming. Ann. Math. Statist. 36(1):226–235.Crossref, Google Scholar
Bolton P, Harris C (1999) Strategic experimentation. Econometrica 67(2):349–374.Crossref, Google Scholar
Breakwell J, Chernoff H (1964) Sequential tests for the mean of a normal distribution II (large t). Ann. Math. Statist. 35(1):162–173.Crossref, Google Scholar
Brezzi M, Lai TL (2002) Optimal learning and experimentation in bandit problems. J. Econom. Dynam. Control 27(1):87–108.Crossref, Google Scholar
Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
Chang F, Lai TL (1987) Optimal stopping and dynamic allocation. Adv. Appl. Probab. 19(4):829–853.Crossref, Google Scholar
Chernoff H (1959) Sequential design of experiments. Ann. Math. Statist. 30(3):755–770.Google Scholar
Chernoff H (1961) Sequential tests for the mean of a normal distribution. Proc. Fourth Berkeley Sympos. Math. Statist. Probab., vol. 1 (University of California Press, Berkeley, CA), 612–624.Google Scholar
Chernoff H (1972) Sequential Analysis and Optimal Design (SIAM, Philadelphia).Crossref, Google Scholar
Chick S, Frazier P (2012) Sequential sampling with economics of selection procedures. Management Sci. 58(3):550–569.Link, Google Scholar
Chick S, Gans N (2009) Economic analysis of simulation selection problems. Management Sci. 55(3):421–437.Link, Google Scholar
Dayanik S, Poor HV, Sezer SO (2008) Sequential multi-hypothesis testing for compound poisson processes. Stochastics 80(1):19–50.Crossref, Google Scholar
den Boer AV (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
den Boer AV, Zwart B (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770–783.Link, Google Scholar
Fan L, Glynn PW (2021) Diffusion approximations for Thompson sampling. Preprint, submitted May 19, https://arxiv.org/abs/2105.09232.Google Scholar
Feng Y, Caldentey R, Ryan CT (2021) Robust learning of consumer preferences. Oper. Res., ePub ahead of print December 8, https://doi.org/10.1287/opre.2021.2157.Link, Google Scholar
Finkelman M (2008) On using stochastic curtailment to shorten the SPRT in sequential mastery testing. J. Ed. Behav. Statist. 33(4):442–463.Crossref, Google Scholar
Gallego G, Talebian M (2012) Demand learning and dynamic pricing for multi-versions products. J. Revenue Pricing Management 11(3):303–318.Crossref, Google Scholar
Garivier A, Kaufmann E (2016) Optimal best arm identification with fixed confidence. Proc. 29th Annual Conf. Learn. Theory, vol. 49 (Columbia University, New York), 998–1027.Google Scholar
Harrison JM, Sunar N (2015) Investment timing with incomplete information and multiple means of learning. Oper. Res. 63(2):442–457.Link, Google Scholar
Harrison JM, Keskin NB, Zeevi A (2012) Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Sci. 58(3):570–586.Link, Google Scholar
Karatzas I, Shreve SE (1991) Brownian Motion and Stochastic Calculus (Springer-Verlag, New York).Google Scholar
Kaufmann E, Cappé O, Garivier A (2016) On the complexity of best arm identification in multi-armed bandit models. J. Machine Learn. Res. 17(1):1–42.Google Scholar
Keener R (1984) Second order efficiency in the sequential design of experiments. Ann. Statist. 12(2):510–532.Crossref, Google Scholar
Keskin G, Birge JR (2019) Dynamic selling mechanisms for product differentiation and learning. Oper. Res. 67(4):1069–1089.Abstract, Google Scholar
Keskin G, Zeevi A (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 6(5):1142–1167.Link, Google Scholar
Keskin G, Zeevi A (2018) On incomplete learning and certain-equivalence control. Oper. Res. 66(4):1136–1167.Link, Google Scholar
Kohavi R, Thomke S (2017) The surprising power of online experiments. Harvard Bus. Rev.Google Scholar
Kök AG, Fisher ML, Vaidyanatha R (2009) Assortment planning: Review of literature and industry practice. Agrawal N, Smith SA, eds. Retail Supply Chain Management: Quantitative Models and Empirical Studies, International Series in Operations Research and Management Science (Springer, New York), 175–236.Google Scholar
Lai TL (2001) Sequential analysis: Some classical problems and new challenges. Statist. Sinica 11(2):303–351.Google Scholar
Le Cam L (1996) Comparison of experiments: A short review. Lecture Notes-Monograph Series, vol. 30 (Institute of Mathematical Statistics), 127–138.Google Scholar
Lewis RA, Rao JR (2015) The unfavorable economics of measuring the returns to advertising Quart. J. Econom. 130(4):1941–1973.Crossref, Google Scholar
Lindley DV (1956) On a measure of the information provided by an experiment. Ann. Math. Statist. 27(4):986–1005.Crossref, Google Scholar
Marinesi S, Girotra K (2013) Information acquisition through customer voting systems. INSEAD Working Paper No. 2013/99/TOM, INSEAD, Fontainebleau, France.Google Scholar
Naghshvar M, Javidi T (2013) Active sequential hypothesis testing. Ann. Statist. 41(6):2703–2738.Crossref, Google Scholar
Nahm M (2012) Data quality in clinical research. Richession RL, Andrews JE, eds. Clinical Research Informatics (Springer, New York), 175–201.Crossref, Google Scholar
Oh M, Iyengar G (2019) Thompson sampling for multinomial logit contextual bandits. H. Wallach and H. Larochelle and A. Beygelzimer and F. d’ Alchè-Buc and E. Fox and R. Garnett, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates, Inc., Vancouver), 3145–3155.Google Scholar
Papanastasiou Y, Bimpikis K, Savva N (2018) Crowdsourcing exploration. Management Sci. 64(4):1727–1746.Link, Google Scholar
Peskir G, Shiryaev AN (2006) Optimal Stopping and Free-Boundary Problems (Birkhäuser Verlag, Basel, Switzerland).Google Scholar
Powell WB (2016) Perspectives of approximate dynamic programming. Ann. Oper. Res. 241:319–356.Crossref, Google Scholar
Puterman ML (2005) Markov Decision Processes: Discrete Stochastic Dynamic Programming, 2nd ed. (John Wiley & Sons, Hoboken, NJ.).Google Scholar
Qiu P (2014) Introduction to Statistical Process Control (Chapman & Hall/CRC, Boca Raton, FL).Google Scholar
Robbins H (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (N.S.) 58(5):527–535.Crossref, Google Scholar
Russo D (2020) Simple Bayesian algorithms for best arm identification. Oper. Res. 68(6):1625–1647.Link, Google Scholar
Sauré D, Zeevi A (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
Siegmund D (1985) Sequential Analysis: Tests and Confidence Intervals (Springer-Verlag, New York).Crossref, Google Scholar
Soare M, Lazaric A, Munos R (2014) Best-Arm Identification in Linear Bandits. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 27 (Curran Associates, Inc., Montreal), 1–9.Google Scholar
Sunar N, Birge JR, Vitavasiri S (2019) Optimal dynamic product development and launch for a network of customers. Oper. Res. 67(3):770–790.Link, Google Scholar
Ulu C, Honhon D, Alptekinoğlu A (2012) Learning consumer tastes through dynamic assortments. Oper. Res. 60(4):833–849.Link, Google Scholar
Wager S, Xu K (2021) Diffusion asymptotics for sequential experiments. Preprint, submitted January 25, https://arxiv.org/abs/2101.09855.Google Scholar
Wald A (1945) Sequential tests of statistical hypotheses. Ann. Math. Statist. 16(2):117–186.Crossref, Google Scholar
Wald A (1947) Sequential Analysis (John Wiley and Sons, New York).Google Scholar
Wald A, Wolfowitz J (1948) Optimum character of the sequential probability ratio test. Ann. Math. Statist. 19(3):326–339.Crossref, Google Scholar
Zenios S, Wang Z (2021) Adaptive design of clinical trials: A sequential learning approach. Preprint, submitted January 27, http://dx.doi.org/10.2139/ssrn.3713924.Google Scholar

Volume 68, Issue 8

August 2022

Pages 5557-6354, iv-v

Article Information

Supplemental Material

Metrics

Information

Received:October 31, 2019
Accepted:June 23, 2021
Published Online:December 28, 2021

Cite as

Victor F. Araman, René A. Caldentey (2022) Diffusion Approximations for a Class of Sequential Experimentation Problems. Management Science 68(8):5958-5979.

https://doi.org/10.1287/mnsc.2021.4195

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Diffusion Approximations for a Class of Sequential Experimentation Problems

References

Volume 68, Issue 8

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News