Diffusion Approximations for a Class of Sequential Experimentation Problems
Published Online:28 Dec 2021https://doi.org/10.1287/mnsc.2021.4195
References
- (2019) MNL-bandit: A dynamic learning approach to assortment selection. Oper. Res. 67(5):1453–1485.Link, Google Scholar
- (2016) A dynamic model of crowdfunding. Working paper, Ross School of Business, University of Michigan, Ann Arbor.Google Scholar
- (2009) Dynamic pricing for nonperishable products with demand learning. Oper. Res. 57(5):1169–1188.Link, Google Scholar
- (2011) Revenue management with incomplete demand information. Cochran JJ, Cox LA, Keskinocak P, Kharoufeh JP, Smith JC, eds. Wiley Encyclopedia of Operations Research and Management Science (John Wiley and Sons, Hoboken, NJ), 1–17.Google Scholar
- (2002) Statistical Methods in Medical Research, 4th ed. (Blackwell Science, MA).Crossref, Google Scholar
- (2008) Modern sequential analysis and its applications to computerized adaptive testing. Psychometrika 73(3):473–486.Crossref, Google Scholar
- (2021) Mostly exploration-free algorithms for contextual bandits. Management Sci. 67(3):1329–1349.Link, Google Scholar
- (2009) Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Oper. Res. 57(6):1407–1420.Link, Google Scholar
- (1951) Comparison of experiments. Proc. Second Berkeley Sympos. Math. Statist. Probab. (University of California Press, Berkeley, CA), 93–102.Google Scholar
- (1965) Discounted dynamic programming. Ann. Math. Statist. 36(1):226–235.Crossref, Google Scholar
- (1999) Strategic experimentation. Econometrica 67(2):349–374.Crossref, Google Scholar
- (1964) Sequential tests for the mean of a normal distribution II (large t). Ann. Math. Statist. 35(1):162–173.Crossref, Google Scholar
- (2002) Optimal learning and experimentation in bandit problems. J. Econom. Dynam. Control 27(1):87–108.Crossref, Google Scholar
- (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.Link, Google Scholar
- (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- (1987) Optimal stopping and dynamic allocation. Adv. Appl. Probab. 19(4):829–853.Crossref, Google Scholar
- (1959) Sequential design of experiments. Ann. Math. Statist. 30(3):755–770.Google Scholar
- (1961) Sequential tests for the mean of a normal distribution. Proc. Fourth Berkeley Sympos. Math. Statist. Probab., vol. 1 (University of California Press, Berkeley, CA), 612–624.Google Scholar
- (1972) Sequential Analysis and Optimal Design (SIAM, Philadelphia).Crossref, Google Scholar
- (2012) Sequential sampling with economics of selection procedures. Management Sci. 58(3):550–569.Link, Google Scholar
- (2009) Economic analysis of simulation selection problems. Management Sci. 55(3):421–437.Link, Google Scholar
- (2008) Sequential multi-hypothesis testing for compound poisson processes. Stochastics 80(1):19–50.Crossref, Google Scholar
- (2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys Oper. Res. Management Sci. 20(1):1–18.Crossref, Google Scholar
- (2014) Simultaneously learning and optimizing using controlled variance pricing. Management Sci. 60(3):770–783.Link, Google Scholar
- (2021) Diffusion approximations for Thompson sampling. Preprint, submitted May 19, https://arxiv.org/abs/2105.09232.Google Scholar
- (2021) Robust learning of consumer preferences. Oper. Res., ePub ahead of print December 8, https://doi.org/10.1287/opre.2021.2157.Link, Google Scholar
- (2008) On using stochastic curtailment to shorten the SPRT in sequential mastery testing. J. Ed. Behav. Statist. 33(4):442–463.Crossref, Google Scholar
- (2012) Demand learning and dynamic pricing for multi-versions products. J. Revenue Pricing Management 11(3):303–318.Crossref, Google Scholar
- E (2016) Optimal best arm identification with fixed confidence. Proc. 29th Annual Conf. Learn. Theory, vol. 49 (Columbia University, New York), 998–1027.Google Scholar
- (2015) Investment timing with incomplete information and multiple means of learning. Oper. Res. 63(2):442–457.Link, Google Scholar
- (2012) Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Sci. 58(3):570–586.Link, Google Scholar
- (1991) Brownian Motion and Stochastic Calculus (Springer-Verlag, New York).Google Scholar
- (2016) On the complexity of best arm identification in multi-armed bandit models. J. Machine Learn. Res. 17(1):1–42.Google Scholar
- (1984) Second order efficiency in the sequential design of experiments. Ann. Statist. 12(2):510–532.Crossref, Google Scholar
- (2019) Dynamic selling mechanisms for product differentiation and learning. Oper. Res. 67(4):1069–1089.Abstract, Google Scholar
- (2014) Dynamic pricing with an unknown demand model: Asymptotically optimal semi-myopic policies. Oper. Res. 6(5):1142–1167.Link, Google Scholar
- (2018) On incomplete learning and certain-equivalence control. Oper. Res. 66(4):1136–1167.Link, Google Scholar
- (2017) The surprising power of online experiments. Harvard Bus. Rev.Google Scholar
- (2009) Assortment planning: Review of literature and industry practice. Agrawal N, Smith SA, eds. Retail Supply Chain Management: Quantitative Models and Empirical Studies, International Series in Operations Research and Management Science (Springer, New York), 175–236.Google Scholar
- (2001) Sequential analysis: Some classical problems and new challenges. Statist. Sinica 11(2):303–351.Google Scholar
- Le Cam L (1996) Comparison of experiments: A short review. Lecture Notes-Monograph Series, vol. 30 (Institute of Mathematical Statistics), 127–138.Google Scholar
- (2015) The unfavorable economics of measuring the returns to advertising Quart. J. Econom. 130(4):1941–1973.Crossref, Google Scholar
- (1956) On a measure of the information provided by an experiment. Ann. Math. Statist. 27(4):986–1005.Crossref, Google Scholar
- (2013) Information acquisition through customer voting systems. INSEAD Working Paper No. 2013/99/TOM, INSEAD, Fontainebleau, France.Google Scholar
- (2013) Active sequential hypothesis testing. Ann. Statist. 41(6):2703–2738.Crossref, Google Scholar
- (2012) Data quality in clinical research. Richession RL, Andrews JE, eds. Clinical Research Informatics (Springer, New York), 175–201.Crossref, Google Scholar
- (2019) Thompson sampling for multinomial logit contextual bandits. H. Wallach and H. Larochelle and A. Beygelzimer and F. d’ Alchè-Buc and E. Fox and R. Garnett, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates, Inc., Vancouver), 3145–3155.Google Scholar
- (2018) Crowdsourcing exploration. Management Sci. 64(4):1727–1746.Link, Google Scholar
- (2006) Optimal Stopping and Free-Boundary Problems (Birkhäuser Verlag, Basel, Switzerland).Google Scholar
- (2016) Perspectives of approximate dynamic programming. Ann. Oper. Res. 241:319–356.Crossref, Google Scholar
- (2005) Markov Decision Processes: Discrete Stochastic Dynamic Programming, 2nd ed. (John Wiley & Sons, Hoboken, NJ.).Google Scholar
- (2014) Introduction to Statistical Process Control (Chapman & Hall/CRC, Boca Raton, FL).Google Scholar
- (1952) Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. (N.S.) 58(5):527–535.Crossref, Google Scholar
- (2020) Simple Bayesian algorithms for best arm identification. Oper. Res. 68(6):1625–1647.Link, Google Scholar
- (2013) Optimal dynamic assortment planning with demand learning. Manufacturing Service Oper. Management 15(3):387–404.Link, Google Scholar
- (1985) Sequential Analysis: Tests and Confidence Intervals (Springer-Verlag, New York).Crossref, Google Scholar
- Soare M, Lazaric A, Munos R (2014) Best-Arm Identification in Linear Bandits. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 27 (Curran Associates, Inc., Montreal), 1–9.Google Scholar
- (2019) Optimal dynamic product development and launch for a network of customers. Oper. Res. 67(3):770–790.Link, Google Scholar
- (2012) Learning consumer tastes through dynamic assortments. Oper. Res. 60(4):833–849.Link, Google Scholar
- (2021) Diffusion asymptotics for sequential experiments. Preprint, submitted January 25, https://arxiv.org/abs/2101.09855.Google Scholar
- (1945) Sequential tests of statistical hypotheses. Ann. Math. Statist. 16(2):117–186.Crossref, Google Scholar
- (1947) Sequential Analysis (John Wiley and Sons, New York).Google Scholar
- (1948) Optimum character of the sequential probability ratio test. Ann. Math. Statist. 19(3):326–339.Crossref, Google Scholar
- (2021) Adaptive design of clinical trials: A sequential learning approach. Preprint, submitted January 27, http://dx.doi.org/10.2139/ssrn.3713924.Google Scholar

