Technical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian Inference

Ye Chen
Corresponding Author
Ye Chen
Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, Virginia 23284;
Search for more papers by this author
,
Ilya O. Ryzhov
Corresponding Author
Ilya O. Ryzhov
http://orcid.org/0000-0002-4191-084X
Robert H. Smith School of Business, University of Maryland, College Park, Maryland 20742;Institute for Systems Research, A. James Clark School of Engineering, University of Maryland, College Park, Maryland 20742
Search for more papers by this author

Ye Chen

Corresponding Author

Ye Chen

Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, Virginia 23284;

Search for more papers by this author

Ilya O. Ryzhov

Corresponding Author

Ilya O. Ryzhov

http://orcid.org/0000-0002-4191-084X

Robert H. Smith School of Business, University of Maryland, College Park, Maryland 20742;Institute for Systems Research, A. James Clark School of Engineering, University of Maryland, College Park, Maryland 20742

Search for more papers by this author

Published Online:2 Jan 2020https://doi.org/10.1287/opre.2019.1850

References

Abounadi J, Bertsekas DP, Borkar VS (2002) Stochastic approximation for nonexpansive maps: Application to Q-learning algorithms. SIAM J. Control Optim. 41(1):1–22.Crossref, Google Scholar
Anderson CK, Xie X (2012) A choice-based dynamic programming approach for setting opaque prices. Production Oper. Management 21(3):590–605.Crossref, Google Scholar
Andradóttir S (1995) A stochastic approximation algorithm with varying bounds. Oper. Res. 43(6):1037–1048.Link, Google Scholar
Asmussen S, Glynn PW (2011) A new proof of convergence of MCMC via the ergodic theorem. Statist. Probab. Lett. 81(10):1482–1485.Crossref, Google Scholar
Bertsimas D, Kallus N (2015) From predictive to prescriptive analytics. Working paper, Massachusetts Institute of Technology, Cambridge.Google Scholar
Borkar VS (2008) Stochastic Approximation: A Dynamical Systems Viewpoint (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Borkar VS, Meyn SP (2000) The ODE method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim. 38(2):447–469.Crossref, Google Scholar
Bottou L (1998) Online learning and stochastic approximations. Saad D, ed. On-Line Learning in Neural Networks (Cambridge University Press, Cambridge, UK), 9–42.Google Scholar
Broadie M, Cicek D, Zeevi A (2011) General bounds and finite-time improvement for the Kiefer-Wolfowitz stochastic approximation algorithm. Oper. Res. 59(5):1211–1224.Link, Google Scholar
Chakraborty M, Das S, Magdon-Ismail M (2011) Near-optimal target learning with stochastic binary signals. Cozman F, Pfeffer A, eds. Proc. 27th Conf. Uncertainty Artificial Intelligence (AUAI Press, Corvallis, OR), 69–76.Google Scholar
Chakraborty M, Das S, Lavoie A, Magdon-Ismail M, Naamad Y (2013) Instructor rating markets. Proc. 27th AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 159–165.Google Scholar
Chau M, Fu MC, Qu H, Ryzhov IO (2014) Simulation optimization: A tutorial overview and recent developments in gradient-based methods. Tolk A, Diallo SY, Ryzhov IO, Yilmaz L, Buckley S, Miller JA, eds. Proc. 2014 Winter Simulation Conf. (IEEE, Piscataway, NJ), 21–35.Crossref, Google Scholar
Chen C-H, Lee LH (2010) Stochastic Simulation Optimization: An Optimal Computing Budget Allocation (World Scientific, Singapore).Crossref, Google Scholar
Chen Y, Ryzhov IO (2016) Approximate Bayesian inference as a form of stochastic approximation: A new consistency theory with applications. Roeder TMK, Frazier PI, Szechtman R, Zhou E, Huschka T, Chick SE, eds. Proc. 2016 Winter Simulation Conf. (IEEE, Piscataway, NJ), 534–544.Crossref, Google Scholar
Chen C-H, Chick SE, Lee LH, Pujowidianto NA (2015) Ranking and selection: Efficient simulation budget allocation. Fu MC, ed. Handbook of Simulation Optimization (Springer, New York), 45–80.Crossref, Google Scholar
Chhabra M, Das S (2011) Learning the demand curve in posted-price digital goods auctions. Proc. 10th Internat. Conf. Autonomous Agents Multi-Agent Systems (International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC), 63–70.Google Scholar
Chick SE (2006) Subjective probability and bayesian methodology. Henderson S, Nelson B, eds. Simulation, Handbooks of Operations Research and Management Science, vol. 13 (North-Holland Publishing, Amsterdam), 225–258.Google Scholar
Dangauthier P, Herbrich R, Minka T, Graepel T (2007) TrueSkill through time: Revisiting the history of chess. Platt JC, Koller D, Singer Y, Roweis S, eds. Advances in Neural Information Processing Systems, vol. 20 (Curran Associates, Red Hook, NY), 337–344.Google Scholar
Das S, Magdon-Ismail M (2009) Adapting to a market shock: Optimal sequential market-making. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Advances in Neural Information Processing Systems, vol. 21 (Curran Associates, Red Hook, NY), 361–368.Google Scholar
DeGroot MH (1970) Optimal Statistical Decisions (John Wiley & Sons, Hoboken, NJ).Google Scholar
García-Fernández ÁF, Svensson L (2015) Gaussian MAP filtering using Kalman optimization. IEEE Trans. Automatic Control 60(5):1336–1349.Crossref, Google Scholar
Gelman AB, Carlin JB, Stern HS, Rubin DB (2004) Bayesian Data Analysis, 2nd ed. (CRC Press, Boca Raton, FL).Google Scholar
Gupta AK, Nagar DK (2000) Matrix Variate Distributions (CRC Press, Boca Raton, FL).Google Scholar
Gutin E, Farias V (2016) Optimistic Gittins indices. Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates, Red Hook, NY), 3153–3161.Google Scholar
Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7(2):223–242.Crossref, Google Scholar
Herbrich R, Minka T, Graepel T (2006) TrueSkillTM: A Bayesian skill rating system. Schölkopf B, Platt JC, Hoffman T, eds. Advances in Neural Information Processing Systems, vol. 19 (MIT Press, Cambridge, MA), 569–576.Google Scholar
Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J. Machine Learn. Res. 14(1):1303–1347.Google Scholar
Hong LJ, Nelson BL (2009) A brief introduction to optimization via simulation. Rosetti M, Hill R, Johansson B, Dunkin A, Ingalls R, eds. Proc. 2009 Winter Simulation Conf. (IEEE, Piscataway, NJ), 75–85.Crossref, Google Scholar
Jaakkola TS, Jordan MI (2000) Bayesian parameter estimation via variational methods. Statist. Comput. 10(1):25–37.Crossref, Google Scholar
Jaakkola TS, Jordan MI, Singh SP (1994) On the convergence of stochastic iterative dynamic programming algorithms. Neural Comput. 6(6):1185–1201.Crossref, Google Scholar
Jiang H, Shanbhag UV (2016) On the solution of stochastic optimization and variational problems in imperfect information regimes. SIAM J. Optim. 26(4):2394–2429.Crossref, Google Scholar
Jiang H, Xu H (2008) Stochastic approximation approaches to the stochastic variational inequality problem. IEEE Trans. Automatic Control 53(6):1462–1475.Crossref, Google Scholar
Keizers JM, Bertrand JWM, Wessels J (2003) Diagnosing order planning performance at a navy maintenance and repair organization, using logistic regression. Production Oper. Management 12(4):445–463.Crossref, Google Scholar
Koshal J, Nedić A, Shanbhag UV (2013) Regularized iterative stochastic approximation methods for stochastic variational inequality problems. IEEE Trans. Automatic Control 58(3):594–609.Crossref, Google Scholar
Kushner HJ, Yin G (2003) Stochastic Approximation and Recursive Algorithms and Applications, 2nd ed. (Springer, New York).Google Scholar
Lai TL (2003) Stochastic approximation. Ann. Statist. 31(2):391–406.Crossref, Google Scholar
Marin JM, Pudlo P, Robert CP, Ryder RJ (2012) Approximate Bayesian computational methods. Statist. Comput. 22(6):1167–1180.Crossref, Google Scholar
Minka TP (2001) Expectation propagation for approximate Bayesian inference. Breese J, Koller D, eds. Proc. 17th Conf. Uncertainty Artificial Intelligence (Morgan Kaufmann, San Francisco), 362–369.Google Scholar
Opper M (1998) A Bayesian approach to on-line learning. Saad D, ed. On-Line Learning in Neural Networks (Cambridge University Press, Cambridge, UK), 363–378.Google Scholar
Pasupathy R, Kim S (2011) The stochastic root-finding problem: Overview, solutions, and open questions. ACM Trans. Model. Comput. Simulation 21(3):19:1–19:23.Crossref, Google Scholar
Petruzzi NC, Dada M (1999) Pricing and the newsvendor problem: A review with extensions. Oper. Res. 47(2):183–194.Link, Google Scholar
Plagnol V, Tavaré S (2004) Approximate Bayesian computation and MCMC. Niederreiter H, ed. Monte Carlo and Quasi-Monte Carlo Methods (Springer, New York), 99–113.Google Scholar
Powell WB (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
Powell WB, Ryzhov IO (2012) Optimal Learning (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
Powell WB, George A, Simão H, Scott WR, Lamont A, Stewart J (2012) SMART: A stochastic multiscale model for the analysis of energy resources, technology, and policy. INFORMS J. Comput. 24(4):665–682.Link, Google Scholar
Puterman ML (1994) Markov Decision Processes (John Wiley & Sons, New York).Crossref, Google Scholar
Qu H, Ryzhov IO, Fu MC, Ding Z (2015) Sequential selection with unknown correlation structures. Oper. Res. 63(4):931–948.Link, Google Scholar
Ribeiro C, Szepesvári C (1996) Q-learning combined with spreading: Convergence and results. Proc. 1996 ISRF-IEE Internat. Conf. Intelligent Cognitive Systems, Tehran, Iran, 32–36.Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.Crossref, Google Scholar
Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.Link, Google Scholar
Ryzhov IO (2015) Approximate Bayesian inference for simulation and optimization. Defourny B, Terlaky T, eds. Modeling and Optimization: Theory and Applications (Springer International, Cham, Switzerland), 1–28.Crossref, Google Scholar
Ryzhov IO (2016) On the convergence rates of expected improvement methods. Oper. Res. 64(6):1515–1528.Link, Google Scholar
Ryzhov IO, Powell WB (2010) Approximate dynamic programming with correlated Bayesian beliefs. Viswanath P, Meyn S, eds. Proc. 48th Annual Allerton Conf. Comm. Control Comput. (IEEE, Piscataway, NJ), 1360–1367.Crossref, Google Scholar
Simão HP, George A, Powell WB, Gifford T, Nienow J, Day J (2010) Approximate dynamic programming captures fleet operations for Schneider National. Interfaces 40(5):342–352.Link, Google Scholar
Sohn SY, Kim HS (2007) Random effects logistic regression model for default prediction of technology credit guarantee fund. Eur. J. Oper. Res. 183(1):472–478.Crossref, Google Scholar
Spiegelhalter DJ, Lauritzen SL (1990) Sequential updating of conditional probabilities on directed graphical structures. Networks 20(5):579–605.Crossref, Google Scholar
Sunnåker M, Busetto AG, Numminen E, Corander J, Foll M, Dessimoz C (2013) Approximate Bayesian computation. PLoS Comput. Biol. 9(1):e1002803.Crossref, Google Scholar
Szepesvári C (1997) The asymptotic convergence rate of Q-learning. Jordan MI, Kearns MJ, Solla SA, eds. Advances in Neural Information Processing Systems, vol. 10 (MIT Press, Cambridge, MA), 1064–1070.Google Scholar
Tsitsiklis JN (1994) Asynchronous stochastic approximation and Q-learning. Machine Learn. 16(3):185–202.Crossref, Google Scholar
Wang C, Blei DM (2013) Variational inference in nonconjugate models. J. Machine Learn. Res. 14(1):1005–1031.Google Scholar
Wang Y, Blei DM (2019) Frequentist consistency of variational Bayes. J. Amer. Statist. Assoc. Forthcoming.Google Scholar
Watkins CJCH, Dayan P (1992) Q-Learning. Machine Learn. 8(3–4):279–292.Crossref, Google Scholar
Wen Z, Van Roy B (2013) Efficient exploration and value function generalization in deterministic systems. Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates, Red Hook, NY), 3021–3029.Google Scholar
Yousefian F, Nedić A, Shanbhag UV (2012) On stochastic gradient and subgradient methods with adaptive steplength sequences. Automatica 48(1):56–67.Crossref, Google Scholar
Zhang Q, Song Y (2015) Simulation selection for empirical model comparison. Yilmaz L, Chan WKV, Moon I, Roeder TMK, Macal C, Rossetti MD, eds. Proc. 2015 Winter Simulation Conf. (IEEE, Piscataway, NJ), 3777–3788.Crossref, Google Scholar
Zhang Q, Song Y (2017) Moment-matching-based conjugacy approximation for Bayesian ranking and selection. ACM Trans. Model. Comput. Simulation 27(4):26:1–26:23.Crossref, Google Scholar

Volume 68, Issue 1

January-February 2020

Pages 1-307, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:January 09, 2017
Accepted:January 07, 2019
Published Online:January 02, 2020

Cite as

Ye Chen, Ilya O. Ryzhov (2020) Technical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian Inference. Operations Research 68(1):295-307.

https://doi.org/10.1287/opre.2019.1850

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Technical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian Inference

References

Volume 68, Issue 1

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News