Variance Regularization in Sequential Bayesian Optimization
Published Online:14 Apr 2020https://doi.org/10.1287/moor.2019.1019
References
- [1] (2013) Bayesian dynamic pricing in queuing systems with unknown delay cost characteristics. Manufacturing Service Oper. Management 15(2):292–304.Link, Google Scholar
- [2] (1972) Real Analysis and Probability (Academic Press, New York).Google Scholar
- [3] (2018) Machine learning and portfolio optimization. Management Sci. 64(3):1136–1154.Link, Google Scholar
- [4] (2019) Thompson sampling for stochastic control: The continuous parameter case. IEEE Trans. Automatic Control 64(10):4137–4152.Crossref, Google Scholar
- [5] (2009) Optimal consumption and portfolio decisions with partially observed real prices. Math. Finance 19(2):215–236.Crossref, Google Scholar
- [6] (1999) Nonlinear Programming (Athena Scientific, Nashua, NH).Google Scholar
- [7] (2012) Dynamic Programming and Optimal Control Volume II: Approximate Dynamic Programming (Athena Scientific, Nashua, NH).Google Scholar
- [8] (1978) Stochastic Optimal Control: The Discrete Time Case (Academic Press, New York).Google Scholar
- [9] (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- [10] (1996) Linear least-squares algorithms for temporal difference learning. Machine Learning 22(1–3):33–57.Crossref, Google Scholar
- [11] (2002) Optimal learning and experimentation in bandit problems. J. Econom. Dynamics Control 27(1):87–108.Crossref, Google Scholar
- [12] (2019) Opaque bank assets and optimal equity capital. J. Econom. Dynamics Control 100:369–394.Google Scholar
- [13] (1977) A converse of Taylor’s theorem for functions on Banach spaces. Proc. Amer. Math. Soc. 65(2):265–273.Google Scholar
- [14] (2002) An adaptive Bayesian replacement policy with minimal repair. Oper. Res. 50(3):552–558.Link, Google Scholar
- [15] (2019) Variance-based regularization with convex objectives. J. Machine Learn. Res. 20(68):1–55.Google Scholar
- [16] (2007) Convergence rates of posterior distributions for non-i.i.d. observations. Ann. Statist. 35(1):192–223.Crossref, Google Scholar
- [17] (2018) Robust empirical optimization is almost the same as mean-variance optimization. Oper. Res. Lett. 46(4):448–452.Crossref, Google Scholar
- [18] (2017) Calibration of distributionally robust empirical optimization models. Preprint submitted November 17, https://arxiv.org/abs/1711.06565.Google Scholar
- [19] (2013) Robust portfolio techniques for mitigating the fragility of CVaR minimization and generalization to coherent risk measures. Quant. Finance 13(10):1621–1635.Crossref, Google Scholar
- [20] (2012) Bayesian dynamic pricing policies: Learning and earning under a binary prior distribution. Management Sci. 58(3):570–586.Link, Google Scholar
- [21] (1989) Adaptive Markov Control Processes (Springer-Verlag, New York).Crossref, Google Scholar
- [22] (2015) Demand estimation and ordering under censoring: Stock-out timing is (almost) all you need. Oper. Res. 63(1):134–150.Link, Google Scholar
- [23] (2008) The demand for more information: More heat than light. J. Econom. Theory 138(1):21–50.Crossref, Google Scholar
- [24] (2019) Can individual investors time bubbles? Working paper, National University of Singapore, Singapore.Google Scholar
- [25] (2008) Optimal electoral timing: Exercise wisely and you may live longer. Rev. Econom. Stud. 75(2):597–628.Crossref, Google Scholar
- [26] (2014) Optimal dynamic pricing with demand model uncertainty: A squared-coefficient-of-variation rule for learning and earning. Working paper, Duke University, Durham, NC.Google Scholar
- [27] (2019) Dynamic selling mechanisms for product differentiation and learning. Oper. Res. 67(4):1069–1089.Google Scholar
- [28] (2016) Robust control of partially observable failing systems. Oper. Res. 64(4):999–1014.Link, Google Scholar
- [29] (2017) Thompson sampling for stochastic control: The finite parameter case. IEEE Trans. Automatic Control 62(12):6415–6422.Crossref, Google Scholar
- [30] (2016) Robust multi-armed bandit problems. Management Sci. 62(1):264–285.Abstract, Google Scholar
- [31] (2019) Approximating the Gittins index for Bayesian bandits. Working paper, University of British Columbia, Vancouver.Google Scholar
- [32] (2013) Joint optimization of sampling and control of partially observable failing systems. Oper. Res. 61(3):777–790.Link, Google Scholar
- [33] (2016) Robust sensitivity analysis for stochastic systems. Math. Oper. Res. 41(4):1248–1275.Link, Google Scholar
- [34] (2003) Least-squares policy evaluation algorithms with linear function approximation. J. Discrete Event Systems 13(1/2):79–110.Crossref, Google Scholar
- [35] (1975) Bayesian dynamic programming. Adv. Appl. Probab. 7(2):330–348.Crossref, Google Scholar
- [36] (2013) The fundamental risk quadrangle in risk management, optimization and statistical estimation. Surv. Oper. Res. Management Sci. 18(S1–S2):33–53.Crossref, Google Scholar
- [37] (2014) Learning to optimize via posterior sampling. Math. Oper. Res. 39(4):1221–1243.Link, Google Scholar
- [38] (2012) The knowledge gradient algorithm for a general class of online learning problems. Oper. Res. 60(1):180–195.Link, Google Scholar
- [39] (2001) Rates of convergence of posterior distributions. Ann. Statist. 29(3):687–714.Crossref, Google Scholar
- [40] (1988) Learning to predict by the methods of temporal differences. Machine Learn. 3(1):9–44.Crossref, Google Scholar
- [41] (1996) Regression shrinkage and selection via the Lasso. J. Royal Statist. Soc. B 58(1):267–288.Crossref, Google Scholar
- [42] (1997) An analysis of temporal-difference learning with function approximation. IEEE Trans. Automatic Control 42(5):674–690.Crossref, Google Scholar
- [43] (2004) New approaches to Bayesian consistency. Ann. Statist. 32(5):2028–2043.Crossref, Google Scholar
- [44] (2000) Probability via Expectation (Springer-Verlag, New York).Crossref, Google Scholar
- [45] (2010) Robust regression and Lasso. IEEE Trans. Inform. Theory 56(7):3561–3574.Crossref, Google Scholar

