Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures
Published Online:13 Nov 2017https://doi.org/10.1287/moor.2017.0872
References
- (2011) Dynamic risk measures. Di Nunno, G, Øksendal, B, eds. Advanced Mathematical Methods for Finance (Springer, Berlin), 1–34.Crossref, Google Scholar
- (1995) Stochastic gradient optimization of importance sampling for the efficient simulation of digital communication systems. IEEE Trans. Comm. 43(12):2975–2985.Crossref, Google Scholar
- (1999) Coherent measures of risk. Math. Finance 9(3):203–228.Crossref, Google Scholar
- (2006) Coherent multiperiod risk adjusted values and Bellman’s principle. Ann. Oper. Res. 152(1):5–22.Crossref, Google Scholar
- (2011) Reinforcement learning with a near optimal rate of convergence. Technical Report, Radboud University Nijmegen, Netherlands.Google Scholar
- (2009) Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling. Monte Carlo Methods Appl. 15(3):173–210.Crossref, Google Scholar
- (2014) Beyond value-at-risk: GlueVaR distortion risk measures. Risk Anal. 34(1):121–134.Crossref, Google Scholar
- (1998) Optimal control of execution costs. J. Financial Markets 1(1):1–50.Crossref, Google Scholar
- (1996) Neuro-Dynamic Programming (Athena Scientific, Belmont, MA).Google Scholar
- (2006) Time consistent dynamic risk measures. Math. Methods Oper. Res. 63(1):169–186.Crossref, Google Scholar
- (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (1992) Probability (SIAM, Philadelphia).Crossref, Google Scholar
- (2004) Introduction to Rare Event Simulation (Springer, New York).Crossref, Google Scholar
- (2005) Extreme value theory and extremely large electricity price changes. Internat. Rev. Econom. Finance 14(1):41–55.Crossref, Google Scholar
- (2014) Computational methods for risk-averse undiscounted transient Markov models. Oper. Res. 62(2):401–417.Link, Google Scholar
- (2009) Time-inconsistency of VaR and time-consistent alternatives. Finance Res. Lett. 6(1):40–46.Crossref, Google Scholar
- (2006) Dynamic monetary risk measures for bounded discrete-time processes. Electronic J. Probab. 11(3):57–106.Crossref, Google Scholar
- (2014) Algorithms for CVaR optimization in MDPs. Advances in Neural Information Processing Systems (MIT Press, Cambridge, MA), 3509–3517.Google Scholar
- (1954) On a stochastic approximation method. Ann. Math. Statist. 25(3):463–483.Crossref, Google Scholar
- (2005) Subadditivity reexamined: The case for value-at-risk. Discussion paper, 549. Financial Markets Group, London School of Economics and Political Science, London, UK.Google Scholar
- (2006) Can a coherent risk measure be too subadditive? J. Risk Insurance 75(2):365–386.Crossref, Google Scholar
- (2006) After VaR: The theory, estimation, and insurance applications of quantile-based risk measures. J. Risk Insurance 73(2):193–229.Crossref, Google Scholar
- (1997) An overview of value at risk. J. Derivatives 4(3):7–49.Crossref, Google Scholar
- (2010) Quantile estimation with adaptive importance sampling. Ann. Statist. 38(2):1244–1278.Crossref, Google Scholar
- (2010) A dynamic model for the failure replacement of aging high-voltage transformers. Energy Systems 1(1):31–59.Crossref, Google Scholar
- (2004) Learning rates for Q-learning. J. Machine Learn. Res. 5:1–25.Google Scholar
- (2004) Dynamic convex risk measures. Szegö, Risk Measures for the 21st Century (John Wiley & Sons, Chichester, UK), 227–248.Google Scholar
- (2006) Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine Learn. 65(1):167–198.Crossref, Google Scholar
- (1996) Rare-event simulation for multistage production-inventory systems. Management Sci. 42(9):1292–1307.Link, Google Scholar
- (2007) The limits of diversification when losses may be large. J. Banking and Finance 31(8):2551–2569.Crossref, Google Scholar
- (1994) On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation 6:1185–1201.Crossref, Google Scholar
- (2015) An approximate dynamic programming algorithm for monotone value functions. Oper. Res. 63(6): 1489–1511.Link, Google Scholar
- (2015) Optimal hour-ahead bidding in the real-time electricity market with battery storage using approximate dynamic programming. INFORMS J. Comput. 27(3):525–543.Link, Google Scholar
- (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574–1609.Crossref, Google Scholar
- (2011) On the convergence of a stochastic approximation procedure for estimating the quantile criterion in the case of a discontinuous distribution function. Automation Remote Control 72(2):283–288.Crossref, Google Scholar
- (1999) Finite-sample convergence rates for Q-learning and indirect algorithms. Advances in Neural Information Processing Systems, 996–1002.Google Scholar
- (2011) An hour-ahead prediction model for heavy-tailed spot prices. Energy Econom. 33(6):1252–1266.Crossref, Google Scholar
- (2002) The sample average approximation method for stochastic discrete optimization. SIAM J. Optimization 12(2):479–502.Crossref, Google Scholar
- (2015) Evaluating policies in risk-averse multi-stage stochastic programming. Math. Programming 152(1):275–300.Crossref, Google Scholar
- (2003) Stochastic Approximation and Recursive Algorithms and Applications, Vol. 35 (Springer, New York).Google Scholar
- (2013) Optimizing trading decisions for hydro storage systems using approximate dual dynamic programming. Oper. Res. 61(4):810–823.Link, Google Scholar
- (2009) An optimal approximate dynamic programming algorithm for the lagged asset acquisition problem. Math. Oper. Res. 34(1):210–237.Link, Google Scholar
- (2008) Risk Management in Stochastic Integer Programming (Springer Fachmedien, Wiesbaden, Germany).Google Scholar
- (1991) Multi-stage stochastic optimization applied to energy planning. Math. Programming 52(1):359–375.Crossref, Google Scholar
- (1996) Optimization of Stochastic Models (Kluwer Academic Publishers, New York).Crossref, Google Scholar
- (2005) Measuring risk for income streams. Computational Optim. Appl. 32(1–2):161–178.Crossref, Google Scholar
- (2012) Dynamic sampling algorithms for multi-stage stochastic programs with risk aversion. Eur. J. Oper. Res. 218(2):470–483.Crossref, Google Scholar
- (2013) On solving multistage stochastic programs with coherent risk measures. Oper. Res. 61(4):957–970.Link, Google Scholar
- (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. (John Wiley & Sons, New York).Crossref, Google Scholar
- (2004) Learning algorithms for separable approximations of discrete stochastic optimization problems. Math. Oper. Res. 29(4):814–836.Link, Google Scholar
- (2014) Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, New York).Google Scholar
- (2012) Making gradient descent optimal for strongly convex stochastic optimization. Proc. 29th Internat. Conf. Machine Learn. ICML ’12 (Omnipress, Madison, WI), 449–456.Google Scholar
- (2004) Dynamic coherent risk measures. Stochastic Processes and Their Appl. 112(2):185–200.Crossref, Google Scholar
- (1951) A stochastic approximation method. The Annals of Mathematical Statistics 22(3):400–407.Crossref, Google Scholar
- (1971) A convergence theorem for nonnegative almost supermartingales and some applications. Rustagi JS, ed. Optimizing Methods in Statistics (Academic Press, New York), 233–257.Google Scholar
- (2000) Optimization of conditional value-at-risk. J. Risk 2:21–41.Crossref, Google Scholar
- (2002) Conditional value-at-risk for general loss distributions. J. Banking and Finance 26(7):1443–1471.Crossref, Google Scholar
- (1999) The simulated entropy method for combinatorial and continuous optimization. Methodology Comput. Appl. Probab. 1(2):127–190.Crossref, Google Scholar
- (2014) Time consistency and risk averse dynamic decision models: Definition, interpretation and practical consequences. Eur. J. Oper. Res. 234(3):743–750.Crossref, Google Scholar
- (2010) Risk-averse dynamic programming for Markov decision processes. Math. Programming 125(2):235–261.Crossref, Google Scholar
- (2006) Conditional risk mappings. Math. Oper. Res. 31(3):544–561.Link, Google Scholar
- (2006) Optimization of convex risk functions. Math. Oper. Res. 31(3):433–452.Link, Google Scholar
- (2015) Adaptive importance sampling via stochastic convex programming. Preprint arXiv:1412.4845.Google Scholar
- (2011) Information collection on a graph. Oper. Res. 59(1):188–201.Link, Google Scholar
- (2015) A new optimal stepsize for approximate dynamic programming. IEEE Trans. Automatic Control 60(3):743–758.Crossref, Google Scholar
- (2013) No more pesky learning rates. J. Machine Learn. Res. 28(2):343–351.Google Scholar
- (2010) Distortion risk measures in portfolio optimization. Handbook of Portfolio Construction (Springer, New York), 649–673.Crossref, Google Scholar
- (2009) On a time consistency concept in risk averse multistage stochastic programming. Oper. Res. Lett. 37(3):143–147.Crossref, Google Scholar
- (2013) Risk neutral and risk averse stochastic dual dynamic programming method. Eur. J. Oper. Res. 224(2):375–391.Crossref, Google Scholar
- (1996) Generalized Markov decision processes: Dynamic programming and reinforcement learning algorithms. Technical Report, Bolyai Institute of Mathematics, Hungary.Google Scholar
- (1983) A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM J. Scientific Statist. Comput. 4(4):706–711.Crossref, Google Scholar
- (1994) Asynchronous stochastic approximation and Q-learning. Machine Learn. 16(3):185–202.Crossref, Google Scholar
- (1992) Q-learning. Machine Learn. 8(3–4):279–292.Crossref, Google Scholar
- (1980) Multi-armed bandits and the Gittins Index. J. Roy. Statist. Soc. Ser. B (Methodological) 42(2):143–149.Google Scholar
- (2014) A stochastic dynamic programming model for co-optimization of distributed energy storage. Energy Systems 5(3):475–505.Crossref, Google Scholar

