A Learning Algorithm for Risk-Sensitive Cost
Published Online:17 Oct 2008https://doi.org/10.1287/moor.1080.0324
References
- An Invitation to Operator Theory (2002) (American Mathematical Society, Providence, RI) Crossref, Google Scholar
- , Yong J. Dynamic asset management: Risk sensitive criterion with positive factors constraints. Recent Developments in Mathematical Finance (2002) (World Scientific, Hong Kong) 1–11Crossref, Google Scholar
- Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stochastic Processes Their Appl. (2000) 90:123–144Crossref, Google Scholar
- Nonnegative Matrices and Applications (1997) (Cambridge University Press, Cambridge, UK) Crossref, Google Scholar
- Neuron-like elements that can solve difficult learning control problems. IEEE Trans. Systems Man Cybernetics (1983) 13:835–846Google Scholar
- , Azéma J., Emery M., Ledoux M., Yor M. Dynamics of stochastic approximation algorithms. Le Séminaire de Probabilités. Springer Lecture Notes in Mathematics (1999) 1709(Springer Verlag, Berlin-Heidelberg) 1–68Crossref, Google Scholar
- Adaptive Algorithms and Stochastic Approximations (1991) (Springer Verlag, Berlin-Heidelberg) Google Scholar
- Least squares policy evaluation algorithms with linear function approximation. Discrete Event Dynamical Systems (2003) 13:79–110Crossref, Google Scholar
- Neurodynamic Programming (1996) (Athena Scientific, Belmont, MA) Google Scholar
- , Si J., Barto A. G., Powell W. B., Wunsch D. Improved temporal difference methods with linear function approximation. Handbook of Learning and Approximate Dynamic Programming (2004) (IEEE Press, New York) 235–259Google Scholar
- Matrix Analysis (1997) (Springer Verlag, New York) Crossref, Google Scholar
- Risk sensitive dynamic asset management. Appl. Math. Optim. (1999) 39:337–360Crossref, Google Scholar
- Risk sensitive asset management with transaction costs. Finance Stochastics (2000) 4:1–33Crossref, Google Scholar
- Economic properties of the risk-sensitive criterion for portfolio management. Rev. Account. Finance (2003) 2:3–17Crossref, Google Scholar
- A sensitivity formula for risk-sensitive cost and the actor-critic algorithm. Systems Control Lett. (2001) 44:339–346Crossref, Google Scholar
- Q-learning for risk-sensitive control. Math. Oper. Res. (2002) 27:294–311Link, Google Scholar
- The o.d.e. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim. (2000) 38:447–469Crossref, Google Scholar
- Risk-sensitive optimal control for Markov decision processes with monotone costs. Math. Oper. Res. (2002) 27:192–209Link, Google Scholar
- Two models for analyzing the dynamics of adaptation algorithms. Automation Remote Control (1974) 35:59–67Google Scholar
- On adaptive and multiplicative (controlled) Poisson equations: Approximation and probability. Bonach Center Publications (2006) 72:57–70Crossref, Google Scholar
- Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optim. (2007) 46:231–252Crossref, Google Scholar
- Matrix Computations (1996) (Johns Hopkins University Press, Baltimore, MD) Google Scholar
- Convergent activation dynamics in continuous time networks. Neural Networks (1989) 2:331–349Crossref, Google Scholar
- Brownian Motion and Stochastic Calculus (1988) (Springer-Verlag, New York) Crossref, Google Scholar
- Spectral theory and limit theorems for geometrically ergodic Markov processes. Anal. Appl. Probab. (2003) 13:304–362Crossref, Google Scholar
- Large deviations asymptotics and the spectral theory of multiplicatively regular Markov processes. Electronic J. Probab. (2005) 10:61–123Crossref, Google Scholar
- Numerical Methods for Stochastic Control Problems in Continuous Time (2001) 2nd ed.(Springer-Verlag, New York) Crossref, Google Scholar
- Si J., Barto A. G., Powell W. B., Wunsch D.Handbook of Learning and Approximate Dynamic Programming (2004) (IEEE Press, New York) Crossref, Google Scholar
- Matrix Algorithms (2001) II(SIAM, Philadelphia) Crossref, Google Scholar
- Reinforcement Learning (1998) (MIT Press, Cambridge, MA) Google Scholar
- An analysis of temporal-difference learning with function approximation. IEEE Trans. Automatic Control (1997) 42:674–690Crossref, Google Scholar
- Learning from delayed rewards. (1989) . Ph.D. thesis, University of Cambridge, Cambridge, UKGoogle Scholar
- Smoothing derivatives of functions and applications. Trans. Amer. Math. Soc. (1969) 139:413–428Crossref, Google Scholar

