Deep Learning for High-Dimensional Continuous-Time Stochastic Optimal Control Without Explicit Solution
References
- (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791.Crossref, Google Scholar
- (2019) A convergence theory for deep learning via over-parameterization. Chaudhuri K, Salakhutdinov R, eds. Internat. Conf. Machine Learn. (PMLR, New York), 242–252.Google Scholar
- (2003) Optimal execution with nonlinear impact functions and trading-enhanced risk. Appl. Math. Finance 10(1):1–18.Crossref, Google Scholar
- (2001) Optimal execution of portfolio transactions. J. Risk 3(2):5–40.Crossref, Google Scholar
- (2005) Direct estimation of equity market impact. Risk 18(7):58–62.Google Scholar
- (2022) Deep neural networks algorithms for stochastic control problems on finite horizon: Numerical applications. Methodology Comput. Appl. Probab. 24(1):143–178.Crossref, Google Scholar
- (2022) The application of improved physics-informed neural network (IPINN) method in finance. Nonlinear Dynam. 107(4):3655–3667.Crossref, Google Scholar
- (2021a) Deep splitting method for parabolic PDEs. SIAM J. Sci. Comput. 43(5):A3135–A3154.Crossref, Google Scholar
- (2021b) Solving the Kolmogorov PDE by means of deep learning. J. Sci. Comput. 88(3):73.Google Scholar
- (2017) A distributional perspective on reinforcement learning. Precup D, Teh YW, eds. Internat. Conf. Machine Learn. (PMLR), 449–458.Google Scholar
- (2018) A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 317(1):28–41.Crossref, Google Scholar
- (2020) Numerically solving parametric families of high-dimensional Kolmogorov partial differential equations via deep learning. Adv. Neural Inform. Processing Systems 33(1):16615–16627.Google Scholar
- (2023) A Course in Reinforcement Learning (Athena Scientific, Nashua, NH).Google Scholar
- (2021) Three ways to solve partial differential equations with neural networks—A review. GAMM-Mitteilungen 44(2):e202100006.Crossref, Google Scholar
- (2016) Incorporating order-flow into optimal execution. Math. Financial Econom. 10(3):339–364.Crossref, Google Scholar
- (2019) Trading co-integrated assets with price impact. Math. Finance 29(2):542–567.Crossref, Google Scholar
- (2015) Algorithmic and High-Frequency Trading (Cambridge University Press, Cambridge, UK).Google Scholar
- (2014) Buy low, sell high: A high frequency trading perspective. SIAM J. Financial Math. 5(1):415–444.Crossref, Google Scholar
- (2025) Deep learning for continuous-time stochastic control with jumps. Preprint, submitted May 21, https://arxiv.org/abs/2505.15602.Google Scholar
- (2024a) Adjoint matching: Fine-tuning flow and diffusion generative models with memoryless stochastic optimal control. Preprint, submitted September 13, https://arxiv.org/abs/2409.08861.Google Scholar
- (2024b) Stochastic optimal control matching. Adv. Neural Inform. Processing Systems 37(1):112459–112504.Google Scholar
- (2019) Gradient descent finds global minima of deep neural networks. Chaudhuri K, Salakhutdinov R, eds. Internat. Conf. Machine Learn. (PMLR, New York), 1675–1685.Google Scholar
- (2016) Benchmarking deep reinforcement learning for continuous control. Balcan MF, Weinberger KQ, eds. Internat. Conf. Machine Learn. (PMLR, New York), 1329–1338.Google Scholar
- (2023) Relaxed actor-critic with convergence guarantees for continuous-time optimal control of nonlinear systems. IEEE Trans. Intelligent Vehicles 8(5):3299–3311.Crossref, Google Scholar
- (2021) Deep learning for discrete-time hedging in incomplete markets. J. Comput. Finance 25(2):51–85.Google Scholar
- (2018) Addressing function approximation error in actor-critic methods. Dy J, Krause A, eds. Internat. Conf. Machine Learn. (PMLR, New York), 1587–1596.Google Scholar
- (2010) No-dynamic-arbitrage and market impact. Quant. Finance 10(7):749–759.Crossref, Google Scholar
- (2017) Numerical methods for Hamilton–Jacobi–Bellman equations. MS thesis, University of Wisconsin–Milwaukee, Milwaukee.Google Scholar
- (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Dy J, Krause A, eds. Internat. Conf. Machine Learn. (PMLR), 1861–1870.Google Scholar
- (2024) Valuation of guaranteed minimum accumulation benefits (GMAB) with physics inspired neural networks. Ann. Actuar. Sci. 18(2):442–473.Google Scholar
- (2026) Optimal control by policy iterations and constrained Gaussian process regressions. Eur. J. Oper. Res. 330(2):525–539.Crossref, Google Scholar
- (2021) Recurrent neural networks for stochastic control problems with delay. Math. Control Signals Systems 33(4):775–795.Crossref, Google Scholar
- (2016) Deep learning approximation for stochastic control problems. Preprint, submitted November 2, https://arxiv.org/abs/1611.07422.Google Scholar
- (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Comm. Math. Statist. 5(4):349–380.Crossref, Google Scholar
- (2018) Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA 115(34):8505–8510.Crossref, Google Scholar
- (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5):359–366.Crossref, Google Scholar
- (1960) Dynamic Programming and Markov Processes (John Wiley, New York).Google Scholar
- (2023) Recent developments in machine learning methods for stochastic control and games. Preprint, submitted March 18, http://dx.doi.org/10.2139/ssrn.4096569.Google Scholar
- (2021) Deep neural networks algorithms for stochastic control problems on finite horizon: Convergence analysis. SIAM J. Numer. Anal. 59(1):525–557.Crossref, Google Scholar
- (2020) A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations. SN Partial Differential Equations Appl. 1(2):10.Crossref, Google Scholar
- (2017) On the policy improvement algorithm in continuous time. Stochastics 89(1):348–359.Crossref, Google Scholar
- (2021) Strong error analysis for stochastic gradient descent optimization algorithms. IMA J. Numer. Anal. 41(1):455–492.Crossref, Google Scholar
- (2022) Solving stochastic optimal control problem via stochastic maximum principle with deep learning method. J. Sci. Comput. 93(1):30.Crossref, Google Scholar
- (2017) Deep reinforcement learning: An overview. Preprint, submitted January 25, https://doi.org/10.48550/arXiv.1701.07274.Google Scholar
- (2019) A data-driven neural network approach to optimal asset allocation for target based defined contribution pension plans. Insurance Math. Econom. 86(1):189–204.Crossref, Google Scholar
- (2024) A neural network approach for stochastic optimal control. SIAM J. Sci. Comput. 46(5):C535–C556.Crossref, Google Scholar
- (2022) Optimal portfolio liquidation with cross-price impacts on trading. Oper. Res. 22(2):1083–1102.Crossref, Google Scholar
- (2019) Continuous control with deep reinforcement learning. Preprint, submitted September 9, https://arxiv.org/abs/1509.02971.Google Scholar
- (2013) Playing Atari with deep reinforcement learning. Preprint, submitted December 19, https://arxiv.org/abs/1312.5602.Google Scholar
- (2016) Asynchronous methods for deep reinforcement learning. Preprint, submitted February 4, https://doi.org/10.48550/arXiv.1602.01783.Google Scholar
- (2023) Optimal control of PDEs using physics-informed neural networks. J. Comput. Phys. 473(1):111731.Crossref, Google Scholar
- (2024) Stochastic liquidity as a proxy for nonlinear price impact. Oper. Res. 72(2):444–458.Link, Google Scholar
- (2021) Adaptive deep learning for high-dimensional Hamilton–Jacobi–Bellman equations. SIAM J. Sci. Comput. 43(2):A1221–A1247.Crossref, Google Scholar
- (2021) Solving high-dimensional Hamilton–Jacobi–Bellman PDEs using neural networks: Perspectives from the theory of controlled diffusions and measures on path space. Partial Differential Equations Appl. 2(4):48.Crossref, Google Scholar
- (2013) Stochastic Differential Equations: An Introduction with Applications (Springer Science & Business Media, Berlin).Google Scholar
- (2005) Numerical solution of the Hamilton–Jacobi–Bellman equation for stochastic optimal control problems. Georgi JN, Lazakidou AA, Otestenau M, Niola V, eds. Proc. 2005 WSEAS Internat. Conf. Dynamical Systems Control (Stevens Point, Wisconsin), 489–497.Google Scholar
- (2017) Physics informed deep learning (part I): Data-driven solutions of nonlinear partial differential equations. Preprint, submitted November 28, https://arxiv.org/abs/1711.10561.Google Scholar
- (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Computational Phys. 378(1):686–707.Crossref, Google Scholar
- (1997) Long short-term memory. Neural Comput. 9(8):1735–1780.Crossref, Google Scholar
- (2017) Proximal policy optimization algorithms. Preprint, submitted July 20, https://arxiv.org/abs/1707.06347.Google Scholar
- (2018) DGM: A deep learning algorithm for solving partial differential equations. J. Computational Phys. 375(1):1339–1364.Crossref, Google Scholar
- (2015) Highway networks. Preprint, submitted May 3, https://arxiv.org/abs/1505.00387.Google Scholar
- (2018) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
- (2013) Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE, Fields Institute Monographs, vol. 29 (Springer Science & Business Media, New York).Crossref, Google Scholar
- (2023) A parsimonious neural network approach to solve portfolio optimization problems without using dynamic programming. Preprint, submitted March 15, https://arxiv.org/abs/2303.08968.Google Scholar
- (2022) When and why PINNs fail to train: A neural tangent kernel perspective. J. Computational Phys. 449(1):110768.Crossref, Google Scholar
- (2023) Reservoir optimization and machine learning methods. EURO J. Comput. Optim. 11(1):100068.Crossref, Google Scholar
- (1992) Approximate dynamic programming for real-time control and neural modeling. White DA, Safge DA, eds. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches (Van Nostrand Reinhold, New York), 493–526.Google Scholar
- (2023) A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Comput. Methods Appl. Mech. Engrg. 403(A):115671.Crossref, Google Scholar
- (2021) Actor-critic method for high dimensional static Hamilton–Jacobi–Bellman partial differential equations based on neural networks. SIAM J. Sci. Comput. 43(6):A4043–A4066.Crossref, Google Scholar

