Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov Noise

Published Online:https://doi.org/10.1287/moor.2019.1037

References

  • [1] Andrieu C, Moulines É, Priouret P (2005) Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44(1):283–312.CrossrefGoogle Scholar
  • [2] Aubin JP, Cellina A (2012) Differential Inclusions: Set-Valued Maps and Viability Theory, vol. 264 (Springer Science & Business Media, New York).Google Scholar
  • [3] Benaim M (1996) A dynamical system approach to stochastic approximations. SIAM J. Control Optim. 34(2):437–472.CrossrefGoogle Scholar
  • [4] Benaïm M, Hofbauer J, Sorin S (2005) Stochastic approximations and differential inclusions. SIAM J. Control Optim. 44(1):328–348.CrossrefGoogle Scholar
  • [5] Benzi M, Golub GH, Liesen J (2005) Numerical solution of saddle point problems. Acta Numerica 14:1–137.CrossrefGoogle Scholar
  • [6] Bertsekas DP (2009) Convex Optimization Theory (Athena Scientific, Belmont, MA).Google Scholar
  • [7] Bhatnagar S, Prashanth L (2015) Simultaneous perturbation newton algorithms for simulation optimization. J. Optim. Theory Appl. 164(2):621–643.CrossrefGoogle Scholar
  • [8] Borkar VS (1989) Optimal Control of Diffusion Processes. Pitman Lecture Notes in Mathematics, vol. 203 (Longman Scientific and Technical, Harlow, UK).Google Scholar
  • [9] Borkar VS (1997) Stochastic approximation with two time scales. Systems Control Lett. 29(5):291–294.CrossrefGoogle Scholar
  • [10] Borkar VS (2006) Stochastic approximation with controlled Markov noise. Systems Control Lett. 55(2):139–145.CrossrefGoogle Scholar
  • [11] Borkar VS (2008) Stochastic Approximation: A Dynamical Systems Viewpoint (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • [12] Borkar VS (2012) Probability Theory: An Advanced Course (Springer Science & Business Media, New York).Google Scholar
  • [13] Borkar VS, Meyn SP (2000) The ODE method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim. 38(2):447–469.CrossrefGoogle Scholar
  • [14] Chen HF, Guo L, Gao AJ (1987) Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds. Stochastic Processes Their Appl. 27:217–231.CrossrefGoogle Scholar
  • [15] Chen HF, Yunmin Z (1986) Stochastic approximation procedures with randomly varying truncations. Sci. China Ser. A: Math., Phys., Astron., Tech. Sci. 29(9):914–926.Google Scholar
  • [16] Derevitskii D, Fradkov AL (1974) Two models analyzing the dynamics of adaptation algorithms. Avtomatika i Telemekhanika (1):67–75.Google Scholar
  • [17] Duflo M (2013) Random Iterative Models, vol. 34 (Springer Science & Business Media, New York).Google Scholar
  • [18] Fort G, Moulines E, Schreck A, Vihola M (2016) Convergence of markovian stochastic approximation with discontinuous dynamics. SIAM J. Control Optim. 54(2):866–893.CrossrefGoogle Scholar
  • [19] Iusem A, Jofré A, Oliveira RI, Thompson P (2017) Extragradient method with variance reduction for stochastic variational inequalities. SIAM J. Optim. 27(2):686–724.CrossrefGoogle Scholar
  • [20] Jiang H, Xu H (2008) Stochastic approximation approaches to the stochastic variational inequality problem. IEEE Trans. Automatic Control 53(6):1462–1475.CrossrefGoogle Scholar
  • [21] Karmakar P, Bhatnagar S (2017) Two time-scale stochastic approximation with controlled markov noise and off-policy temporal-difference learning. Math. Oper. Res. 43(1):130–151.LinkGoogle Scholar
  • [22] Kloeden P, Kozyakin V (2000) The inflation of attractors and their discretization: The autonomous case. Nonlinear Anal.: Theory Methods Appl. 40(1–8):333–344.Google Scholar
  • [23] Koshal J, Nedic A, Shanbhag UV (2013) Regularized iterative stochastic approximation methods for stochastic variational inequality problems. IEEE Trans. Automatic Control 58(3):594–609.CrossrefGoogle Scholar
  • [24] Kumaresan S (2005) Topology of Metric Spaces (Alpha Science International, Oxford, UK).Google Scholar
  • [25] Kushner H, Yin G (2003) Stochastic approximation and recursive algorithms and applications (Springer, New York).Google Scholar
  • [26] Li D, Zhang X (2002) On dynamical properties of general dynamical systems and differential inclusions. J. Math. Anal. Appl. 274(2):705–724.CrossrefGoogle Scholar
  • [27] Li S, Ogura Y, Kreinovich V (2013) Limit Theorems and Applications of Set-Valued and Fuzzy Set-Valued Random Variables, vol. 43 (Springer Science & Business Media, New York).Google Scholar
  • [28] Metivier M, Priouret P (1984) Applications of a Kushner and Clark lemma to general classes of stochastic algorithms. IEEE Trans. Inform. Theory 30(2):140–151.CrossrefGoogle Scholar
  • [29] Meyn SP, Tweedie RL (2012) Markov Chains and Stochastic Stability (Springer Science & Business Media, New York).Google Scholar
  • [30] Milgrom P, Segal I (2002) Envelope theorems for arbitrary choice sets. Econometrica 70(2):583–601.Google Scholar
  • [31] Nagurney A, Zhang D (2012) Projected Dynamical Systems and Variational Inequalities with Applications, vol. 2 (Kluwer Academic Publishers, Norwell, MA).Google Scholar
  • [32] Nedić A, Ozdaglar A (2009) Subgradient methods for saddle-point problems. J. Optim. Theory Appl. 142(1):205–228.CrossrefGoogle Scholar
  • [33] Parthasarathy KR (1967) Probability Measures on Metric Spaces, vol. 352 (American Mathematical Society, Providence, RI).CrossrefGoogle Scholar
  • [34] Perkins S, Leslie DS (2012) Asynchronous stochastic approximation with differential inclusions. Stochastic Systems 2(2):409–446.LinkGoogle Scholar
  • [35] Ramaswamy A, Bhatnagar S (2016) A generalization of the borkar-meyn theorem for stochastic recursive inclusions. Math. Oper. Res. 42(3):648–661.LinkGoogle Scholar
  • [36] Ramaswamy A, Bhatnagar S (2016) Stochastic recursive inclusion in two timescales with an application to the Lagrangian dual problem. Stochastics 88(8):1173–1187.Google Scholar
  • [37] Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.CrossrefGoogle Scholar
  • [38] Rudin W (1976) Principles of mathematical analysis. International Series in Pure and Applied Mathematics (McGraw-Hill, New York).Google Scholar
  • [39] Spall JC (1992) Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Automatic Control 37(3):332–341.CrossrefGoogle Scholar
  • [40] Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E (2009) Fast gradient-descent methods for temporal-difference learning with linear function approximation. Proc. 26th Annual Internat. Conf. Machine Learning (ACM, New York), 993–1000.Google Scholar
  • [41] Yaji VG, Bhatnagar S (2018) Stochastic recursive inclusions with non-additive iterate-dependent markov noise. Stochastics 90(3):330–363.CrossrefGoogle Scholar
  • [42] Yaji VG, Bhatnagar S (2020) Analysis of stochastic approximation schemes with set-valued maps in the absence of a stability guarantee and their stabilization. IEEE Trans. Automatic Control 65(3):1100–1115.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.