On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP
Published Online:1 Feb 2008https://doi.org/10.1287/moor.1070.0279
References
- Internal-state policy-gradient algorithms for infinite-horizon POMDPs. (2001) . Technical report, RSISE, Australian National University, Canberra, AustraliaGoogle Scholar
- Infinite-horizon policy-gradient estimation. J. Artificial Intelligence Res. (2001) 15:319–350Crossref, Google Scholar
- Stochastic Optimal Control: The Discrete Time Case (1978) (Academic Press, New York) Google Scholar
- An expected average reward criterion. Stochastic Process. Appl. (1987) 26:123–140Crossref, Google Scholar
- Controlled Markov Processes (1979) (Springer-Verlag, New York) Crossref, Google Scholar
- An ϵ-optimal control of a finite Markov chain with an average reward criterion. Theory Probab. Appl. (1980) 25:70–81Crossref, Google Scholar
- Controlled Markov processes with arbitrary numerical criteria. Theory Probab. Appl. (1982) 27:486–503Crossref, Google Scholar
- Nonrandomized Markov and semi-Markov strategies in dynamic programming. Theory Probab. Appl. (1982) 27:116–126Crossref, Google Scholar
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes. Ann. Oper. Res. (1991) 29:439–470Crossref, Google Scholar
- On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion. Systems Control Lett. (2006) 55:165–173Crossref, Google Scholar
- Reinforcement learning algorithm for partially observable Markov decision problems. Proc. Neural Inform. Processing Systems Conf. (1995) Denver, CO(MIT Press, Cambridge, MA) Google Scholar
- Graphical Models (1996) (Oxford University Press, Oxford, UK) Crossref, Google Scholar
- Learning finite-state controllers for partially observable environment. Proc. 15th Conf. Uncertainty in Artificial Intelligence (1999) Stockholm, Sweden(Morgan Kaufmann, San Francisco) Google Scholar
- Optimal infinite-horizon undiscounted control of finite probabilistic systems. SIAM J. Control Optim. (1980) 18(4):362–380Crossref, Google Scholar
- Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994) (John Wiley and Sons, Inc., New York) Crossref, Google Scholar
- Arbitrary state Markovian decision processes. Ann. Math. Statist. (1968) 39(6):2118–2122Crossref, Google Scholar
- Approximations of Discrete Time Partially Observable Control Problems, Applied Mathematics Monographs (1994) 6(Giardini Editori e Stampatori, Pisa, Italy) Google Scholar
- A function approximation approach to estimation of policy gradient for POMDP with structured polices. Proc. 21st Conf. Uncertainty in Artificial Intelligence (2005) Edinburgh, UK(AUAI Press)Google Scholar
- Approximate solution methods for partially observable Markov and semi-Markov decision processes. (2006) . Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MAGoogle Scholar

