The Simplex Method is Strongly Polynomial for Deterministic Markov Decision Processes
Published Online:10 Feb 2015https://doi.org/10.1287/moor.2014.0699
References
- (1957) Dynamic Programming (Princeton University Press, Princeton, NJ).Google Scholar
- (1996) Dynamic Programming and Optimal Control (Athena Scientific, Nashua, NH).Google Scholar
- (1960) Les problèmes de décisions sequentielles. Cahiers du Centre d’Etudes de Recherche Opérationelle 2:161–179.Google Scholar
- (1960) Sur un problème de production et de stockage dans l’aléatoire. Revue Française de Recherche Opérationelle 14:3–16.Google Scholar
- (2010) Exponential lower bounds for policy iteration. Automata, Languages and Programming, Lecture Notes in Computer Science, Vol. 6199 (Springer, Berlin), 551–562.Crossref, Google Scholar
- (2009) An exponential lower bound for the parity game strategy improvement algorithm as we know it. Pitts A, ed. Proc. 24th ACM/IEEE Sympos. Logic in Computer Science, LICS ’09 (IEEE, Piscataway, NJ), 145–156.Crossref, Google Scholar
- (2011) A subexponential lower bound for Zadeh’s pivoting rule for solving linear programs and games. Integer Programming and Combinatoral Optimization, Lecture Notes in Computer Science, Vol. 6655 (Springer, Berlin), 192–206.Crossref, Google Scholar
- (2011) Subexponential lower bounds for randomized pivoting rules for the simplex algorithm. Fortnow L, Vadhan SP, eds. Proc. 43rd ACM Sympos. Theory of Comput., STOC ’11 (ACM, New York), 283–292.Crossref, Google Scholar
- (2014) Dantzig’s pivoting rule for shortest paths, deterministic MDPs, and minimum cost to time ratio cycles. Chekuri C, ed. Proc. 25th ACM-SIAM Sympos. Discrete Algorithms, SODA ’14 (SIAM, Philadelphia), 847–860.Crossref, Google Scholar
- (2013) Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. J. ACM 60(1):1:1–1:16.Crossref, Google Scholar
- (2010) Lower bounds for Howard’s algorithm for finding minimum mean-cost cycles. Cheong O, Chwa K-Y, Park K, eds. Algorithms and Computation, Lecture Notes in Computer Science, Vol. 6506 (Springer, Berlin), 415–426.Crossref, Google Scholar
- (1994) Simple and fast algorithms for linear and integer programs with two variables per inequality. SIAM J. Comput. 23(6):1179–1192.Crossref, Google Scholar
- (1960) Dynamic Programming and Markov Decision Processes (MIT, Cambridge, MA).Google Scholar
- (1995) On the complexity of solving Markov decision problems. Besnard P, Hanks S, eds. Proc. 11th Uncertainty in Artificial Intelligence, UAI ’95 (Morgan Kaufmann, San Francisco), 394–402.Google Scholar
- (2002) On policy iteration as a Newton’s method and polynomial policy iteration algorithms. Dechter R, Sutton RS, eds. Proc. 18th Nat. Conf. Artificial Intelligence, AAAI ’02 (AAAI Press, Palo Alto, CA), 273–278.Google Scholar
- (2010) Discounted deterministic Markov decision processes and discounted all-pairs shortest paths. ACM Trans. Algorithms 6(2):33:1–33:25.Crossref, Google Scholar
- (1960) Linear programming and sequential decisions. Management Sci. 6(3):259–267.Link, Google Scholar
- (1999) On the complexity of policy iteration. Laskey KB, Prade H, eds. Proc. 15th Uncertainty in Artificial Intelligence, UAI ’99 (Morgan Kaufmann, San Francisco), 401–408.Google Scholar
- (1994) On the complexity of the policy improvement algorithm for Markov decision processes. ORSA J. Comput. 6(2):188–192.Link, Google Scholar
- (1987) The complexity of Markov decision processes. Math. Oper. Res. 12(3):441–450.Link, Google Scholar
- (2013) The simplex method is strongly polynomial for deterministic Markov decision processes. Khanna S, ed. Proc. 24th ACM-SIAM Sympos. Discrete Algorithms, SODA ’13 (SIAM, Philadelphia), 1465–1473.Crossref, Google Scholar
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, New York).Crossref, Google Scholar
- (2005) A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3):733–749.Link, Google Scholar
- (2011) The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4):593–603.Link, Google Scholar

