The Simplex Method is Strongly Polynomial for Deterministic Markov Decision Processes

Published Online:https://doi.org/10.1287/moor.2014.0699

References

  • Bellman RE (1957) Dynamic Programming (Princeton University Press, Princeton, NJ).Google Scholar
  • Bertsekas DP (1996) Dynamic Programming and Optimal Control (Athena Scientific, Nashua, NH).Google Scholar
  • de Ghellinck GT (1960) Les problèmes de décisions sequentielles. Cahiers du Centre d’Etudes de Recherche Opérationelle 2:161–179.Google Scholar
  • D’Epenoux F (1960) Sur un problème de production et de stockage dans l’aléatoire. Revue Française de Recherche Opérationelle 14:3–16.Google Scholar
  • Fearnley J (2010) Exponential lower bounds for policy iteration. Automata, Languages and Programming, Lecture Notes in Computer Science, Vol. 6199 (Springer, Berlin), 551–562.CrossrefGoogle Scholar
  • Friedmann O (2009) An exponential lower bound for the parity game strategy improvement algorithm as we know it. Pitts A, ed. Proc. 24th ACM/IEEE Sympos. Logic in Computer Science, LICS ’09 (IEEE, Piscataway, NJ), 145–156.CrossrefGoogle Scholar
  • Friedmann O (2011) A subexponential lower bound for Zadeh’s pivoting rule for solving linear programs and games. Integer Programming and Combinatoral Optimization, Lecture Notes in Computer Science, Vol. 6655 (Springer, Berlin), 192–206.CrossrefGoogle Scholar
  • Friedmann O, Hansen TD, Zwick U (2011) Subexponential lower bounds for randomized pivoting rules for the simplex algorithm. Fortnow L, Vadhan SP, eds. Proc. 43rd ACM Sympos. Theory of Comput., STOC ’11 (ACM, New York), 283–292.CrossrefGoogle Scholar
  • Hansen TD, Kaplan H, Zwick U (2014) Dantzig’s pivoting rule for shortest paths, deterministic MDPs, and minimum cost to time ratio cycles. Chekuri C, ed. Proc. 25th ACM-SIAM Sympos. Discrete Algorithms, SODA ’14 (SIAM, Philadelphia), 847–860.CrossrefGoogle Scholar
  • Hansen TD, Miltersen PB, Zwick U (2013) Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. J. ACM 60(1):1:1–1:16.CrossrefGoogle Scholar
  • Hansen TD, Zwick U (2010) Lower bounds for Howard’s algorithm for finding minimum mean-cost cycles. Cheong O, Chwa K-Y, Park K, eds. Algorithms and Computation, Lecture Notes in Computer Science, Vol. 6506 (Springer, Berlin), 415–426.CrossrefGoogle Scholar
  • Hochbaum DS, Naor J(S) (1994) Simple and fast algorithms for linear and integer programs with two variables per inequality. SIAM J. Comput. 23(6):1179–1192.CrossrefGoogle Scholar
  • Howard R (1960) Dynamic Programming and Markov Decision Processes (MIT, Cambridge, MA).Google Scholar
  • Littman ML, Dean TL, Kaelbling LP (1995) On the complexity of solving Markov decision problems. Besnard P, Hanks S, eds. Proc. 11th Uncertainty in Artificial Intelligence, UAI ’95 (Morgan Kaufmann, San Francisco), 394–402.Google Scholar
  • Madani O (2002) On policy iteration as a Newton’s method and polynomial policy iteration algorithms. Dechter R, Sutton RS, eds. Proc. 18th Nat. Conf. Artificial Intelligence, AAAI ’02 (AAAI Press, Palo Alto, CA), 273–278.Google Scholar
  • Madani O, Thorup M, Zwick U (2010) Discounted deterministic Markov decision processes and discounted all-pairs shortest paths. ACM Trans. Algorithms 6(2):33:1–33:25.CrossrefGoogle Scholar
  • Manne AS (1960) Linear programming and sequential decisions. Management Sci. 6(3):259–267.LinkGoogle Scholar
  • Mansour Y, Singh S (1999) On the complexity of policy iteration. Laskey KB, Prade H, eds. Proc. 15th Uncertainty in Artificial Intelligence, UAI ’99 (Morgan Kaufmann, San Francisco), 401–408.Google Scholar
  • Melekopoglou M, Condon A (1994) On the complexity of the policy improvement algorithm for Markov decision processes. ORSA J. Comput. 6(2):188–192.LinkGoogle Scholar
  • Papadimitriou C, Tsitsiklis JN (1987) The complexity of Markov decision processes. Math. Oper. Res. 12(3):441–450.LinkGoogle Scholar
  • Post I, Ye Y (2013) The simplex method is strongly polynomial for deterministic Markov decision processes. Khanna S, ed. Proc. 24th ACM-SIAM Sympos. Discrete Algorithms, SODA ’13 (SIAM, Philadelphia), 1465–1473.CrossrefGoogle Scholar
  • Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, New York).CrossrefGoogle Scholar
  • Ye Y (2005) A new complexity result on solving the Markov decision problem. Math. Oper. Res. 30(3):733–749.LinkGoogle Scholar
  • Ye Y (2011) The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4):593–603.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.