Provably Near-Optimal Approximation Schemes for Implicit Stochastic and Sample-Based Dynamic Programs

Published Online:https://doi.org/10.1287/ijoc.2019.0926

References

  • Arrow K, Harris T, Marschak J (1951) Optimal inventory policy. Econometrica 19:250–272.CrossrefGoogle Scholar
  • Bellman R, Dreyfus S (1962) Applied Dynamic Programming (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Bernstein S (1946) The Theory of Probabilities (Gastehizdat, Moscow).Google Scholar
  • Bertsekas D (2005) Dynamic Programming and Optimal Control, vol. 1, 3rd ed. (Athena Scientific, Belmont, MA).Google Scholar
  • Boutilier C, Dearden C, Goldszmidt M (1995) Exploiting structure in policy construction. Proc. 14th Internat. Joint Conf. Artificial Intelligence, vol. 14 (ACM, New York), 1104–1113.Google Scholar
  • Cheung W, Simchi-Levi D (2019) Sampling-based approximation schemes for capacitated stochastic inventory control models. Math. Oper. Res. 44(2):668–692.LinkGoogle Scholar
  • Devroye L, Lugosi G (2001) Combinatorial Methods in Density Estimation (Springer, New York).CrossrefGoogle Scholar
  • Halman N (2015) Approximating convex functions via non-convex oracles under the relative noise model. Discrete Optim. 16:1–16.CrossrefGoogle Scholar
  • Halman N, Nannicini G (2019) Fully polynomial time (Σ, Π)-approximation schemes for continuous nonlinear newsvendor and continuous stochastic dynamic programs. Accessed August 1, 2019, http://www.optimization-online.org/DB_HTML/2016/11/5726.html.Google Scholar
  • Halman N, Nannicini G, Orlin J ( 2015) A computationally efficient FPTAS for convex stochastic dynamic programs. SIAM J. Optim. 25(1):317–350.CrossrefGoogle Scholar
  • Halman N, Orlin JB, Simchi-Levi D (2012) Approximating the nonlinear newsvendor and single-item stochastic lot-sizing problems when data are given by an oracle. Oper. Res. 60(2):429–446.LinkGoogle Scholar
  • Halman N, Klabjan D, Li CL, Orlin J, Simchi-Levi D (2014) Fully polynomial time approximation schemes for stochastic dynamic programs. SIAM J. Discrete Math. 28(4):1725–1796.CrossrefGoogle Scholar
  • Halman N, Klabjan D, Mostagir M, Orlin J, Simchi-Levi D (2009) A fully polynomial time approximation scheme for single-item stochastic inventory control with discrete demand. Math. Oper. Res. 34(3):674–685.LinkGoogle Scholar
  • Kakade S (2003) On the sample complexity of reinforcement learning. PhD thesis, University College London, London.Google Scholar
  • Kearns M, Singh S (1999) Finite-sample convergence rates for q-learning and indirect algorithms. Advances Neural Inform. Processing Systems, vol. 11 (ACM, New York), 996–1002.Google Scholar
  • Kearns M, Mansour Y, Ng A (2002) A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine Learn. 49(2–3):193–208.CrossrefGoogle Scholar
  • Khouja M (1999) The single-period (news-vendor) problem: Literature review and suggestions for future research. Omega 27(5):537–553.CrossrefGoogle Scholar
  • Koller D, Parr R (1999) Computing factored value functions for policies in structured MDPs. Proc. 16th Internat. Joint Conf. Artificial Intelligence, vol. 2 (Morgan Kaufmann, San Francisco), 1332–1339.Google Scholar
  • Kovalyov M, Kubiak W (2012) A generic FPTAS for partition type optimisation problems. Internat. J. Planning Scheduling 1(3):209–233.Google Scholar
  • Levi R, Perakis G, Uichanco J (2015) The data-driven newsvendor problem: New bounds and insights. Oper. Res. 63(6):1294–1306.LinkGoogle Scholar
  • Levi R, Roundy R, Shmoys D (2007) Provably near-optimal sampling-based policies for stochastic inventory control models. Math. Oper. Res. 32(4):821–839.LinkGoogle Scholar
  • Meuleau N, Hauskrecht M, Kim K, Peshkin L, Kaelbling L, Dean T, Boutilier C (1998) Solving very large weakly coupled Markov decision processes. Proc. 15th Internat. Joint Conf. Artificial Intelligence (American Association for Artificial Intelligence, Menlo Park, CA), 165–172.Google Scholar
  • Mittal S, Schultz A (2013) A general framework for designing approximation schemes for combinatorial optimization problems with many objectives combined into one. Oper. Res. 61(2):386–397.LinkGoogle Scholar
  • Porteus E (1990) Stochastic inventory theory. Heyman D, Sobel M, eds. Handbook in OR & MS, vol. 2 (North-Holland, Amsterdam), 605–652.Google Scholar
  • Pruhs K, Woeginger G (2007) Approximation schemes for a class of subset selection problems. Theoret. Comput. Sci. 382(2):151–156.CrossrefGoogle Scholar
  • Qin Y, Wang R, Vakharia A, Chen Y, Seref M (2011) The newsvendor problem: Review and directions for future research. Eur. J. Oper. Res. 213(2):361–374.CrossrefGoogle Scholar
  • Shapiro A (2003) Monte Carlo sampling methods. Ruszczyński A, Shapiro A, eds. Stochastic Programming, Handbooks in Operations Research and Management Science, vol. 10 (Elsevier, Amsterdam), 353–425.CrossrefGoogle Scholar
  • Shmoys D, Swamy C (2006) An approximation scheme for stochastic linear programming and its application to stochastic integer programs. J. ACM 53(6):978–1012.CrossrefGoogle Scholar
  • Simchi-Levi D, Chen X, Bramel J (2014) The Logic of Logistics: Theory, Algorithms, and Applications for Logistics Management, 3rd ed. (Springer-Verlag, New York).CrossrefGoogle Scholar
  • Woeginger G (2000) When does a dynamic programming formulation guarantee the existence of a fully polynomial time approximation scheme (FPTAS)? INFORMS J. Comput. 12(1):57–74.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.