Information Relaxation Bounds for Infinite Horizon Markov Decision Processes

Published Online:https://doi.org/10.1287/opre.2017.1631

References

  • Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.LinkGoogle Scholar
  • Andersen L, Broadie M (2004) Primal-dual simulation algorithm for pricing multidimensional American options. Management Sci. 50(9):1222–1234.LinkGoogle Scholar
  • Ansell PS, Glazebrook KD, Niño-Mora J, O’Keeffe M (2003) Whittle’s index policy for a multi-class queueing system with convex holding costs. Math. Methods Oper. Res. 57(1):21–39.CrossrefGoogle Scholar
  • Bertsekas DP, Shreve SE (1996) Stochastic Optimal Control: The Discrete-Time Case (Athena Scientific, Belmont, MA).Google Scholar
  • Brown DB, Smith JE (2011) Dynamic portfolio optimization with transaction costs: Heuristics and dual bounds. Management Sci. 57(10):1752–1770.LinkGoogle Scholar
  • Brown DB, Smith JE (2014) Information relaxations, duality, and convex stochastic dynamic programs. Oper. Res. 62(6):1394–1415.LinkGoogle Scholar
  • Brown DB, Smith JE, Sun P (2010) Information relaxations and duality in stochastic dynamic programs. Oper. Res. 58(4, Part 1):785–801.LinkGoogle Scholar
  • Chen N, Glasserman P (2007) Additive and multiplicative duals for American option pricing. Finance Stochastics 11(2):153–179.CrossrefGoogle Scholar
  • Cox DR, Smith WL (1961) Queues (Methuen, London).Google Scholar
  • de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51(6):850–865.LinkGoogle Scholar
  • Desai VV, Farias VF, Moallemi CC (2011) Bounds for Markov decision processes. Lewis FL, Liu D, eds. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control (IEEE Press, Piscataway, NJ), 452–473.Google Scholar
  • Desai VV, Farias VF, Moallemi CC (2012) Pathwise optimization for optimal stopping problems. Management Sci. 58(12):2292–2308.LinkGoogle Scholar
  • Devalkar S, Anupindi R, Sinha A (2011) Integrated optimization of procurement, processing, and trade of commodities. Oper. Res. 59(6):1369–1381.LinkGoogle Scholar
  • Feinberg EA (2011) Total expected discounted reward MDPs: Existence of optimal policies. Cochran JJ, ed. Wiley Encyclopedia of Operations Research and Management Science (John Wiley & Sons, Hoboken, NJ).CrossrefGoogle Scholar
  • Feinberg EA, Huang J (2015) On the reduction of total-cost and average-cost MDPs to discounted MDPs. Preprint arXiv:1507.00664.Google Scholar
  • Glasserman P (2004) Monte Carlo Methods in Financial Engineering (Springer-Verlag, New York).CrossrefGoogle Scholar
  • Graves SC (1999) A single-item inventory model for a nonstationary demand process. Manufacturing Service Oper. Management 1(1):50–61.LinkGoogle Scholar
  • Harrison JM (1975) Dynamic scheduling of a multiclass queue: Discount optimality. Oper. Res. 23(2):270–281.LinkGoogle Scholar
  • Haugh MB, Kogan L (2004) Pricing American options: A duality approach. Oper. Res. 52(2):258–270.LinkGoogle Scholar
  • Haugh MB, Wang C (2014a) Dynamic portfolio execution and information relaxations. SIAM J. Financial Math. 5(1):316–359.CrossrefGoogle Scholar
  • Haugh MB, Wang C (2014b) Information relaxations and dynamic zero-sum games. Working paper, Columbia University, New York.Google Scholar
  • Haugh MB, Iyengar G, Wang C (2016) Tax-aware dynamic asset allocation. Oper. Res. 64(4):849–866.LinkGoogle Scholar
  • Hawkins J (2003) A Lagrangian decomposition approach to weakly coupled dynamic optimization problems and its applications. Ph.D. thesis, Operations Research Center, Massachusetts Institute of Technology, Cambridge.Google Scholar
  • Henderson SG, Glynn PW (2002) Approximating martingales for variance reduction in Markov process simulation. Math. Oper. Res. 27(2):253–271.LinkGoogle Scholar
  • Johnson G, Thompson H (1975) Optimality of myopic inventory policies for certain dependent demand processes. Management Sci. 21(11):1303–1307.LinkGoogle Scholar
  • Kim MJ, Lim AEB (2016) Robust multiarmed bandit problems. Management Sci. 62(1):264–285.LinkGoogle Scholar
  • Kogan L, Mitra I (2013) Accuracy verification for numerical solutions of equilibrium models. Working paper, Massachusetts Institute of Technology, Cambridge.Google Scholar
  • Lai G, Margot F, Secomandi N (2010) An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation. Oper. Res. 58(3):564–582.LinkGoogle Scholar
  • Lu X, Song J-S, Regan A (2006) Inventory planning with forecast updates: Approximate solutions and cost error bounds. Oper. Res. 54(6):1079–1097.LinkGoogle Scholar
  • Niño-Mora J (2006) Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue. Queueing Syst. 54(4):281–312.CrossrefGoogle Scholar
  • Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, New York).CrossrefGoogle Scholar
  • Rogers LCG (2002) Monte Carlo valuation of American options. Math. Finance 12(3):271–286.CrossrefGoogle Scholar
  • Rogers LCG (2007) Pathwise stochastic optimal control. SIAM J. Control Optim. 46(3):1116–1132.CrossrefGoogle Scholar
  • Rudin W (1987) Real and Complex Analysis (McGraw-Hill, New York).Google Scholar
  • Van Mieghem JA (1995) Dynamic scheduling with convex delay costs: The generalized cμ rule. Ann. Appl. Probab. 5(3):809–833.CrossrefGoogle Scholar
  • Veatch MH, Wein LM (1996) Scheduling a make-to-stock queue: Index policies and hedging points. Oper. Res. 44(4):634–647.LinkGoogle Scholar
  • Veinott AF Jr (1965) Optimal policy for a multi-product, dynamic, nonstationary inventory system. Management Sci. 12(3):206–222.LinkGoogle Scholar
  • Veinott AF Jr (1969) Discrete dynamic programming with sensitive discount optimality criteria. Ann. Math. Statist. 40(5):1635–1660.CrossrefGoogle Scholar
  • Whittle P (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A:287–298.CrossrefGoogle Scholar
  • Williams D (1991) Probability with Martingales (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Ye F, Zhu H, Zhou E (2014) Weakly coupled dynamic program: Information and Lagrangian relaxations. Working paper, Georgia Institute of Technology, Atlanta.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.