Information Relaxation Bounds for Infinite Horizon Markov Decision Processes
Published Online:2 Aug 2017https://doi.org/10.1287/opre.2017.1631
References
- (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.Link, Google Scholar
- (2004) Primal-dual simulation algorithm for pricing multidimensional American options. Management Sci. 50(9):1222–1234.Link, Google Scholar
- (2003) Whittle’s index policy for a multi-class queueing system with convex holding costs. Math. Methods Oper. Res. 57(1):21–39.Crossref, Google Scholar
- (1996) Stochastic Optimal Control: The Discrete-Time Case (Athena Scientific, Belmont, MA).Google Scholar
- (2011) Dynamic portfolio optimization with transaction costs: Heuristics and dual bounds. Management Sci. 57(10):1752–1770.Link, Google Scholar
- (2014) Information relaxations, duality, and convex stochastic dynamic programs. Oper. Res. 62(6):1394–1415.Link, Google Scholar
- (2010) Information relaxations and duality in stochastic dynamic programs. Oper. Res. 58(4, Part 1):785–801.Link, Google Scholar
- (2007) Additive and multiplicative duals for American option pricing. Finance Stochastics 11(2):153–179.Crossref, Google Scholar
- (1961) Queues (Methuen, London).Google Scholar
- (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51(6):850–865.Link, Google Scholar
- (2011) Bounds for Markov decision processes. Lewis FL, Liu D, eds. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control (IEEE Press, Piscataway, NJ), 452–473.Google Scholar
- (2012) Pathwise optimization for optimal stopping problems. Management Sci. 58(12):2292–2308.Link, Google Scholar
- (2011) Integrated optimization of procurement, processing, and trade of commodities. Oper. Res. 59(6):1369–1381.Link, Google Scholar
- (2011) Total expected discounted reward MDPs: Existence of optimal policies. Cochran JJ, ed. Wiley Encyclopedia of Operations Research and Management Science (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
- (2015) On the reduction of total-cost and average-cost MDPs to discounted MDPs. Preprint arXiv:1507.00664.Google Scholar
- (2004) Monte Carlo Methods in Financial Engineering (Springer-Verlag, New York).Crossref, Google Scholar
- (1999) A single-item inventory model for a nonstationary demand process. Manufacturing Service Oper. Management 1(1):50–61.Link, Google Scholar
- (1975) Dynamic scheduling of a multiclass queue: Discount optimality. Oper. Res. 23(2):270–281.Link, Google Scholar
- (2004) Pricing American options: A duality approach. Oper. Res. 52(2):258–270.Link, Google Scholar
- (2014a) Dynamic portfolio execution and information relaxations. SIAM J. Financial Math. 5(1):316–359.Crossref, Google Scholar
- (2014b) Information relaxations and dynamic zero-sum games. Working paper, Columbia University, New York.Google Scholar
- (2016) Tax-aware dynamic asset allocation. Oper. Res. 64(4):849–866.Link, Google Scholar
- (2003) A Lagrangian decomposition approach to weakly coupled dynamic optimization problems and its applications. Ph.D. thesis, Operations Research Center, Massachusetts Institute of Technology, Cambridge.Google Scholar
- (2002) Approximating martingales for variance reduction in Markov process simulation. Math. Oper. Res. 27(2):253–271.Link, Google Scholar
- (1975) Optimality of myopic inventory policies for certain dependent demand processes. Management Sci. 21(11):1303–1307.Link, Google Scholar
- (2016) Robust multiarmed bandit problems. Management Sci. 62(1):264–285.Link, Google Scholar
- (2013) Accuracy verification for numerical solutions of equilibrium models. Working paper, Massachusetts Institute of Technology, Cambridge.Google Scholar
- (2010) An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation. Oper. Res. 58(3):564–582.Link, Google Scholar
- (2006) Inventory planning with forecast updates: Approximate solutions and cost error bounds. Oper. Res. 54(6):1079–1097.Link, Google Scholar
- (2006) Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue. Queueing Syst. 54(4):281–312.Crossref, Google Scholar
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, New York).Crossref, Google Scholar
- (2002) Monte Carlo valuation of American options. Math. Finance 12(3):271–286.Crossref, Google Scholar
- (2007) Pathwise stochastic optimal control. SIAM J. Control Optim. 46(3):1116–1132.Crossref, Google Scholar
- (1987) Real and Complex Analysis (McGraw-Hill, New York).Google Scholar
- (1995) Dynamic scheduling with convex delay costs: The generalized cμ rule. Ann. Appl. Probab. 5(3):809–833.Crossref, Google Scholar
- (1996) Scheduling a make-to-stock queue: Index policies and hedging points. Oper. Res. 44(4):634–647.Link, Google Scholar
- (1965) Optimal policy for a multi-product, dynamic, nonstationary inventory system. Management Sci. 12(3):206–222.Link, Google Scholar
- (1969) Discrete dynamic programming with sensitive discount optimality criteria. Ann. Math. Statist. 40(5):1635–1660.Crossref, Google Scholar
- (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A:287–298.Crossref, Google Scholar
- (1991) Probability with Martingales (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2014) Weakly coupled dynamic program: Information and Lagrangian relaxations. Working paper, Georgia Institute of Technology, Atlanta.Google Scholar

