Self-Adapting Network Relaxations for Weakly Coupled Markov Decision Processes

Published Online:https://doi.org/10.1287/mnsc.2022.01108

References

  • Adelman D (2004) A price-directed approach to stochastic inventory/routing. Oper. Res. 52(4):499–514.LinkGoogle Scholar
  • Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.LinkGoogle Scholar
  • Balseiro SR, Gur Y (2019) Learning in repeated auctions with budgets: Regret minimization and equilibrium. Management Sci. 65(9):3952–3968.LinkGoogle Scholar
  • Bertsekas DP (2012) Dynamic Programming and Optimal Control: Approximate Dynamic Programming, 4th ed., vol. 2 (Athena Scientific, Nashua, NH).Google Scholar
  • Bertsekas DP (2017) Dynamic Programming and Optimal Control, 4th ed., vol. 1 (Athena Scientific, Nashua, NH).Google Scholar
  • Bertsimas D, Mersereau AJ (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.LinkGoogle Scholar
  • Bertsimas D, Mišić VV (2016) Decomposable Markov decision processes: A fluid optimization approach. Oper. Res. 64(6):1537–1555.LinkGoogle Scholar
  • Brown DB, Smith JE (2020) Index policies and performance bounds for dynamic selection problems. Management Sci. 66(7):3029–3050.LinkGoogle Scholar
  • Brown DB, Zhang J (2022) Technical note—On the strength of relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 71(6):2374–2389.LinkGoogle Scholar
  • Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.LinkGoogle Scholar
  • D’Aeth JC, Ghosal S, Grimm F, Haw D, Koca E, Lau K, Liu H, et al. (2023) Optimal hospital care scheduling during the SARS-CoV-2 pandemic. Management Sci. 69(10):5923–5947.LinkGoogle Scholar
  • de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51(6):850–865.LinkGoogle Scholar
  • de Jonge B, Scarf PA (2020) A review on maintenance optimization. Eur. J. Oper. Res. 285(3):805–824.CrossrefGoogle Scholar
  • Gurobi Optimization LLC (2022) Gurobi optimizer reference manual. https://www.gurobi.com/.Google Scholar
  • Hawkins JT (2003) A Langrangian decomposition approach to weakly coupled dynamic optimization problems and its applications. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  • Hu W, Frazier P (2017) An asymptotically optimal index policy for finite-horizon restless bandits. Preprint, submitted July 1, https://arxiv.org/abs/1707.00205.Google Scholar
  • Kleywegt AJ, Nori VS, Savelsbergh MW (2004) Dynamic programming approximations for a stochastic inventory routing problem. Transportation Sci. 38(1):42–70.LinkGoogle Scholar
  • Knuth D (2011) The Art of Computer Programming, vol. 4A. Combinatorial Algorithms, Part 1 (Addison-Wesley, Upper Saddle River, NJ), 202–208.Google Scholar
  • Lemaréchal C (2007) The omnipresence of Lagrange. Ann. Oper. Res. 153(1):9–27.CrossrefGoogle Scholar
  • Mahimkar A, Ge Z, Zhang W, Qiu L, Qureshi MA (2019) Scheduler for upgrading access point devices efficiently. US Patent 10374888.Google Scholar
  • Marklund J, Rosling K (2012) Lower bounds and heuristics for supply chain stock allocation. Oper. Res. 60(1):92–105.LinkGoogle Scholar
  • Meer K, Rautenbach D (2009) On the OBDD size for graphs of bounded tree- and clique-width. Discrete Math. 309(4):843–851.CrossrefGoogle Scholar
  • Merizalde Y, Hernández-Callejo L, Duque-Perez O, Alonso-Gómez V (2019) Maintenance models applied to wind turbines. A comprehensive overview. Energies 12(2):225.CrossrefGoogle Scholar
  • Nambiar M, Simchi-Levi D, Wang H (2021) Dynamic inventory allocation with demand learning for seasonal goods. Production Oper. Management 30(3):750–765.CrossrefGoogle Scholar
  • Papadaki KP, Powell WB (2003) An adaptive dynamic programming algorithm for a stochastic multiproduct batch dispatch problem. Naval Res. Logist. 50(7):742–769.CrossrefGoogle Scholar
  • Schweitzer PJ, Seidmann A (1985) Generalized polynomial approximations in Markovian decision processes. J. Math. Anal. Appl. 110(2):568–582.CrossrefGoogle Scholar
  • Slivkins A (2019) Introduction to multi-armed bandits. Preprint, submitted April 15, https://arxiv.org/abs/1904.07272.Google Scholar
  • van Noortwijk J (2009) A survey of the application of gamma processes in maintenance. Reliability Engrg. System Safety 94(1):2–21.CrossrefGoogle Scholar
  • Whittle P (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25(A):287–298.CrossrefGoogle Scholar
  • Wolsey LA, Nemhauser GL (1999) Integer and Combinatorial Optimization (John Wiley & Sons, Hoboken, NJ).Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.