Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions

Antoine Sauré
Antoine Sauré
[email protected]
Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada
Search for more papers by this author
,
Jonathan Patrick
Jonathan Patrick
[email protected]
Telfer School of Management, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
Search for more papers by this author
,
Martin L. Puterman
Martin L. Puterman
[email protected]
Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada
Search for more papers by this author

Antoine Sauré

[email protected]

Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada

Search for more papers by this author

Jonathan Patrick

[email protected]

Telfer School of Management, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada

Search for more papers by this author

Martin L. Puterman

[email protected]

Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada

Search for more papers by this author

Published Online:28 Sep 2015https://doi.org/10.1287/ijoc.2015.0645

References

Adelman D (2004) A price-directed approach to stochastic inventory/routing. Oper. Res. 52:499–514.Link, Google Scholar
Adelman D, Klabjan D (2012) Computing near-optimal policies in generalized joint replenishment. INFORMS J. Comput. 24:148–164.Link, Google Scholar
Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56:712–727.Link, Google Scholar
Astaraky D, Patrick J (2015) A simulation based approximate dynamic programming approach to multi-class, multi-resource surgical scheduling. Eur. J. Oper. Res. 245:309–319.Crossref, Google Scholar
Bertsekas D (2011) Dynamic Programming and Optimal Control, Vol. II, 3rd ed. (Athena Scientific, Belmont, MA).Google Scholar
Bertsekas D, Tsitsiklis J (1996) Neuro-Dynamic Programming (Athena Scientific, Belmont, MA).Google Scholar
de Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Oper. Res. 51:850–865.Link, Google Scholar
de Farias DP, Van Roy B (2004) On constraint sampling in the linear programming approach to approximate dynamic programming. Math. Oper. Res. 29:462–478.Link, Google Scholar
Desai V, Farias V, Moallemi C (2009) A smoothed approximate linear program. Adv. Neural Inform 22:459–467.Google Scholar
Enders J, Powell W, Egan D (2010) Robust policies for the transformer acquisition and allocation problem. Energy Sys. 1:245–272.Crossref, Google Scholar
Erdelyi A, Topaloglu H (2009) Computing protection level policies for dynamic capacity allocation problems by using stochastic approximation methods. IIE Trans. 41:498–510.Crossref, Google Scholar
Frantzeskakis LF, Powell WB (1990) A successive linear approximation procedure for stochastic, dynamic vehicle allocation problems. Transportation Sci. 24:40–57.Link, Google Scholar
GAMS (2011) GAMS—The solver manuals. Technical report, GAMS Development Corporation, Washington, DC.Google Scholar
Geramifard A, Walsh T, Tellex S, Chowdhary G, Roy N, How J (2013) A tutorial on linear function approximators for dynamic programming and reinforcement learning. Foundations Trends Machine Learn. 6:375–451.Crossref, Google Scholar
Gocgun Y, Puterman M (2014) Dynamic scheduling with due dates and time windows: An application to chemotherapy patient appointment booking. Health Care Management Sci. 17:60–76.Crossref, Google Scholar
Godfrey GA, Powell WB (2002) An adaptive dynamic programming algorithm for dynamic fleet management, I: Single period travel times. Transportation Sci. 36:21–39.Link, Google Scholar
Gosavi A, Bandla N, Das T (2002) A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking. IIE Trans. 34:729–742.Crossref, Google Scholar
Haykin S (2009) Neural Networks and Learning Machines (Pearson Education, Upper Saddle River, NJ).Google Scholar
Hing M, Van Harten A, Schuur P (2007) Reinforcement learning versus heuristics for order acceptance on a single resource. J. Heuristics 13:167–187.Crossref, Google Scholar
Lam S, Lee L, Tang L (2007) An approximate dynamic programming approach for the empty container allocation problem. Transportation Res. Part C 15:265–277.Crossref, Google Scholar
Marbach P, Mihatsch O, Tsitsiklis J (2000) Call admission control and routing in integrated services networks using neuro-dynamic programming. IEEE J. Sel. Area Comm. 18:197–208.Crossref, Google Scholar
Maxwell MS, Henderson SG, Topaloglu H (2013) Tuning approximate dynamic programming policies for ambulance redeployment via direct search. Stochastic Systems 3:322–361.Link, Google Scholar
Maxwell MS, Restrepo M, Henderson SG, Topaloglu H (2010) Approximate dynamic programming for ambulance redeployment. INFORMS J. Comput. 22:266–281.Link, Google Scholar
Patrick J, Puterman ML, Queyranne M (2008) Dynamic multipriority patient scheduling for a diagnostic resource. Oper. Res. 56:1507–1525.Link, Google Scholar
Powell WB (1987) An operational planning model for the dynamic vehicle allocation problem with uncertain demands. Transportation Res. Part B 21:217–232.Crossref, Google Scholar
Powell WB (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley-Interscience, Hoboken, NJ).Crossref, Google Scholar
Powell WB (2012) Perspectives of approximate dynamic programming. Ann. Oper. Res., ePub ahead of print February 7, http://link.springer.com/article/10.1007%2Fs10479-012-1077-6.Google Scholar
Powell WB, George A, Bouzaiene-Ayari B, Simão H (2005) Approximate dynamic programming for high dimensional resource allocation problems. Proc. 2005 IEEE Internat. Joint Conference Neural Networks, Montreal, 2989–2994.Crossref, Google Scholar
Sauré A, Patrick J, Tyldesley S, Puterman M (2012) Dynamic multi-appointment patient scheduling for radiation therapy. Eur. J. Oper. Res. 223:573–584.Crossref, Google Scholar
Schmid V (2012) Solving the dynamic ambulance relocation and dispatching problem using approximate dynamic programming. Eur. J. Oper. Res. 219:611–621.Crossref, Google Scholar
Schütz H, Kolisch R (2012) Approximate dynamic programming for capacity allocation in the service industry. Eur. J. Oper. Res. 218:239–250.Crossref, Google Scholar
Simão H, Powell W (2009) Approximate dynamic programming for management of high-value spare parts. J. Manufacturing Tech. Management 20:147–160.Crossref, Google Scholar
Simão HP, Day J, George AP, Gifford T, Nienow J, Powell WB (2009) An approximate dynamic programming algorithm for large-scale fleet management: A case application. Transportation Sci. 43:178–197.Link, Google Scholar
Simão HP, George A, Powell WB, Gifford T, Nienow J, Day J (2010) Approximate dynamic programming captures fleet operations for Schneider National. Interfaces 40:342–352.Link, Google Scholar
Sutton R, Barto A (1998) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
Van Roy B, Bertsekas D, Lee Y, Tsitsiklis J (1997) A neuro-dynamic programming approach to retailer inventory management. Proc. 36th IEEE Conf. Decision Control, San Diego, 4052–4057.Crossref, Google Scholar
Zhang D, Adelman D (2009) An approximate dynamic programming approach to network revenue management with customer choice. Transportation Sci. 43:381–394.Link, Google Scholar

cover image INFORMS Journal on Computing

Volume 27, Issue 3

Summer 2015

Pages 431-578

Article Information

Supplemental Material

Metrics

Information

Received:December 01, 2013
Accepted:January 01, 2015
Published Online:September 28, 2015

Cite as

Antoine Sauré, Jonathan Patrick, Martin L. Puterman (2015) Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions. INFORMS Journal on Computing 27(3):579-595.

https://doi.org/10.1287/ijoc.2015.0645

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions

References

Volume 27, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News