Gamifying the Vehicle Routing Problem with Stochastic Requests

Nicholas D. Kullman
Nicholas D. Kullman
[email protected]
https://orcid.org/0000-0002-0389-9395
Laboratoire d’Informatique Fondamentale et Appliquée, Université de Tours, 37200 Tours, France
Search for more papers by this author
,
Nikita Dudorov
Nikita Dudorov
[email protected]
https://orcid.org/0009-0001-9776-7077
École Polytechnique, 91120 Palaiseau, France
Search for more papers by this author
,
Martin Cousineau
Martin Cousineau
[email protected]
https://orcid.org/0000-0001-9184-3553
Department of Operations Management and Logistics, HEC Montréal, Montreal, Quebec H3T 2A7, Canada
Search for more papers by this author
,
Jorge E. Mendoza
Corresponding Author
Jorge E. Mendoza
[email protected]
https://orcid.org/0000-0003-2473-2655
Department of Operations Management and Logistics, HEC Montréal, Montreal, Quebec H3T 2A7, Canada
Search for more papers by this author
,
Justin C. Goodson
Justin C. Goodson
[email protected]
https://orcid.org/0000-0002-9680-0672
Department of Operations and Information Technology Management, Richard A. Chaifetz School of Business, Saint Louis University, St. Louis, Missouri 63108
Search for more papers by this author

Laboratoire d’Informatique Fondamentale et Appliquée, Université de Tours, 37200 Tours, France

Search for more papers by this author

Nikita Dudorov

[email protected]

https://orcid.org/0009-0001-9776-7077

École Polytechnique, 91120 Palaiseau, France

Search for more papers by this author

Martin Cousineau

[email protected]

https://orcid.org/0000-0001-9184-3553

Department of Operations Management and Logistics, HEC Montréal, Montreal, Quebec H3T 2A7, Canada

Search for more papers by this author

Jorge E. Mendoza

Corresponding Author

Jorge E. Mendoza

[email protected]

https://orcid.org/0000-0003-2473-2655

Department of Operations Management and Logistics, HEC Montréal, Montreal, Quebec H3T 2A7, Canada

Search for more papers by this author

Justin C. Goodson

[email protected]

https://orcid.org/0000-0002-9680-0672

Department of Operations and Information Technology Management, Richard A. Chaifetz School of Business, Saint Louis University, St. Louis, Missouri 63108

Search for more papers by this author

Published Online:16 Oct 2025https://doi.org/10.1287/ijoc.2024.0838

References

Archetti C, Feillet D, Mor A, Speranza M (2020) Dynamic traveling salesman problem with stochastic release dates. Eur. J. Oper. Res. 280(3):832–844.Crossref, Google Scholar
Balaji B, Bell-Masterson J, Bilgin E, Damianou A, Moreno Garcia P, Jain A, Luo R, Maggiar A, Narayanaswamy B, Ye C (2019) ORL: Reinforcement learning benchmarks for online stochastic optimization problems. Preprint, submitted November 24, https://arxiv.org/abs/1911.10641.Google Scholar
Bellemare M, Dabney W, Munos R (2017) A distributional perspective on reinforcement learning. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (PMLR, New York), 449–458.Google Scholar
Bent R, Van Hentenryck P (2004) Scenario-based planning for partially dynamic vehicle routing with stochastic customers. Oper. Res. 52(6):977–987.Link, Google Scholar
Branchini R, Armentano A, Løkketangen A (2009) Adaptive granular local search heuristic for a dynamic vehicle routing problem. Comput. Oper. Res. 36(11):2955–2968.Crossref, Google Scholar
Branke J, Middendorf M, Noeth G, Dessouky M (2005) Waiting strategies for dynamic vehicle routing. Transportation Sci. 39(3):298–312.Link, Google Scholar
Brown D, Smith J, Sun P (2010) Information relaxations and duality in stochastic dynamic programs. Oper. Res. 58(4):785–801.Link, Google Scholar
Caspar M, Wendt O (2024) Reinforcement learning applied to the dynamic capacitated profitable tour problem with stochastic requests. Gervasi O, Murgante B, Garau C, Taniar D, Rocha AMAC, Faginas Lago MN, eds. Comput. Sci. Appl. (Springer Nature, Cham, Switzerland), 346–363.Google Scholar
Chen X, Ulmer M, Thomas B (2022) Deep Q-learning for same-day delivery with vehicles and drones. Eur. J. Oper. Res. 298(3):939–952.Crossref, Google Scholar
Choi J, Kwon J, Lee KM (2018) Real-time visual tracking by deep reinforced decision making. Comput. Vision Image Understanding 171:10–19.Crossref, Google Scholar
Ferrucci F, Bock S, Gendreau M (2013) A pro-active real-time control approach for dynamic vehicle routing problems dealing with the delivery of urgent goods. Eur. J. Oper. Res. 225(1):130–141.Crossref, Google Scholar
Gendreau M, Guertin F, Potvin JY, Séguin R (2006) Neighborhood search heuristics for a dynamic vehicle dispatching problem with pick-ups and deliveries. Transportation Res. Part C Emerging Tech. 14(3):157–174.Crossref, Google Scholar
Gendreau M, Guertin F, Potvin JY, Taillard E (1999) Parallel tabu search for real-time vehicle routing and dispatching. Transportation Sci. 33(4):381–390.Link, Google Scholar
Ghiani G, Manni E, Thomas BW (2012) A comparison of anticipatory algorithms for the dynamic and stochastic traveling salesman problem. Transportation Sci. 46(3):374–387.Link, Google Scholar
Ghiani G, Manni E, Quaranta A, Triki C (2009) Anticipatory algorithms for same-day courier dispatching. Transportation Res. Part E Logist. Transportation Rev. 45(1):96–106.Crossref, Google Scholar
Heinrich J, Silver D (2016) Deep reinforcement learning from self-play in imperfect-information games. Preprint, submitted March 3, https://arxiv.org/abs/1603.01121.Google Scholar
Hvattum L, Løkketangen A, Laporte G (2006) Solving a dynamic and stochastic vehicle routing problem with a sample scenario hedging heuristic. Transportation Sci. 40(4):421–438.Link, Google Scholar
Ichoua S, Gendreau M, Potvin J (2000) Diversion issues in real-time vehicle dispatching. Transportation Sci. 34(4):426–438.Link, Google Scholar
Ichoua S, Gendreau M, Potvin J (2006) Exploiting knowledge about future demands for real-time vehicle dispatching. Transportation Sci. 40(2):211–225.Link, Google Scholar
Joe W, Lau H (2020) Deep reinforcement learning approach to solve dynamic vehicle routing problem with stochastic customers. Beck J, Buffet O, Hoffmann J, Karpas E, Sohrabj S, eds. Proc. Internat. Conf. Automated Planning Scheduling, vol. 30 (AAAI Press, Palo Alto, CA), 394–402.Google Scholar
Kullman N, Cousineau M, Goodson J, Mendoza J (2022) Dynamic ride-hailing with electric vehicles. Transportation Sci. 56(3):775–794.Link, Google Scholar
Kullman ND, Dudorov N, Cousineau M, Mendoza JE, Goodson JC (2025) Gamifying the vehicle routing problem with stochastic requests. http://dx.doi.org/10.1287/ijoc.2024.0838.cd, https://github.com/INFORMSJoC/2024.0838.Google Scholar
Lample G, Chaplot D (2017) Playing FPS games with deep reinforcement learning. Proc. 31st AAAI Conf. Artificial Intelligence (AAAI Press), 2140–2146.Google Scholar
Liu Y, Logan B, Liu N, Xu Z, Tang J, Wang Y (2017) Deep reinforcement learning for dynamic treatment regimes on medical registry data. 2017 IEEE Internat. Conf. Healthcare Informatics (IEEE, Piscataway, NJ), 380–385.Google Scholar
Meisel S (2011) Anticipatory Optimization for Dynamic Decision Making, Operations Research/Computer Science Interfaces Series, vol. 51 (Springer, New York).Crossref, Google Scholar
Mitrović-Minić S, Laporte G (2004) Waiting strategies for the dynamic pickup and delivery problem with time windows. Transportation Res. Part B Methodological 38(7):635–655.Crossref, Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare M, Graves A, et al. (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533.Crossref, Google Scholar
Palombarini J, Martínez C (2022) End-to-end on-line rescheduling from Gantt chart images using deep reinforcement learning. Internat. J. Production Res. 60(14):4434–4463.Crossref, Google Scholar
Psaraftis HN (1980) A dynamic programming solution to the single vehicle many-to-many immediate request dial-a-ride problem. Transportation Sci. 14(2):130–154.Link, Google Scholar
RLlib (2024) Industry-grade reinforcement learning. Accessed June 16, 2024, https://docs.ray.io/en/latest/rllib/rllib-algorithms.html#dqn.Google Scholar
Sallab A, Abdou M, Perot E, Yogamani S (2017) Deep reinforcement learning framework for autonomous driving. Electronic Imaging 2017(19):70–76.Crossref, Google Scholar
Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M, Guez A, Lanctot M, et al. (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419):1140–1144.Crossref, Google Scholar
Soeffker N, Ulmer M, Mattfeld D (2019) Adaptive state space partitioning for dynamic decision processes. Bus. Inform. Systems Engrg. 61(3):261–275.Crossref, Google Scholar
Soeffker N, Ulmer M, Mattfeld D (2022) Stochastic dynamic vehicle routing in the light of prescriptive analytics: A review. Eur. J. Oper. Res. 298(3):801–820.Crossref, Google Scholar
Thomas BW, White CC III (2007) The dynamic shortest path problem with anticipation. Eur. J. Oper. Res. 176(2):836–854.Crossref, Google Scholar
Ulmer M, Mattfeld D, Köster F (2018a) Budgeting time for dynamic vehicle routing with stochastic customer requests. Transportation Sci. 52(1):20–37.Link, Google Scholar
Ulmer M, Soeffker N, Mattfeld D (2018b) Value function approximation for dynamic multi-period vehicle routing. Eur. J. Oper. Res. 269(3):883–899.Crossref, Google Scholar
Ulmer MW, Goodson JC, Mattfeld DC, Hennig M (2018c) Offline–online approximate dynamic programming for dynamic vehicle routing with stochastic requests. Transportation Sci. 53(1):185–202.Link, Google Scholar
Ulmer MW, Goodson JC, Mattfeld DC, Thomas BW (2020) On modeling stochastic dynamic vehicle routing problems. EURO J. Transportation Logist. 9(2):100008.Crossref, Google Scholar
van Hemert JI, La Poutré JA (2004) Dynamic routing problems with fruitful regions: Models and evolutionary computation. Parallel Problem Solving from Nature-PPSN VIII (Springer, Berlin, Heidelberg), 692–701.Crossref, Google Scholar
Vinyals O, Babuschkin I, Czarnecki W, Mathieu M, Dudzik A, Chung J, Choi D, et al. (2019) Grandmaster level in Starcraft II using multi-agent reinforcement learning. Nature 575(7782):350–354.Crossref, Google Scholar
Xinquan W, Xuefeng Y (2023) A spatial pyramid pooling-based deep reinforcement learning model for dynamic job-shop scheduling problem. Computers Oper. Res. 160:106401.Crossref, Google Scholar
Zhang J, Woensel TV (2023) Dynamic vehicle routing with random requests: A literature review. Internat. J. Production Econom. 256:108751.Crossref, Google Scholar

cover image INFORMS Journal on Computing

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:July 05, 2024
Accepted:September 04, 2025
Published Online:October 16, 2025

Cite as

Nicholas D. Kullman, Nikita Dudorov, Martin Cousineau, Jorge E. Mendoza, Justin C. Goodson (2025) Gamifying the Vehicle Routing Problem with Stochastic Requests. INFORMS Journal on Computing 0(0).

https://doi.org/10.1287/ijoc.2024.0838

Keywords

Acknowledgments

The authors especially thank Clément Grodecoeur for his efforts in the development and prototyping of an early VRPSR game.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Gamifying the Vehicle Routing Problem with Stochastic Requests

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News