Multiagent Environments for Vehicle Routing Problems
Published Online:23 Jun 2026https://doi.org/10.1287/ijoc.2025.1211
References
- (2022) Guidelines for the computational testing of machine learning approaches to vehicle routing problems. Oper. Res. Lett. 50(2):229–234.Crossref, Google Scholar
- (2024) Multi-Agent Reinforcement Learning: Foundations and Modern Approaches (MIT Press, Cambridge, MA).Google Scholar
- (2023) A multi-agent deep reinforcement learning approach for solving the multi-depot vehicle routing problem. J. Management Anal. 10(3):493–515.Google Scholar
- (2019) ORL: Reinforcement learning benchmarks for online stochastic optimization problems. Preprint, submitted November 19, https://arxiv.org/abs/1911.10641.Google Scholar
- (2017) Neural combinatorial optimization with reinforcement learning. Proc. 5th Internat. Conf. Learn. Representations (ICLR).Google Scholar
- (2025a) RouteFinder: Towards foundation models for vehicle routing problems. Trans. Machine Learn. Res. (OpenReview.net).Google Scholar
- (2025b) PARCO: Parallel autoregressive models for multi-agent combinatorial optimization. Advances in Neural Information Processing Systems (Neural Information Processing Systems Foundation, Inc., San Diego).Google Scholar
- (2025c) RL4CO: An extensive reinforcement learning for combinatorial optimization benchmark. Proc. 31st ACM SIGKDD Conf. Knowledge Discovery Data Mining (ACM, New York).Google Scholar
- (2024) BenchMARL: Benchmarking multi-agent reinforcement learning. J. Machine Learn. Res. 25(217):1–10.Google Scholar
- (2022) graphenv: A Python library for reinforcement learning on graph search spaces. J. Open Source Software 7(77):4621.Crossref, Google Scholar
- (2024) Jumanji: A diverse suite of scalable reinforcement learning environments in JAX. Internat. Conf. Learn. Representations (Vienna, Austria), 49264–49293.Google Scholar
- (2020) Solving multi-agent routing problems using deep attention mechanisms. IEEE Trans. Intelligent Transportation Systems 22(12):7804–7813.Crossref, Google Scholar
- (2024) TorchRL: A data-driven decision-making library for PyTorch. Internat. Conf. Learn. Representations (Vienna, Austria), 1778–1811.Google Scholar
- (2016) The vehicle routing problem: State of the art classification and review. Comput. Indust. Engrg. 99:300–313.Crossref, Google Scholar
- (2016) OpenAI Gym. Preprint, submitted June 5, https://arxiv.org/abs/1606.01540.Google Scholar
- (2024) A scalable learning approach for the capacitated vehicle routing problem. Comput. Oper. Res. 171:106787.Crossref, Google Scholar
- (2025) Top-former: A multi-agent transformer approach for the team orienteering problem. IEEE Trans. Intelligent Transportation Systems 26(9):13799–13810.Crossref, Google Scholar
- (2026) Multiagent environments for vehicle routing problems. https://doi.org/10.1287/ijoc.2025.1211.cd, https://github.com/INFORMSJoC/2025.1211.Google Scholar
- (2023) Deep attention models with dimension-reduction and gate mechanisms for solving practical time-dependent vehicle routing problems. Transportation Res. Part E: Logist. Transportation Rev. 173:103095.Crossref, Google Scholar
- (2023) MARLlib: A scalable and efficient multi-agent reinforcement learning library. J. Machine Learn. Res. 24(315):1–23.Google Scholar
- (2020) OR-Gym: A reinforcement learning library for operations research problems. Preprint, submitted August 14, https://arxiv.org/abs/2008.06319.Google Scholar
- (2022) Sym-NCO: Leveraging symmetricity for neural combinatorial optimization. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 1936–1949.Google Scholar
- (2019) Attention, learn to solve routing problems! 7th Internat. Conf. Learn. Representations, ICLR 2019, 1–25.Google Scholar
- (2020) Pomo: Policy optimization with multiple optima for reinforcement learning. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 21188–21198.Google Scholar
- (2024) Solving pick-up and delivery problems via deep reinforcement learning based symmetric neural optimization. Expert Syst. Appl. 255:124514.Crossref, Google Scholar
- (2022) An overview and experimental study of learning-based optimization algorithms for the vehicle routing problem. IEEE/CAA J. Automatica Sinica 9(7):1115–1138.Crossref, Google Scholar
- (2024a) Multi-task learning for routing problem with cross-problem zero-shot generalization. KDD’24: Proc. 30th ACM SIGKDD Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1898–1908.Google Scholar
- (2024b) 2D-Ptr: 2D array pointer network for solving the heterogeneous capacitated vehicle routing problem. AAMAS’ 24: Proc. 23rd Internat. Conf. Autonomous Agents Multiagent Systems (International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC), 1238–1246.Google Scholar
- (2021) Reinforcement learning for combinatorial optimization: A survey. Computers Oper. Res. 134:105400.Crossref, Google Scholar
- (2018) Deep reinforcement learning for event-driven multi-agent decision processes. IEEE Trans. Intelligent Transportation Systems 20(4):1259–1268.Crossref, Google Scholar
- (2020) Flatland-RL: Multi-agent reinforcement learning on trains. Preprint, submitted December 10, https://arxiv.org/abs/2012.05893.Google Scholar
- (2018) Deep reinforcement learning for solving the vehicle routing problem. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 9861–9871.Google Scholar
- (2023) Deep reinforcement learning for the dynamic and uncertain vehicle routing problem. Appl. Intelligence 53(1):405–422.Crossref, Google Scholar
- (2019) PyTorch: An imperative style, high-performance deep learning. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 8026–8037.Google Scholar
- (2021) Stable-Baselines3: Reliable reinforcement learning implementations. J. Machine Learn. Res. 22(268):1–8.Google Scholar
- (2023) A brief survey on learning based methods for vehicle routing problems. Procedia Comput. Sci. 221:773–780.Crossref, Google Scholar
- (2008) Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations (Cambridge University Press, New York).Crossref, Google Scholar
- (2021) PettingZoo: Gym for multi-agent reinforcement learning. Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 15032–15043.Google Scholar
- (2023) Routing arena: A benchmark suite for neural routing solvers. Preprint, submitted October 6, https://arxiv.org/abs/2310.04140.Google Scholar
- (2024) Gymnasium: A standard interface for reinforcement learning environments. Preprint, submitted July 24, https://arxiv.org/abs/2407.17032.Google Scholar
- (2015) Pointer networks. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. NIPS’15: Proc. 29th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 2692–2700.Google Scholar
- (2023) RLOR: A flexible framework of deep reinforcement learning for operation research. Preprint, submitted March 23, https://arxiv.org/abs/2303.13117.Google Scholar
- (2024) Neural combinatorial optimization algorithms for solving vehicle routing problems: A comprehensive survey with perspectives. Preprint, submitted June 1, https://arxiv.org/abs/2406.00415.Google Scholar
- (2024) Centralized deep reinforcement learning method for dynamic multi-vehicle pickup and delivery problem with crowdshippers. IEEE Trans. Intelligent Transportation Systems 25(8):9253–9267.Crossref, Google Scholar
- (2023b) Coordinated multi-agent hierarchical deep reinforcement learning to solve multi-trip vehicle routing problems with soft time windows. IET Intelligent Transportation Systems 17(10):2034–2051.Crossref, Google Scholar
- (2020) Multi-vehicle routing problems with soft time windows: A multi-agent reinforcement learning approach. Transportation Res. Part C: Emerging Tech. 121:102861.Crossref, Google Scholar
- (2023a) The first AI4TSP competition: Learning to solve stochastic routing problems. Artificial Intelligence 319:103918.Crossref, Google Scholar
- (2024a) Learning-based optimization algorithms for routing problems: Bibliometric analysis and literature review. IEEE Trans. Intelligent Transportation Systems 25(11):15273–15290.Crossref, Google Scholar
- (2024b) MVMoE: Multi-task vehicle routing solver with mixture-of-experts. Proc. 41st Internat. Conf. Machine Learn. (Vienna, Austria).Google Scholar
- (2022) MAPDP: Cooperative multi-agent reinforcement learning to solve pickup and delivery problems. Proc. AAAI Conf. Artificial Intelligence, vol. 36 (AAAI Press, Washington, DC), 9980–9988.Google Scholar

