Data-Driven Optimization for Meal Delivery: A Reinforcement Learning Approach for Order-Courier Assignment and Routing at Meituan

Ramón Auad
Corresponding Author
Ramón Auad
[email protected]
https://orcid.org/0000-0001-6978-2627
Amazon.com, Bellevue, Washington 98004; and Universidad Católica del Norte, Antofagasta 1240000, Chile
Search for more papers by this author
,
Felipe Lagos
Felipe Lagos
[email protected]
https://orcid.org/0000-0001-5129-8896
Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago de Chile 7941169, Chile
Search for more papers by this author
,
Tomás Lagos
Tomás Lagos
[email protected]
https://orcid.org/0000-0003-1457-4665
Discipline of Business Analytics, The University of Sydney, Sydney, New South Wales 2006, Australia
Search for more papers by this author

Ramón Auad

Corresponding Author

Ramón Auad

[email protected]

https://orcid.org/0000-0001-6978-2627

Amazon.com, Bellevue, Washington 98004; and Universidad Católica del Norte, Antofagasta 1240000, Chile

Search for more papers by this author

Felipe Lagos

[email protected]

https://orcid.org/0000-0001-5129-8896

Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago de Chile 7941169, Chile

Search for more papers by this author

Tomás Lagos

[email protected]

https://orcid.org/0000-0003-1457-4665

Discipline of Business Analytics, The University of Sydney, Sydney, New South Wales 2006, Australia

Search for more papers by this author

Published Online:27 Apr 2026https://doi.org/10.1287/trsc.2025.0129

References

Auad R, Erera A, Savelsbergh M (2023) Courier satisfaction in rapid delivery systems using dynamic operating regions. Omega (Westport) 121:102917.Crossref, Google Scholar
Auad R, Erera A, Savelsbergh M (2024a) Capacity requirements and demand management strategies in meal delivery. EURO J. Transportation Logist. 13:100135.Crossref, Google Scholar
Auad R, Erera A, Savelsbergh M (2024b) Dynamic courier capacity acquisition in rapid delivery systems: A deep q-learning approach. Transportation Sci. 58(1):67–93.Link, Google Scholar
Ausseil R, Pazour JA, Ulmer MW (2022) Supplier menus for dynamic matching in peer-to-peer transportation platforms. Transportation Sci. 56(5):1304–1326.Link, Google Scholar
Ausseil R, Ulmer MW, Pazour JA (2024) Online acceptance probability approximation in peer-to-peer transportation. Omega (Westport) 123:102993.Crossref, Google Scholar
Berbeglia G, Cordeau J-F, Laporte G (2010) Dynamic pickup and delivery problems. Eur. J. Oper. Res. 202(1):8–15.Crossref, Google Scholar
Chen X, Ulmer MW, Thomas BW (2022) Deep Q-learning for same-day delivery with vehicles and drones. Eur. J. Oper. Res. 298(3):939–952.Crossref, Google Scholar
Cheng A (2018) Millennials are ordering more food delivery, but are they killing the kitchen, too? Accessed March 13, 2026, https://www.forbes.com/sites/andriacheng/2018/06/26/millennials-are-ordering-food-for-delivery-more-but-are-they-killing-the-kitchen-too/#247cfd8a393e.Google Scholar
Dai H, Liu P (2020) Workforce planning for O2O delivery systems with crowdsourced drivers. Ann. Oper. Res. 291:219–245.Crossref, Google Scholar
Drake JH, Kheiri A, Özcan E, Burke EK (2020) Recent advances in selection hyper-heuristics. Eur. J. Oper. Res. 285(2):405–428.Crossref, Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, New York).Crossref, Google Scholar
Hildebrandt FD, Ulmer MW (2022) Supervised learning for arrival time estimations in restaurant meal delivery. Transportation Sci. 56(4):1058–1084.Link, Google Scholar
Hildebrandt FD, Thomas BW, Ulmer MW (2023) Opportunities for reinforcement learning in stochastic dynamic vehicle routing. Comput. Oper. Res. 150:106071.Crossref, Google Scholar
INFORMS TSL (2024) The first INFORMS TSL data-driven research challenge (2024-2025 TSL-Meituan). Accessed March 13, 2026, https://connect.informs.org/tsl/tslresources/datachallenge.Google Scholar
Jahanshahi H, Bozanta A, Cevik M, Kavuk EM, Tosun A, Sonuc SB, Kosucu B, et al. (2022) A deep reinforcement learning approach for the meal delivery problem. Knowledge Based Systems 243:108489.Crossref, Google Scholar
Kheiri A (2020) Heuristic sequence selection for inventory routing problem. Transportation Sci. 54(2):302–312.Link, Google Scholar
Lagos F, Pereira J (2024) Multi-armed bandit-based hyper-heuristics for combinatorial optimization problems. Eur. J. Oper. Res. 312(1):70–91.Crossref, Google Scholar
Liu Y (2019) An optimization-driven dynamic vehicle routing algorithm for on-demand meal delivery using drones. Comput. Oper. Res. 111:1–20.Crossref, Google Scholar
Mao W, Ming L, Rong Y, Tang CS, Zheng H (2019) Faster deliveries and smarter order assignments for an on-demand meal delivery platform. J. Oper. Management 71:220–245.Crossref, Google Scholar
Morgan Stanley Research (2017) Is online food delivery about to get “Amazoned?” Accessed March 13, 2026, https://www.morganstanley.com/ideas/online-food-delivery-market-expands/.Google Scholar
Neria G, Hildebrandt FD, Tzur M, Ulmer MW (2024) The restaurant meal delivery problem with ghost kitchens. Transportation Sci. 59(2):433–450.Link, Google Scholar
Pham DT, Kiesmüller GP (2024) Hybrid value function approximation for solving the technician routing problem with stochastic repair requests. Transportation Sci. 58(2):499–519.Link, Google Scholar
Pillac V, Gendreau M, Guéret C, Medaglia AL (2013) A review of dynamic vehicle routing problems. Eur. J. Oper. Res. 225(1):1–11.Crossref, Google Scholar
Psaraftis HN, Wen M, Kontovas CA (2016) Dynamic vehicle routing problems: Three decades and counting. Networks 67(1):3–31.Crossref, Google Scholar
Reyes D, Erera A, Savelsbergh M, Sahasrabudhe S, O’Neil R (2018) The meal delivery routing problem. Optim. Online 6571.Google Scholar
Shead S (2020) COVID has accelerated the adoption of online food delivery by 2 to 3 years, Deliveroo CEO says. Accessed March 13, 2026, https://www.cnbc.com/2020/12/03/deliveroo-ceo-says-covid-has-accelerated-adoption-of-takeaway-apps.html.Google Scholar
Steever Z, Karwan M, Murray C (2019) Dynamic courier routing for a food delivery service. Comput. Oper. Res. 107:173–188.Crossref, Google Scholar
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
Ulmer MW, Savelsbergh M (2020) Workforce scheduling in the era of crowdsourced delivery. Transportation Sci. 54(4):1113–1133.Link, Google Scholar
Ulmer MW, Erera A, Savelsbergh M (2022) Dynamic service area sizing in urban delivery. OR Spectrum 44(3):763–793.Crossref, Google Scholar
Ulmer MW, Thomas BW, Campbell AM, Woyak N (2021) The restaurant meal delivery problem: Dynamic pickup and delivery with deadlines and random ready times. Transportation Sci. 55(1):75–100.Link, Google Scholar
Union Bank of Switzerland (2018) Is the kitchen dead? Accessed March 13, 2026, https://www.ubs.com/global/en/investment-bank/in-focus/2018/dead-kitchen.html.Google Scholar
Yan Y, Chow AH, Ho CP, Kuo Y-H, Wu Q, Ying C (2022) Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities. Transportation Res. Part E Logist. Transportation Rev. 162:102712.Crossref, Google Scholar
Yeo L (2021) Which company is winning the restaurant food delivery war? Accessed March 13, 2026, https://secondmeasure.com/datapoints/food-delivery-services-grubhub-uber-eats-doordash-postmates.Google Scholar
Yildiz B, Savelsbergh M (2019) Provably high-quality solutions for the meal delivery routing problem. Transportation Sci. 53(5):1372–1388.Link, Google Scholar
Zhao L, Wang H, Liang Y, Li D, Zhao J, Ding X, Hao J, et al. (2024) The first INFORMS TSL data-driven research challenge (TSL-Meituan 2024): Background and data description. Accessed March 13, 2026, https://github.com/meituan/Meituan-INFORMS-TSL-Research-Challenge.git.Google Scholar

Articles In Advance

Article Information

Metrics

Information

Received:April 01, 2025
Accepted:January 21, 2026
Published Online:April 27, 2026

Cite as

Ramón Auad, Felipe Lagos, Tomás Lagos (2026) Data-Driven Optimization for Meal Delivery: A Reinforcement Learning Approach for Order-Courier Assignment and Routing at Meituan. Transportation Science 0(0).

https://doi.org/10.1287/trsc.2025.0129

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Data-Driven Optimization for Meal Delivery: A Reinforcement Learning Approach for Order-Courier Assignment and Routing at Meituan

References

Articles In Advance

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News