Ride-Hailing Order Dispatching at DiDi via Reinforcement Learning
Published Online:24 Sep 2020https://doi.org/10.1287/inte.2020.1047
References
- (1971) A theory of cerebellar function. Math. Biosci. 10(1–2):25–61.Google Scholar
- (2017) On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment. Proc. Natl. Acad. Sci. USA 114(3):462–467.Google Scholar
- (1987) A simulation analysis of demand and fleet size effects on taxicab service rates. Thesen A , Grant H , Kelton WD , eds. Proc. 19th Conf. Winter Simulation (Association for Computing Machinery, New York), 838–844.Google Scholar
- (1993) Advantage updating. Technical report, Wright Laboratory, Wright-Patterson Air Force Base, Dayton, OH.Google Scholar
- (2017) Neural combinatorial optimization with reinforcement learning. ICLR 2017 Workshop Track. Accessed May 19, 2020, https://openreview.net/pdf?id=Bk9mxlSFx.Google Scholar
- (2018) Planning and learning with stochastic action sets. Lang J , ed. Proc. 27th Internat. Joint Conf. Artificial Intelligence (International Joint Conferences on Artificial Intelligence, Stockholm), 4674–4682.Google Scholar
- (2018) H3: Uber’s hexagonal hierarchical spatial index. Accessed June 26, 2019, https://eng.uber.com/h3/.Google Scholar
- (2001) The honeycomb conjecture. Discrete Comput. Geometry 25(1):1–22.Google Scholar
- (2019) Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. Wang J , Shim K , Wu X , eds. 2019 IEEE Internat. Conf. Data Mining (ICDM) (Institute of Electrical and Electronics Engineers, Washington, DC), 1090–1095.Google Scholar
- (2018) Optimizing taxi carpool policies via reinforcement learning and spatio-temporal mining. Abe N , Liu H , Pu C , Hu X , Ahmed N , Qiao M , Song Y , , eds. 2018 IEEE Internat. Conf. Big Data (Big Data) (Institute of Electrical and Electronics Engineers, Washington, DC), 1417–1426.Google Scholar
- (1955) The Hungarian method for the assignment problem. Naval Res. Logist. Quart. 2(1-2):83–97.Google Scholar
- (2016) Taxi dispatching and stable marriage. Procedia Comput. Sci. 83(December):163–170.Google Scholar
- (2019) Efficient ridesharing order dispatching with mean field multi-agent reinforcement learning. Liu L , White R , eds. WWW’19 World Wide Web Conf. (Association for Computing Machinery, New York), 983–994.Google Scholar
- (2019) Fast block distributed CUDA implementation of the Hungarian algorithm. J. Parallel Distributed Comput. 130(August):50–62.Google Scholar
- (2016) Taxi dispatch with real-time sensing data in metropolitan areas: A receding horizon control approach. IEEE Trans. Automation Sci. Engrg. 13(2):463–478.Google Scholar
- . (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533.Google Scholar
- National Bureau of Statistics of China (2019) National economic performance maintained within an appropriate range in 2018 with main development goals achieved. Report, National Bureau of Statistics of China, Beijing.Google Scholar
- (2018) Reinforcement learning for solving the vehicle routing problem. Adv. Neural Inform. Processing Systems 31:9839–9849.Google Scholar
- (2018) MOVI: A model-free approach to dynamic fleet management. IEEE INFOCOM 2018-IEEE Conf. Comput. Commun. (Institute of Electrical and Electronics Engineers, Washington, DC), 2708–2716.Google Scholar
- (2020) Dynamic matching for real-time ride sharing. Stochastic Systems 10(1):29–70.Link, Google Scholar
- (2011) Hexagonal discrete global grid systems for geospatial computing. Arch. Photogrammetry Cartography Remote Sensing 22 (January):363–376.Google Scholar
- (2017) Proximal policy optimization algorithms. Preprint, submitted August 28, https://arxiv.org/abs/1707.06347.Google Scholar
- (2020) Optimal passenger-seeking policies on e-hailing platforms using Markov decision process and imitation learning. Transportation Res. Part C: Emerging Tech. 111(February):91–113.Google Scholar
- . (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489.Google Scholar
- (1988) Learning to predict by the methods of temporal differences. Machine Learn. 3(1):9–44.Google Scholar
- (2018) Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
- (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1–2):181–211.Google Scholar
- (2019) A deep value-network based approach for multi-driver order dispatching. Proc. 25th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1780–1790.Google Scholar
- (1996) Feature-based methods for large scale dynamic programming. Machine Learn. 22(1–3):59–94.Google Scholar
- (2019) Hierarchical hexagonal clustering and indexing. Symmetry 11(6):731.Google Scholar
- University of Michigan Center for Sustainable Studies (2019) US cities factsheet. Report, Center for Sustainable System, University of Michigan, Ann Arbor. Accessed April 21, 2020, http://css.umich.edu/sites/default/files/US%20Cities_CSS09-06_e2019.pdf.Google Scholar
- (2016) Deep reinforcement learning with double Q-learning. 30th AAAI Conf. Artificial Intelligence (Association for the Advancement of Artificial Intelligence, Menlo Park, CA), 2094–2100.Google Scholar
- (2017) Augmenting decisions of taxi drivers through reinforcement learning for improving revenues. 27th Internat. Conf. Automated Planning Scheduling (Association for the Advancement of Artificial Intelligence, Menlo Park, CA), 409–417.Google Scholar
- (2015) Pointer networks. Adv. Neural Inform. Processing Systems 28:2692–2700.Google Scholar
- (2018) Deep reinforcement learning with knowledge transfer for online rides order dispatching. 2018 IEEE Internat. Conf. Data Mining (ICDM) (Institute of Electrical and Electronics Engineers, Washington, DC), 617–626.Google Scholar
- (2018) Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. Proc. 24th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 905–913.Google Scholar
- (2019) Dynamic pricing and matching in ride-hailing platforms. Naval Res. Logist. , ePub ahead of print November 15, https://doi.org/10.1002/nav.21872.Google Scholar
- (2017) A taxi order dispatch model based on combinatorial optimization. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 2151–2159.Google Scholar

