Forecasting Sparse Movement Speed of Urban Road Networks with Nonstationary Temporal Matrix Factorization
Abstract
Movement speed data from urban road networks, computed from ridesharing vehicles or taxi trajectories, is often high-dimensional, sparse, and nonstationary (e.g., exhibiting seasonality). These characteristics pose challenges for developing scalable and efficient data-driven solutions for traffic flow estimation and forecasting using machine learning techniques. To address these challenges, we propose a Nonstationary Temporal Matrix Factorization (NoTMF) model that leverages matrix factorization to project high-dimensional and sparse movement speed data into low-dimensional latent spaces. This results in a concise formula with the multiplication between spatial and temporal factor matrices. To characterize the temporal correlations, NoTMF takes a latent equation on the seasonal differenced temporal factors using higher-order vector autoregression (VAR). This approach not only preserves the low-rank structure of sparse movement speed data but also maintains consistent temporal dynamics, including seasonality information. The learning process for NoTMF involves optimizing the spatial and temporal factor matrices along with a collection of VAR coefficient matrices. To solve this efficiently, we introduce an alternating minimization framework, which tackles a challenging procedure of estimating the temporal factor matrix using conjugate gradient method, as the subproblem involves both partially observed matrix factorization and seasonal differenced VAR. To evaluate the forecasting performance of NoTMF, we conduct extensive experiments on Uber movement speed data sets, which are estimated from ridesharing vehicle trajectories. These data sets contain a large proportion of missing values due to insufficient ridesharing vehicles on the urban road network. Despite the presence of missing data, NoTMF demonstrates superior forecasting accuracy and effectiveness compared with baseline models. Moreover, as the seasonality of movement speed data are of great concern, the experiment results highlight the significance of addressing the nonstationarity of movement speed data.
Funding: This work was supported by Institute for Data Valorisation (IVADO); Fonds de recherche du Québec – Nature et technologies (FRQNT); the Interuniversity Research Centre on Enterprise Networks, Logistics and Transportation (CIRRELT).

