Wasserstein Distributionally Robust Shallow Convex Neural Networks
Abstract
In this work, we propose Wasserstein distributionally robust shallow convex neural networks (WaDiRo-SCNNs) to provide reliable nonlinear predictions when subject to adverse and corrupted data sets. Our approach is based on the reformulation of a new convex training program for rectified linear unit–based shallow neural networks, and this allows us to cast the problem into the order-1 Wasserstein distributionally robust optimization framework. Our training procedure is conservative, has low stochasticity, is solvable with open-source solvers, and is scalable to large industrial deployments. We provide out-of-sample performance guarantees, show that hard convex physical constraints can be enforced in the training program, and propose a mixed-integer convex posttraining verification program to evaluate model stability. WaDiRo-SCNN aims to make neural networks safer for critical applications, such as in the energy sector. Finally, we numerically demonstrate our model’s performance through both a synthetic experiment and a real-world power system application, namely, the prediction of hourly energy consumption in nonresidential buildings within the context of virtual power plants, and evaluate its stability across standard regression benchmark data sets. The experimental results are convincing and showcase the strengths of the proposed model.
Funding: This work was possible thanks to funding from the Fonds de recherche du Québec, the Natural Sciences and Engineering Research Council of Canada, Mitacs Accelerate, and Hilo by Hydro-Québec [Grants RGPIN-2023-04235, IT35303, and IT38517; Scholarships CGRS-M and B1X].

