Wasserstein Distributionally Robust Shallow Convex Neural Networks

Julien Pallage
Corresponding Author
Julien Pallage
[email protected]
https://orcid.org/0009-0001-1689-3021
Department of Electrical Engineering, Polytechnique Montréal, Montréal, Québec H3T 0A3, Canada; and GERAD, Montréal, Québec H3T 2A7, Canada; and Mila, Montréal, Québec H2S 3H1, Canada
Search for more papers by this author
,
Antoine Lesage-Landry
Antoine Lesage-Landry
[email protected]
https://orcid.org/0000-0001-9652-6557
Department of Electrical Engineering, Polytechnique Montréal, Montréal, Québec H3T 0A3, Canada; and GERAD, Montréal, Québec H3T 2A7, Canada; and Mila, Montréal, Québec H2S 3H1, Canada
Search for more papers by this author

Corresponding Author

Julien Pallage

Department of Electrical Engineering, Polytechnique Montréal, Montréal, Québec H3T 0A3, Canada; and GERAD, Montréal, Québec H3T 2A7, Canada; and Mila, Montréal, Québec H2S 3H1, Canada

Search for more papers by this author

Antoine Lesage-Landry

[email protected]

https://orcid.org/0000-0001-9652-6557

Department of Electrical Engineering, Polytechnique Montréal, Montréal, Québec H3T 0A3, Canada; and GERAD, Montréal, Québec H3T 2A7, Canada; and Mila, Montréal, Québec H2S 3H1, Canada

Search for more papers by this author

Published Online:26 Aug 2025https://doi.org/10.1287/ijoo.2024.0048

Abstract

In this work, we propose Wasserstein distributionally robust shallow convex neural networks (WaDiRo-SCNNs) to provide reliable nonlinear predictions when subject to adverse and corrupted data sets. Our approach is based on the reformulation of a new convex training program for rectified linear unit–based shallow neural networks, and this allows us to cast the problem into the order-1 Wasserstein distributionally robust optimization framework. Our training procedure is conservative, has low stochasticity, is solvable with open-source solvers, and is scalable to large industrial deployments. We provide out-of-sample performance guarantees, show that hard convex physical constraints can be enforced in the training program, and propose a mixed-integer convex posttraining verification program to evaluate model stability. WaDiRo-SCNN aims to make neural networks safer for critical applications, such as in the energy sector. Finally, we numerically demonstrate our model’s performance through both a synthetic experiment and a real-world power system application, namely, the prediction of hourly energy consumption in nonresidential buildings within the context of virtual power plants, and evaluate its stability across standard regression benchmark data sets. The experimental results are convincing and showcase the strengths of the proposed model.

Funding: This work was possible thanks to funding from the Fonds de recherche du Québec, the Natural Sciences and Engineering Research Council of Canada, Mitacs Accelerate, and Hilo by Hydro-Québec [Grants RGPIN-2023-04235, IT35303, and IT38517; Scholarships CGRS-M and B1X].

cover image INFORMS Journal on Optimization

Volume 8, Issue 1

Winter 2026

Pages 1-93, ii

Article Information

Metrics

Information

Received:August 09, 2024
Accepted:July 19, 2025
Published Online:August 26, 2025

Cite as

Julien Pallage, Antoine Lesage-Landry (2025) Wasserstein Distributionally Robust Shallow Convex Neural Networks. INFORMS Journal on Optimization 8(1):61-93.

https://doi.org/10.1287/ijoo.2024.0048

Keywords

Acknowledgments

Special thanks to Salma Naccache and Bertrand Scherrer for their active support and the enriching discussions as well as Steve Boursiquot, Ahmed Abdellatif, and Odile Noël from Hilo for making this project possible.

PDF download

Available Issues

Available Issues

Available Issues

Wasserstein Distributionally Robust Shallow Convex Neural Networks

Abstract

Volume 8, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News