Nash Equilibria, Regularization, and Computation in Optimal Transport-Based Distributionally Robust Optimization
References
- (2022) Wasserstein distributionally robust estimation in high dimensions: Performance analysis and optimal hyperparameter tuning. Preprint, submitted June 27, https://doi.org/10.48550/arXiv.2206.13269.Google Scholar
- (2023) Wasserstein tube MPC with exact uncertainty propagation. Conf. Decision Control (IEEE, Piscataway, NJ), 2036–2041.Google Scholar
- (2025) Distributional uncertainty propagation via optimal transport. IEEE Trans. Automatic Control 70(9):6080–6095.Crossref, Google Scholar
- (1993) Approximation and regularization of arbitrary functions in Hilbert spaces by the Lasry-Lions method. Ann. L’Institut Henri Poincaré 10(3):289–312.Crossref, Google Scholar
- (1989) Epigraphical analysis. Ann. L’Institut Henri Poincaré 6:73–100.Crossref, Google Scholar
- (1938) Über homogene Polynome in (L2). Studia Math. 7(1):36–44.Crossref, Google Scholar
- (2021) Sensitivity analysis of Wasserstein distributionally robust optimization problems. Proc. Roy. Soc. A 477:20210176.Google Scholar
- (2011) Convex Analysis and Monotone Operator Theory in Hilbert Spaces (Springer, Berlin).Crossref, Google Scholar
- (2008) Pricing without priors. J. Eur. Econom. Assoc. 6(2–3):560–569.Crossref, Google Scholar
- (2013) Convergence of Probability Measures (John Wiley, Hoboken, NJ).Google Scholar
- (2019) Quantifying distributional model risk via optimal transport. Math. Oper. Res. 44(2):565–600.Link, Google Scholar
- (2019a) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probab. 56(3):830–857.Crossref, Google Scholar
- (2019b) Multivariate distributionally robust convex regression under absolute error loss. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 11817–11826.Google Scholar
- (2022a) Confidence regions in Wasserstein distributionally robust estimation. Biometrika 109(2):295–315.Crossref, Google Scholar
- (2022b) Optimal transport-based distributionally robust optimization: Structural properties and iterative schemes. Math. Oper. Res. 47(2):1500–1529.Link, Google Scholar
- (2021) Data-driven ambiguity sets with probabilistic guarantees for dynamic processes. IEEE Trans. Automatic Control 66(7):2991–3006.Crossref, Google Scholar
- (1991) Towards minimal assumptions for the infimal convolution regularization. J. Approx. Theory 64(3):245–270.Crossref, Google Scholar
- (2010) Total generalized variation. SIAM J. Imaging Sci. 3(3):492–526.Crossref, Google Scholar
- (2018) A robust learning approach for regression models based on distributionally robust optimization. J. Machine Learn. Res. 19(1):517–564.Google Scholar
- (2019) Selecting optimal decisions via distributionally robust nearest-neighbor regression. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 749–759.Google Scholar
- (2024) Data-driven chance constrained programs over Wasserstein balls. Oper. Res. 72(1):410–424.Link, Google Scholar
- (2021) Distributionally robust chance constrained data-enabled predictive control. IEEE Trans. Automatic Control 67(7):3289–3304.Crossref, Google Scholar
- (2006) Elements of Information Theory (John Wiley, Hoboken, NJ).Google Scholar
- (2000) Mathematical Methods and Models for Economists (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2021) Learning models with uniform performance via distributionally robust optimization. Ann. Statist. 49(3):1378–1406.Crossref, Google Scholar
- (2021) Statistics of robust optimization: A generalized empirical likelihood approach. Math. Oper. Res. 46(3):946–969.Link, Google Scholar
- (2016) A minimax approach to supervised learning. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 4240–4248.Google Scholar
- (2021) Scaleable input gradient regularization for adversarial robustness. Machine Learn. Appl. 3:100017.Google Scholar
- (2023) Finite-sample guarantees for Wasserstein distributionally robust optimization: Breaking the curse of dimensionality. Oper. Res. 71(6):2291–2306.Link, Google Scholar
- (2023) Distributionally robust stochastic optimization with Wasserstein distance. Math. Oper. Res. 48(2):603–655.Link, Google Scholar
- (2024) Wasserstein distributionally robust optimization and variation regularization. Oper. Res. 72(3):1177–1191.Link, Google Scholar
- (2018) Robust hypothesis testing using Wasserstein uncertainty sets. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 7902–7912.Google Scholar
- (2014) Explaining and harnessing adversarial examples. Preprint, submitted December 20, https://doi.org/10.48550/arXiv.1412.6572.Google Scholar
- (2017) Improved training of Wasserstein GANs. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 5769–5779.Google Scholar
- (2017) Formal guarantees on the robustness of a classifier against adversarial manipulation. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 2263–2273.Google Scholar
- (1980) Extension of Lipschitz functions. J. Math. Anal. Appl. 77(2):539–554.Crossref, Google Scholar
- (2023) Adversarial classification via distributional robustness with Wasserstein ambiguity. Math. Programming 198(2):1411–1447.Crossref, Google Scholar
- (2022) Distributionally robust chance-constrained programs with right-hand side uncertainty under Wasserstein ambiguity. Math. Programming 196(1–2):641–672.Crossref, Google Scholar
- (2023) Strong formulations for distributionally robust chance-constrained programs with left-hand side uncertainty under Wasserstein ambiguity. INFORMS J. Optim. 5(2):211–232.Link, Google Scholar
- (2014) Generalized higher degree total variation (HDTV) regularization. IEEE Trans. Image Processing 23(6):2423–2435.Crossref, Google Scholar
- (2018) Improving DNN robustness to adversarial attacks using Jacobian regularization. Ferrari V, Hebert M, Sminchisescu C, Weiss Y, eds. Eur. Conf. Comput. Vision (Springer, Berlin), 514–529.Google Scholar
- (2022) Robust multidimensional pricing: Separation without regret. Math. Programming 196(1–2):841–874.Crossref, Google Scholar
- (2020) Distributionally robust mechanism design. Management Sci. 66(1):159–189.Link, Google Scholar
- (2025) Distributionally robust optimization. Acta Numerica 34:579–804.Crossref, Google Scholar
- (2019) Wasserstein distributionally robust optimization: Theory and applications in machine learning. Operations Research & Management Science in the Age of Analytics (INFORMS, Cantonsville, MD), 130–166.Link, Google Scholar
- (2020) Principled learning method for Wasserstein distributionally robust optimization with local perturbations. Daumé III H, Singh A, eds. Internat. Conf. Machine Learn (PMLR, New York), 5567–5576.Google Scholar
- (2019) Recovering best statistical guarantees via the empirical divergence-based distributionally robust optimization. Oper. Res. 67(4):1090–1105.Abstract, Google Scholar
- (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86(11):2278–2324.Crossref, Google Scholar
- (2004) A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88(2):365–411.Crossref, Google Scholar
- (2018) Minimax statistical learning with Wasserstein distances. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc, Red Hook, NY), 2687–2696.Google Scholar
- (2013) Poisson image reconstruction with Hessian Schatten-norm regularization. IEEE Trans. Image Processing 22(11):4314–4327.Crossref, Google Scholar
- (2013) Hessian Schatten-norm regularization for linear inverse problems. IEEE Trans. Image Processing 22(5):1873–1888.Crossref, Google Scholar
- (2006) Theory of Point Estimation (Springer, Berlin).Google Scholar
- (2004) Robust least-squares estimation with a relative entropy constraint. IEEE Trans. Inform. Theory 50(1):89–104.Crossref, Google Scholar
- (2012) Robust state space filtering under incremental model perturbations subject to a relative entropy tolerance. IEEE Trans. Automatic Control 58(3):682–695.Crossref, Google Scholar
- (2015) A unified gradient regularization family for adversarial examples. Internat. Conf. Data Mining (IEEE, Piscataway, NJ), 301–309.Google Scholar
- (2018) Towards deep learning models resistant to adversarial attacks. Internat. Conf. Learn. Representations (MIT Press, Cambridge, MA).Google Scholar
- (2020) Golden ratio algorithms for variational inequalities. Math. Programming 184(1):383–410.Crossref, Google Scholar
- (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1–2):115–166.Crossref, Google Scholar
- (2018) Data-driven inverse optimization with imperfect information. Math. Programming 167(1):191–234.Crossref, Google Scholar
- (2017) Gradient descent GAN optimization is locally stable. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 5591–5600.Google Scholar
- (2022) Distributionally robust inverse covariance estimation: The Wasserstein shrinkage estimator. Oper. Res. 70(1):490–515.Link, Google Scholar
- (2023) Bridging Bayesian and minimax mean square error estimation via Wasserstein distributionally robust optimization. Math. Oper. Res. 48(1):1–37.Link, Google Scholar
- (2017) Unifying adversarial training algorithms with data gradient regularization. Neural Comput. 29(4):867–887.Crossref, Google Scholar
- (2016a) Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. Preprint, submitted May 24, https://arxiv.org/abs/1605.07277.Google Scholar
- (2016b) Distillation as a defense to adversarial perturbations against deep neural networks. IEEE Sympos. Security Privacy (IEEE, Piscataway, NJ), 582–597.Google Scholar
- (2014) Proximal algorithms. Foundations Trends Optim. 1(3):127–239.Crossref, Google Scholar
- (1998) Proximal mappings. J. Approx. Theory 94(2):203–221.Crossref, Google Scholar
- (1987) Introduction to Optimization (Optimization Software, Inc, Dallas).Google Scholar
- (2009) Variational Analysis (Springer, Berlin).Google Scholar
- (2017) Stabilizing training of generative adversarial networks through regularization. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 2018–2028.Google Scholar
- (2016) Robust growth-optimal portfolios. Management Sci. 62(7):2090–2109.Link, Google Scholar
- (2019) Regularization via mass transportation. J. Machine Learn. Res. 20(103):1–68.Google Scholar
- (2015) Distributionally robust logistic regression. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 1576–1584.Google Scholar
- (2018) Wasserstein distributionally robust Kalman filtering. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 8474–8483.Google Scholar
- (2014) Lectures on Stochastic Programming: Modeling and Theory (SIAM, Philadelphia).Crossref, Google Scholar
- (2022) Chance-constrained set covering with Wasserstein ambiguity. Math. Programming 198:621–674.Crossref, Google Scholar
- (2018) Certifying some distributional robustness with principled adversarial training. Internat. Conf. Learn. Representations (ICLR, Appleton, WI).Google Scholar
- (2013) Intriguing properties of neural networks. Preprint, submitted December 21, https://doi.org/10.48550/arXiv.1312.6199.Google Scholar
- (2023) Semi-discrete optimal transport: Hardness, regularization and numerical solution. Math. Programming 199(1):1033–1106.Crossref, Google Scholar
- (2019) Theoretical analysis of adversarial learning: A minimax approach. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 12280–12290.Google Scholar
- (2021) From data to decisions: Distributionally robust optimization is optimal. Management Sci. 67(6):3387–3402.Link, Google Scholar
- (2018) Gradient regularization improves accuracy of discriminative models. Schedae Informaticae 27:31–45.Crossref, Google Scholar
- (2008) Optimal Transport: Old and New (Springer, Berlin).Google Scholar
- (2018) Generalizing to unseen domains via adversarial data augmentation. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 5339–5349.Google Scholar
- (2019) On the convergence and robustness of adversarial training. Chaudhuri K, Salakhutdinov R, eds. Internat. Conf. Machine Learn. (PMLR, New York), 6586–6595.Google Scholar
- (2021) On distributionally robust chance constrained programs with Wasserstein distance. Math. Programming 186(1):115–155.Crossref, Google Scholar
- (2020) Wasserstein distributionally robust stochastic control: A data-driven approach. IEEE Trans. Automatic Control 66(8):3863–3870.Crossref, Google Scholar
- (2022) On linear optimization over Wasserstein balls. Math. Programming 195(1–2):1107–1122.Crossref, Google Scholar
- (2024) A short and general duality proof for Wasserstein distributionally robust optimization. Oper. Res. 73(4):2146–2155.Link, Google Scholar
- (2018) Data-driven risk-averse stochastic optimization with Wasserstein metric. Oper. Res. Lett. 46(2):262–267.Crossref, Google Scholar
- (2023) A unified theory of robust and distributionally robust optimization via the primal-worst-equals-dual-best principle. Oper. Res. 73(2):862–878.Link, Google Scholar
- (2016) Robust Kalman filtering under model perturbations. IEEE Trans. Automatic Control 62(6):2902–2907.Crossref, Google Scholar
- (2017) On the robustness of the Bayes and Wiener estimators under model uncertainty. Automatica 83:133–140.Crossref, Google Scholar

