Sinkhorn Distributionally Robust Optimization

Jie Wang
Jie Wang
[email protected]
https://orcid.org/0000-0001-8623-4622
School of Artificial Intelligence, School of Data Science, The Chinese University of Hong Kong, Shenzhen 518172, China
Search for more papers by this author
,
Rui Gao
Corresponding Author
Rui Gao
[email protected]
https://orcid.org/0000-0003-0145-8577
Department of Information, Risk, and Operations Management, University of Texas at Austin, Austin, Texas 78712
Search for more papers by this author
,
Yao Xie
Yao Xie
[email protected]
https://orcid.org/0000-0001-6777-2951
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332
Search for more papers by this author

School of Artificial Intelligence, School of Data Science, The Chinese University of Hong Kong, Shenzhen 518172, China

Search for more papers by this author

Rui Gao

Corresponding Author

Rui Gao

[email protected]

https://orcid.org/0000-0003-0145-8577

Department of Information, Risk, and Operations Management, University of Texas at Austin, Austin, Texas 78712

Search for more papers by this author

Yao Xie

[email protected]

https://orcid.org/0000-0001-6777-2951

School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332

Search for more papers by this author

Published Online:15 Oct 2025https://doi.org/10.1287/opre.2023.0294

References

Agrawal S, Ding Y, Saberi A, Ye Y (2012) Price of correlations in stochastic optimization. Oper. Res. 60(1):150–162.Link, Google Scholar
Altschuler J, Weed J, Rigollet P (2017) Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. Guyon I, Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 1–11.Google Scholar
Asmussen S, Glynn PW (2007) Stochastic Simulation: Algorithms and Analysis, vol. 57 (Springer Science & Business Media, New York).Crossref, Google Scholar
Azizian W, Iutzeler F, Malick J (2023) Regularization for Wasserstein distributionally robust optimization. ESAIM: Control, Optimisation and Calculus of Variations, vol. 29 (EDP Sciences, Les Ulis, France), 33.Crossref, Google Scholar
Bacharach M (1965) Estimating nonnegative matrices from marginal data. Internat. Econom. Rev. 6(3):294–310.Crossref, Google Scholar
Bai Y, Wu X, Ozgur A (2020) Information constrained optimal transport: From Talagrand, to Marton, to Cover. 2020 IEEE Internat. Sympos. Inform. Theory (IEEE Press, Piscataway, NJ), 2210–2215.Google Scholar
Bayraksan G, Love DK (2015) Data-driven stochastic programming using phi-divergences. INFORMS TutORials in Operations Research (INFORMS, Catonsville, MD), 1–19.Link, Google Scholar
Ben-Tal A, den Hertog D, De Waegenaere A, Melenberg B, Rennen G (2013) Robust solutions of optimization problems affected by uncertain probabilities. Management Sci. 59(2):341–357.Link, Google Scholar
Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.Link, Google Scholar
Bertsimas D, Natarajan K, Teo CP (2006) Persistence in discrete optimization under data uncertainty. Math. Programming 108(2):251–274.Crossref, Google Scholar
Bertsimas D, Sim M, Zhang M (2019) Adaptive distributionally robust optimization. Management Sci. 65(2):604–618.Link, Google Scholar
Blackwell D, Ryll-Nardzewski C (1963) Non-existence of everywhere proper conditional distributions. Ann. Math. Statist. 34(1):223–225.Crossref, Google Scholar
Blanchet J, Glynn PW (2015) Unbiased Monte Carlo for optimization and functions of expectations via multi-level randomization. 2015 Winter Simulation Conf. (IEEE, Piscataway, NJ), 3656–3667.Google Scholar
Blanchet J, Kang Y (2020) Semi-supervised learning based on distributionally robust optimization. Data Analysis and Applications 3: Computational, Classification, Financial, Statistical and Stochastic Methods, vol. 5 (John Wiley & Sons, Ltd, Hoboken, NJ), 1–33.Crossref, Google Scholar
Blanchet J, Murthy K (2019) Quantifying distributional model risk via optimal transport. Math. Oper. Res. 44(2):565–600.Link, Google Scholar
Blanchet J, Chen L, Zhou XY (2022a) Distributionally robust mean-variance portfolio selection with Wasserstein distances. Management Sci. 68(9):6382–6410.Link, Google Scholar
Blanchet J, Kang Y, Murthy K (2019a) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probab. 56(3):830–857.Crossref, Google Scholar
Blanchet J, Murthy K, Nguyen VA (2021) Statistical analysis of Wasserstein distributionally robust estimators. Tutorials in Operations Research: Emerging Optimization Methods and Modeling Techniques with Applications (INFORMS, Catonsville, MD), 227–254.Link, Google Scholar
Blanchet J, Murthy K, Si N (2022b) Confidence regions in Wasserstein distributionally robust estimation. Biometrika 109(2):295–315.Crossref, Google Scholar
Blanchet J, Murthy K, Zhang F (2022c) Optimal transport-based distributionally robust optimization: Structural properties and iterative schemes. Math. Oper. Res. 47(2):1500–1529.Link, Google Scholar
Blanchet J, Glynn PW, Yan J, Zhou Z (2019b) Multivariate distributionally robust convex regression under absolute error loss. Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 1–10.Google Scholar
Chen R, Paschalidis IC (2019) Selecting optimal decisions via distributionally robust nearest-neighbor regression. Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 1–11.Google Scholar
Chen Y, Sun H, Xu H (2020) Decomposition and discrete approximation methods for solving two-stage distributionally robust optimization problems. Comput. Optim. Appl. 78(1):205–238.Crossref, Google Scholar
Chen Z, Kuhn D, Wiesemann W (2022) Data-driven chance constrained programs over Wasserstein balls. Oper. Res. 72(1):410–424.Link, Google Scholar
Chen Z, Sim M, Xu H (2019) Distributionally robust optimization with infinitely constrained ambiguity sets. Oper. Res. 67(5):1328–1344.Link, Google Scholar
Cherukuri A, Cortés J (2019) Cooperative data-driven distributionally robust optimization. IEEE Trans. Automatic Control 65(10):4400–4407.Crossref, Google Scholar
Coates A, Andrew N, Honglak L (2011) An analysis of single-layer networks in unsupervised feature learning. Gordon G, Dunson D, Dudík M, eds. Proc. 14th Internat. Conf. Artificial Intelligence Statist., vol. 77 (PMLR, New York), 215–223.Google Scholar
Cohen MB, Lee YT, Miller G, Pachocki J, Sidford A (2016) Geometric median in nearly linear time. Proc. 48th Annual ACM Sympos. Theory Comput. (Association for Computing Machinery, New York), 9–21.Google Scholar
Courty N, Flamary R, Tuia D (2014) Domain adaptation with regularized optimal transport. Calders T, Esposito F, Hüllermeier E, Meo R, eds. Joint Eur. Conf. Machine Learn. Knowledge Discovery Databases, ECML PKDD 2014, Lecture Notes in Computer Science, vol. 8724 (Springer, Berlin), 274–289.Google Scholar
Courty N, Flamary R, Habrard A, Rakotomamonjy A (2017) Joint distribution optimal transportation for domain adaptation. Guyon I, Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 1–10.Google Scholar
Courty N, Flamary R, Tuia D, Rakotomamonjy A (2016) Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Machine Intelligence 39(9):1853–1865.Crossref, Google Scholar
Cover TM, Thomas JA (2006) Elements of Information Theory (Wiley-Interscience, Hoboken, NJ).Google Scholar
Cuturi M (2013) Sinkhorn distances: Lightspeed computation of optimal transport. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 26 (Curran Associates, Inc., Red Hook, NY), 2292–2300.Google Scholar
Delage E, Ye Y (2010) Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 58(3):595–612.Link, Google Scholar
Deming WE, Stephan FF (1940) On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Statist. 11(4):427–444.Crossref, Google Scholar
Doan XV, Natarajan K (2012) On the complexity of nonoverlapping multivariate marginal bounds for probabilistic combinatorial optimization problems. Oper. Res. 60(1):138–149.Link, Google Scholar
Duchi JC, Glynn PW, Namkoong H (2021) Statistics of robust optimization: A generalized empirical likelihood approach. Math. Oper. Res. 46(3):946–969.Link, Google Scholar
Eckstein S, Kupper M, Pohl M (2020) Robust risk aggregation with neural networks. Math. Finance 30(4):1229–1272.Crossref, Google Scholar
Esfahani PM, Kuhn D (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1):115–166.Crossref, Google Scholar
Feng Y, Schlögl E (2018) Model risk measurement under Wasserstein distance. Preprint, submitted September 11, https://arxiv.org/abs/1809.03641.Google Scholar
Fréchet M (1960) Sur les tableaux dont les marges et des bornes sont données. Revue de L’Institut Internat. de Statistique 28(1/2):10–32.Crossref, Google Scholar
Gao R (2022) Finite-sample guarantees for Wasserstein distributionally robust optimization: Breaking the curse of dimensionality. Oper. Res. 71(6):2291–2306.Link, Google Scholar
Gao R, Kleywegt A (2022) Distributionally robust stochastic optimization with Wasserstein distance. Math. Oper. Res. 48(2):603–655.Link, Google Scholar
Gao R, Chen X, Kleywegt AJ (2022) Wasserstein distributionally robust optimization and variation regularization. Oper. Res. 72(3):1177–1191.Link, Google Scholar
Genevay A, Peyre G, Cuturi M (2018) Learning generative models with Sinkhorn divergences. Storkey A, Perez-Cruz F, eds. Proc. 21st Internat. Conf. Artificial Intelligence Statist., vol. 84 (PMLR, New York), 1608–1617.Google Scholar
Genevay A, Cuturi M, Peyré G, Bach F (2016) Stochastic optimization for large-scale optimal transport. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 29 (Curran Associates, Inc., Red Hook, NY), 1–9.Google Scholar
Goh J, Sim M (2010) Distributionally robust optimization and its tractable approximations. Oper. Res. 58(4):902–917.Link, Google Scholar
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. Preprint, submitted December 20, https://arxiv.org/abs/1412.6572.Google Scholar
Härdle W (1990) Applied Nonparametric Regression (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc. IEEE Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 770–778.Google Scholar
He W, Wei J, Chen X, Carlini N, Song D (2017) Adversarial example defense: Ensembles of weak defenses are not strong. WOOT'17 Proc. 11th USENIX Conf. Offensive Technologies (USENIX Association, Berkeley, CA), 15.Google Scholar
Hu Y, Chen X, He N (2020a) Sample complexity of sample average approximation for conditional stochastic optimization. SIAM J. Optim. 30(3):2103–2133.Crossref, Google Scholar
Hu Y, Chen X, He N (2021) On the bias-variance-cost tradeoff of stochastic optimization. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Vaughan JW, eds. Adv. Neural Inform. Processing Systems, vol. 34 (Curran Associates, Inc., Red Hook, NY), 1–13.Google Scholar
Hu Y, Zhang S, Chen X, He N (2020b) Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 1–12.Google Scholar
Hu Z, Hong LJ (2012) Kullback-Leibler divergence constrained distributionally robust optimization. Optimization Online (November 23), https://optimization-online.org/2012/11/3677/.Google Scholar
Huang M, Ma S, Lai L (2021) A Riemannian block coordinate descent method for computing the projection robust Wasserstein distance. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., vol. 139 (PMLR, New York), 4446–4455.Google Scholar
Kallenberg O (1997) Foundations of Modern Probability, vol. 2 (Springer, New York).Google Scholar
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf.Google Scholar
Kruithof J (1937) Telefoonverkeersrekening. De Ingenieur 52:15–25.Google Scholar
Kuhn D, Shafieezadeh-Abadeh S, Wiesemann W (2025) Distributionally robust optimization. Acta Numerica 34:579–804.Crossref, Google Scholar
Le Y, Yang XS (2014) TinyImageNet visual recognition challenge. https://cs231n.stanford.edu/reports/2015/pdfs/yle_project.pdf.Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86(11):2278–2324.Crossref, Google Scholar
Levy D, Carmon Y, Duchi JC, Sidford A (2020) Large-scale methods for distributionally robust optimization. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 8847–8860.Google Scholar
Lin T, Ho N, Jordan MI (2022) On the efficiency of entropic regularized algorithms for optimal transport. J. Machine Learn. Res. 23(137):1–42.Google Scholar
Lin T, Fan C, Ho N, Cuturi M, Jordan M (2020) Projection robust Wasserstein distance and Riemannian optimization. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 9383–9397.Google Scholar
Liu Y, Yuan X, Zhang J (2021) Discrete approximation scheme in distributionally robust optimization. Numerical Math. Theory Methods Appl. 14(2):285–320.Crossref, Google Scholar
Luise G, Rudi A, Pontil M, Ciliberto C (2018) Differential properties of Sinkhorn approximation for learning with Wasserstein distance. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 31 (Curran Associates, Inc., Red Hook, NY), 1–12.Google Scholar
Luo F, Mehrotra S (2019) Decomposition algorithm for distributionally robust optimization using Wasserstein metric with an application to a class of regression models. Eur. J. Oper. Res. 278(1):20–35.Crossref, Google Scholar
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. Internat. Conf. Learn. Representations (OpenReview.net), 1–23.Google Scholar
Mensch A, Peyré G (2020) Online Sinkhorn: Optimal transport distances from sample streams. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 1657–1667.Google Scholar
Namkoong H, Duchi JC (2016) Stochastic gradient methods for distributionally robust optimization with f-divergences. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 29 (Curran Associates, Inc., Red Hook, NY), 2208–2216.Google Scholar
Natarajan K, Song M, Teo CP (2009) Persistency model and its applications in choice modeling. Management Sci. 55(3):453–469.Link, Google Scholar
Nemirovsky A, Yudin D (1983) Problem Complexity and Method Efficiency in Optimization (John Wiley & Sons, Chichester, UK).Google Scholar
Nesterov Y, Nemirovskii A (1994) Interior-Point Polynomial Algorithms in Convex Programming (SIAM, Philadelphia).Crossref, Google Scholar
Nguyen VA, Si N, Blanchet J (2020) Robust Bayesian classification using an optimistic score ratio. Daumé H III, Singh A, eds. Internat. Conf. Machine Learn., vol. 119 (PMLR, New York), 7327–7337.Google Scholar
Nguyen VA, Zhang F, Wang S, Blanchet J, Delage E, Ye Y (2024) Robustifying conditional portfolio decisions via optimal transport. Oper. Res., ePub ahead of print November 4, https://doi.org/10.1287/opre.2021.0243.Link, Google Scholar
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016a) Distillation as a defense to adversarial perturbations against deep neural networks. 2016 IEEE Sympos. Security Privacy (IEEE, Piscataway, NJ), 582–597.Google Scholar
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. Proc. 2017 ACM Asia Conf. Comput. Comm. Security (Association for Computing Machinery, New York), 506–519.Google Scholar
Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2016b) The limitations of deep learning in adversarial settings. 2016 IEEE Eur. Sympos. Security Privacy (IEEE, Piscataway, NJ), 372–387.Google Scholar
Patrini G, van den Berg R, Forre P, Carioni M, Bhargav S, Welling M, Genewein T, Nielsen F (2020) Sinkhorn autoencoders. Adams R, Gogate V, eds. Uncertainty in Artificial Intelligence, vol. 115 (PMLR, New York), 733–743.Google Scholar
Petzka H, Fischer A, Lukovnikov D (2018) On the regularization of Wasserstein GANs. Internat. Conf. Learn. Representations (OpenReview.net), 1–24.Google Scholar
Peyre G, Cuturi M (2019) Computational optimal transport: With applications to data science. Foundations Trends Machine Learn. 11(5–6):355–607.Crossref, Google Scholar
Pflug G, Wozabal D (2007) Ambiguity in portfolio selection. Quant. Finance 7(4):435–442.Crossref, Google Scholar
Pichler A, Shapiro A (2021) Mathematical foundations of distributionally robust multistage optimization. SIAM J. Optim. 31(4):3044–3067.Crossref, Google Scholar
Popescu I (2005) A semidefinite programming approach to optimal-moment bounds for convex classes of distributions. Math. Oper. Res. 30(3):632–657.Link, Google Scholar
Qi Q, Lyu J, Chan KS, Bai EW, Yang T (2025) Stochastic constrained DRO with a complexity independent of sample size. Trans. Machine Learn. Res. Forthcoming.Google Scholar
Rockafellar RT, Uryasev S (1999) Optimization of conditional value-at-risk. J. Risk 2(3):21–42.Crossref, Google Scholar
Rozsa A, Gunther M, Boult TE (2018) Towards robust deep neural networks with bang. 2018 IEEE Winter Conf. Appl. Comput. Vision (IEEE, Piscataway, NJ), 803–811.Google Scholar
Scarf H (1957) A min-max solution of an inventory problem. Studies in the Mathematical Theory of Inventory and Production (Stanford University Press, Redwood City, CA).Google Scholar
Selvi A, Belbasi MR, Haugh MB, Wiesemann W (2022) Wasserstein logistic regression with mixed features. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates, Inc., Red Hook, NY), 1–14.Google Scholar
Shafieezadeh-Abadeh S, Kuhn D, Esfahani PM (2019) Regularization via mass transportation. J. Machine Learn. Res. 20(103):1–68.Google Scholar
Shafieezadeh Abadeh S, Esfahani PM, Kuhn D (2015) Distributionally robust logistic regression. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (Curran Associates, Inc., Red Hook, NY), 1–9.Google Scholar
Shafieezadeh-Abadeh S, Aolaritei L, Dörfler F, Kuhn D (2023) New perspectives on regularization and computation in optimal transport-based distributionally robust optimization. Preprint, submitted March 7, https://arxiv.org/abs/2303.03900.Google Scholar
Shapiro A (2001) On duality theory of conic linear problems. Semi-Infinite Programming (Springer, Boston), 135–165.Crossref, Google Scholar
Shapiro A, Zhou E, Lin Y (2023) Bayesian distributionally robust optimization. SIAM J. Optim. 33(2):1279–1304.Crossref, Google Scholar
Singh D, Zhang S (2021) Distributionally robust profit opportunities. Oper. Res. Lett. 49(1):121–128.Crossref, Google Scholar
Singh D, Zhang S (2022) Tight bounds for a class of data-driven distributionally robust risk measures. Appl. Math. Optim. 85(1):1–41.Crossref, Google Scholar
Sinha A, Namkoong H, Duchi J (2018) Certifiable distributional robustness with principled adversarial training. Internat. Conf. Learn. Representations (OpenReview.net), 1–34.Google Scholar
Sinkhorn R (1964) A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Statist. 35(2):876–879.Crossref, Google Scholar
Song J, He N, Ding L, Zhao C (2025) Provably convergent policy optimization via metric-aware trust region methods. Trans. Machine Learn. Res. Forthcoming.Google Scholar
Staib M, Jegelka S (2019) Distributionally robust optimization and generalization in kernel methods. Wallach H, Larochelle H, Beygelzimer A, d'Alché-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 9134–9144.Google Scholar
Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, McDaniel P (2018) Ensemble adversarial training: Attacks and defenses. Internat. Conf. Learn. Representations (OpenReview.net), 1–20.Google Scholar
Vandenberghe L, Boyd S (1995) Semidefinite programming. SIAM Rev. 38(1):49–95.Crossref, Google Scholar
Van Parys BP, Goulart PJ, Kuhn D (2015) Generalized Gauss inequalities via semidefinite programming. Math. Programming 156(1–2):271–302.Crossref, Google Scholar
Wang J, Gao R, Xie Y (2022a) Two-sample test with kernel projected Wasserstein distance. Camps-Valls G, Ruiz F, Valera I, eds. Proc. 25th Internat. Conf. Artificial Intelligence Statist., vol. 151 (PMLR, New York), 8022–8055.Google Scholar
Wang J, Gao R, Zha H (2022b) Reliable off-policy evaluation for reinforcement learning. Oper. Res. 72(2):699–716.Google Scholar
Wang Z, Glynn PW, Ye Y (2015) Likelihood robust optimization for data-driven problems. Comput. Management Sci. 13(2):241–261.Crossref, Google Scholar
Wang C, Gao R, Qiu F, Wang J, Xin L (2018) Risk-based distributionally robust optimal power flow with dynamic line rating. IEEE Trans. Power Systems 33(6):6074–6086.Crossref, Google Scholar
Wiesemann W, Kuhn D, Sim M (2014) Distributionally robust convex optimization. Oper. Res. 62(6):1358–1376.Link, Google Scholar
Wozabal D (2012) A framework for optimization under ambiguity. Ann. Oper. Res. 193(1):21–47.Crossref, Google Scholar
Xie W (2019) On distributionally robust chance constrained programs with Wasserstein distance. Math. Programming 186(1):115–155.Google Scholar
Yang I (2017) A convex optimization approach to distributionally robust Markov decision processes with Wasserstein distance. IEEE Control Syst. Lett. 1(1):164–169.Crossref, Google Scholar
Yang I (2020) Wasserstein distributionally robust stochastic control: A data-driven approach. IEEE Trans. Automatic Control 66(8):3863–3870.Crossref, Google Scholar
Yu Y, Lin T, Mazumdar EV, Jordan M (2022) Fast distributionally robust learning with variance-reduced min-max optimization. Camps-Valls G, Ruiz F, Valera I, eds. Internat. Conf. Artificial Intelligence Statist., vol. 151 (PMLR, New York), 1219–1250.Google Scholar
Yule GU (1912) On the methods of measuring association between two attributes. J. Roy. Statist. Soc. 75(6):579–652.Crossref, Google Scholar
Zhao C, Guan Y (2018) Data-driven risk-averse stochastic optimization with Wasserstein metric. Oper. Res. Lett. 46(2):262–267.Crossref, Google Scholar
Zhu J, Jitkrittum W, Diehl M, Schölkopf B (2021) Kernel distributionally robust optimization: Generalized duality theorem and stochastic approximation. Banerjee A, Fukumizu K, eds. Proc. 24th Internat. Conf. Artificial Intelligence Statist., vol. 130 (PMLR, New York), 280–288.Google Scholar
Zymler S, Kuhn D, Rustem B (2013) Distributionally robust joint chance constraints with second-order moment information. Math. Programming 137(1):167–198.Crossref, Google Scholar

Volume 74, Issue 3

May-June 2026

Pages v-x, 1153-1728, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:September 24, 2021
Accepted:August 27, 2025
Published Online:October 15, 2025

Cite as

Jie Wang, Rui Gao, Yao Xie (2025) Sinkhorn Distributionally Robust Optimization. Operations Research 74(3):1581-1603.

https://doi.org/10.1287/opre.2023.0294

Keywords

Acknowledgments

The authors thank the referees and the editorial team for extensive feedback in improving this manuscript.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Sinkhorn Distributionally Robust Optimization

References

Volume 74, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News