Decision Making with Side Information: A Causal Transport Robust Approach

Published Online:https://doi.org/10.1287/opre.2024.0997

References

  • Acciaio B, Backhoff-Veraguas J, Carmona R (2019) Extended mean field control problems: Stochastic maximum principle and transport perspective. SIAM J. Control Optim. 57(6):3666–3693.CrossrefGoogle Scholar
  • Acciaio B, Backhoff-Veraguas J, Zalashko A (2020) Causal optimal transport and its links to enlargement of filtrations and continuous-time stochastic optimization. Stochastic Processing Appl. 130(5):2918–2953.CrossrefGoogle Scholar
  • Analui B, Pflug GC (2014) On distributionally robust multiperiod stochastic optimization. Comput. Management Sci. 11(3):197–220.CrossrefGoogle Scholar
  • Backhoff J, Beiglbock M, Lin Y, Zalashko A (2017) Causal transport in discrete time and applications. SIAM J. Optim. 27(4):2528–2562.CrossrefGoogle Scholar
  • Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning. Oper. Res. 67(1):90–108.LinkGoogle Scholar
  • Ban GY, Gallien J, Mersereau AJ (2019) Dynamic procurement of new products with covariate information: The residual tree method. Manufacturing Service Oper. Management 21(4):798–815.LinkGoogle Scholar
  • Bartl D, Wiesel J (2023) Sensitivity of multiperiod optimization problems with respect to the adapted Wasserstein distance. SIAM J. Financial Math. 14(2):704–720.CrossrefGoogle Scholar
  • Basciftci B, Ahmed S, Shen S (2021) Distributionally robust facility location problem under decision-dependent stochastic demand. Eur. J. Oper. Res. 292(2):548–561.CrossrefGoogle Scholar
  • Bayraksan G, Love DK (2015) Data-driven stochastic programming using phi-divergences. Aleman DM, Thiele AC, eds. The Operations Research Revolution: INFORMS TutORials in Operations Research (INFORMS, Catonsville, MD), 1–19.LinkGoogle Scholar
  • Bazier-Matte T, Delage E (2020) Generalization bounds for regularized portfolio selection with market side information. INFOR: Inform. Systems Oper. Res. 58(2):374–401.CrossrefGoogle Scholar
  • Bertsimas D, Georghiou A (2015) Design of near optimal decision rules in multistage adaptive mixed-integer optimization. Oper. Res. 63(3):610–627.LinkGoogle Scholar
  • Bertsimas D, Goyal V (2012) On the power and limitations of affine policies in two-stage adaptive optimization. Math. Programming 134(2):491–531.CrossrefGoogle Scholar
  • Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.LinkGoogle Scholar
  • Bertsimas D, Koduri N (2022) Data-driven optimization: A reproducing kernel hilbert space approach. Oper. Res. 70(1):454–471.LinkGoogle Scholar
  • Bertsimas D, McCord C (2019) From predictions to prescriptions in multistage optimization problems. Preprint, submitted April 26, https://arxiv.org/abs/1904.11637.Google Scholar
  • Bertsimas D, Van Parys B (2022) Bootstrap robust prescriptive analytics. Math. Programming 195:39–78.CrossrefGoogle Scholar
  • Bertsimas D, Iancu DA, Parrilo PA (2010) Optimality of affine policies in multistage robust optimization. Math. Oper. Res. 35(2):363–394.LinkGoogle Scholar
  • Bertsimas D, Iancu DA, Parrilo PA (2011) A hierarchy of near-optimal policies for multistage adaptive optimization. IEEE Trans. Automated Control 56(12):2809–2824.CrossrefGoogle Scholar
  • Bertsimas D, McCord C, Sturt B (2023) Dynamic optimization with side information. Eur. J. Oper. Res. 304(2):634–651.CrossrefGoogle Scholar
  • Blanchet J, Murthy K (2019) Quantifying distributional model risk via optimal transport. Math. Oper. Res. 44(2):565–600.LinkGoogle Scholar
  • Blanchet J, Kang Y, Murthy K (2019) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probability 56(3):830–857.CrossrefGoogle Scholar
  • Brandt MW, Santa-Clara P, Valkanov R (2009) Parametric portfolio policies: Exploiting characteristics in the cross-section of equity returns. Rev. Financial Stud. 22(9):3411–3447.CrossrefGoogle Scholar
  • Cao J, Gao R (2021) Contextual decision-making under parametric uncertainty and data-driven optimistic optimization. Optimization Online (October 16), https://optimization-online.org/wp-content/uploads/2021/10/Contextual_optimization-1.pdf.Google Scholar
  • Carmeli C, De Vito E, Toigo A, Umanitá V (2010) Vector valued reproducing kernel Hilbert spaces and universality. Anal. Appl. (Singapore) 8(01):19–61.CrossrefGoogle Scholar
  • Chen X, Sim M, Sun P, Zhang J (2008) A linear decision-based approximation approach to stochastic programming. Oper. Res. 56(2):344–357.LinkGoogle Scholar
  • Chenreddy AR, Bandi N, Delage E (2022) Data-driven conditional robust optimization. Adv. Neural Inform. Processing Systems 35:9525–9537.Google Scholar
  • El Balghiti O, Elmachtoub AN, Grigas P, Tewari A (2019) Generalization bounds in the predict-then-optimize framework. Adv. Neural Inform. Processing Systems 32:14412–14421.Google Scholar
  • El Housni O, Goyal V (2021) On the optimality of affine policies for budgeted uncertainty sets. Math. Oper. Res. 46(2):674–711.LinkGoogle Scholar
  • Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize”. Management Sci. 68(1):9–26.LinkGoogle Scholar
  • Elmachtoub A, Liang JCN, McNellis R (2020) Decision trees for decision-making under the predict-then-optimize framework. Daumé H III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 119 (ML Research Press, Cambridge, MA), 2858–2867.Google Scholar
  • Esfahani PM, Kuhn D (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1):115–166.CrossrefGoogle Scholar
  • Esteban-Pérez A, Morales JM (2022) Distributionally robust stochastic programs with side information based on trimmings. Math. Programming 195(1–2):1069–1105.CrossrefGoogle Scholar
  • Estes A (2021) Slow rates of convergence in optimization with side information. Preprint, submitted March 15, https://doi.org/10.2139/ssrn.3803427.Google Scholar
  • Feng X, He X, Jiao Y, Kang L, Wang C (2024) Deep nonparametric quantile regression under covariate shift. J. Machine Learn. Res. 25(385):1–50.Google Scholar
  • Gao R (2023) Finite-sample guarantees for Wasserstein distributionally robust optimization: Breaking the curse of dimensionality. Oper. Res. 71(6):2291–2306.LinkGoogle Scholar
  • Gao R, Kleywegt A (2023) Distributionally robust stochastic optimization with Wasserstein distance. Math. Oper. Res. 48(2):603–655.LinkGoogle Scholar
  • Gao R, Arora R, Huang Y (2024a) Data-driven multistage distributionally robust linear optimization with nested distance. Preprint, submitted July 23, https://arxiv.org/abs/2407.16346.Google Scholar
  • Gao R, Chen X, Kleywegt AJ (2024b) Wasserstein distributionally robust optimization and variation regularization. Oper. Res. 72(3):1177–1191.LinkGoogle Scholar
  • Genpact (2019) Food demand forecasting dataset. Kaggle, https://www.kaggle.com/datasets/kannanaikkal/food-demand-forecasting.Google Scholar
  • Georghiou A, Tsoukalas A, Wiesemann W (2025) On the optimality of affine decision rules in distributionally robust optimization. Management Sci. 72(2):1456–1471.LinkGoogle Scholar
  • Hanasusanto GA, Kuhn D (2013) Robust data-driven dynamic programming. Adv. Neural Inform. Processing Systems 26.Google Scholar
  • Hanasusanto GA, Kuhn D, Wiesemann W (2015) K-adaptability in two-stage robust binary programming. Oper. Res. 63(4):877–891.LinkGoogle Scholar
  • Hanasusanto GA, Kuhn D, Wiesemann W (2016) K-adaptability in two-stage distributionally robust binary programming. Oper. Res. Lett. 44(1):6–11.CrossrefGoogle Scholar
  • Hannah L, Powell W, Blei D (2010) Nonparametric density estimation for stochastic optimization with an observable state variable. Adv. Neural Inform. Processing Systems 23:820–828.Google Scholar
  • Ho-Nguyen N, Kılınç-Karzan F (2022) Risk guarantees for end-to-end prediction and optimization processes. Management Sci. 68(12):8680–8698.LinkGoogle Scholar
  • Hu Y, Kallus N, Mao X (2022) Fast rates for contextual linear optimization. Management Sci. 68(6):4236–4245.LinkGoogle Scholar
  • Hu Y, Wang J, Xie Y, Krause A, Kuhn D (2024) Contextual stochastic bilevel optimization. Adv. Neural Inform. Processing Systems 36.Google Scholar
  • Iancu DA, Sharma M, Sviridenko M (2013) Supermodularity and affine policies in dynamic robust optimization. Oper. Res. 61(4):941–956.LinkGoogle Scholar
  • Jean J (1980) Weak and strong solutions of stochastic differential equations. Stochastics 3(1–4):171–191.CrossrefGoogle Scholar
  • Jiang Y (2024) Duality of causal distributionally robust optimization: The discrete-time case. Preprint, submitted January 29, https://arxiv.org/abs/2401.16556.Google Scholar
  • Kallus N, Mao X (2023) Stochastic optimization forests. Management Sci. 69(4):1975–1994.LinkGoogle Scholar
  • Kannan R, Bayraksan G, Luedtke JR (2024) Residuals-based distributionally robust optimization with covariate information. Math. Programming 207:369–425.CrossrefGoogle Scholar
  • Kannan R, Bayraksan G, Luedtke JR (2025) Data-driven sample average approximation with covariate information. Oper. Res. 73(6):3245–3259.LinkGoogle Scholar
  • Kuhn D, Esfahani PM, Nguyen VA, Shafieezadeh-Abadeh S (2019) Wasserstein distributionally robust optimization: Theory and applications in machine learning. Netessine S, ed. Operations Research & Management Science in the Age of Analytics. INFORMS TutORials in Operations Research (INFORMS, Catonsville, MD), 130–166.Google Scholar
  • Kurtz T (2014) Weak and strong solutions of general stochastic models. Electronic Comm. Probability 19:1–16.CrossrefGoogle Scholar
  • Lassalle R (2018) Causal transference plans and their Monge-Kantorovich problems. Stochastic Anal. Appl. 36(3):452–484.Google Scholar
  • Liu M, Qi M, Shen ZJM (2021) End-to-end deep learning for inventory management with fixed ordering cost and its theoretical analysis. Preprint, submitted July 19, https://doi.org/10.2139/ssrn.3888897.Google Scholar
  • Liyanage LH, Shanthikumar JG (2005) A practical inventory control policy using operational statistics. Oper. Res. Lett. 33(4):341–348.CrossrefGoogle Scholar
  • Loke GG, Tang Q, Xiao Y (2020) Decision-driven regularization: A blended model for predict-then-optimize. Preprint, submitted June 17, https://doi.org/10.2139/ssrn.3623006.Google Scholar
  • Muñoz MA, Pineda S, Morales JM (2022) A bilevel framework for decision-making under uncertainty with contextual information. Omega (Westport) 108:102575.CrossrefGoogle Scholar
  • Nguyen VA, Zhang F, Wang S, Blanchet J, Delage E, Ye Y (2025) Robustifying conditional portfolio decisions via optimal transport. Oper. Res. 73(5):2801–2829.LinkGoogle Scholar
  • Oroojlooyjadid A, Snyder LV, Takáč M (2020) Applying deep learning to the newsvendor problem. IISE Trans. 52(4):444–463.CrossrefGoogle Scholar
  • Perakis G, Sim M, Tang Q, Xiong P (2023) Robust pricing and production with information partitioning and adaptation. Management Sci. 69(3):1398–1419.LinkGoogle Scholar
  • Pflug GC (2010) Version-independence and nested distributions in multistage stochastic optimization. SIAM J. Optim. 20(3):1406–1420.CrossrefGoogle Scholar
  • Pflug GC, Pichler A (2012) A distance for multistage stochastic optimization models. SIAM J. Optim. 22(1):1–23.CrossrefGoogle Scholar
  • Pflug GC, Pichler A (2014) Multistage Stochastic Optimization (Springer, Berlin).CrossrefGoogle Scholar
  • Pflug GC, Pichler A (2015) Dynamic generation of scenario trees. Comput. Optim. Appl. 62(3):641–668.CrossrefGoogle Scholar
  • Pflug GC, Pichler A (2016) From empirical observations to tree models for stochastic optimization: Convergence properties. SIAM J. Optim. 26(3):1715–1740.CrossrefGoogle Scholar
  • Pflug G, Wozabal D (2007) Ambiguity in portfolio selection. Quant. Finance 7(4):435–442.CrossrefGoogle Scholar
  • Pichler A, Shapiro A (2021) Mathematical foundations of distributionally robust multistage optimization. SIAM J. Optim. 31(4):3044–3067.CrossrefGoogle Scholar
  • Postek K, Hertog D (2016) Multistage adjustable robust mixed-integer optimization via iterative splitting of the uncertainty set. INFORMS J. Comput. 28(3):553–574.LinkGoogle Scholar
  • Qi M, Shen ZJ (2022) Integrating prediction/estimation and optimization with applications in operations management. Chou MC, Gibson H, Staats BR, eds. Tutorials in Operations Research: Emerging and Impactful Topics in Operations (INFORMS, Catonsville, MD), 36–58.LinkGoogle Scholar
  • Qi M, Grigas P, Shen ZJ (2025) Integrated conditional estimation-optimization. Oper. Res., ePub ahead of print October 28, https://doi.org/10.1287/opre.2023.0427.LinkGoogle Scholar
  • Qi M, Shen ZJ, Zheng Z (2024) Learning newsvendor problems with intertemporal dependence and moderate non-stationarities. Production Oper. Management 33(5):1196–1213.CrossrefGoogle Scholar
  • Qi M, Shi Y, Qi Y, Ma C, Yuan R, Wu D, Shen ZJ (2023) A practical end-to-end inventory management model with deep learning. Management Sci. 69(2):759–773.LinkGoogle Scholar
  • Rahimian H, Bayraksan G, Homem-de Mello T (2019) Controlling risk and demand ambiguity in newsvendor models. Eur. J. Oper. Res. 279(3):854–868.CrossrefGoogle Scholar
  • Rüschendorf L (1985) The Wasserstein distance and approximation theorems. Probability Theory Related Fields 70(1):117–129.CrossrefGoogle Scholar
  • Rychener Y, Kuhn D, Sutter T (2023) End-to-end learning for stochastic optimization: A bayesian perspective. Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J, eds. Proc. 40th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 202 (ML Research Press, Cambridge, MA), 29455–29472.Google Scholar
  • Sadana U, Chenreddy A, Delage E, Forel A, Frejinger E, Vidal T (2025) A survey of contextual optimization methods for decision-making under uncertainty. Eur. J. Oper. Res. 320(2):271–289.CrossrefGoogle Scholar
  • Shafieezadeh-Abadeh S, Kuhn D, Esfahani PM (2019) Regularization via mass transportation. J. Machine Learn. Res. 20(103):1–68.Google Scholar
  • Shapiro A, Dentcheva D, Ruszczyński A (2014) Lectures on Stochastic Programming: Modeling and Theory (SIAM, Philadelphia).CrossrefGoogle Scholar
  • Shen Y, Xu P, Zavlanos M (2024) Wasserstein distributionally robust policy evaluation and learning for contextual bandits. Trans. Machine Learn. Res.Google Scholar
  • Sturt B (2023) A nonparametric algorithm for optimal stopping based on robust optimization. Oper. Res. 71(5):1530–1557.LinkGoogle Scholar
  • Subramanyam A, Gounaris CE, Wiesemann W (2019) K-adaptability in two-stage mixed-integer robust optimization. Math. Programming Comput. 1–32.Google Scholar
  • Toktay LB, Wein LM (2001) Analysis of a forecasting-production-inventory system with stationary demand. Management Sci. 47(9):1268–1281.LinkGoogle Scholar
  • Tulabandhula T, Rudin C (2013) Machine learning with operational costs. J. Machine Learn. Res. 14:1989–2028.Google Scholar
  • Van Parys B, Bennouna MA (2022) Robust two-stage optimization with covariate data. Optimization Online (October 24), https://optimization-online.org/wp-content/uploads/2022/10/main-3.pdf.Google Scholar
  • Van Parys BP, Esfahani PM, Kuhn D (2021) From data to decisions: Distributionally robust optimization is optimal. Management Sci. 67(6):3387–3402.LinkGoogle Scholar
  • Vayanos P, Georghiou A, Yu H (2025) Robust optimization with decision-dependent information discovery. Management Sci. 72(2):1509–1528.LinkGoogle Scholar
  • Wang Y, Srivastava PR, Hanasusanto GA, Ho CP (2026) On data-driven prescriptive analytics with side information: A regularized Nadaraya–Watson approach. Manufacturing & Service Operations Management, ePub ahead of print January 5, https://doi.org/10.1287/msom.2024.0997.LinkGoogle Scholar
  • Wozabal D (2012) A framework for optimization under ambiguity. Ann. Oper. Res. 193(1):21–47.CrossrefGoogle Scholar
  • Xu T, Wenliang LK, Munn M, Acciaio B (2020) COT-GAN: Generating sequential data via causal optimal transport. Adv. Neural Inform. Processing Systems 33:8798–8809.Google Scholar
  • Yamada T, Watanabe S (1971) On the uniqueness of solutions of stochastic differential equations. J. Math. Kyoto University 11(1):155–167.CrossrefGoogle Scholar
  • Yu X, Shen S (2022) Multistage distributionally robust mixed-integer programming with decision-dependent moment-based ambiguity sets. Math. Programming 196(1):1025–1064.CrossrefGoogle Scholar
  • Zhang L, Yang J, Gao R (2024) Optimal robust policy for feature-based newsvendor. Management Sci. 70(4):2315–2329.LinkGoogle Scholar
  • Zhang L, Yang J, Gao R (2025) A short and general duality proof for Wasserstein distributionally robust optimization. Oper. Res. 73(4):2146–2155.LinkGoogle Scholar
  • Zhu K, Thonemann UW (2004) An adaptive forecasting algorithm and inventory policy for products with short life cycles. Naval Res. Logist. 51(5):633–653.CrossrefGoogle Scholar
  • Zhu T, Xie J, Sim M (2022) Joint estimation and robustness optimization. Management Sci. 68(3):1659–1677.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.