Generalization Bounds in the Predict-Then-Optimize Framework
Published Online:16 Nov 2022https://doi.org/10.1287/moor.2022.1330
References
- [1] (2017) Optnet: Differentiable optimization as a layer in neural networks. Internat. Conf. Machine Learn. (PMLR), 136–145.Google Scholar
- [2] (2019) The big data newsvendor: Practical insights from machine learning. Oper. Res. 67(1):90–108.Link, Google Scholar
- [3] (2002) Rademacher and Gaussian complexities: Risk bounds and structural results. J. Machine Learn. Res. 3:463–482.Google Scholar
- [4] (2020) Learning with differentiable perturbed optimizers. Adv. Neural Inform. Processing Systems 33:9508–9519.Google Scholar
- [5] (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.Link, Google Scholar
- [6] (2021) Data-driven optimization for last-mile delivery. Complex Intelligent Systems 1–14.Google Scholar
- [7] (2014) Optimal learners for multiclass problems. Conf. Learn. Theory 287–316.Google Scholar
- [8] (2015) Multiclass learnability and the ERM principle. J. Machine Learn. Res. 16(1):2377–2404.Google Scholar
- [9] (2019) Predict+optimise with ranking objectives: Exhaustively learning linear functions. IJCAI-19, 1078–1085.Google Scholar
- [10] (2020) Dynamic programming for predict+optimise. Proc. 34th AAAI Conf. Artificial Intelligence, 1444–1451.Google Scholar
- [11] (2017) Task-based end-to-end model learning in stochastic optimization. Adv. Neural Inform. Processing Systems 31:5484–5494.Google Scholar
- [12] (2019) Generalization bounds in the predict-then-optimize framework. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems, vol. 32, 14412–14421.Google Scholar
- [13] (2022) Smart “predict, then optimize.” Management Sci. 68(1):9–26.Link, Google Scholar
- [14] (2020) Decision trees for decision-making under the predict-then-optimize framework. Internat. Conf. Machine Learn. (PMLR), 2858–2867.Google Scholar
- [15] (2020) MIPaaL: Mixed integer program as a layer. Proc. Conf. AAAI Artificial Intelligence, vol. 34, 1504–1511.Google Scholar
- [16] (2019) ℓ∞ vector contraction for Rademacher complexity. Preprint, submitted November 15, https://arxiv.org/abs/1911.06468.Google Scholar
- [17] (2015) Faster rates for the Frank-Wolfe method over strongly-convex sets. Internat. Conf. Machine Learn. (PMLR), 541–549.Google Scholar
- [18] (2007) VC theory of large margin multi-category classifiers. J. Machine Learn. Res. 8:2551–2594.Google Scholar
- [19] (2019) On data-driven prescriptive analytics with side information: A regularized Nadaraya–Watson approach. Accessed June 1, 2022, http://www.optimization-online.org/DBFILE/2019/01/7043.pdf.Google Scholar
- [20] (2022) Risk guarantees for end-to-end prediction and optimization processes. Management Sci. 68(12):8680–8698.Google Scholar
- [21] (2022) Fast rates for contextual linear optimization. Management Sci. 68(6):4236–4245.Link, Google Scholar
- [22] (2010) Generalized power method for sparse principal component analysis. J. Machine Learn. Res. 11:517–553.Google Scholar
- [23] (2012) Regularization techniques for learning with matrices. J. Machine Learn. Res. 13:1865–1890.Google Scholar
- [24] (2008) On the complexity of linear prediction: Risk bounds, margin bounds, and regularization. Adv. Neural Inform. Processing Systems 21:793–800.Google Scholar
- [25] (2022) Stochastic optimization forests. Management Sci. ePub ahead of print June 28, https://doi.org/10.1287/mnsc.2022.4458.Google Scholar
- [26] (2009) Directed regression. Adv. Neural Inform. Processing Systems 22:889–897.Google Scholar
- [27] (2002) Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30(1):1–50.Crossref, Google Scholar
- [28] (2001) Some new bounds on the generalization error of combined classifiers. Adv. Neural Inform. Processing Systems 13:245–251.Google Scholar
- [29] (2021) End-to-end constrained optimization learning: A survey. 30th International Joint Conference on Artificial Intelligence, IJCAI, 4475–4482.Google Scholar
- [30] (2015) Multi-class SVMs: From tighter data-dependent generalization bounds to novel algorithms. Adv. Neural Inform. Processing Systems 28:2035–2043.Google Scholar
- [31] (2018) Multi-class learning: From theory to algorithm. Adv. Neural Inform. Processing Systems 31:1586–1595.Google Scholar
- [32] (2022) Decision-driven regularization: A blended model for predict-then-optimize. Preprint, submitted June 17, 2020, https://dx.doi.org/10.2139/ssrn.3623006.Google Scholar
- [33] (2020) Interior point solving for LP-based prediction+optimisation. Adv. Neural Inform. Processing Systems 33:7272–7282.Google Scholar
- [34] (2020) Smart predict-and-optimize for hard combinatorial optimization problems. Proc. AAAI Conf. Artificial Intelligence, vol. 34, 1603–1610.Google Scholar
- [35] (2016) A vector-contraction inequality for Rademacher complexities. Internat. Conf. Algorithmic Learn. Theory (Springer, Berlin), 3–17.Google Scholar
- [36] (2018) Foundations of Machine Learning, 2nd ed. (MIT Press, Cambridge, MA).Google Scholar
- [37] (1989) On learning sets and functions. Machine Learn. 4(1):67–97.Crossref, Google Scholar
- [38] (2019) Differentiation of blackbox combinatorial solvers. Internat. Conf. Learn. Representations.Google Scholar
- [39] (2009) Variational Analysis, vol. 317 (Springer Science & Business Media, Berlin).Google Scholar
- [40] (2014) Understanding Machine Learning: From Theory to Algorithms (Cambridge University Press, Cambridge).Crossref, Google Scholar
- [41] (2015) Statistical Learning with Sparsity: The Lasso and Generalizations (Chapman and Hall/CRC, Boca Raton).Google Scholar
- [42] (1983) Strong and weak convexity of sets and functions. Math. Oper. Res. 8(2):231–259.Link, Google Scholar
- [43] (2020) Automatically learning compact quality-aware surrogates for optimization problems. Adv. Neural Inform. Processing Systems 33:9586–9596.Google Scholar
- [44] (2019) Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization. Proc. Conf. AAAI Artificial Intelligence 33:1658–1665.Google Scholar
- [45] (2019) End to end learning and optimization on graphs. Adv. Neural Inform. Processing Systems 33:4674–4685.Google Scholar
- [46] (2020) A semi-“smart predict then optimize”(semi-SPO) method for efficient ship inspection. Transportation Res. Part B Methodological 142:100–125.Crossref, Google Scholar
- [47] (2019) A vector-contraction inequality for Rademacher complexities using p-stable variables. Preprint, submitted December 20, https://arxiv.org/abs/1912.10136.Google Scholar
- [48] (2012) Lectures on Polytopes, vol. 152 (Springer Science & Business Media, Berlin).Google Scholar

