Risk Guarantees for End-to-End Prediction and Optimization Processes

Published Online:https://doi.org/10.1287/mnsc.2022.4321

References

  • Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning. Oper. Res. 67(1):90–108.LinkGoogle Scholar
  • Bartlett PL, Jordan MI, McAuliffe JD (2006) Convexity, classification, and risk bounds. J. Amer. Statist. Assoc. 101(473):138–156.CrossrefGoogle Scholar
  • Bengio Y (1997) Using a financial training criterion rather than a prediction criterion. Internat. J. Neural Systems 8(04):433–443.CrossrefGoogle Scholar
  • Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.LinkGoogle Scholar
  • Bertsimas D, Van Parys B (2021) Bootstrap robust prescriptive analytics. Math. Programming, ePub ahead of print June 25, https://doi.org/10.1007/s10107-021-01679-2.Google Scholar
  • Bousquet O, Boucheron S, Lugosi G (2004) Introduction to statistical learning theory. Bousquet O, von Luxburg U, Rätsch G, eds. Advanced Lectures on Machine Learning, Lecture Notes in Computer Science, vol. 3176 (Springer, Berlin), 169–207.CrossrefGoogle Scholar
  • Donti P, Amos B, Kolter JZ (2017) Task-based end-to-end model learning in stochastic optimization. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds.Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY), 5484–5494.Google Scholar
  • Duchi J, Khosravi K, Ruan F (2018) Multiclass classification, information, divergence and surrogate risk. Annals Statist. 46(6B):3246–3275.CrossrefGoogle Scholar
  • Elmachtoub AN, Grigas P (2022) Smart “predict, then optimize.” Management Sci. 68(1):9–26.LinkGoogle Scholar
  • Fama EF, French KR (1992) The cross-section of expected stock returns. J. Finance 47(2):427–465.CrossrefGoogle Scholar
  • Goh CY, Jaillet P (2016) Structured prediction by conditional risk minimization. Technical report, https://arxiv.org/abs/1611.07096.Google Scholar
  • Hanasusanto GA, Kuhn D (2013) Robust data-driven dynamic programming. Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates, Red Hook, NY), 827–835.Google Scholar
  • Hannah L, Powell W, Blei DM (2010) Nonparametric density estimation for stochastic optimization with an observable state variable. Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, eds. Advances in Neural Information Processing Systems, vol. 23 (Curran Associates, Red Hook, NY), 820–828.Google Scholar
  • Kao Y, Roy BV, Yan X (2009) Directed regression. Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A, eds. Advances in Neural Information Processing Systems, vol. 22 (Curran Associates, Red Hook, NY), 889–897.Google Scholar
  • Lin Y (2004) A note on margin-based loss functions in classification. Statist. Probab. Lett. 68(1):73–82.CrossrefGoogle Scholar
  • Liyanage LH, Shanthikumar JG (2005) A practical inventory control policy using operational statistics. Oper. Res. Lett. 33(4):341–348.CrossrefGoogle Scholar
  • Osokin A, Bach F, Lacoste-Julien S (2017) On structured prediction theory with calibrated convex surrogate losses. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Red Hook, NY), 302–313.Google Scholar
  • Srivastava PR, Wang Y, Hanasusanto GA, Ho CP (2019) On data-driven prescriptive analytics with side information: A regularized {N}adaraya-{W}atson approach. Technical report, http://www.optimization-online.org/DB_HTML/2019/01/7043.html.Google Scholar
  • Steinwart I (2002a) On the influence of the kernel on the consistency of support vector machines. J. Machine Learn. Res. 2:67–93.Google Scholar
  • Steinwart I (2002b) Support vector machines are universally consistent. J. Complexity 18(3):768–791.CrossrefGoogle Scholar
  • Steinwart I (2005) Consistency of support vector machines and other regularized kernel classifiers. IEEE Trans. Inform. Theory 51(1):128–142.CrossrefGoogle Scholar
  • Steinwart I (2007) How to compare different loss functions and their risks. Constructive Approximation 26(2):225–287.CrossrefGoogle Scholar
  • Tewari A, Bartlett PL (2007) On the consistency of multiclass classification methods. J. Machine Learn. Res. 8(36):1007–1025.Google Scholar
  • Zhang T (2004a) Statistical analysis of some multi-category large margin classification methods. J. Machine Learn. Res. 5(October):1225–1251.Google Scholar
  • Zhang T (2004b) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32(1):56–85.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.