Risk-Based Robust Statistical Learning by Stochastic Difference-of-Convex Value-Function Optimization

Published Online:https://doi.org/10.1287/opre.2021.2248

References

  • Aravkin A, Davis D (2020) Trimmed statistical estimation via variance reduction. Math. Oper. Res. 45(1):292–322.LinkGoogle Scholar
  • Bartlett PL, Mendelson S (2002) Rademacher and gaussian complexities: Risk bounds and structural results. J. Machine Learn. Res. 3(Nov):463–482.Google Scholar
  • Ben-Tal A, Teboulle M (2007) An old-new concept of convex risk measures: The optimized certainty equivalent. Math. Finance 17(3):449–476.CrossrefGoogle Scholar
  • Cui Y, Pang JS (2021) Modern Nonconvex Nondifferentiable Optimization (SIAM Publications, Philadelphia).CrossrefGoogle Scholar
  • Cui Y, Pang JS, Sen B (2018) Composite difference-max programs for modern statistical estimation problems. SIAM J. Optim. 28(4):3344–3374.CrossrefGoogle Scholar
  • Curi S, Levy KY, Jegelka S, Krause A (2020) Adaptive sampling for stochastic risk-averse learning. 34th Conf. on Neural Inform. Processing Systems (NeurIPS 2020), Vancouver.Google Scholar
  • Ermoliev YM, Norkin VI (2013) Sample average approximation method for compound stochastic optimization problems. SIAM J. Optim. 23(4):2231–2263.CrossrefGoogle Scholar
  • Fan Y, Lyu S, Ying Y, Hu B-G (2017) Learning with average top-k loss. 31st Conf. on Neural Inform. Processing Systems (NIPS 2017), Long Beach, CA. Google Scholar
  • Fujiwara S, Takeda A, Kanamori T (2017) DC algorithm for extended robust support vector machine. Neural Comput. 29(5):1406–1438.CrossrefGoogle Scholar
  • Giloni A, Padberg M (2002) Least trimmed squares regression, least median squares regression, and mathematical programming. Math. Comput. Model. 35(9-10):1043–1060.CrossrefGoogle Scholar
  • Holland PW, Welsch RE (1977) Robust regression using iteratively reweighted least-squares. Comm. Statist. Theory Methods 6(9):813–827.CrossrefGoogle Scholar
  • Horst R, Thoai NV (1999) DC programming: Overview. J. Optim. Theory Appl. 103(1):1–43.CrossrefGoogle Scholar
  • Huber PJ (1973) Robust regression: Asymptotics, conjectures and Monte Carlo. Ann. Statist. 1(5):799–821.CrossrefGoogle Scholar
  • Jin C, Liu LT, Ge R, Jordan MI (2018) On the local minima of the empirical risk. 32nd Conf. on Neural Inform. Processing Systems (NeurIPS 2018), Montreal, Canada.Google Scholar
  • Kanamori T, Fujiwara S, Takeda A (2017) Breakdown point of robust support vector machines. Entropy 19(2):83.CrossrefGoogle Scholar
  • Le Thi HA, Pham Dinh T (2018) DC programming and DCA: Thirty years of developments. Math. Programming 169(1):5–68.CrossrefGoogle Scholar
  • Le Thi HA, Van Ngai H, Pham Dinh T (2019) Stochastic difference-of-convex algorithms for solving nonconvex optimization problems. Preprint, submitted November 11, https://arxiv.org/abs/1911.04334v1.Google Scholar
  • Lecué G, Lerasle M, Mathieu T (2020) Robust classification via mom minimization. Machine Learn. 109(8):1635–1665.CrossrefGoogle Scholar
  • Leung DHY (2005) Cross-validation in nonparametric regression with outliers. Ann. Statist. 33(5):2291–2310.CrossrefGoogle Scholar
  • Liu J, Cui Y, Pang J-S (2022) Solving nonsmooth nonconvex compound stochastic programs with applications to risk measure minimization. Math. Oper. Res. Forthcoming.LinkGoogle Scholar
  • Lu Z, Zhou Z, Sun Z (2019) Enhanced proximal dc algorithms with extrapolation for a class of structured nonsmooth dc minimization. Math. Programming 176(1-2):369–401.CrossrefGoogle Scholar
  • Mafusalov A, Uryasev S (2016) CVaR (superquantile) norm: Stochastic case. Eur. J. Oper. Res. 249(1):200–208.CrossrefGoogle Scholar
  • Nouiehed M, Pang JS, Razaviyayn M (2019) On the pervasiveness of difference-convexity in optimization and statistics. Math. Programming 174(1-2):195–222.CrossrefGoogle Scholar
  • Pang JS, Razaviyayn M, Alvarado A (2016) Computing b-stationary points of nonsmooth dc programs. Math. Oper. Res. 42(1):95–118.LinkGoogle Scholar
  • Pflug GC, Ruszczynski A (2005) Measuring risk for income streams. Comput. Optim. Appl. 32(1–2):161–178.Google Scholar
  • Pham Dinh T, Le Thi HA (1997) Convex analysis approach to dc programming: Theory, algorithms and applications. Acta Math. Vietnam 22(1):289–355.Google Scholar
  • Robbins H, Siegmund D (1971) A convergence theorem for non negative almost supermartingales and some applications. Rustagi JS, ed. Optimizing Methods in Statistics (Academic Press, New York), 233–257.Google Scholar
  • Rockafellar RT, Uryasev S (2000) Optimization of conditional value-at-risk. J. Risk 2(7):21–42.CrossrefGoogle Scholar
  • Rockafellar RT, Uryasev S (2002) Conditional value-at-risk for general loss distributions. J. Banking Finance 26(7):1443–1471.CrossrefGoogle Scholar
  • Ronchetti E, Field C, Blanchard W (1997) Robust linear model selection by cross-validation. J. Amer. Statist. Assoc. 92(439):1017–1023.CrossrefGoogle Scholar
  • Rousseeuw PJ (1984) Least median of squares regression. J. Amer. Statist. Assoc. 79(388):871–880.CrossrefGoogle Scholar
  • Rousseeuw PJ, Leroy AM (2005) Robust Regression and Outlier Detection, vol. 589 (John Wiley & Sons, New York).Google Scholar
  • Rousseeuw PJ, Van Driessen K (2006) Computing lts regression for large data sets. Data Mining Knowledge Discovery 12(1):29–45.CrossrefGoogle Scholar
  • Shapiro A, Dentcheva D, Ruszczyński A (2009) Lectures on Stochastic Programming: Modeling and Theory (SIAM Publications, Philadelphia).CrossrefGoogle Scholar
  • Shen Y, Sanghavi S (2019) Learning with bad training data via iterative trimmed loss minimization. Proc. 36th Internat. Conf. on Machine Learn., Long Beach, CA (PMLR), 97.Google Scholar
  • Takeda A, Kanamori T (2009) A robust approach based on conditional value-at-risk measure to statistical learning problems. Eur. J. Oper. Res. 198(1):287–296.CrossrefGoogle Scholar
  • Tsyurmasto P, Uryasev S, Gotoh J (2013) Support vector classification with positive homogeneous risk functionals. Research report 2013-4, ISE Department, University of Florida, Gainesville, FL.Google Scholar
  • Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J. Amer. Statist. Assoc. 102(479):974–983.CrossrefGoogle Scholar
  • Zabarankin M, Uryasev S (2014) Portfolio safeguard case studies. Statistical Decision Problems: Selected Concepts and Portfolio Safeguard Case Studies, Springer Optimization and Its Applications, vol. 85 (Springer, New York), 133–240.CrossrefGoogle Scholar
  • Zhang C, Pham M, Fu S, Liu Y (2018) Robust multicategory support vector machines using difference convex algorithm. Math. Programming 169(1):277–305.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.