Convex and Nonconvex Risk-Based Linear Regression at Scale
References
- (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Statist. 7(1):226–248.Crossref, Google Scholar
- (2009) Curriculum learning. Proc. 26th Annual Internat. Conf. Machine Learn. (ACM, New York), 41–48.Google Scholar
- (2003) Convex Analysis and Optimization (Athena Scientific, Belmont, MA).Google Scholar
- (1997) Matrix Analysis (Springer-Verlag, New York).Crossref, Google Scholar
- (1984) An O(n) algorithm for quadratic knapsack problems. Oper. Res. Lett. 3(3):163–166.Crossref, Google Scholar
- (2021) Augmented Lagrangian methods for convex optimization problems. J. Oper. Res. Soc. China 10(2):305–342.Crossref, Google Scholar
- (2012) Safe feature elimination for the lasso and sparse supervised learning problems. Pacific J. Optim. 8(4):667–698.Google Scholar
- (2007) Finite-Dimensional Variational Inequalities and Complementarity Problems (Springer Science & Business Media, New York).Google Scholar
- (2008) Sure independence screening for ultrahigh dimensional feature space. J. Roy. Statist. Soc. B 70(5):849–911.Crossref, Google Scholar
- (1951) Maximum properties and inequalities for the eigenvalues of completely continuous operators. Proc. Natl. Acad. Sci. USA 37(11):760–766.Crossref, Google Scholar
- (2015) Statistical Learning with Sparsity: The Lasso and Generalizations (CRC Press, Boca Raton, FL).Crossref, Google Scholar
- (1974) Validation of subgradient optimization. Math. Programming 6(1):62–88.Crossref, Google Scholar
- (1980) A polynomially bounded algorithm for a singly constrained quadratic program. Math. Programming 18(1):338–343.Crossref, Google Scholar
- (1984) Generalized Hessian matrix and second-order optimality conditions for problems with C1,1 data. Appl. Math. Optim. 11(1):43–56.Crossref, Google Scholar
- (2013) Matrix Analysis, 2nd ed. (Cambridge University Press, Cambridge, UK).Google Scholar
- (2010) Predicting execution time of computer programs using sparse polynomial regression. Lafferty J, Williams C, Shawe-Taylor J, Zemel R, Culotta A, eds. Adv. Neural Inform. Processing Systems, vol. 23 (NeurIPS, San Diego), 883–891.Google Scholar
- (2020) Large-scale methods for distributionally robust optimization. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (NeurIPS, San Diego), 8847–8860.Google Scholar
- (2018) On efficiently solving the subproblems of a level-set method for fused lasso problems. SIAM J. Optim. 28(2):1842–1866.Crossref, Google Scholar
- (2020) An asymptotically superlinearly convergent semismooth Newton augmented Lagrangian method for linear programming. SIAM J. Optim. 30(3):2410–2440.Crossref, Google Scholar
- (2020) Adaptive sieving with PPDNA: Generating solution paths of exclusive lasso models. Preprint, submitted September 18, https://arxiv.org/abs/2009.08719.Google Scholar
- (2020) Average top-k aggregate loss for supervised learning. IEEE Trans. Pattern Anal. Machine Intelligence 44(1):76–86.Crossref, Google Scholar
- (2021) Screening rules and its complexity for active set identification. J. Convex Anal. 28(4):1053–1072.Google Scholar
- (2015) Gap safe screening rules for sparse multi-task and multi-class models. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (NeurIPS, San Diego), 811–819.Google Scholar
- (2000) Iterative Solution of Nonlinear Equations in Several Variables, Classics in Applied Mathematics, vol. 30 (SIAM, Philadelphia).Crossref, Google Scholar
- (1993) Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices. Math. Programming 62(1):321–357.Crossref, Google Scholar
- (1990) An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds. Math. Programming 46(1):321–328.Crossref, Google Scholar
- (1976) Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2):97–116.Link, Google Scholar
- (2000) Optimization of conditional value-at-risk. J. Risk 2(3):21–41.Crossref, Google Scholar
- (1998) Variational Analysis (Springer-Verlag, Berlin).Crossref, Google Scholar
- (1984) Least median of squares regression. J. Amer. Statist. Assoc. 79(388):871–880.Crossref, Google Scholar
- (2016) Simultaneous safe screening of features and samples in doubly sparse modeling. Balcan MF, Weinberger KQ, eds. Proc. 33rd Internat. Conf. Machine Learn., vol. 48 (ICML, San Diego), 1577–1586.Google Scholar
- (2016) Training region-based object detectors with online hard example mining. Proc. IEEE Conf. Comput. Vision Pattern Recognition, 761–769.Google Scholar
- (1986) On monotropic piecewise quadratic programming. Unpublished PhD thesis, Department of Mathematics, University of Washington, Seattle.Google Scholar
- (2020) A sparse semismooth Newton based proximal majorization-minimization algorithm for nonconvex square-root-loss regression problems. J. Machine Learn. Res. 21(226):1–38.Google Scholar
- (1997) Convex analysis approach to DC programming: Theory, algorithms and applications. Acta Mathematica Vietnamica 22(1):289–355.Google Scholar
- (2012) Strong rules for discarding predictors in lasso-type problems. J. Roy. Statist. Soc. B 74(2):245–266.Crossref, Google Scholar
- (2007) Robust regression shrinkage and consistent variable selection through the LAD-lasso. J. Bus. Econom. Statist. 25(3):347–355.Crossref, Google Scholar
- (2014) A safe screening rule for sparse logistic regression. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 27 (NeurIPS, San Diego), 1053–1061.Google Scholar
- (1992) Linear best approximation using a class of polyhedral norms. Numer. Algorithms 2(3):321–336.Crossref, Google Scholar
- (2014) On the Moreau-Yosida regularization of the vector k-norm related functions. SIAM J. Optim. 24(2):766–794.Crossref, Google Scholar
- (2022) Convex and nonconvex risk-based linear regression at scale. URL http://dx.doi.org/10.5281/zenodo.7483279, available at https://github.com/INFORMSJoC/2022.0012.Google Scholar
- (2010) Robust regression and lasso. IEEE Trans. Inform. Theory 56(7):3561–3574.Crossref, Google Scholar
- (2010) A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM J. Optim. 20(4):1737–1765.Crossref, Google Scholar

