Trimmed Statistical Estimation via Variance Reduction

Published Online:https://doi.org/10.1287/moor.2019.0992

References

  • [1] Abdel-Aziz Y, Karara H, Hauck M (2015) Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry. Photogrammetric Engrg. Remote Sensing 81(2):103–107.CrossrefGoogle Scholar
  • [2] Alfons A, Croux C, Gelper S (2013) Sparse least trimmed squares regression for analyzing high-dimensional large data sets. Ann. Appl. Statist. 7(1):226–248.CrossrefGoogle Scholar
  • [3] Aravkin A, Drusvyatskiy D, van Leeuwen T (2016) Variable projection without smoothness. Working paper, University of Washington, Seattle.Google Scholar
  • [4] Bauschke HH, Combettes PL (2011) Convex Analysis and Monotone Operator Theory in Hilbert Spaces, vol. 408 (Springer, New York).CrossrefGoogle Scholar
  • [5] Bolte J, Daniilidis A, Lewis A (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.CrossrefGoogle Scholar
  • [6] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1–2):459–494.CrossrefGoogle Scholar
  • [7] Bolte J, Daniilidis A, Lewis A, Shiota M (2007) Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2):556–572.CrossrefGoogle Scholar
  • [8] Bradley PS, Mangasarian OL, Street WN (1997) Clustering via concave minimization. Mozer MC, Jordan MI, Petsche T, eds. Advances in Neural Information Processing Systems, vol. 9 (Curran Associates, Red Hook, NY), 368–374.Google Scholar
  • [9] Davis D (2016) The asynchronous PALM algorithm for nonsmooth nonconvex problems. Working paper, Cornell University, Ithaca, NY.Google Scholar
  • [10] Davis D, Drusvyatskiy D (2018) Stochastic subgradient method converges at the rate O(k-1/4) on weakly convex functions. Working paper, Cornell University, Ithaca, NY.Google Scholar
  • [11] Davis D, Edmunds B, Udell M (2016) The sound of APALM clapping: Faster nonsmooth nonconvex optimization with stochastic asynchronous PALM. Lee DD, Sugiyama MM, Luxburg UV, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates, Red Hook, NY), 226–234.Google Scholar
  • [12] Defazio A, Bach F, Lacoste-Julien S (2014) SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Red Hook, NY), 1646–1654.Google Scholar
  • [13] Drusvyatskiy D (2018) The proximal point method revisited. SIAG/OPT Views News 26(1):1–8.Google Scholar
  • [14] Drusvyatskiy D, Lewis AS (2018) Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3):919–948.LinkGoogle Scholar
  • [15] Drusvyatskiy D, Pacquette C (2018) Variational analysis of spectral functions simplified. J. Convex Anal. 25(1):119–134.Google Scholar
  • [16] Drusvyatskiy D, Ioffe AD, Lewis AS (2016) Nonsmooth optimization using taylor-like models: error bounds, convergence, and termination criteria. Working paper, University of Washington, Seattle.Google Scholar
  • [17] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96(456):1348–1360.CrossrefGoogle Scholar
  • [18] Fischler MA, Bolles RC (1981) Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24(6):381–395.CrossrefGoogle Scholar
  • [19] Gao HY, Bruce AG (1997) WaveShrink with firm shrinkage. Statist. Sinica 7(4):855–874.Google Scholar
  • [20] Ghadimi S, Lan G, Zhang H (2016) Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Programming 155(1–2):267–305.CrossrefGoogle Scholar
  • [21] Hare W, Sagastizábal C (2009) Computing proximal points of nonconvex functions. Math. Programming 116(1):221–258.CrossrefGoogle Scholar
  • [22] Hartley R, Zisserman A (2003) Multiple View Geometry in Computer Vision (Cambridge University Press, Cambridge, UK).Google Scholar
  • [23] Huber PJ (2004) Robust Statistics (John Wiley & Sons, Berlin).Google Scholar
  • [24] Hunter JD (2007) Matplotlib: A 2D graphics environment. Comput. Sci. Engrg. 9(3):90–95.CrossrefGoogle Scholar
  • [25] Lange KL, Little RJA, Taylor JMG (1989) Robust statistical modeling using the t distribution. J. Amer. Statist. Assoc. 84(408):881–896.Google Scholar
  • [26] LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86(11):2278–2324.CrossrefGoogle Scholar
  • [27] Lewis AS (1999) Nonsmooth analysis of eigenvalues. Math. Programming 84(1):1–24.CrossrefGoogle Scholar
  • [28] Lowe DG (1999) Object recognition from local scale-invariant features. Proc. 7th IEEE Internat. Conf. Comput. Vision, vol. 2 (IEEE, Piscataway, NJ), 1150–1157.CrossrefGoogle Scholar
  • [29] Luo ZQ, Tseng P (1993) Error bounds and convergence analysis of feasible descent methods: A general approach. Ann. Oper. Res. 46(1):157–178.CrossrefGoogle Scholar
  • [30] Ma Y, Soatto S, Kosecka J, Sastry SS (2012) An Invitation to 3-D Vision: From Images to Geometric Models, vol. 26 (Springer Science & Business Media).Google Scholar
  • [31] Mangasarian OL (2007) Absolute value equation solution via concave minimization. Optim. Lett. 1(1):3–8.CrossrefGoogle Scholar
  • [32] Maronna RA, Martin D, Yohai (2006) Robust Statistics, Wiley Series in Probability and Statistics (Wiley, Berlin).CrossrefGoogle Scholar
  • [33] Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2014) On the least trimmed squares estimator. Algorithmica 69(1):148–183.CrossrefGoogle Scholar
  • [34] Nesterov Y (2004) Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization (Kluwer Academic Publications, Boston).CrossrefGoogle Scholar
  • [35] Neykov NM, Müller CH (2003) Breakdown point and computation of trimmed likelihood estimators in generalized linear models. Dutter R, Filzmoser P, Gather U, Rousseeuw PJ, eds. Developments in Robust Statistics (Springer, Berlin), 277–286.CrossrefGoogle Scholar
  • [36] R Development Core Team (2008) R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna).Google Scholar
  • [37] Reddi SJ, Sra S, Póczos B, Smola AJ (2016) Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. Lee DD, Sugiyama MM, Luxburg UV, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates, Red Hook, NY), 1145–1153.Google Scholar
  • [38] Robbins H, Siegmund D (1985) A convergence theorem for non negative almost supermartingales and some applications. Herbert Robbins Selected Papers (Springer-Verlag, New York), 111–135.CrossrefGoogle Scholar
  • [39] Rockafellar RT, Wets RJB (1998) Variational Analysis, vol. 317 (Springer-Verlag, Berlin).CrossrefGoogle Scholar
  • [40] Rousseeuw PJ (1984) Least median of squares regression. J. Amer. Statist. Assoc. 79(388):871–880.CrossrefGoogle Scholar
  • [41] Rousseeuw PJ (1985) Multivariate estimation with high breakdown point. Math. Statist. Appl. 8(37):283–297.CrossrefGoogle Scholar
  • [42] Rousseeuw PJ, Van Driessen K (2006) Computing LTS regression for large data sets. Data Mining Knowledge Discovery 12(1):29–45.CrossrefGoogle Scholar
  • [43] Ruppert D, Carroll RJ (1980) Trimmed least squares estimation in the linear model. J. Amer. Statist. Assoc. 75(372):828–838.CrossrefGoogle Scholar
  • [44] Vedaldi A, Fulkerson B (2008) VLFeat: An open and portable library of computer vision algorithms. Accessed June 25, 2019, http://www.vlfeat.org/.Google Scholar
  • [45] Xiao L, Zhang T (2014) A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4):2057–2075.CrossrefGoogle Scholar
  • [46] Yang E, Lozano A (2015) Robust Gaussian graphical modeling with the trimmed graphical lasso. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 28 (Curran Associates, Red Hook, NY), 2602–2610.Google Scholar
  • [47] Yang E, Lozano AC, Aravkin A (2018) A general family of trimmed estimators for robust high-dimensional data analysis. Electronic J. Statist. 12(2):3519–3553.CrossrefGoogle Scholar
  • [48] Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38(2):894–942.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.