One-Step Estimation with Scaled Proximal Methods

Published Online:https://doi.org/10.1287/moor.2021.1212

References

  • [1] Antoniadis A, Gijbels I, Nikolova M (2011) Penalized likelihood regression for generalized linear models with non-quadratic penalties. Ann. Inst. Statist. Math. 63(3):585–615.CrossrefGoogle Scholar
  • [2] Bassett R, Deride J (2019) Maximum a posteriori estimators as a limit of bayes estimators. Math. Programming 174(1–2):129–144.CrossrefGoogle Scholar
  • [3] Bauschke HH, Combettes PL (2011) Convex Analysis and Monotone Operator Theory in Hilbert Spaces (Springer, New York).CrossrefGoogle Scholar
  • [4] Beck A (2017) First-Order Methods in Optimization (SIAM, Philadelphia).CrossrefGoogle Scholar
  • [5] Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.CrossrefGoogle Scholar
  • [6] Beck A, Teboulle M (2012) Smoothing and first order methods: A unified framework. SIAM J. Optim. 22(2):557–580.CrossrefGoogle Scholar
  • [7] Becker S, Fadili J (2012) A quasi-Newton proximal splitting method. Adv. Neural Inform. Processing Systems 25:2618–2626.Google Scholar
  • [8] Bickel PJ (1975) One-step Huber estimates in the linear model. J. Amer. Statist. Assoc. 70(350):428–434.CrossrefGoogle Scholar
  • [9] Boucheron S, Massart P (2011) A high-dimensional wilks phenomenon. Probab. Theory Related Fields 150(3-4):405–433.CrossrefGoogle Scholar
  • [10] Burke JV, Hoheisel T (2013) Epi-convergent smoothing with applications to convex composite functions. SIAM J. Optim. 23(3):1457–1479.CrossrefGoogle Scholar
  • [11] Burke JV, Hoheisel T (2017) Epi-convergence properties of smoothing by infimal convolution. Set-Valued Variational Anal. 25(1):1–23.CrossrefGoogle Scholar
  • [12] Dennis JE, Moré JJ (1974) A characterization of superlinear convergence and its application to quasi-newton methods. Math. Comput. 28(126):549–560.CrossrefGoogle Scholar
  • [13] Eggermont PPB, LaRiccia VN, LaRiccia V (2001) Maximum Penalized Likelihood Estimation (Springer, New York).CrossrefGoogle Scholar
  • [14] Fan J, Chen J (1999) One-step local quasi-likelihood estimation. J. Royal Statist. Soc. Ser. B. Statist. Methodology 61(4):927–943.CrossrefGoogle Scholar
  • [15] Ferguson TS (2017) A Course in Large Sample Theory (Routledge, Boca Raton, FL).CrossrefGoogle Scholar
  • [16] Friedlander MP, Goh G (2017) Efficient evaluation of scaled proximal operators. Electronic Trans. Numer. Anal. 46:1–22.Google Scholar
  • [17] Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33(1):1–22.CrossrefGoogle Scholar
  • [18] Golub GH, Wilkinson JH (1966) Note on the iterative refinement of least squares solution. Numer. Math. 9(2):139–148.CrossrefGoogle Scholar
  • [19] Hare W, Sagastizábal C (2009) Computing proximal points of nonconvex functions. Math. Programming 116(1–2):221–258.CrossrefGoogle Scholar
  • [20] Hazan E, Levy KY, Shalev-Shwartz S (2016) On graduated optimization for stochastic non-convex problems. Proc. 33rd Internat. Conf. Machine Learning ( JMLR.org), 1833–1841.Google Scholar
  • [21] Huang C, Huo X (2019) A distributed one-step estimator. Math. Programming 174(1–2):41–76.CrossrefGoogle Scholar
  • [22] Ionides E (2005) Maximum smoothed likelihood estimation. Statist. Sinica 15(4):1003–1014.Google Scholar
  • [23] Kanzow C, Lechner T (2021) Globalized inexact proximal Newton-type methods for nonconvex composite functions. Comput. Optim. Appl. 78(2):377–410.CrossrefGoogle Scholar
  • [24] Le Cam L (1960) Locally asymptotically normal families of distributions. Univ. California Publ. Statist. 3:37–98.Google Scholar
  • [25] Le Cam L (1970) On the assumptions used to prove asymptotic normality of maximum likelihood estimates. Ann. Math. Statist. 41(3):802–828.CrossrefGoogle Scholar
  • [26] Le Cam L (1972) Théorie asymptotique de la décision statistique (Presses de l’Université de Montréal, Montréal).Google Scholar
  • [27] Le Cam L (2012) Asymptotic Methods in Statistical Decision Theory (Springer, New York).Google Scholar
  • [28] Lee JD, Sun Y, Saunders MA (2014) Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24(3):1420–1443.CrossrefGoogle Scholar
  • [29] Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrinking diameters. ACM Trans. Knowledge Discovery Data 1(1):2-es.CrossrefGoogle Scholar
  • [30] Luenberger DG, Ye Y (1984) Linear and Nonlinear Programming (Springer, Cham, Switzerland).Google Scholar
  • [31] Mattingley J, Boyd S (2012) Cvxgen: A code generator for embedded convex optimization. Optim. Eng. 13(1):1–27.CrossrefGoogle Scholar
  • [32] Milzarek A (2016) Numerical methods and second order theory for nonsmooth problems. Unpublished doctoral dissertation, Technische Universität München.Google Scholar
  • [33] Mobahi H, Fisher JW (2015) On the link between Gaussian homotopy continuation and convex envelopes. Tai X-C, Bae E, Chan TF, Lysaker M, eds. Internat. Workshop Energy Minimization Methods Comput. Vision Pattern Recognition (Springer, Cham, Switzerland), 43–56.Google Scholar
  • [34] Parikh N, Boyd S (2014) Proximal algorithms. Foundations Trends Optim. 1(3):127–239.CrossrefGoogle Scholar
  • [35] Pollard D (1997) Another look at differentiability in quadratic mean. Pollard D, Torgersen E, Yang GL, eds. Festschrift for Lucien Le Cam (Springer, New York), 305–314.CrossrefGoogle Scholar
  • [36] Polson NG, Scott JG, Willard BT (2015) Proximal algorithms in statistics and machine learning. Statist. Sci. 30(4):559–581.CrossrefGoogle Scholar
  • [37] Rockafellar RT, Wets RJB (2009) Variational Analysis (Springer, Berlin).Google Scholar
  • [38] Scheinberg K, Tang X (2016) Practical inexact proximal quasi-newton method with global complexity analysis. Math. Programming 160(1–2):495–529.CrossrefGoogle Scholar
  • [39] Spokoiny V (2012) Parametric estimation. finite sample theory. Ann. Statist. 40(6):2877–2909.CrossrefGoogle Scholar
  • [40] Taddy M (2017) One-step estimator paths for concave regularization. J. Comput. Graphical Statist. 26(3):525–536.CrossrefGoogle Scholar
  • [41] Tseng P, Yun S (2009) A coordinate gradient descent method for nonsmooth separable minimization. Math. Programming 117(1):387–423.CrossrefGoogle Scholar
  • [42] Van der Vaart AW (2000) Asymptotic Statistics (Cambridge University Press, Cambridge, UK).Google Scholar
  • [43] Xu M, Ye JJ, Zhang L (2015) Smoothing SQP methods for solving degenerate nonsmooth constrained optimization problems with applications to bilevel programs. SIAM J. Optim. 25(3):1388–1410.CrossrefGoogle Scholar
  • [44] Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36(4):1509–1533.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.