On the Efficient Implementation of the Matrix Exponentiated Gradient Algorithm for Low-Rank Matrix Optimization

Published Online:https://doi.org/10.1287/moor.2022.1332

References

  • [1] Allen-Zhu Z, Li Y (2017) Follow the compressed leader: Faster online learning of eigenvectors and faster MMWU. Doina P, Yee WT, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR), 116–125.Google Scholar
  • [2] Allen-Zhu Z, Lee YT, Orecchia L (2016) Using optimization to obtain a width-independent, parallel, simpler, and faster positive SDP solver. Krauthgamer R, ed. Proc. 27th Annual ACM-SIAM Sympos. Discrete Algorithms (Society for Industrial and Applied Mathematics, Philadelphia), 1824–1831.Google Scholar
  • [3] Alman J, Williams VV (2021) A refined laser method and faster matrix multiplication. Marx D, ed. Proc. 2021 ACM-SIAM Sympos. Discrete Algorithms (Society for Industrial and Applied Mathematics, Philadelphia), 522–539.Google Scholar
  • [4] Beck A (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).CrossrefGoogle Scholar
  • [5] Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3):167–175.CrossrefGoogle Scholar
  • [6] Bubeck S (2015) Convex optimization: Algorithms and complexity. Foundations Trends Machine Learn. 8(3-4):231–357.CrossrefGoogle Scholar
  • [7] Candès EJ, Recht B (2009) Exact matrix completion via convex optimization. Foundations Comput. Math. 9(6):717–772.CrossrefGoogle Scholar
  • [8] Candès EJ, Eldar YC, Strohmer T, Voroninski V (2015) Phase retrieval via matrix completion. SIAM Rev. 57(2):225–251.CrossrefGoogle Scholar
  • [9] Candès EJ, Li X, Ma Y, Wright J (2009) Robust principal component analysis? J. ACM 58(3):1–37.CrossrefGoogle Scholar
  • [10] Carmon Y, Duchi JC, Sidford A, Tian K (2019) A Rank-1 sketch for matrix multiplicative weights (COLT). Alina B, Daniel H, eds. Proc. 32nd Conf. Learn. Theory, vol. 99, Proceedings of Machine Learning Research Series (PMLR), 589–623.Google Scholar
  • [11] Chung F, Lu L (2006) Concentration inequalities and martingale inequalities: A survey. Internet Math. 3(1):79–127.CrossrefGoogle Scholar
  • [12] Ding L, Fei Y, Xu Q, Yang C (2020) Spectral Frank-Wolfe algorithm: Strict complementarity and linear convergence (ICML). Daumé III, Aarti S, eds. Proc. 37th Internat. Conf. Machine Learn., vol, 119, Proceedings of Machine Learning Research Series (PMLR), 2535–2544.Google Scholar
  • [13] Drusvyatskiy D, Lewis AS (2018) Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3):919–948.LinkGoogle Scholar
  • [14] Garber D (2019) On the convergence of projected-gradient methods with low-rank projections for smooth convex minimization over trace-norm balls and related problems. SIAM J. Optim. 31(1):727–753.CrossrefGoogle Scholar
  • [15] Garber D (2020) On the convergence of stochastic gradient descent with low-rank projections for convex low-rank matrix problems. Abernethy JD, Agarwal S, eds. Conf. Learn. Theory (COLT), vol. 125, Proceedings of Machine Learning Research Series (PMLR), 1666–1681.Google Scholar
  • [16] Garber D (2022) Linear convergence of Frank-Wolfe for rank-one matrix recovery without strong convexity. Math. Programming A. https://doi.org/10.1007/s10107-022-01821-8.Google Scholar
  • [17] Ge R, Lee JD, Ma T (2016) Matrix completion has no spurious local minimum. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates Inc., Red Hook, NY), 2973–2981.Google Scholar
  • [18] Golub GH, Van Loan CF (2013) Matrix Computations (JHU Press, Baltimore, MD).CrossrefGoogle Scholar
  • [19] Horn RA, Johnson CR (1994) Topics in Matrix Analysis (Cambridge University Press, Cambridge, UK).Google Scholar
  • [20] Jaggi M, Sulovský M (2010) A simple algorithm for nuclear norm regularized problems. Fürnkranz J, Joachims T, eds. Proc. 27th Internat. Conf. Internat. Conf. Machine Learn. (Omnipress, Madison, WI), 471–478.Google Scholar
  • [21] Jia Z, Stewart G (2001) An analysis of the Rayleigh–Ritz method for approximating eigenspaces. Math. Comput. 70(234):637–647.CrossrefGoogle Scholar
  • [22] Marshall AW, Olkin I, Arnold BC (2011) Inequalities: Theory of Majorization and Its Applications, vol. 143, 2nd ed. (Springer, New York).CrossrefGoogle Scholar
  • [23] Mu C, Zhang Y, Wright J, Goldfarb D (2016) Scalable robust matrix recovery: Frank-Wolfe meets proximal methods. SIAM J. Sci. Comput. 38(5):A3291–A3317.CrossrefGoogle Scholar
  • [24] Musco C, Musco C (2015) Randomized block Krylov methods for stronger and faster approximate singular value decomposition. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Proc. 28th Internat. Conf. Neural Inform. Processing Systems, vol. 1 (MIT Press, Cambridge, MA), 1396–1404.Google Scholar
  • [25] Musco C, Musco C, Sidford A (2018) Stability of the Lanczos method for matrix function approximation. Czumaj A, ed. Proc. 29th Annual ACM-SIAM Sympos. Discrete Algorithms, SODA’18 (Society for Industrial and Applied Mathematics, Philadelphia), 1605–1624.Google Scholar
  • [26] Netrapalli P, Jain P, Sanghavi S (2013) Phase retrieval using alternating minimization. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 26 (Curran Associates, Inc., Red Hook, NY), 2796–2804.Google Scholar
  • [27] Netrapalli P, Niranjan U, Sanghavi S, Anandkumar A, Jain P (2014) Non-convex robust PCA. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 1107–1115.Google Scholar
  • [28] Peng R, Tangwongsan K (2012) Faster and simpler width-independent parallel algorithms for positive semidefinite programming. Blelloch GE, Herlihy M, eds. Proc. 24th Annual ACM Sympos. Parallelism Algorithms Architectures (Association for Computing Machinery, New York), 101–108.Google Scholar
  • [29] Recht B (2011) A simpler approach to matrix completion. J. Machine Learn. Res. 12:3413–3430.Google Scholar
  • [30] Saad Y (2011) Numerical Methods for Large Eigenvalue Problems, rev. ed. (Manchester University Press, Manchester, UK).CrossrefGoogle Scholar
  • [31] Thompson CJ (1965) Inequality with applications in statistical mechanics. J. Math. Phys. 6(11):1812–1813.CrossrefGoogle Scholar
  • [32] Tropp JA (2012) User-friendly tail bounds for sums of random matrices. Foundations Comput. Math. 12(4):389–434.CrossrefGoogle Scholar
  • [33] Tsuda K, Rätsch G, Warmuth MK (2005) Matrix exponentiated gradient updates for on-line learning and Bregman projection. J. Machine Learn. Res. 6(34):995–1018.Google Scholar
  • [34] Wright J, Ganesh A, Rao S, Peng Y, Ma Y (2009) Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. Bengio Y, Schuurmans D, Lafferty J, Williams C, Culotta A, eds. Adv. Neural Inform. Processing Systems, vol. 22 (Curran Associates, Inc., Red Hook, NY), 2080–2088.Google Scholar
  • [35] Yi P, Park D, Chen Y, Caramanis C (2016) Fast algorithms for robust PCA via gradient descent. Lee DD, von Luxburg U, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY), 4159–4167.Google Scholar
  • [36] Yu YL (2013) The strong convexity of von Neumann’s entropy. Unpublished manuscript.Google Scholar
  • [37] Yurtsever A, Udell M, Tropp JA, Cevher V (2017) Sketchy decisions: Convex low-rank matrix optimization with optimal storage. Sangal R, Mehta H, Bagga RK, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist., vol. 54 (Morgan Kaufmann Publishers Inc., San Francisco), 1188–1196.Google Scholar
  • [38] Yurtsever A, Tropp J, Fercoq O, Udell M, Cevher V (2021) Scalable semidefinite programming. SIAM J. Math. Data Sci. 3(1):171–200.CrossrefGoogle Scholar
  • [39] Zhou Z, So AMC (2017) A unified approach to error bounds for structured convex optimization problems. Math. Programming 165(2):689–728.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.