On the Efficient Implementation of the Matrix Exponentiated Gradient Algorithm for Low-Rank Matrix Optimization
Published Online:7 Dec 2022https://doi.org/10.1287/moor.2022.1332
References
- [1] (2017) Follow the compressed leader: Faster online learning of eigenvectors and faster MMWU. Doina P, Yee WT, eds. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR), 116–125.Google Scholar
- [2] (2016) Using optimization to obtain a width-independent, parallel, simpler, and faster positive SDP solver. Krauthgamer R, ed. Proc. 27th Annual ACM-SIAM Sympos. Discrete Algorithms (Society for Industrial and Applied Mathematics, Philadelphia), 1824–1831.Google Scholar
- [3] (2021) A refined laser method and faster matrix multiplication. Marx D, ed. Proc. 2021 ACM-SIAM Sympos. Discrete Algorithms (Society for Industrial and Applied Mathematics, Philadelphia), 522–539.Google Scholar
- [4] (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
- [5] (2003) Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3):167–175.Crossref, Google Scholar
- [6] (2015) Convex optimization: Algorithms and complexity. Foundations Trends Machine Learn. 8(3-4):231–357.Crossref, Google Scholar
- [7] (2009) Exact matrix completion via convex optimization. Foundations Comput. Math. 9(6):717–772.Crossref, Google Scholar
- [8] (2015) Phase retrieval via matrix completion. SIAM Rev. 57(2):225–251.Crossref, Google Scholar
- [9] (2009) Robust principal component analysis? J. ACM 58(3):1–37.Crossref, Google Scholar
- [10] (2019) A Rank-1 sketch for matrix multiplicative weights (COLT). Alina B, Daniel H, eds. Proc. 32nd Conf. Learn. Theory, vol. 99, Proceedings of Machine Learning Research Series (PMLR), 589–623.Google Scholar
- [11] (2006) Concentration inequalities and martingale inequalities: A survey. Internet Math. 3(1):79–127.Crossref, Google Scholar
- [12] (2020) Spectral Frank-Wolfe algorithm: Strict complementarity and linear convergence (ICML). Daumé III, Aarti S, eds. Proc. 37th Internat. Conf. Machine Learn., vol, 119, Proceedings of Machine Learning Research Series (PMLR), 2535–2544.Google Scholar
- [13] (2018) Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3):919–948.Link, Google Scholar
- [14] (2019) On the convergence of projected-gradient methods with low-rank projections for smooth convex minimization over trace-norm balls and related problems. SIAM J. Optim. 31(1):727–753.Crossref, Google Scholar
- [15] (2020) On the convergence of stochastic gradient descent with low-rank projections for convex low-rank matrix problems. Abernethy JD, Agarwal S, eds. Conf. Learn. Theory (COLT), vol. 125, Proceedings of Machine Learning Research Series (PMLR), 1666–1681.Google Scholar
- [16] Garber D (2022) Linear convergence of Frank-Wolfe for rank-one matrix recovery without strong convexity. Math. Programming A. https://doi.org/10.1007/s10107-022-01821-8.Google Scholar
- [17] (2016) Matrix completion has no spurious local minimum. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates Inc., Red Hook, NY), 2973–2981.Google Scholar
- [18] (2013) Matrix Computations (JHU Press, Baltimore, MD).Crossref, Google Scholar
- [19] (1994) Topics in Matrix Analysis (Cambridge University Press, Cambridge, UK).Google Scholar
- [20] (2010) A simple algorithm for nuclear norm regularized problems. Fürnkranz J, Joachims T, eds. Proc. 27th Internat. Conf. Internat. Conf. Machine Learn. (Omnipress, Madison, WI), 471–478.Google Scholar
- [21] (2001) An analysis of the Rayleigh–Ritz method for approximating eigenspaces. Math. Comput. 70(234):637–647.Crossref, Google Scholar
- [22] (2011) Inequalities: Theory of Majorization and Its Applications, vol. 143, 2nd ed. (Springer, New York).Crossref, Google Scholar
- [23] (2016) Scalable robust matrix recovery: Frank-Wolfe meets proximal methods. SIAM J. Sci. Comput. 38(5):A3291–A3317.Crossref, Google Scholar
- [24] (2015) Randomized block Krylov methods for stronger and faster approximate singular value decomposition. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Proc. 28th Internat. Conf. Neural Inform. Processing Systems, vol. 1 (MIT Press, Cambridge, MA), 1396–1404.Google Scholar
- [25] (2018) Stability of the Lanczos method for matrix function approximation. Czumaj A, ed. Proc. 29th Annual ACM-SIAM Sympos. Discrete Algorithms, SODA’18 (Society for Industrial and Applied Mathematics, Philadelphia), 1605–1624.Google Scholar
- [26] (2013) Phase retrieval using alternating minimization. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 26 (Curran Associates, Inc., Red Hook, NY), 2796–2804.Google Scholar
- [27] (2014) Non-convex robust PCA. Ghahramani Z, Welling M, Cortes C, Lawrence N, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 1107–1115.Google Scholar
- [28] (2012) Faster and simpler width-independent parallel algorithms for positive semidefinite programming. Blelloch GE, Herlihy M, eds. Proc. 24th Annual ACM Sympos. Parallelism Algorithms Architectures (Association for Computing Machinery, New York), 101–108.Google Scholar
- [29] (2011) A simpler approach to matrix completion. J. Machine Learn. Res. 12:3413–3430.Google Scholar
- [30] (2011) Numerical Methods for Large Eigenvalue Problems, rev. ed. (Manchester University Press, Manchester, UK).Crossref, Google Scholar
- [31] (1965) Inequality with applications in statistical mechanics. J. Math. Phys. 6(11):1812–1813.Crossref, Google Scholar
- [32] (2012) User-friendly tail bounds for sums of random matrices. Foundations Comput. Math. 12(4):389–434.Crossref, Google Scholar
- [33] (2005) Matrix exponentiated gradient updates for on-line learning and Bregman projection. J. Machine Learn. Res. 6(34):995–1018.Google Scholar
- [34] (2009) Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. Bengio Y, Schuurmans D, Lafferty J, Williams C, Culotta A, eds. Adv. Neural Inform. Processing Systems, vol. 22 (Curran Associates, Inc., Red Hook, NY), 2080–2088.Google Scholar
- [35] (2016) Fast algorithms for robust PCA via gradient descent. Lee DD, von Luxburg U, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY), 4159–4167.Google Scholar
- [36] (2013) The strong convexity of von Neumann’s entropy. Unpublished manuscript.Google Scholar
- [37] (2017) Sketchy decisions: Convex low-rank matrix optimization with optimal storage. Sangal R, Mehta H, Bagga RK, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist., vol. 54 (Morgan Kaufmann Publishers Inc., San Francisco), 1188–1196.Google Scholar
- [38] (2021) Scalable semidefinite programming. SIAM J. Math. Data Sci. 3(1):171–200.Crossref, Google Scholar
- [39] (2017) A unified approach to error bounds for structured convex optimization problems. Math. Programming 165(2):689–728.Crossref, Google Scholar

