Generic Linear Convergence for Algorithms of Nonlinear Least Squares over Smooth Varieties

Shenglong Hu
Corresponding Author
Shenglong Hu
[email protected]
https://orcid.org/0009-0004-2963-1065
College of Science, National University of Defense Technology, Changsha, Hunan 410072, China
Search for more papers by this author
,
Ke Ye
Ke Ye
[email protected]
https://orcid.org/0000-0002-3193-7264
State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
Search for more papers by this author

Shenglong Hu

Corresponding Author

Shenglong Hu

[email protected]

https://orcid.org/0009-0004-2963-1065

College of Science, National University of Defense Technology, Changsha, Hunan 410072, China

Search for more papers by this author

Ke Ye

[email protected]

https://orcid.org/0000-0002-3193-7264

State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

Search for more papers by this author

Published Online:28 May 2026https://doi.org/10.1287/moor.2025.1060

References

[1] Absil PA, Mahony R, Sepulchre R (2008) Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
[2] Alizadeh F, Haeberly JPA, Overton ML (1997) Complementarity and nondegeneracy in semidefinite programming. Math. Programming 77(1):111–128.Crossref, Google Scholar
[3] Attouch H, Bolte J, Svaiter BF (2013) Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Programming 137(1–2):91–129.Crossref, Google Scholar
[4] Bates DM, Watts DG (1988) Nonlinear Regression Analysis and Its Applications, Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics (John Wiley & Sons, Inc., New York).Crossref, Google Scholar
[5] Bertsekas DP (1982) Constrained Optimization and Lagrange Multiplier Methods, Computer Science and Applied Mathematics (Academic Press, Inc. (Harcourt Brace Jovanovich), New York; London).Google Scholar
[6] Bertsekas DP (1999) Nonlinear Programming, Athena Scientific Optimization and Computation Series, 2nd ed. (Athena Scientific, Belmont, MA).Google Scholar
[7] Björck Å (1996) Numerical Methods for Least Squares Problems (SIAM, Philadelphia).Crossref, Google Scholar
[8] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
[9] Breiding P, Vannieuwenhoven N (2018) The condition number of joint decompositions. SIAM J. Matrix Anal. Appl. 39(1):287–309.Crossref, Google Scholar
[10] Breiding P, Vannieuwenhoven N (2018) A Riemannian trust region method for the canonical tensor rank approximation problem. SIAM J. Optim. 28(3):2435–2465.Crossref, Google Scholar
[11] Breiding P, Gesmundo F, Michał Ek M, Vannieuwenhoven N (2023) Algebraic compressed sensing. Appl. Comput. Harmonic Anal. 65:374–406.Crossref, Google Scholar
[12] Chan S, Lawrence P (1988) General inverse kinematics with the error damped pseudoinverse. Liégeois A, ed. Proc. 1988 IEEE Internat. Conf. Robotics Automation, vol. 2 (IEEE Computer Society Press, Philadelphia), 834–839.Google Scholar
[13] Chen J, Saad Y (2008/09) On the tensor SVD and the optimal low rank orthogonal approximation of tensors. SIAM J. Matrix Anal. Appl. 30(4):1709–1734.Crossref, Google Scholar
[14] Comon P, Lim LH, Qi Y, Ye K (2020) Topology of tensor ranks. Adv. Math. 367:107128.Crossref, Google Scholar
[15] De Silva V, Lim LH (2008) Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM J. Matrix Anal. Appl. 30(3):1084–1127.Crossref, Google Scholar
[16] Deutsch F (2001) Best Approximation in Inner Product Spaces, vol. 7 (Springer, New York).Crossref, Google Scholar
[17] do Carmo MP (1992) Riemannian Geometry (Birkhäuser Boston, Inc., Boston).Crossref, Google Scholar
[18] Dontchev AL, Rockafellar RT (2014) Implicit Functions and Solution Mappings, 2nd ed. (Springer, New York).Crossref, Google Scholar
[19] Draisma J, Horobeţ E, Ottaviani G, Sturmfels B, Thomas RR (2016) The Euclidean distance degree of an algebraic variety. Foundations Comput. Math. 16(1):99–149.Crossref, Google Scholar
[20] Drusvyatskiy D, Ioffe AD, Lewis AS (2016) Generic minimizing behavior in semialgebraic optimization. SIAM J. Optim. 26(1):513–534.Crossref, Google Scholar
[21] Edelman A, Arias TA, Smith ST (1999) The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2):303–353.Crossref, Google Scholar
[22] Eisenberger M, Lahner Z, Cremers D (2019) Divergence-free shape correspondence by deformation. Comput. Graphics Forum 38(5):1–12.Crossref, Google Scholar
[23] Espig M, Khachatryan A (2015) Convergence of alternating least squares optimisation for rank-one approximation to high order tensors. Preprint, submitted March 18, https://arxiv.org/abs/1503.05431.Google Scholar
[24] Espig M, Hackbusch W, Khachatryan A (2015) On the convergence of alternating least squares optimisation in tensor format representations. Preprint, submitted May 30, https://arxiv.org/abs/1506.00062.Google Scholar
[25] Facchinei F, Pang JS (2003) Finite-Dimensional Variational Inequalities and Complementarity Problems, vols. I and II (Springer, New York).Google Scholar
[26] Gill PE, Murray W (1978) Algorithms for the solution of the nonlinear least-squares problem. SIAM J. Numerical Anal. 15(5):977–992.Crossref, Google Scholar
[27] Gill PE, Murray W, Wright MH (1981) Practical Optimization (Academic Press, London; New York).Google Scholar
[28] Golub GH, Pereyra V (1973) The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate. SIAM J. Numerical Anal. 10(2):413–432.Crossref, Google Scholar
[29] Golub GH, Pereyra V (2003) Separable nonlinear least squares: The variable projection method and its applications. Inverse Problems 19(2):R1.Crossref, Google Scholar
[30] Golub GH, Van Loan CF (2013) Matrix Computations, 4th ed. (Johns Hopkins University Press, Baltimore).Crossref, Google Scholar
[31] Guan Y, Chu D (2019) Numerical computation for orthogonal low-rank approximation of tensors. SIAM J. Matrix Anal. Appl. 40(3):1047–1065.Crossref, Google Scholar
[32] Harris J (1995) Algebraic Geometry (Springer-Verlag, New York).Google Scholar
[33] Hillar CJ, Lim LH (2013) Most tensor problems are NP-hard. J. ACM 60(6):1–39.Crossref, Google Scholar
[34] Hochster M, Eagon JA (1971) Cohen-Macaulay rings, invariant theory, and the generic perfection of determinantal loci. Amer. J. Math. 93(4):1020–1058.Crossref, Google Scholar
[35] Hoefler T, Alistarh D, Ben-Nun T, Dryden N, Peste A (2021) Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. J. Machine Learn. Res. 22(1):10882–11005.Google Scholar
[36] Hu S, Li G (2018) Convergence rate analysis for the higher order power method in best rank one approximations of tensors. Numerische Mathematik 140(4):993–1031.Crossref, Google Scholar
[37] Hu S, Ye K (2023) Linear convergence of an alternating polar decomposition method for low rank orthogonal tensor approximations. Math. Programming 199(1–2):1305–1364.Crossref, Google Scholar
[38] Huang M, Xu Z (2024) Uniqueness and stability for the solution of a nonlinear least squares problem. Math. Comput. 93(347):1247–1264.Crossref, Google Scholar
[39] Huang M, Rong Y, Wang Y, Xu Z (2021) Almost everywhere generalized phase retrieval. Appl. Comput. Harmonic Anal. 50:16–33.Crossref, Google Scholar
[40] Jiang TX, Ng MK, Pan J, Song GJ (2023) Nonnegative low rank tensor approximations with multidimensional image applications. Numerische Mathematik 153(1):141–170.Crossref, Google Scholar
[41] Khouja R, Khalil H, Mourrain B (2022) Riemannian Newton optimization methods for the symmetric tensor approximation problem. Linear Algebra Appl. 637:175–211.Crossref, Google Scholar
[42] Kiefer J, Wolfowitz J (1952) Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23(3):462–466.Crossref, Google Scholar
[43] Kruskal JB (1977) Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra Appl. 18(2):95–138.Crossref, Google Scholar
[44] Landsberg JM (2012) Tensors: Geometry and Applications (American Mathematical Society, Providence, RI).Google Scholar
[45] Lee J (2010) Introduction to Topological Manifolds (Springer Science & Business Media, New York).Google Scholar
[46] Lee J (2012) Introduction to Smooth Manifolds, vol. 218 (Springer Science & Business Media, New York).Crossref, Google Scholar
[47] Lewis AS, Malick J (2008) Alternating projections on manifolds. Math. Oper. Res. 33(1):216–234.Link, Google Scholar
[48] Li G, Pong TK (2018) Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods. Foundations Comput. Math. 18(5):1199–1232.Crossref, Google Scholar
[49] Li G, Mordukhovich BS, Phạm TS (2015) New fractional error bounds for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors. Math. Programming 153(2):333–362.Crossref, Google Scholar
[50] Maxim LG, Rodriguez JI, Wang B (2021) Euclidean distance degree of projective varieties. Internat. Math. Res. Notices 20:15788–15802.Crossref, Google Scholar
[51] Milnor J (1963) Morse Theory (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
[52] Mohlenkamp MJ (2019) The dynamics of swamps in the canonical tensor approximation problem. SIAM J. Appl. Dynam. Systems 18(3):1293–1333.Crossref, Google Scholar
[53] Musialski P, Hafner C, Rist F, Birsak M, Wimmer M, Kobbelt L (2016) Non-linear shape optimization using local subspace projections. ACM Trans. Graphics 35(4):1–13.Crossref, Google Scholar
[54] Mustaţă M (2010) An irreducibility criterion solution to problem sets. Accessed June 8, 2023, https://public.websites.umich.edu/mmustata/Note1_09.pdf.Google Scholar
[55] Ni K, Steedly D, Dellaert F (2007) Out-of-core bundle adjustment for large-scale 3D reconstruction. IEEE 11th Internat. Conf. Comput. Vision, 1–8.Google Scholar
[56] Olikier G, Absil PA (2022) On the continuity of the tangent cone to the determinantal variety. Set-Valued Variational Anal. 30(2):769–788.Crossref, Google Scholar
[57] Rockafellar RT (2023) Generic linear convergence through metric subregularity in a variable-metric extension of the proximal point algorithm. Comput. Optim. Appl. 86(3):1327–1346.Crossref, Google Scholar
[58] Rockafellar RT, Wets RJB (1998) Variational Analysis (Springer-Verlag, Berlin).Crossref, Google Scholar
[59] Ruhe A, Wedin PA (1980) Algorithms for separable nonlinear least squares problems. SIAM Rev. 22(3):318–337.Crossref, Google Scholar
[60] Schneider R, Uschmajew A (2015) Convergence results for projected line-search methods on varieties of low-rank matrices via Łojasiewicz inequality. SIAM J. Optim. 25(1):622–646.Crossref, Google Scholar
[61] Shafarevich IR (2013) Basic Algebraic Geometry, Varieties in Projective Space, 3rd ed., vol 1 (Springer Science & Business Media, New York).Google Scholar
[62] Shtengel A, Poranne R, Sorkine-Hornung O, Kovalsky SZ, Lipman Y (2017) Geometric optimization via composite majorization. ACM Trans. Graphics 36(4):1–11.Crossref, Google Scholar
[63] Suwandi RC, Lin Z, Yin F, Wang Z, Theodoridis S (2025) Sparsity-aware distributed learning for Gaussian processes with linear multiple kernel. IEEE Trans. Neural Networks Learn. Systems 36(8):14869–14883.Crossref, Google Scholar
[64] Swijsen L, Van der Veken J, Vannieuwenhoven N (2022) Tensor completion using geodesics on Segre manifolds. Numerical Linear Algebra Appl. 29(6):e2446.Crossref, Google Scholar
[65] Uschmajew A (2012) Local convergence of the alternating least squares algorithm for canonical tensor approximation. SIAM J. Matrix Anal. Appl. 33(2):639–652.Crossref, Google Scholar
[66] Uschmajew A, Vandereycken B (2020) On critical points of quadratic low-rank matrix optimization problems. IMA J. Numerical Anal. 40(4):2626–2651.Crossref, Google Scholar
[67] Van Huffel S, Vandewalle J (1991) The Total Least Squares Problem (SIAM, Philadelphia).Crossref, Google Scholar
[68] Vannieuwenhoven N (2017) Condition numbers for the tensor rank decomposition. Linear Algebra Appl. 535:35–86.Crossref, Google Scholar
[69] Wang L, Chu MT (2014) On the global convergence of the alternating least squares method for rank-one approximation to generic tensors. SIAM J. Matrix Anal. Appl. 35(3):1058–1072.Crossref, Google Scholar
[70] Weyman J (2003) Cohomology of Vector Bundles and Syzygies (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
[71] Wolovich WA, Elliott H (1984) A computational technique for inverse kinematics. Twenty-third IEEE Conf. Decision Control, 1359–1363.Google Scholar
[72] Wu C, Agarwal S, Curless B, Seitz SM (2011) Multicore bundle adjustment. CVPR 2011, 3057–3064.Google Scholar
[73] Yang Y (2020) The epsilon-alternating least squares for orthogonal low-rank tensor approximation and its global convergence. SIAM J. Matrix Anal. Appl. 41(4):1797–1825.Crossref, Google Scholar
[74] Ye K, Hu S (2022) When geometry meets optimization theory: Partially orthogonal tensors. Preprint, submitted January 13, https://arxiv.org/abs/2201.04824.Google Scholar
[75] Yu P, Li G, Pong TK (2022) Kurdyka–Łojasiewicz exponent via inf-projection. Foundations Comput. Math. 22(4):1171–1217.Crossref, Google Scholar
[76] Zarantonello EH (1971) Projections on convex sets in Hilbert space and spectral theory: Part I. Projections on convex sets; Part II. Spectral theory. Eduardo H, ed. Contributions to Nonlinear Functional Analysis (Academic Press, New York), 237–424.Crossref, Google Scholar
[77] Zhang T, Golub GH (2001) Rank-one approximation to high order tensors. SIAM J. Matrix Anal. Appl. 23(2):534–550.Google Scholar
[78] Zhang R, Zhu S, Fang T, Quan L (2017) Distributed very large scale bundle adjustment by global camera consensus. IEEE Internat. Conf. Comput. Vision, 29–38.Google Scholar
[79] Zhao J, Badler NI (1994) Inverse kinematics positioning using nonlinear programming for highly articulated figures. ACM Trans. Graphics 13(4):313–336.Crossref, Google Scholar

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Metrics

Information

Received:May 25, 2025
Accepted:April 25, 2026
Published Online:May 28, 2026

Cite as

Shenglong Hu, Ke Ye (2026) Generic Linear Convergence for Algorithms of Nonlinear Least Squares over Smooth Varieties. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2025.1060

Keywords

Acknowledgments

The authors acknowledge the anonymous associate editor and two referees for their constructive suggestions and helpful comments on our manuscript.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Generic Linear Convergence for Algorithms of Nonlinear Least Squares over Smooth Varieties

References

Articles In Advance

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News