On the Power of Linear Programming for K-Means Clustering

Published Online:https://doi.org/10.1287/ijoo.2025.0065

References

  • Abdalla P, Bandeira AS (2022) Community detection with a subsampled semidefinite program. Sampling Theory Signal Processing Data Anal. 20(1):6.Google Scholar
  • Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Machine Learn. 75(2):245–248.Google Scholar
  • Applegate D, Díaz M, Hinder O, Lu H, Lubin M, O’Donoghue B, Schudy W (2021) Practical large-scale linear programming using primal-dual hybrid gradient. Adv. Neural Inform. Processing Systems 34:20243–20257.Google Scholar
  • Awasthi P, Bandeira AS, Charikar M, Krishnaswamy R, Villar S, Ward R (2015) Relax, no need to round: Integrality of clustering formulations. Proc. 2015 Conf. Innovations Theoret. Comput. Sci., vol. 165 (ACM, New York), 191–200.Google Scholar
  • Czyzyk J, Mesnier MP, More JJ (1998) The NEOS server. IEEE J. Computational Sci. Engrg. 5(3):68–75.Google Scholar
  • Dagum L, Menon R (1998) OpenMP: An industry standard API for shared-memory programming. IEEE Computational Sci. Engrg. 5(1):46–55.Google Scholar
  • De Rosa A, Khajavirad A (2022) The ratio-cut polytope and K-means clustering. SIAM J. Optim. 32(1):173–203.Google Scholar
  • Dua D, Graff C (2017) Welcome to the UC Irvine machine learning repository. http://archive.ics.uci.edu/ml.Google Scholar
  • Friggstad Z, Rezapour M, Salavatiour MR (2019) Local search yields a PTAS for k-means in doubling metrics. SIAM J. Comput. 48(2):452–480.Google Scholar
  • Gurobi Optimization LLC (2021) Gurobi Optimizer reference manual. https://www.gurobi.com.Google Scholar
  • Iguchi T, Mixon DG, Peterson J, Villar S (2017) Probably certifiably correct K-means clustering. Math. Programming 165(2):605–642.Google Scholar
  • Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A (2002) A local search approximation algorithm for K-means clustering. Proc. 18th Annual ACM Sympos. Computational Geometry (ACM, New York), 10–18.Google Scholar
  • Khajavirad A, Sahinidis NV (2018) A hybrid LP/NLP paradigm for global optimization relaxations. Math. Programming Comput. 10(3):383–421.Google Scholar
  • Li X, Li Y, Ling S, Strohmer T, Wei K (2020) When do birds of a feather flock together? k-Means, proximity, and conic programming. Math. Programming 179(1):295–341.Google Scholar
  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans. Inform. Theory 28(2):129–137.Google Scholar
  • Lu H, Yang J (2025) cuPDLP.jl: A GPU implementation of restarted primal-dual hybrid gradient for linear programming in Julia. Oper. Res. 73(6):3440–3452.Google Scholar
  • Lu H, Yang J, Hu H, Huangfu Q, Liu J, Liu T, Ye Y, Zhang C, Ge D (2023) cuPDLP-C: A strengthened implementation of cuPDLP for linear programming by C language. Preprint, submitted December 22, https://arxiv.org/abs/2312.14832.Google Scholar
  • Mahajan M, Nimbhorkar P, Varadarajan K (2009) The planar k-means problem is NP-hard. Das S, Uehara R, eds. WALCOM: Algorithms and Computation. WALCOM 2009, Lecture Notes in Computer Science, vol. 5431 (Springer, Berlin), 274–285.Google Scholar
  • Mangasarian O (1979) Uniqueness of solution in linear programming. Linear Algebra Appl. 25:151–162.Google Scholar
  • Marzi F, Rossi F, Smriglio S (2019) Computational study of separation algorithms for clique inequalities. Soft Comput. 23(9):3013–3027.Google Scholar
  • Mixon DG, Xie K (2021) Sketching semidefinite programs for faster clustering. IEEE Trans. Inform. Theory 67(10):6832–6840.Google Scholar
  • Mixon DG, Villar S, Ward R (2017) Clustering subgaussian mixtures by semidefinite programming. Inform. Inference 6(4):389–415.Google Scholar
  • MOSEK ApS (2019) MOSEK FAQ release 9.0.105. (September 25), http://docs.mosek.com/9.0/faq.pdf.Google Scholar
  • Neumaier A, Shcherbina O (2004) Safe bounds in linear and mixed-integer linear programming. Math. Programming 99(2):283–296.Google Scholar
  • Padberg M (1989) The Boolean quadric polytope: Some characteristics, facets and relatives. Math. Programming 45(1):139–172.Google Scholar
  • Peng J, Wei Y (2007) Approximating K-means-type clustering via semidefinite programming. SIAM J. Optim. 18(1):186–205.Google Scholar
  • Peng J, Xia Y (2005) A New Theoretical Framework for K-Means-Type Clustering (Springer, Berlin), 79–96.Google Scholar
  • Piccialli V, Sudoso AM, Wiegele A (2022) SOS-SDP: An exact solver for minimum sum-of-squares clustering. INFORMS J. Comput. 34(4):2144–2162.LinkGoogle Scholar
  • Prasad MN, Hanasusanto GA (2018) Improved conic reformulations for K-means clustering. SIAM J. Optim. 28(4):3105–3126.Google Scholar
  • Vershynin R (2018) High-Dimensional Probability: An Introduction with Applications in Data Science, vol. 47 (Cambridge University Press, Cambridge, UK).Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.