A Global Optimization Algorithm for K-Center Clustering of One Billion Samples

Published Online:https://doi.org/10.1287/mnsc.2023.00218

References

  • Aggarwal CC, Wolf JL, PSl Y (2004) Method for targeted advertising on the web based on accumulated self-learning data, clustering users and semantic node graph techniques. US Patent 6714975.Google Scholar
  • Aloise D, Contardo C (2018) A sampling-based exact algorithm for the solution of the minimax diameter clustering problem. J. Global Optim. 71(3):613–630.CrossrefGoogle Scholar
  • Barilan J, Kortsarz G, Peleg D (1993) How to allocate network centers. J. Algorithms 15(3):385–415.CrossrefGoogle Scholar
  • Bateni M, Esfandiari H, Fischer M, Mirrokni V (2021) Extreme k-center clustering. Proc. AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 3941–3949.Google Scholar
  • Benzi J, Damodaran M (2009) Parallel three dimensional direct simulation Monte Carlo for simulating micro flows. Parallel Computational Fluid Dynamics 2007, vol. 67 (Springer, Berlin), 91–98.CrossrefGoogle Scholar
  • Brusco MJ, Stahl S (2005) Branch-and-Bound Applications in Combinatorial Data Analysis (Springer, New York).Google Scholar
  • Byrne S, Wilcox LC, Churavy V (2021) MPI.jl: Julia bindings for the message passing interface. Proc. JuliaCon Conf. (JuliaCon) 1(1):68.CrossrefGoogle Scholar
  • Calik H (2013) Exact solution methodologies for the P-center problem under single and multiple allocation strategies. Theses, Bilkent University, Turkey.Google Scholar
  • Cao Y, Zavala VM (2019) A scalable global optimization algorithm for stochastic nonlinear programs. J. Global Optim. 75(2):393–416.CrossrefGoogle Scholar
  • Chen D, Chen R (2009) New relaxation-based algorithms for the optimal solution of the continuous and discrete p-center problems. Computers Oper. Res. 36(5):1646–1655.CrossrefGoogle Scholar
  • Chen R, Handler GY (1987) Relaxation method for the solution of the minimax location-allocation problem in Euclidean space. Naval Res. Logist. 34(6):775–788.CrossrefGoogle Scholar
  • Chiplunkar A, Kale S, Ramamoorthy SN (2020) How to solve fair k-center in massive data models. Proc. 37th Internat. Conf. Machine Learn. (PMLR, New York), 1877–1886.Google Scholar
  • Contardo C, Iori M, Kramer R (2019) A scalable exact algorithm for the vertex p-center problem. Computers Oper. Res. 103(March):211–220.CrossrefGoogle Scholar
  • Cook W, Lovasz L, Seymour P, eds. (1995) Combinatorial Optimization, DIMACS Series in Discrete Mathematics and Theoretical Computer Science (American Mathematical Society, Providence, RI).CrossrefGoogle Scholar
  • Cplex II (2020) V20.1.0: User’s Manual for CPLEX (International Business Machines Corporation, Armonk, NY).Google Scholar
  • Dao TBH, Duong KC, Vrain C (2013) A declarative framework for constrained clustering. Joint Eur. Conf. Machine Learn. Knowledge Discovery Databases (Springer, Berlin, Heidelberg), 419–434.CrossrefGoogle Scholar
  • Dao TBH, Duong KC, Vrain C (2017) Constrained clustering by constraint programming. Artificial Intelligence 244(March):70–94.Google Scholar
  • Daskin MS (2000) A new approach to solving the vertex p-center problem to optimality: Algorithm and computational results. Comm. Oper. Res. Soc. Japan 45(9):428–436.Google Scholar
  • Daugulis P (2024) Optimizing administrative divisions: A vertex k-center approach for edge-weighted road graphs. Baltic J. Modern Comput. 12(2):176–188.CrossrefGoogle Scholar
  • Davidović T, Ramljak D, Šelmić M, Teodorović D (2011) Bee colony optimization for the p-center problem. Computers Oper. Res. 38(10):1367–1376.CrossrefGoogle Scholar
  • Dua D, Graff C (2017) UCI machine learning repository. Accessed October 17, 2022, http://archive.ics.uci.edu/ml.Google Scholar
  • Dyer ME, Frieze AM (1985) A simple heuristic for the p-centre problem. Oper. Res. Lett. 3(6):285–288.CrossrefGoogle Scholar
  • Elloumi S, Labbé M, Pochet Y (2004) A new formulation and resolution method for the p-center problem. INFORMS J. Comput. 16(1):84–94.LinkGoogle Scholar
  • Feldmann AE, Marx D (2020) The parameterized hardness of the k-center problem in transportation networks. Algorithmica 82(7):1989–2005.CrossrefGoogle Scholar
  • Garcia-Diaz J, Sanchez-Hernandez J, Menchaca-Mendez R, Menchaca-Mendez R (2017) When a worse approximation factor gives better performance: A 3-approximation algorithm for the vertex k-center problem. J. Heuristics 23(5):349–366.CrossrefGoogle Scholar
  • Garcia-Diaz J, Menchaca-Mendez R, Menchaca-Mendez R, Pomares Hernández S, Pérez-Sansalvador JC, Lakouari N (2019) Approximation algorithms for the vertex K-center problem: Survey and experimental evaluation. IEEE Access 7(August):109228–109245.CrossrefGoogle Scholar
  • Garey M, Johnson D (1979) Computers and Intractability: A Guide to the Theory of NP-Completeness (W. H. Freeman, New York).Google Scholar
  • Gonzalez TF (1985) Clustering to minimize the maximum intercluster distance. Theoretical Comput. Sci. 38:293–306.CrossrefGoogle Scholar
  • Hansen P, Brimberg J, Urošević D, Mladenović N (2009) Solving large p-median clustering problems by primal–dual variable neighborhood search. Data Mining Knowledge Discovery 19(3):351–375.CrossrefGoogle Scholar
  • Hesabi ZR, Tari Z, Goscinski A, Fahad A, Khalil I, Queiroz C (2015) Data summarization techniques for big data—A survey. Handbook on Data Centers (Springer, New York), 1109–1152.CrossrefGoogle Scholar
  • Hochbaum DS, Shmoys DB (1985) A best possible heuristic for the K-Center problem. Math. Oper. Res. 10(2):180–184.LinkGoogle Scholar
  • Horst R, Tuy H (2013) Global Optimization: Deterministic Approaches (Springer Science & Business Media, Boston).Google Scholar
  • Hua K, Shi M, Cao Y (2021) A scalable deterministic global optimization algorithm for clustering problems. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 4391–4401.Google Scholar
  • Ilhan T, Pinar MC (2001) An efficient exact algorithm for the vertex p-center problem. Accessed November 18, 2022, http://www.ie.bilkent.edu.tr/mustafap/pubs.Google Scholar
  • Jia X, Sheth K, Svensson O (2022) Fair colorful k-center clustering. Math. Programming 192(July):339–360.CrossrefGoogle Scholar
  • Jiang J, Wen S, Yu S, Xiang Y, Zhou W (2015) K-center: An approach on the multi-source identification of information diffusion. IEEE Trans. Inform. Forensics Security 10(12):2616–2626.CrossrefGoogle Scholar
  • Kaufman L, Rousseeuw PJ (2009) Finding Groups in Data: An Introduction to Cluster Analysis (John Wiley & Sons, New York).Google Scholar
  • Kleindessner M, Awasthi P, Morgenstern J (2019) Fair k-center clustering for data summarization. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 3448–3457.Google Scholar
  • Li X, Zhao Z, Zhu X, Wyatt T (2011) Covering models and optimization techniques for emergency response facility location and planning: A review. Math. Methods Oper. Res. 74(3):281–310.CrossrefGoogle Scholar
  • Lim A, Rodrigues B, Wang F, Xu Z (2005) K-center problems with minimum coverage. Theoretical Comput. Sci. 332(1):1–17.CrossrefGoogle Scholar
  • Malkomes G, Kusner MJ, Chen W, Weinberger KQ, Moseley B (2015) Fast distributed k-center clustering with outliers on massive data. Proc. Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 1063–1071.Google Scholar
  • Mihelič J, Robic B (2005) Solving the k-center problem efficiently with a dominating set algorithm. J. Comput. Inform. Tech. 13(3):225–234.CrossrefGoogle Scholar
  • Minieka E (1970) The m-center problem. SIAM Rev. 12(01):138–139.CrossrefGoogle Scholar
  • Mladenović N, Labbé M, Hansen P (2003) Solving the p-center problem with Tabu search and variable neighborhood search. Networks 42(1):48–64.CrossrefGoogle Scholar
  • Padberg M, Rinaldi G (1991) A branch-and-cut algorithm for the resolution of large-scale symmetric traveling salesman problems. SIAM Rev. 33(1):60–100.CrossrefGoogle Scholar
  • Plesník J (1987) A heuristic for the p-center problems in graphs. Discrete Appl. Math. 17(3):263–268.CrossrefGoogle Scholar
  • Prokop B, Owsiński JW, Sep K, Sapiecha P (2016) Solving the k-centre problem as a method for supporting the Park and Ride facilities location decision. Proc. Federated Conf. Computer Sci. Inform. Systems (IEEE, Piscataway, NJ), 1223–1228.Google Scholar
  • Pullan W (2008) A memetic genetic algorithm for the vertex p-center problem. Evolutionary Comput. 16(3):417–436.CrossrefGoogle Scholar
  • Schneider T (2015) Analyzing 1.1 billion NYC taxi and uber trips with a vengeance. Accessed December 15, 2022, https://toddwschneider.com/posts/analyzing-1-1-billion-nyc-taxi-and-uber-trips-with-a-vengeance/.Google Scholar
  • Shi M, Hua K, Ren J, Cao Y (2022) Global optimization of k-center clustering. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 19956–19966.Google Scholar
  • Wang E, Ballachay R, Cai G, Cao Y, Trajano HL (2022) Predicting xylose yield in prehydrolysis of hardwoods: A machine learning approach. Frontiers Chemical Engrg. 4(October):994428.Google Scholar
  • Zhang C, Gao X, Li Y, Feng L (2019) Fault detection strategy based on weighted distance of k nearest neighbors for semiconductor manufacturing processes. IEEE Trans. Semiconductor Manufacturing 32(1):75–81. CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.