An Optimization-Based Order-and-Cut Approach for Fair Clustering of Data Sets

Published Online:https://doi.org/10.1287/ijds.2022.0005

References

  • Abbasi M, Bhaskara A, Venkatasubramanian S (2021) Fair clustering via equitable group representations. Proc. 2021 ACM Conf. Fairness Accountability Transparency (Association for Computing Machinery, New York), 504–514.Google Scholar
  • Abraham SS, P D, Sundaram SS (2019) Fairness in clustering with multiple sensitive attributes. Preprint, submitted October 11, https://arxiv.org/abs/1910.05113.Google Scholar
  • Ahmadian S, Epasto A, Kumar R, Mahdian M (2019) Clustering without over-representation. Proc. 25th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 267–275.Google Scholar
  • Ahmadian S, Epasto A, Kumar R, Mahdian M (2020a) Fair correlation clustering. Chiappa S, Calandra R, eds. Proc. 23rd Internat. Conf. Artificial Intelligence Statist., vol. 108 (PMLR, New York), 4195–4205.Google Scholar
  • Ahmadian S, Epasto A, Knittel M, Kumar R, Mahdian M, Moseley B, Pham P, Vassilvitskii S, Wang Y (2020b) Fair hierarchical clustering. Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H, eds. Adv. Neural Inform. Process. Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 21050–21060.Google Scholar
  • Anderson N, Bera SK, Das S, Liu Y (2020) Distributional individual fairness in clustering. Preprint, submitted June 22, https://arxiv.org/abs/2006.12589.Google Scholar
  • Angwin J, Larson J, Mattu S, Kirchner L (2016) Machine bias. ProPublica Online (May 23), https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.Google Scholar
  • Aprahamian H, El-Amine H (2021) Optimal clustering of frequency data with application to disease risk categorization. IISE Trans. 54(8):728–740.Google Scholar
  • Aprahamian H, Bish DR, Bish EK (2019) Optimal risk-based group testing. Management Sci. 65(9):4365–4384.LinkGoogle Scholar
  • Aviran S, Onn S (2002) Momentopes, the complexity of vector partitioning, and Davenport–Schinzel sequences. Discrete Comput. Geometry 27(3):409–417.Google Scholar
  • Backurs A, Indyk P, Onak K, Schieber B, Vakilian A, Wagner T (2019) Scalable fair clustering. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn., vol. 97 (PMLR, New York), 405–413.Google Scholar
  • Baharlouei S, Nouiehed M, Beirami A, Razaviyayn M (2019) Rényi fair inference. Preprint, submitted June 28, https://arxiv.org/abs/1906.12005.Google Scholar
  • Balas E, Padberg MW (1976) Set partitioning: A survey. SIAM Rev. 18(4):710–760.Google Scholar
  • Baron D (2019) Machine learning in astronomy: A practical overview. Preprint, submitted April 15, https://arxiv.org/abs/1904.07248.Google Scholar
  • Bera S, Chakrabarty D, Flores N, Negahbani M (2019) Fair algorithms for clustering. Wallach H, Larochelle H, Beygelzimer A, d’Alch é-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Process. Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY), 4955–4966.Google Scholar
  • Bercea IO, Groß M, Khuller S, Kumar A, Rösner C, Schmidt DR, Schmidt M (2018) On the cost of essentially fair clusterings. Preprint, submitted November 26, https://arxiv.org/abs/1811.10319.Google Scholar
  • Böhm M, Fazzone A, Leonardi S, Schwiegelshohn C (2020) Fair clustering with multiple colors. Preprint, submitted February 18, https://arxiv.org/abs/2002.07892.Google Scholar
  • Buolamwini J, Gebru T (2018) Gender shades: Intersectional accuracy disparities in commercial gender classification. Friedler SA, Wilson C, eds. Proc. 1st Conf. Fairness, Accountability Transparency, vol. 81 (PMLR, New York), 77–91.Google Scholar
  • Chakravarty AK, Orlin JB, Rothblum UG (1982) A partitioning problem with additive objective with an application to optimal inventory groupings for joint replenishment. Oper. Res. 30(5):1018–1022.LinkGoogle Scholar
  • Chakravarty AK, Orlin JB, Rothblum UG (1985) Consecutive optimizers for a partitioning problem with applications to optimal inventory groupings for joint replenishment. Oper. Res. 33(4):820–834.LinkGoogle Scholar
  • Chan LMA, Simchi-Levi D, Bramel J (1998) Worst-case analyses, linear programming and the bin-packing problem. Math. Programming 83(1):213–227.Google Scholar
  • Chhabra A, Mohapatra P (2022) Fair algorithms for hierarchical agglomerative clustering. Proc. 2022 21st IEEE Internat. Conf. Machine Learn. Appl. (IEEE, Piscataway, NJ), 206–211.Google Scholar
  • Chhabra A, Masalkovaitė K, Mohapatra P (2021) An overview of fairness in clustering. IEEE Access 9:130698–130720.Google Scholar
  • Chierichetti F, Kumar R, Lattanzi S, Vassilvitskii S (2017) Fair clustering through fairlets. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Inform. Process. Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 5029–5037.Google Scholar
  • Coates A, Ng AY (2012) Learning feature representations with k-means. Montavon G, Orr GB, Müller KR, eds. Neural Networks: Tricks of the Trade, 2nd ed. (Springer, Berlin), 561–580.Google Scholar
  • Cook WJ, Cunningham WH, Pulleyblank WR, Schrijver A (1997) Optimal trees and paths, chapter 2. Combinatorial Optimization (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Dasgupta S (2008) The Hardness of k-Means Clustering (Department of Computer Science and Engineering, University of California, Berkeley, CA). Google Scholar
  • Datta A, Tschantz MC, Datta A (2015) Automated experiments on ad privacy settings: A tale of opacity, choice, and discrimination. Proc. Privacy Enhancing Tech. 2015(1):92–112.Google Scholar
  • El-Amine H, Aprahamian H (2021) A heuristic scheme for multivariate set partitioning problems with application to classifying heterogeneous populations for multiple binary attributes. IISE Trans. 54(6):537–549.Google Scholar
  • El-Amine H, Aprahamian H (2022) A heuristic scheme for multivariate set partitioning problems with application to classifying heterogeneous populations for multiple binary attributes. IISE Trans. 54(6):537–549.Google Scholar
  • Fisher WD (1958) On grouping for maximum homogeneity. J. Amer. Statist. Assoc. 53(284):789–798.Google Scholar
  • Gal S, Klots B (1995) Optimal partitioning which maximizes the sum of the weighted averages. Oper. Res. 43(3):500–508.LinkGoogle Scholar
  • Ghosal A, Nandy A, Das AK, Goswami S, Panday M (2020) A short review on different clustering techniques and their applications. Mandal JK, Bhattacharya D, eds. Emerging Technology in Modelling and Graphics (Springer, Singapore), 69–83.Google Scholar
  • Grari V, Ruf B, Lamprier S, Detyniecki M (2020) Fairness-aware neural Rényi minimization for continuous features. Bessiere C, ed. Proc. 29th Internat. Joint Conf. Artificial Intelligence (IJCAI-20), 2262–2268.Google Scholar
  • Handler GY, Zang I (1980) A dual algorithm for the constrained shortest path problem. Networks 10(4):293–309.Google Scholar
  • Hwang FK, Onn S, Rothblum UG (1999) A polynomial time algorithm for shaped partition problems. SIAM J. Optim. 10(1):70–81.Google Scholar
  • Hwang FK, Sun J, Yao E (1985) Optimal set partitioning. SIAM J. Algebraic Discrete Methods 6(1):163–170.Google Scholar
  • Jin K (2017) Optimal partitioning which maximizes the weighted sum of products. Xiao M, Rosamond F, eds. Frontiers in Algorithmics (Springer International Publishing, Cham, Switzerland), 127–138.Google Scholar
  • Kamishima T, Akaho S, Sakuma J (2011) Fairness-aware learning through regularization approach. 2011 IEEE 11th Internat. Conf. Data Mining Workshops (IEEE, Piscataway, NJ), 643–650.Google Scholar
  • Kleindessner M, Awasthi P, Morgenstern J (2019a) Fair k-center clustering for data summarization. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn., vol. 97 (PMLR, New York), 3448–3457.Google Scholar
  • Kleindessner M, Samadi S, Awasthi P, Morgenstern J (2019b) Guarantees for spectral clustering with fairness constraints. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn., vol. 97 (PMLR, New York), 3458–3467.Google Scholar
  • Kuo R, Ho L, Hu CM (2002) Integration of self-organizing feature map and k-means algorithm for market segmentation. Comput. Oper. Res. 29(11):1475–1493.Google Scholar
  • Lewis M, Kochenberger G, Alidaee B (2008) A new modeling and solution approach for the set-partitioning problem. Comput. Oper. Res. 35(3):807–813.Google Scholar
  • Li P, Zhao H, Liu H (2020) Deep fair clustering for visual learning. Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognition, 9070–9079.Google Scholar
  • Lin J, Aprahamian H, El-Amine H (2023) Optimal unlabeled set partitioning with application to risk-based quarantine policies. IISE Trans. Forthcoming.Google Scholar
  • Liu K, Tovar A, Nutwell E, Detwiler D (2015) Toward nonlinear multimaterial topology optimization using unsupervised machine learning and metamodel-based optimization. 41st Design Automation Conf. Internat. Design Engrg. Technical Conf. Comput. Inform. Engrg. Conf., vol. 2B (American Society of Mechanical Engineers, New York).Google Scholar
  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans. Inform. Theory 28(2):129–137.Google Scholar
  • Magzhan K, Jani HM (2013) A review and evaluations of shortest path algorithms. Internat. J. Sci. Tech. Res. 2(6):99–104.Google Scholar
  • Mary J, Calauzenes C, El Karoui N (2019) Fairness-aware learning for continuous attributes and treatments. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn., vol. 97 (PMLR, New York), 4382–4391.Google Scholar
  • Mead A (1992) Review of the development of multidimensional scaling methods. Statistician 41(1):27–39.Google Scholar
  • Micha E, Shah N (2020) Proportionally fair clustering revisited. Czumaj A, Dawar A, Merelli E, eds. Proc. 47th Internat. Colloquium Automata Languages Programming, vol. 168, Leibniz International Proceedings in Informatics (Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany), 85:1–85:16.Google Scholar
  • Mitchell K, Grupac M, Zauskova A (2021) Ethical management and implementation of COVID-19 immunity passports and vaccination certificates: Lawfulness, fairness, and transparency. Linguistic Philos. Investigations 20:45–54.Google Scholar
  • Munguía-López AC, Ponce-Ortega JM (2021) Fair allocation of potential COVID-19 vaccines using an optimization-based strategy. Process Integration Optim. Sustainability 5(1):3–12.Google Scholar
  • Onn S, Schulman LJ (2001) The vector partition problem for convex objective functions. Math. Oper. Res. 26(3):583–590.LinkGoogle Scholar
  • Pérez-Suay A, Laparra V, Mateo-García G, Muñoz-Marí J, Gómez-Chova L, Camps-Valls G (2017) Fair kernel learning. Ceci M, Hollmén J, Todorovski L, Vens C, Džeroski S, eds. Machine Learning and Knowledge Discovery in Databases (Springer International Publishing, Cham, Switzerland), 339–355.Google Scholar
  • Rösner C, Schmidt M (2018) Privacy preserving clustering with constraints. Preprint, submitted February 7, https://arxiv.org/abs/1802.02497.Google Scholar
  • Sandrin R, Simpson R (2021) Public assessments of police during the COVID-19 pandemic: The effects of procedural justice and personal protective equipment. Policing 45(1):154–168.Google Scholar
  • Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin C (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681.Google Scholar
  • Schmidt M, Schwiegelshohn C, Sohler C (2020) Fair coresets and streaming algorithms for fair k-means. Bampis E, Megow N, eds. Approximation and Online Algorithms (Springer International Publishing, Cham, Switzerland), 232–251.Google Scholar
  • Shaham U, Stanton K, Li H, Nadler B, Basri R, Kluger Y (2018) Spectralnet: Spectral clustering using deep neural networks. Preprint, submitted January 4, https://arxiv.org/abs/1801.01587.Google Scholar
  • Sweeney L (2013) Discrimination in online ad delivery. Queue 11(3):10–29.Google Scholar
  • Tang M, Marin D, Ayed IB, Boykov Y (2019) Kernel cuts: Kernel and spectral clustering meet regularization. Internat. J. Comput. Vision 127(5):477–511.Google Scholar
  • Tang R, Fong S, Yang X, Deb S (2012) Integrating nature-inspired optimization algorithms to k-means clustering. Seventh Internat. Conf. Digital Inform. Management (IEEE, Piscataway, NJ), 116–123.Google Scholar
  • Thejaswi S, Ordozgoiti B, Gionis A (2021) Diversity-aware k-median: Clustering with fair center representation. Oliver N, Pérez-Cruz F, Kramer S, Read J, Lozano JA, eds. Machine Learning and Knowledge Discovery in Databases. Research Track (Springer International Publishing, Cham, Switzerland), 765–780.Google Scholar
  • Von Luxburg U, Williamson RC, Guyon I (2012) Clustering: Science or art? Guyon I, Dror G, Lemaire V, Taylor G, Silver D, eds. Proc. ICML Workshop Unsupervised Transfer Learn., vol. 27 (PMLR, New York), 65–79.Google Scholar
  • Witsenhausen HS (1975) On sequences of pairs of dependent random variables. SIAM J. Appl. Math. 28(1):100–113.Google Scholar
  • Zafar MB, Valera I, Rodriguez MG, Gummadi KP (2017) Fairness constraints: Mechanisms for fair classification. Singh A, Zhu J, eds. Proc. 20th Internat. Conf. Artificial Intelligence Statist., vol. 54 (PMLR, New York), 962–970.Google Scholar
  • Ziegelmann M (2007) Constrained Shortest Paths and Related Problems: Constrained Network Optimization (Verlag Dr. Muller, Saarbrucken, Germany).Google Scholar
  • Ziko IM, Yuan J, Granger E, Ben Ayed I (2021) Variational fair clustering. Proc. AAAI Conf. Artificial Intelligence 35(12):11202–11209.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.