Mixed Integer Linear Optimization Formulations for Learning Optimal Binary Classification Trees
Published Online:12 Mar 2026https://doi.org/10.1287/ijoc.2023.0068
References
- (2019) Learning optimal and fair decision trees for non-discriminative decision-making. Proc. 33rd AAAI Conf. Artificial Intelligence, 31st Innovative Appl. Artificial Intelligence Conf., 9th AAAI Sympos. Ed. Adv. Artificial Intelligence, vol. 33 (AAAI Press, Palo Alto, CA), 1418–1426.Google Scholar
- (2025) Strong optimal classification trees. Oper. Res. 73(4):2223–2241.Link, Google Scholar
- (2020) Learning optimal decision trees using caching branch-and-bound search. Proc. AAAI Conf. Artificial Intelligence 34(04):3146–3153.Crossref, Google Scholar
- (2024) Bridging the gap between operations research and machine learning with decision trees and neural nets. PhD thesis, Rice University, Houston, TX.Google Scholar
- (2026) Mixed integer linear optimization formulations for learning optimal binary classification trees. https://doi.org/10.1287/ijoc.2023.0068.cd, https://github.com/INFORMSJoC/2023.0068.Google Scholar
- (2017) Learning certifiably optimal rule lists. Proc. 23rd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 35–44.Google Scholar
- (2017) Neural decision trees. Preprint, submitted March 6, https://arxiv.org/abs/1702.07360.Google Scholar
- (2000) Combining binary decision tree and geostatistical methods to estimate snow distribution in a mountain watershed. Water Resources Res. 36(1):13–26.Crossref, Google Scholar
- (2024) Fair tree learning. Machine Learn. 113(5):3305–3324.Google Scholar
- (1995) Global tree optimization: A non-greedy decision tree algorithm. Comput. Sci. Statist. 26(1):156–156.Google Scholar
- (1996) Optimal decision trees. Rensselaer Polytechnic Inst. Math Rep. 214(24):1–28.Google Scholar
- (1993) Bilinear separation of two sets inn-space. Comput. Optim. Appl. 2(3):207–227.Crossref, Google Scholar
- (2017) Optimal classification trees. Machine Learn. 106(7):1039–1082.Crossref, Google Scholar
- (2007) Classification and regression via integer optimization. Oper. Res. 55(2):252–271.Link, Google Scholar
- (2019) Optimal prescriptive trees. INFORMS J. Optim. 1(2):164–183.Link, Google Scholar
- (2012) A brief history of linear and mixed-integer programming computation. Documenta Math. 107–121.Google Scholar
- (1999) MIP: Theory and practice—Closing the gap. System Modelling and Optimization (Springer, Berlin), 19–49.Google Scholar
- (2021) Optimal randomized classification trees. Computers Oper. Res. 132(C):105281.Crossref, Google Scholar
- (1992) A training algorithm for optimal margin classifiers. Proc. 5th Annual Workshop Comput. Learn. Theory (Association for Computing Machinery, New York), 144–152.Crossref, Google Scholar
- (2022) Shattering inequalities for learning optimal decision trees. Integration of Constraint Programming, Artificial Intelligence, and Operations Research (Springer, Berlin), 74–90.Crossref, Google Scholar
- (1984) Classification and Regression Trees (CRC Press, Boca Raton, FL).Google Scholar
- (1995) Multivariate decision trees. Machine Learn. 19:45–77.Crossref, Google Scholar
- (2001) Combinatorial Optimization Packing and Covering, CBMS-NSF Regional Conference Series in Applied Mathematics (Society for Industrial and Applied Mathematics, Philadelphia).Google Scholar
- (2007) Revival of the Gomory cuts in the 1990’s. Ann. Oper. Res. 149(1):63–66.Crossref, Google Scholar
- (2008) Valid inequalities for mixed integer linear programs. Math. Programming 112(1):3–44.Crossref, Google Scholar
- (2018) Boolean decision rules via column generation. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 4660–4670.Google Scholar
- (2013) Measuring firm performance using financial ratios: A decision tree approach. Expert Systems Appl. 40(10):3970–3983.Crossref, Google Scholar
- (2022) MurTree: Optimal decision trees via dynamic programming and search. J. Machine Learn. Res. 23(26):1–47.Google Scholar
- (2018) Avoiding numerical issues in optimization models. Accessed February 1, 2025, https://www.gurobi.com/events/avoiding-numerical-issues-in-optimization-models/.Google Scholar
- (2020) Column generation based heuristic for learning classification trees. Computers Oper. Res. 116(1):104866.Crossref, Google Scholar
- (2017) Thinning out Steiner trees: A node-based model for uniform edge costs. Math. Programming Comput. 9(2):203–229.Crossref, Google Scholar
- (1962) Flows in Networks (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
- (2020) Artificial intelligence and algorithmic bias: Source, detection, mitigation, and implications. INFORMS Tutorials in Operations Research, 39–63.Google Scholar
- (2021) Optimal decision trees for categorical data via integer programming. J. Global Optim. 81(1):233–260.Crossref, Google Scholar
- (2018) Machine learning and data mining with combinatorial optimization algorithms. INFORMS Tutorials in Operations Research, 109–129.Google Scholar
- (2003) Logic-based benders decomposition. Math. Programming 96(1):33–60.Crossref, Google Scholar
- (1976) Constructing optimal binary decision trees is NP-complete. Inform. Processing Lett. 5(1):15–17.Crossref, Google Scholar
- (2013) Efficient symmetry breaking formulations for the job grouping problem. Comput. Oper. Res. 40(4):1132–1142.Crossref, Google Scholar
- (2022) Learning optimal fair classification trees. Proc. AAAI/ACM Conf. AI, Ethics, Society, AIES ’23 (ACM, New York), 181–192.Google Scholar
- (2011) Minimum s−t cut in undirected planar graphs when the source and the sink are close. Proc. Sympos. Theoretical Aspects Comput. Sci., vol. 9 (Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany), 117–128.Google Scholar
- (2015) Deep neural decision forests. Proc. IEEE Internat. Conf. Comput. Vision (AAAI Press, Palo Alto, CA), 1467–1475.Google Scholar
- (2013) Fuzzy binary decision tree for biometric based personal authentication. Neurocomputing 99:87–97.Crossref, Google Scholar
- (1960) An automatic method of solving discrete programming problems. Econometrica 28(3):497–520.Crossref, Google Scholar
- (2020) Oblique decision trees from derivatives of ReLu networks. Preprint, submitted May 3, https://doi.org/10.48550/arXiv.1909.13488.Google Scholar
- (1987) Factoring integers with elliptic curves. Ann. Math. 126(3):649–673.Crossref, Google Scholar
- (2021) Efficient management strategy of COVID-19 patients based on cluster analysis and clinical decision tree classification. Sci. Rep. 11(1):9626.Crossref, Google Scholar
- (2020) Generalized and scalable optimal sparse decision trees. Proc. 37th Internat. Conf. Machine Learn. (JMLR, New York), 6150–6160.Google Scholar
- (1964) Decision Trees for Decision Making (Harvard Business Review, Cambridge, MA).Google Scholar
- (2019) Deep learning in computer vision: Methods, interpretation, causation, and fairness. INFORMS Tutorials in Operations Research, 73–100.Google Scholar
- (2010) Symmetry in Integer Linear Programming (Springer, Berlin), 647–686.Crossref, Google Scholar
- (2022) Fast sparse decision tree optimization via reference ensembles. Proc. AAAI Conf. Artificial Intelligence 36(9):9604–9613.Crossref, Google Scholar
- (2011) On oblique random forests. Machine Learning and Knowledge Discovery in Databases (Springer, Berlin), 453–469.Crossref, Google Scholar
- (1998) Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining Knowledge Discovery 2(4):345–389.Crossref, Google Scholar
- (1994) A system for induction of decision trees. J. Artificial Intelligence Res. 2(1):1–32.Crossref, Google Scholar
- (2018) Learning optimal decision trees with sat. Proc. 27th Internat. Joint Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 1362–1368.Crossref, Google Scholar
- (2003) Multivariate classification trees based on minimum features discrete support vector machines. IMA J. Management Math. 14(3):221–234.Google Scholar
- (2024) An improved column-generation-based matheuristic for learning classification trees. Comput. Oper. Res. 165(C):106579.Crossref, Google Scholar
- (2021) Measuring financial performance of Indian manufacturing firms: Application of decision tree algorithms. Measuring Bus. Excellence 26(3):288–307.Google Scholar
- (2021) How fair can we go in machine learning? Assessing the boundaries of accuracy and fairness. Internat. J. Intelligent Systems 36(4):1619–1643.Crossref, Google Scholar
- (2019) Learning optimal classification trees using a binary linear program formulation. Proc. AAAI Conf. Artificial Intelligence, vol. 33 (AAAI Press, Palo Alto, CA), 1625–1632.Crossref, Google Scholar
- (2015) Trading interpretability for accuracy: Oblique treed sparse additive models. Proc. 21th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining. (ACM, New York), 1245–1254.Google Scholar
- (2016) HHCART: An oblique decision tree. Comput. Statist. Data Anal. (Oxford) 96(C):12–23.Crossref, Google Scholar
- (2018) Deep neural decision trees. Preprint, submitted June 19, https://arxiv.org/abs/1806.06988.Google Scholar
- (2020) Learning binary decision trees by argmin differentiation. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., vol. 139 (PMLR, New York), 12298–12309.Google Scholar
- (2019) FAHT: An adaptive fairness-aware decision tree classifier. Kraus S, ed. Proc. 28th Internat. Joint Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 1480–1486.Google Scholar

