Multiobjective Linear Ensembles for Robust and Sparse Training of Few-Bit Neural Networks
References
- (2020) Strong mixed-integer programming formulations for trained neural networks. Math. Programming 183(1):3–39.Crossref, Google Scholar
- (2018) Scalable methods for 8-bit training of neural networks. Adv. Neural Inform. Processing Systems 31:5145–5153.Google Scholar
- (2021) Machine learning for combinatorial optimization: A methodological tour d’horizon. Eur. J. Oper. Res. 290(2):405–421.Crossref, Google Scholar
- (2022) Janos: An integrated predictive and prescriptive modeling framework. INFORMS J. Comput. 34(2):807–816.Link, Google Scholar
- (2023) The BeMi stardust: A structured ensemble of binarized neural networks. Internat. Conf. Learn. Intelligent Optimization (LION) (Springer, Cham, Switzerland), 443–458.Google Scholar
- (2024) Multi-objective linear ensembles for robust and sparse training of few-bit neural networks. http://dx.doi.org/10.1287/ijoc.2023.0281.cd, https://github.com/INFORMSJoC/2023.0281.Google Scholar
- (2006) Pattern Recognition and Machine Learning (Springer, New York).Google Scholar
- (2020) What is the state of neural network pruning? Proc. Machine Learn. Systems, 129–146.Google Scholar
- (2019) Qutibench: Benchmarking neural networks on heterogeneous hardware. ACM J. Emerging Tech. Comput. Systems 15(4):1–38.Crossref, Google Scholar
- (2020) Efficient verification of ReLu-based neural networks via dependency analysis. Proc. Conf. AAAI Artificial Intelligence 34(4):3291–3299.Crossref, Google Scholar
- (2022) Image classification with small datasets: Overview and benchmark. IEEE Access 10(2022):49233–49250.Crossref, Google Scholar
- (2023) Getting away with more network pruning: From sparsity to geometry and linear regions. Internat. Conf. Integration Constraint Programming Artificial Intelligence Oper. Res. (CPAIOR) (Springer, Cham, Switzerland), 200–218.Google Scholar
- (2023) Combinatorial optimization and reasoning with graph neural networks. J. Machine Learn. Res. 24(130):1–61.Google Scholar
- (2023) OAMIP: Optimizing ANN architectures using mixed-integer programming. Internat. Conf. Integration Constraint Programming Artificial Intelligence Oper. Res. (CPAIOR) (Springer, Berlin, Heidelberg), 219–237.Google Scholar
- (2018) Deep neural networks and mixed integer linear optimization. Constraints 23(3):296–309.Crossref, Google Scholar
- (2022) A survey of quantization methods for efficient neural network inference. Thiruvathukal GK, Lu Y-H, Kim J, Chen Y, Chen B, eds. Low-Power Computer Vision (Chapman and Hall/CRC, New York), 291–326.Crossref, Google Scholar
- (2022) Recall distortion in neural network pruning and the undecayed pruning algorithm. Adv. Neural Inform. Processing Systems 35:32762–32776.Google Scholar
- Gurobi Optimization LLC (2023) Gurobi Optimizer reference manual. Accessed August 9, 2023, https://www.gurobi.com.Google Scholar
- (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. Preprint, submitted October 1, https://arxiv.org/abs/1510.00149.Google Scholar
- (2023) Searching large neighborhoods for integer linear programs with contrastive learning. Proc. 40th Internat. Conf. Machine Learn., vol. 202 (PMLR, New York), 13869–13890.Google Scholar
- (2016) Binarized neural networks. Adv. Neural Inform. Processing Systems 29:4107–4115.Google Scholar
- (2023) When deep learning meets polyhedral theory: A survey. Preprint, submitted April 29, https://arxiv.org/abs/2305.00241.Google Scholar
- (1988) Heart disease data set. Accessed August 9, 2023, http://archive.ics.uci.edu/ml/datasets/Heart+Disease.Google Scholar
- (2019) Predicting the generalization gap in deep networks with margin distributions. Internat. Conf. Learn. Representations (ICLR) (OpenReview.net).Google Scholar
- (2017) Generalization in deep learning. Preprint, submitted October 16, https://arxiv.org/abs/1710.05468.Google Scholar
- (2017) On large-batch training for deep learning: Generalization gap and sharp minima. Internat. Conf. Learn. Representations (ICLR), vol. 5 (OpenReview.net).Google Scholar
- (2019) Combinatorial attacks on binarized neural networks. Internat. Conf. Learn. Representations (ICLR) (OpenReview.net).Google Scholar
- (2021) Efficient and robust mixed-integer optimization methods for training binarized deep neural networks. Preprint, submitted October 21, https://arxiv.org/abs/2110.11382.Google Scholar
- (2015) Deep learning. Nature 521(7553):436–444.Crossref, Google Scholar
- (1998) The MNIST database of handwritten digits. Accessed August 9, 2023, http://yann.lecun.com/exdb/mnist.Google Scholar
- (2020) Simple and fast algorithm for binary integer and online linear programming. Adv. Neural Inform. Processing Systems 33:9412–9421.Google Scholar
- (2017) Toward accurate binary convolutional neural network. Adv. Neural Inform. Processing Systems 30:345–353.Google Scholar
- (2018) Boosting combinatorial problem modeling with machine learning. Preprint, submitted July 15, https://arxiv.org/abs/1807.05517.Google Scholar
- (2021) Mixed-integer convex nonlinear optimization with gradient-boosted trees embedded. INFORMS J. Comput. 33(3):1103–1119.Link, Google Scholar
- (1991) The effective number of parameters: An analysis of generalization and regularization in nonlinear learning systems. Adv. Neural Inform. Processing Systems 4:847–854.Google Scholar
- (2019) One ticket to win them all: Generalizing lottery ticket initializations across datasets and optimizers. Adv. Neural Inform. Processing Systems 32:4932–4942.Google Scholar
- (2017) Exploring generalization in deep learning. Adv. Neural Inform. Processing Systems 30:5947–5956.Google Scholar
- (2022) A mixed-integer programming approach to training dense neural networks. Preprint, submitted January 3, https://arxiv.org/abs/2201.00723.Google Scholar
- (2022) Training thinner and deeper neural networks: Jumpstart regularization. Schaus P, ed. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2022, Lecture Notes in Computer Science, vol. 13292 (Springer, Cham, Switzerland), 345–357.Google Scholar
- (2018) True gradient-based training of deep binary activated neural networks via continuous binarization. 2018 IEEE Internat. Conf. Acoustics Speech Signal Processing (ICASSP) (IEEE, Piscataway, NJ), 2346–2350.Google Scholar
- (2020) Lossless compression of deep neural networks. Hebrard E, Musliu N, eds. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2020, Lecture Notes in Computer Science, vol. 12296 (Springer, Cham, Switzerland), 417–430.Google Scholar
- (2021) Scaling up exact neural network compression by ReLU stability. Adv. Neural Inform. Processing Systems 34:27081–27093.Google Scholar
- (2017) How to train a compact binary neural network with high accuracy? Thirty-First AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 2625–2631.Google Scholar
- (2023) Optimal training of integer-valued neural networks with mixed integer programming. PLoS One 18(2):e0261029.Crossref, Google Scholar
- (2020) The convex relaxation barrier, revisited: Tightened single-neuron relaxations for neural network verification. Adv. Neural Inform. Processing Systems 33:21675–21686.Google Scholar
- (2018) Evaluating robustness of neural networks with mixed integer programming. Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
- (2019) Training binarized neural networks using MIP and CP. Internat. Conf. Principles Practice Constraint Programming, vol. 11802 (Springer, Cham, Switzerland), 401–417.Google Scholar
- (2021) Partition-based formulations for mixed-integer optimization of trained ReLU neural networks. Adv. Neural Inform. Processing Systems 34:3068–3080.Google Scholar
- (2019) Meta-learning. Hutter F, Kotthoff L, Vanschoren J, eds. Automated Machine Learning, Springer Series on Challenges in Machine Learning (Springer, Cham, Switzerland), 35–61.Crossref, Google Scholar
- (2023) Optimizing over an ensemble of trained neural networks. INFORMS J. Comput. 35(3):652–674.Link, Google Scholar
- (2013) Model Building in Mathematical Programming (John Wiley & Sons, Chichester, UK).Google Scholar
- (2017) Fashion-mnist: A novel image data set for benchmarking machine learning algorithms. Preprint, submitted August 25, https://arxiv.org/abs/1708.07747.Google Scholar
- (2023) GNN&GBDT-guided fast optimizing framework for large-scale integer programming. Proc. 40th Internat. Conf. Machine Learn., vol. 202 (PMLR, New York), 39864–39878.Google Scholar
- (1988) Condorcet’s theory of voting. Amer. Political Sci. Rev. 82(4):1231–1244.Crossref, Google Scholar
- (2022) The combinatorial brain surgeon: Pruning weights that cancel one another in neural networks. Internat. Conf. Machine Learn. (ICML) (PMLR, New York), 25668–25683.Google Scholar

