Abrahamsen M, Kleist L, Miltzow T (2021) Training neural networks is ∃ℝ-complete. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 18293–18306.Google Scholar
Agostinelli F, Hoffman M, Sadowski P, Baldi P (2015) Learning activation functions to improve deep neural networks. Preprint, submitted April 21, https://arxiv.org/abs/1412.6830.Google Scholar
Alcántara A, Ruiz C (2023) A neural network-based distributional constraint learning methodology for mixed-integer stochastic optimization. Expert Systems Appl. 232:120895.Crossref, Google Scholar
Alcántara A, Ruiz C, Tsay C (2025) A quantile neural network framework for two-stage stochastic optimization. Expert Systems Appl. 284:127876.Crossref, Google Scholar
Amrami A, Goldberg Y (2021) A simple geometric proof for the benefit of depth in ReLU networks. Preprint, submitted January 18, https://arxiv.org/abs/2101.07126.Google Scholar
Anderson R, Huchette J, Tjandraatmadja C, Vielma J (2019) Strong mixed-integer programming formulations for trained neural networks. Lodi A, Nagarajan V, eds. Integer Programming Combin. Optim. IPCO 2019, Lecture Notes in Computer Science, vol. 11480 (Springer, Cham, Switzerland), 27–42.Google Scholar
Anderson R, Huchette J, Ma W, Tjandraatmadja C, Vielma JP (2020) Strong mixed-integer programming formulations for trained neural networks. Math. Programming 183(1–2):3–39.Crossref, Google Scholar
Anil C, Lucas J, Grosse R (2019) Sorting out Lipschitz function approximation. Chaudhuri K, Salakhutdinov R, eds. Internat. Conf. Machine Learn. (ICML), vol. 97 (PMLR, New York), 291–301.Google Scholar
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. Precup D, Teh YW, eds. Internat. Conf. Machine Learn. (ICML), vol. 70 (JMLR), 214–223.Google Scholar
Arora R, Basu A, Mianjy P, Mukherjee A (2018) Understanding deep neural networks with rectified linear units. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Averkov G, Hojny C, Merkert M (2025) On the expressiveness of rational ReLU neural networks with bounded depth. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Aziznejad S, Gupta H, Campos J, Unser M (2020) Deep neural networks with trainable activations and controlled Lipschitz constant. IEEE Trans. Signal Processing 68:4688–4699.Crossref, Google Scholar
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Bakaev E, Brunck F, Hertrich C, Reichman D, Yehudayoff A (2025a) On the depth of monotone ReLU neural networks and ICNNs. Preprint, submitted May 9, https://arxiv.org/abs/2505.06169.Google Scholar
Bakaev E, Brunck F, Hertrich C, Stade J, Yehudayoff A (2025b) Better neural network expressivity: Subdividing the simplex. Preprint, submitted May 20, https://arxiv.org/abs/2505.14338.Google Scholar
Balas E (1998) Disjunctive programming: Properties of the convex hull of feasible points. Discrete Appl. Math. 89(1–3):3–44.Crossref, Google Scholar
Balas E (2018) Disjunctive Programming (Springer, Cham, Switzerland).Crossref, Google Scholar
Balas E, Ceria S, Cornuéjols G (1993) A lift-and-project cutting plane algorithm for mixed 0–1 programs. Math. Programming 58(1–3):295–324.Crossref, Google Scholar
Balas E, Ceria S, Cornuéjols G (1996) Mixed 0-1 programming by lift-and-project in a branch-and-cut framework. Management Sci. 42(9):1229–1246.Link, Google Scholar
Balestriero R, Baraniuk RG (2018) A spline theory of deep networks. Internat. Conf. Machine Learn. (ICML) (PMLR, New York).Google Scholar
Balunović M, Vechev M (2020) Adversarial training and provable defenses: Bridging the gap. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Batten B, Kouvaros P, Lomuscio A, Zheng Y (2021) Efficient neural network verification via layer-based semidefinite relaxations and linear cuts. Zhou Z, ed. Proc. 30th Internat. Joint Conf. Artificial Intelligence (IJCAI), 2184–2190.Google Scholar
Bengio Y (2009) Learning deep architectures for AI. Foundations Trends Machine Learn. 2(1):1–127.Crossref, Google Scholar
Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: A methodological tour d’horizon. Eur. J. Oper. Res. 290(2):405–421.Crossref, Google Scholar
Bennett KP (1992) Decision tree construction via linear programming. Technical report, University of Wisconsin-Madison Department of Computer Sciences, Madison.Google Scholar
Bennett KP, Mangasarian OL (1990) Neural network training via linear programming. Computer Sciences Technical Report 1067, University of Wisconsin–Madison Department of Computer Sciences, Madison.Google Scholar
Bennett KP, Mangasarian OL (1992) Robust linear programming discrimination of two linearly inseparable sets. Optim. Methods Software 1(1):23–34.Crossref, Google Scholar
Benussi E, Patane A, Wicker M, Laurenti L, Kwiatkowska M (2022) Individual fairness guarantees for neural networks. De Raedt L, ed. Proc. 31st Internat. Joint Conf. Artificial Intelligence (IJCAI), 651–658.Google Scholar
Bergman D, Huang T, Brooks P, Lodi A, Raghunathan AU (2022) JANOS: An integrated predictive and prescriptive modeling framework. INFORMS J. Comput. 34(2):807–816.Link, Google Scholar
Bernardelli AM, Gualandi S, Lau HC, Milanesi S (2023) The bemi stardust: A structured ensemble of binarized neural networks. Sellmann M, Tierney K, eds. Learn. Intelligent Optim. LION 2023, Lecture Notes in Computer Science, vol. 14286 (Springer, Cham, Switzerland), 443–458.Google Scholar
Berner C, Brockman G, Chan B, Cheung V, Dȩbiak P, Dennison C, Farhi D, et al. (2019) Dota 2 with large scale deep reinforcement learning. Preprint, submitted December 13, https://arxiv.org/abs/1912.06680.Google Scholar
Berrada L, Zisserman A, Mudigonda P (2019) Deep Frank-Wolfe for neural network optimization. 7th Internat. Conf. Learn. Representations (ICLR 2019).Google Scholar
Bertschinger D, Hertrich C, Jungeblut P, Miltzow T, Weber S (2024) Training fully connected neural networks is ∃R-complete. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. NIPS’23: Proc. 37th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 36222–36237.Google Scholar
Bhosekar A, Ierapetritou M (2018) Advances in surrogate based modeling, feasibility analysis, and optimization: A review. Comput. Chemical Engrg. 108:250–267.Crossref, Google Scholar
Bianchini M, Scarselli F (2014) On the complexity of neural network classifiers: A comparison between shallow and deep architectures. IEEE Trans. Neural Networks Learn. Systems 25(8):1553–1565.Crossref, Google Scholar
Biau G, Sangnier M, Tanielian U (2021) Some theoretical insights into Wasserstein GANs. J. Machine Learn. Res. 22(1):5287–5331.Google Scholar
Bienstock D, Muñoz G (2018) LP formulations for polynomial optimization problems. SIAM J. Optim. 28(2):1121–1150.Crossref, Google Scholar
Bienstock D, Muñoz G, Pokutta S (2023) Principled deep neural network training through linear programming. Discrete Optim. 49:100795.Crossref, Google Scholar
Blum AL, Rivest RL (1992) Training a 3-node neural network is NP-complete. Neural Networks 5(1):117–127.Crossref, Google Scholar
Bohra P, Campos J, Gupta H, Aziznejad S, Unser M (2020) Learning activation functions in deep (spline) neural networks. IEEE Open J. Signal Processing 1:295–309.Crossref, Google Scholar
Bonami P, Lodi A, Tramontani A, Wiese S (2015) On mathematical programming with indicator constraints. Math. Programming 151:191–223.Crossref, Google Scholar
Boob D, Dey SS, Lan G (2022) Complexity of training ReLU neural network. Discrete Optim. 44:100620.Crossref, Google Scholar
Botoeva E, Kouvaros P, Kronqvist J, Lomuscio A, Misener R (2020) Efficient verification of ReLU-based neural networks via dependency analysis. Proc. AAAI Conf. Artificial Intelligence 34(4):3291–3299.Crossref, Google Scholar
Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
Brandenburg MC, Grillo ML, Hertrich C (2025) Decomposition polyhedra of piecewise linear functions. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Bridle JS (1990) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Soulié FF, Hérault J, eds. Neurocomput., NATO ASI Series, vol. 68 (Springer, Berlin, Heidelberg), 227–236.Google Scholar
Bubeck S (2015) Convex optimization: Algorithms and complexity. Foundations Trends Machine Learn. 8(3–4):231–357.Crossref, Google Scholar
Bunel RR, Hinder O, Bhojanapalli S, Dvijotham K (2020c) An efficient nonconvex reformulation of stagewise convex optimization problems. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 8247–8258.Google Scholar
Bunel RR, Turkaslan I, Torr P, Kohli P, Mudigonda PK (2018) A unified view of piecewise linear neural network verification. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 4795–4804.Google Scholar
Bunel R, Mudigonda P, Turkaslan I, Torr P, Lu J, Kohli P (2020b) Branch and bound for piecewise linear neural network verification. J. Machine Learn. Res. 21(1):1574–1612.Google Scholar
Bunel R, De Palma A, Desmaison A, Dvijotham K, Kohli P, Torr P, Pawan Kumar M (2020a) Lagrangian decomposition for neural network verification. Peters J, Sontag D, eds. Proc. 36th Conf. Uncertainty Artificial Intelligence (UAI), vol. 124 (PMLR, New York), 370–379.Google Scholar
Burer S, Monteiro RD (2003) A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Programming 95(2):329–357.Crossref, Google Scholar
Burtea R, Tsay C (2024) Constrained continuous-action reinforcement learning for supply chain inventory management. Comput. Chemical Engrg. 181:108518.Crossref, Google Scholar
Cai J, Nguyen KN, Shrestha N, Good A, Tu R, Yu X, Zhe S, Serra T (2023) Getting away with more network pruning: From sparsity to geometry and linear regions. Cire AA, ed. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2023, Lecture Notes in Computer Science, vol. 13884 (Springer, Cham, Switzerland), 200–218.Google Scholar
Carrasco P, Muñoz G (2024) Tightening convex relaxations of trained neural networks: A unified approach for convex and s-shaped activations. Preprint, submitted October 30, https://arxiv.org/abs/2410.23362.Google Scholar
Ceccon F, Jalving J, Haddad J, Thebelt A, Tsay C, Laird CD, Misener R (2022) OMLT: Optimization & machine learning toolkit. J. Machine Learn. Res. 23(1):15829–15836.Google Scholar
Charisopoulos V, Maragos P (2018) A tropical approach to neural networks with piecewise linear activations. Preprint, submitted May 22, https://arxiv.org/abs/1805.08749.Google Scholar
Chaudhry A, Khan N, Dokania P, Torr P (2020) Continual learning in low-rank orthogonal subspaces. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 9900–9911.Google Scholar
Chen KL, Garudadri H, Rao BD (2022a) Improved bounds on neural complexity for representing piecewise linear functions. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 7167–7180.Google Scholar
Chen W, Gong X, Wang Z (2021) Neural architecture search on ImageNet in four GPU hours: A theoretically inspired perspective. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Chen S, Klivans AR, Meka R (2022b) Learning deep ReLU networks is fixed-parameter tractable. 2021 IEEE 62nd Annual Sympos. Foundations Comput. Sci. (FOCS) (IEEE, Piscataway, NJ), 696–707.Google Scholar
Chen H, Wang YG, Xiong H (2023a) Lower and upper bounds for numbers of linear regions of graph convolutional networks. Neural Networks 168:394–404.Crossref, Google Scholar
Chen T, Lasserre JB, Magron V, Pauwels E (2020) Semialgebraic optimization for Lipschitz constants of ReLU networks. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 19189–19200.Google Scholar
Chen W, Gong X, Wu J, Wei Y, Shi H, Yan Z, Yang Y, Wang Z (2023b) Understanding and accelerating neural architecture search with training-free and theory-grounded metrics. IEEE Trans. Pattern Anal. Machine Intelligence 46(2):749–763.Crossref, Google Scholar
Cheng C, Nührenberg G, Ruess H (2017) Maximum resilience of artificial neural networks. D’Souza D, Narayan Kumar K, eds. Automated Tech. Verification Anal. ATVA 2017, Lecture Notes in Computer Science, vol. 10482 (Springer, Cham, Switzerland), 251–268.Google Scholar
Cheng CH, Nührenberg G, Huang CH, Ruess H (2018) Verification of binarized neural networks via inter-neuron factoring: (short paper). Piskac R, Rümmer P, eds. Verified Software. Theories Tools Experiments. VSTTE 2018, Lecture Notes in Computer Science, vol. 11294 (Springer, Cham, Switzerland), 279–290.Google Scholar
Cheon MS (2022) An outer-approximation guided optimization approach for constrained neural network inverse problems. Math. Programming 196(1–2):173–202.Crossref, Google Scholar
Chu L, Hu X, Hu J, Wang L, Pei J (2018) Exact and consistent interpretation for piecewise linear neural networks: A closed form solution. KDD’18: Proc. ACM SIGKDD Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1244–1253.Google Scholar
Ciresan D, Meier U, Masci J, Schmidhuber J (2012) Multi column deep neural network for traffic sign classification. Neural Networks 32:333–338.Crossref, Google Scholar
Cisse M, Bojanowski P, Grave E, Dauphin Y, Usunier N (2017) Parseval networks: Improving robustness to adversarial examples. Precup D, Teh YW, eds. ICML’17: Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR), 854–863.Google Scholar
Cohan S, Kim NH, Rolnick D, van de Panne M (2022) Understanding the evolution of linear regions in deep reinforcement learning. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 10891–10903.Google Scholar
Collobert R (2004) Large scale machine learning. PhD thesis, University of Paris, Paris.Google Scholar
Combettes PL, Pesquet JC (2020) Lipschitz certificates for layered network structures driven by averaged activation operators. SIAM J. Math. Data Sci. 2(2):529–557.Crossref, Google Scholar
Courbariaux M, Bengio Y, David JP (2015) BinaryConnect: Training deep neural networks with binary weights during propagations. Cortes C, Lee DD, Sugiyama M, Garnett R, eds. NIPS’15: Proc. 29th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 3123–3131.Google Scholar
Craighero F, Angaroni F, Graudenzi A, Stella F, Antoniotti M (2020a) Investigating the compositional structure of deep neural networks. Nicosia G, ed. Machine Learn. Optim. Data Sci. LOD 2020, Lecture Notes in Computer Science, vol. 12565 (Springer, Cham, Switzerland), 322–334.Google Scholar
Craighero F, Angaroni F, Graudenzi A, Stella F, Antoniotti M (2020b) Understanding deep learning with activation pattern diagrams. CEUR Workshop Proc. 2742:119–126Google Scholar
Croce F, Hein M (2018) A randomized gradient-free attack on ReLU networks. Brox T, Bruhn A, Fritz M, eds. Pattern Recognition. GCPR 2018, Lecture Notes in Computer Science, vol. 11269 (Springer, Cham, Switzerland), 215–227.Google Scholar
Croce F, Andriushchenko M, Hein M (2019) Provable robustness of ReLU networks via maximization of linear regions. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist., vol. 89 (PMLR, New York), 2057–2066.Google Scholar
Croce F, Rauber J, Hein M (2020) Scaling up the randomized gradient-free adversarial attack reveals overestimation of robustness using established attacks. Internat. J. Comput. Vision 128:1028–1046.Crossref, Google Scholar
Croxton KL, Gendron B, Magnanti TL (2003) A comparison of mixed-integer programming models for nonconvex piecewise linear cost minimization problems. Management Sci. 49(9):1268–1273.Link, Google Scholar
Curtis FE, Scheinberg K (2017) Optimization methods for supervised machine learning: From linear models to deep learning. INFORMS TutORials in Operations Research (INFORMS, Catonsville, MD), 89–114.Link, Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math. Control Signals Systems 2:303–314.Crossref, Google Scholar
Danna E, Fenelon M, Gu Z, Wunderling R (2007) Generating multiple solutions for mixed integer programming problems. Fischetti M, Williamson DP, eds. Integer Programming Combin. Optim. IPCO 2019, Lecture Notes in Computer Science, vol. 4513 (Springer, Cham, Switzerland), 280–294.Google Scholar
Dantzig GB (1960) On the significance of solving linear programming problems with some integer variables. Econometrica 28(1):30–44.Crossref, Google Scholar
Dantzig GB, Eaves BC (1973) Fourier-Motzkin elimination and its dual. J. Combin. Theory Series A 14(3):288–297.Crossref, Google Scholar
Dathathri S, Dvijotham K, Kurakin A, Raghunathan A, Uesato J, Bunel RR, Shankar S, et al. (2020) Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 5318–5331.Google Scholar
Daubechies I, DeVore R, Foucart S, Hanin B, Petrova G (2022) Nonlinear approximation and (deep) ReLU networks. Constructive Approximation 55:127–172.Crossref, Google Scholar
De Palma A, Behl H, Bunel RR, Torr P, Kumar MP (2021) Scaling the convex barrier with active sets. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Delarue A, Anderson R, Tjandraatmadja C (2020) Reinforcement learning with combinatorial actions: An application to vehicle routing. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 609–620.Google Scholar
Deng Y, Zheng X, Zhang T, Chen C, Lou G, Kim M (2020) An analysis of adversarial attacks and defenses on autonomous driving models. 2020 IEEE Internat. Conf. Pervasive Comput. Comm. (PerCom) (IEEE, Piscataway, NJ), 1–10.Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Conf. North Amer. Chapter Assoc. Comput. Linguistics (NAACL) (Association for Computational Linguistics, Stroudsburg, PA), 4171–4186.Google Scholar
Dey SS, Wang G, Xie Y (2020) Approximation algorithms for training one-node ReLU neural networks. IEEE Trans. Signal Processing 68:6696–6706.Crossref, Google Scholar
Dowson O, Parker RB, Bent R (2025) MathOptAI.jl: Embed trained machine learning predictors into JuMP models. Preprint, submitted July 3, https://arxiv.org/abs/2507.03159.Google Scholar
Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: A comprehensive survey and benchmark. Neurocomput. 503:92–108.Crossref, Google Scholar
Dutta S, Jha S, Sankaranarayanan S, Tiwari A (2018) Output range analysis for deep feedforward networks. Dutle A, Muñoz C, Narkawicz A, eds. NASA Formal Methods. NFM 2018, Lecture Notes in Computer Science, vol. 10811 (Springer, Cham, Switzerland), 121–138.Google Scholar
Dvijotham K, Stanforth R, Gowal S, Mann TA, Kohli P (2018a) A dual approach to scalable verification of deep networks. Globerson A, Silva R, eds. Conf. Uncertainty Artificial Intelligence (UAI) (Monterey), 550–559.Google Scholar
Dvijotham K, Gowal S, Stanforth R, Arandjelovic R, O’Donoghue B, Uesato J, Kohli P (2018b) Training verified learners with learned verifiers. Preprint, submitted May 25, https://arxiv.org/abs/1805.10265.Google Scholar
Dym N, Sober B, Daubechies I (2020) Expression of fractals through neural network functions. IEEE J. Selected Areas Inform. Theory 1(1):57–66.Crossref, Google Scholar
Ehlers R (2017) Formal verification of piece-wise linear feed-forward neural networks. D’Souza D, Narayan Kumar K, eds. Automated Tech. Verification Anal. ATVA 2017, Lecture Notes in Computer Science, vol. 10482 (Springer, Cham, Switzerland), 269–286.Google Scholar
ElAraby M, Wolf G, Carvalho M (2020) Identifying efficient sub-networks using mixed integer programming. OPT2020: 12th Annual Workshop Optim. Machine Learn.Google Scholar
Elsken T, Metzen JH, Hutter F (2019) Neural architecture search: A survey. J. Machine Learn. Res. 20(1):1997–2017.Google Scholar
Ergen E, Grillo M (2024) Topological expressivity of ReLU neural networks. Agrawal S, Roth A, eds. Proc. 37th Conf. Learn. Theory (COLT), vol. 247 (PMLR, New York), 1599–1642.Google Scholar
Ergen T, Pilanci M (2020) Convex geometry of two-layer ReLU networks: Implicit autoencoding and interpretable models. Chiappa S, Calandra R, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist., vol. 108 (PMLR, New York), 4024–4033.Google Scholar
Ergen T, Pilanci M (2021a) Convex geometry and duality of over-parameterized neural networks. J. Machine Learn. Res. 22(1):9646–9708.Google Scholar
Ergen T, Pilanci M (2021b) Global optimality beyond two layers: Training deep ReLU networks via convex programs. Proc. 38th Internat. Conf. Machine Learn. (ICLR) (PMLR, New York), 2993–3003.Google Scholar
Ergen T, Pilanci M (2021c) Implicit convex regularizers of CNN architectures: Convex optimization of two-and three-layer networks in polynomial time. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Ergen T, Pilanci M (2021d) Revealing the structure of deep neural networks via convex duality. Proc. 38th Internat. Conf. Machine Learn., vol. 139 (PMLR, New York), 3004–3014.Google Scholar
Ergen T, Pilanci M (2024) Path regularization: A convexity and sparsity inducing regularization for parallel ReLU networks. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. NIPS’23: Proc. 37th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 59761–59786.Google Scholar
Ergen T, Pilanci M (2025) The convex landscape of neural networks: Characterizing global optima and stationary points via Lasso models. IEEE Trans. Inform. Theory 71(5):3854–3870.Crossref, Google Scholar
Ergen T, Gulluk HI, Lacotte J, Pilanci M (2023) Globally optimal training of neural networks with threshold activation functions. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Ergen T, Sahiner A, Ozturkler B, Pauly JM, Mardani M, Pilanci M (2022) Demystifying batch normalization in ReLU networks: Equivalent convex optimization models and implicit regularization. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Xiao C, Prakash A, Kohno T, Song D (2018) Robust physical-world attacks on deep learning visual classification. 2018 IEEE/CVF Conf. Comput. Vision Pattern Recognition (CVPR) (IEEE, Piscataway, NJ), 1625–1634.Google Scholar
Fan F, Lai R, Wang G (2023a) Quasi-equivalence between width and depth of neural networks. J. Machine Learn. Res. 24(183):1–22.Google Scholar
Fan FL, Huang W, Zhong X, Ruan L, Zeng T, Xiong H, Wang F (2023b) Deep ReLU networks have surprisingly simple polytopes. Preprint, submitted May 16, https://arxiv.org/abs/2305.09145.Google Scholar
Fazlyab M, Morari M, Pappas GJ (2020) Safety verification and robustness analysis of neural networks via quadratic constraints and semidefinite programming. IEEE Trans. Automatic Control 67(1):1–15.Crossref, Google Scholar
Fazlyab M, Robey A, Hassani H, Morari M, Pappas GJ (2019) Efficient and accurate estimation of Lipschitz constants for deep neural networks. Wallach HM, Larochelle HM, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. NIPS’19: Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 11427–11438.Google Scholar
Ferlez J, Shoukry Y (2020) AReN: Assured ReLU NN architecture for model predictive control of LTI systems. HSCC’20: Proc. 23rd Internat. Conf. Hybrid Systems: Comput. Control (Association for Computing Machinery, New York), 1–11.Google Scholar
Ferrari C, Mueller MN, Jovanović N, Vechev M (2022) Complete verification via multi-neuron relaxation guided branch-and-bound. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Finlayson SG, Bowers JD, Ito J, Zittrain JL, Beam AL, Kohane IS (2019) Adversarial attacks on medical machine learning. Science 363(6433):1287–1289.Crossref, Google Scholar
Fischetti M, Jo J (2018) Deep neural networks and mixed integer linear optimization. Constraints 23:296–309.Crossref, Google Scholar
Fourier J (1826) Solution d’une question particuliére du calcul des inégalités. Nouveau Bull. Des Sci. Par la Société Philomatique de Paris.Google Scholar
Francobaldi M, Lombardi M (2025) SMLE: Safe machine learning via embedded overapproximation. AAAI Conf. Artificial Intelligence 39(26):27286–27294.Google Scholar
Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2):95–110.Crossref, Google Scholar
Froese V, Hertrich C (2024) Training neural networks is NP-hard in fixed dimension. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. NIPS’23: Proc. 37th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 44039–44049.Google Scholar
Froese V, Grillo M, Skutella M (2024) Complexity of injectivity and verification of ReLU neural networks. Preprint, submitted May 30, https://arxiv.org/abs/2405.19805.Google Scholar
Froese V, Hertrich C, Niedermeier R (2022) The computational complexity of ReLU network training parameterized by data dimensionality. J. Artificial Intelligence Res. 74:1775–1790.Crossref, Google Scholar
Fukushima K (1980) Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybernetics 36:193–202.Crossref, Google Scholar
Funahashi KI (1989) On the approximate realization of continuous mappings by neural networks. Neural Networks 2(3):183–192.Crossref, Google Scholar
Gamba M, Carlsson S, Azizpour H, Björkman M (2020) Hyperplane arrangements of trained ConvNets are biased. Preprint, submitted March 17, https://arxiv.org/abs/2003.07797.Google Scholar
Gamba M, Chmielewski-Anders A, Sullivan J, Azizpour H, Björkman M (2022) Are all linear regions created equal? Camps-Valls G, Ruiz FJR, Valera I, eds. Proc. 22nd Internat. Conf. Artificial Intelligence Statist., vol. 151 (PMLR, New York), 6573–6590.Google Scholar
Gambella C, Ghaddar B, Naoum-Sawaya J (2021) Optimization problems for machine learning: A survey. Eur. J. Oper. Res. 290(3):807–828.Crossref, Google Scholar
Gao J, Sun C, Zhao H, Shen Y, Anguelov D, Li C, Schmid C (2020) VectorNet: Encoding HD maps and agent dynamics from vectorized representation. 2018 IEEE/CVF Conf. Comput. Vision Pattern Recognition (CVPR) (IEEE, Piscataway, NJ), 1625–1634.Google Scholar
Geißler B, Martin A, Morsi A, Schewe L (2012) Using piecewise linear functions for solving MINLPs. Lee J, Leyffer S, eds. Mixed Integer Nonlinear Programming, The IMA Volumes in Mathematics and its Applications, vol. 154 (Springer, New York), 287–314.Google Scholar
Glass L, Hilali W, Nelles O (2021) Compressing interpretable representations of piecewise linear neural networks using neuro-fuzzy models. IEEE Sympos. Series Comput. Intelligence (SSCI) (IEEE, Piscataway, NJ), 2057–2066.Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. Gordon G, Dunson D, Dudík M, eds. Proc. 14th Internat. Conf. Artificial Intelligence Statist., vol. 15 (PMLR, New York), 315–323.Google Scholar
Goebbels S (2021) Training of ReLU activated multilayerd neural networks with mixed integer linear programs. Technical Report No. 2021-01, Hochschule Niederrhein, Fachbereich Elektrotechnik & Informatik, Krefeld, Germany.Google Scholar
Goel S, Klivans A, Manurangsi P, Reichman D (2021) Tight hardness results for training depth-2 ReLU networks. 12th Innovations Theoret. Comput. Sci. Conf. (ITCS 2021), Leibniz International Proceedings in Informatics (LIPIcs), vol. 185 (Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Wadern, Germany), 22:1–22:14.Google Scholar
Goerigk M, Kurtz J (2023) Data-driven robust optimization using deep neural networks. Comput. Oper. Res. 151:106087.Crossref, Google Scholar
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning (MIT Press, Cambridge, MA).Google Scholar
Goodfellow I, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Dasgupta S, McAllester D, eds. ICML’13: Proc. 30th Internat. Conf. Machine Learn. (ICML), vol. 28 (JMLR), 1319–1327.Google Scholar
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. NIPS’14: Proc. 28th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 2672–2680.Google Scholar
Gopinath D, Converse H, Pasareanu CS, Taly A (2019) Property inference for deep neural networks. ASE’19: Proc. 34th IEEE/ACM Internat. Conf. Automated Software Engrg. (IEEE, Piscataway, NJ), 797–809.Google Scholar
Goujon A, Etemadi A, Unser M (2024) On the number of regions of piecewise linear neural networks. J. Comput. Appl. Math. 441:115667.Crossref, Google Scholar
Gowal S, Dvijotham K, Stanforth R, Bunel R, Qin C, Uesato J, Arandjelovic R, Mann T, Kohli P (2018) On the effectiveness of interval bound propagation for training verifiably robust models. Preprint, submitted October 30, https://arxiv.org/abs/1810.12715.Google Scholar
Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. Xing EP, Jebara T, eds. ICML’14: Proc. 31st Internat. Conf. Machine Learn. (ICML), vol. 32 (JMLR), II-1764–II-1772.Google Scholar
Grigsby JE, Lindsey K (2022) On transversality of bent hyperplane arrangements and the topological expressiveness of ReLU neural networks. SIAM J. Appl. Algebra Geometry 6(2):216–242.Crossref, Google Scholar
Grigsby JE, Lindsey K, Rolnick D (2023) Hidden symmetries of ReLU networks. Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlet J, eds. ICML’23: Proc. 40th Internat. Conf. Machine Learn. (ICML) (JMLR), 11734–11760.Google Scholar
Grillo M, Hertrich C, Loho G (2025) Depth-bounds for neural networks via the braid arrangement. Thirty-ninth Annual Conf. Neural Inform. Processing Systems (OpenReview).Google Scholar
Grimstad B, Andersson H (2019) ReLU networks as surrogate models in mixed-integer linear programs. Comput. Chemical Engrg. 131:106580.Crossref, Google Scholar
Grossmann IE, Ruiz JP (2012) Generalized disjunctive programming: A framework for formulation and alternative algorithms for MINLP optimization. Lee J, Leyffer S, eds. Mixed Integer Nonlinear Programming, The IMA Volumes in Mathematics and its Applications, vol. 154 (Springer, New York), 93–115.Google Scholar
Haase CA, Hertrich C, Loho G (2023) Lower bounds on the depth of integral ReLU neural networks via lattice polytopes. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Hahnloser R, Sarpeshkar R, Mahowald M, Douglas R, Seung S (2000) Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405:947–951.Crossref, Google Scholar
Han S, Gómez A (2021) Single-neuron convexification for binarized neural networks. Preprint, submitted May 27, https://optimization-online.org/?p=17148.Google Scholar
Hanin B, Rolnick D (2019a) Complexity of linear regions in deep networks. Internat. Conf. Machine Learn. (ICML) (PMLR, New York).Google Scholar
Hanin B, Rolnick D (2019b) Deep ReLU networks have surprisingly few activation patterns. Wallach HM, Larochelle HM, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. NIPS’19: Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 18293–18306.Google Scholar
Hanin B, Sellke M (2017) Approximating continuous functions by ReLU nets of minimal width. Preprint, submitted October 31, https://arxiv.org/abs/1710.11278.Google Scholar
Hashemi V, Kouvaros P, Lomuscio A (2021) OSIP: Tightened bound propagation for the verification of ReLU neural networks. Calinescu R, Păsăreanu CS, eds. Software Engrg. Formal Methods. SEFM 2021, Lecture Notes in Computer Science, vol. 13085 (Springer, Cham, Switzerland), 463–480.Google Scholar
He F, Lei S, Ji J, Tao D (2021) Neural networks behave as hash encoders: An empirical study. Preprint, submitted January 14, https://arxiv.org/abs/2101.05490.Google Scholar
He J, Li L, Xu J, Zheng C (2020) ReLU deep neural networks and linear finite elements. J. Comput. Math. 38(3):502–527.Crossref, Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. IEEE Internat. Conf. Comput. Vision (ICCV) (IEEE, Piscataway, NJ), 1026–1034.Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. 2018 IEEE/CVF Conf. Comput. Vision Pattern Recognition (CVPR) (IEEE, Piscataway, NJ), 1625–1634.Google Scholar
Henriksen P, Lomuscio A (2021) DEEPSPLIT: An efficient splitting method for neural network verification via indirect effect analysis. Zhou Z, ed. Proc. 30th Internat. Joint Conf. Artificial Intelligence (IJCAI), 2549–2555.Google Scholar
Henriksen P, Leofante F, Lomuscio A (2022) Repairing misclassifications in neural networks using limited data. SAC’22: Proc. ACM/SIGAPP Sympos. Appl. Comput. (Association for Computing Machinery, New York), 1625–1634.Google Scholar
Hertrich C, Loho G (2024) Neural networks and (virtual) extended formulations. Preprint, submitted November 5, https://arxiv.org/abs/2411.03006.Google Scholar
Hertrich C, Basu A, Di Summa M, Skutella M (2023) Towards lower bounds on the depth of ReLU neural networks. SIAM J. Discrete Math. 37(2):997–1029.Crossref, Google Scholar
Hinton G, Deng L, Dahl G, Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine 29(6):82–97.Crossref, Google Scholar
Hinz P (2021) Using activation histograms to bound the number of affine regions in ReLU feed-forward neural networks. Preprint, submitted March 31, https://arxiv.org/abs/2103.17174.Google Scholar
Hinz P, van de Geer S (2019) A framework for the construction of upper bounds on the number of affine linear regions of ReLU feed-forward neural networks. IEEE Trans. Inform. Theory 65(11):7304–7324.Crossref, Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput. 9(8):1735–1780.Crossref, Google Scholar
Hopfield J (1982) Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. USA 79(8):2554–2558.Crossref, Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5):359–366.Crossref, Google Scholar
Hu T, Shang Z, Cheng G (2020a) Sharp rate of convergence for deep neural network classifiers under the teacher-student setting. Preprint, submitted January 19, https://arxiv.org/abs/2001.06892.Google Scholar
Hu X, Liu W, Bian J, Pei J (2020b) Measuring model complexity of neural networks with curve activation functions. KDD’20: Proc. 26th ACM SIGKDD Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 1521–1531.Google Scholar
Hu X, Chu L, Pei J, Liu W, Bian J (2021) Model complexity of deep learning: A survey. Knowledge Inform. Systems 63:2585–2619.Crossref, Google Scholar
Huang Y, Zhang H, Shi Y, Kolter JZ, Anandkumar A (2021) Training certifiably robust neural networks with efficient local Lipschitz bounds. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 22745–22757.Google Scholar
Huang X, Kroening D, Ruan W, Sharp J, Sun Y, Thamo E, Wu M, Yi X (2020) A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 37:100270.Crossref, Google Scholar
Huchette J, Vielma JP (2022) Nonconvex piecewise linear functions: Advanced formulations and simple modeling tools. Oper. Res. 71(5):1835–1856.Link, Google Scholar
Huster T, Chiang CYJ, Chadha R (2018) Limitations of the Lipschitz constant as a defense against adversarial examples. Alzate C, ed. ECML PKDD 2018 Workshops. ECML PKDD 2018, Lecture Notes in Computer Science, vol. 11329 (Springer, Cham, Switzerland), 16–29.Google Scholar
Hwang WL, Heinecke A (2020) Un-rectifying non-linear networks for signal representation. IEEE Trans. Signal Processing 68:196–210.Crossref, Google Scholar
Icarte RT, Illanes L, Castro MP, Cire AA, McIlraith SA, Beck JC (2019) Training binarized neural networks using MIP and CP. Schiex T, de Givry S, eds. Principles Practice Constraint Programming. CP 2019, Lecture Notes in Computer Science, vol. 11802 (Springer, Cham, Switzerland), 401–417.Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. Bach F, Blei D, eds. ICML’15: Proc. 32nd Internat. Conf. Machine Learn. (ICML), vol. 37 (JMLR), 448–456.Google Scholar
Jeroslow RG, Lowe JK (1984) Modelling with integer variables. Korte B, Ritter K, eds. Mathematical Programming at Oberwolfach II, Mathematical Programming Studies, vol. 22 (Springer, Berlin, Heidelberg), 167–184.Crossref, Google Scholar
Jia K, Rinard M (2020) Efficient exact verification of binarized neural networks. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1782–1795.Google Scholar
Johnson TT, Lopez DM, Musau P, Tran HD, Botoeva E, Leofante F, Maleki A, Sidrane C, Fan J, Huang C (2020) ARCH-COMP20 category report: Artificial intelligence and neural network control systems (AINNCS) for continuous and hybrid systems plants. EPiC Series Comput. 74:107–139.Crossref, Google Scholar
Jordan M, Dimakis AG (2020) Exactly computing the local Lipschitz constant of ReLU networks. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 7344–7353.Google Scholar
Jordan M, Lewis J, Dimakis AG (2019) Provable certificates for adversarial examples: Fitting a ball in the union of polytopes. Wallach HM, Larochelle HM, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. NIPS’19: Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 14082–14092.Google Scholar
Kaleem W, Subramanyam A (2024) Neural embedded mixed-integer optimization for location-routing problems. Preprint, submitted December 7, https://arxiv.org/abs/2412.05665.Google Scholar
Kanamori K, Takagi T, Kobayashi K, Ike Y, Uemura K, Arimura H (2021) Ordered counterfactual explanation by mixed-integer linear optimization. Proc. AAAI Conf. Artificial Intelligence 35(13):11564–11574.Crossref, Google Scholar
Karg B, Lucia S (2020) Efficient representation and approximation of model predictive control laws via deep learning. IEEE Trans. Cybernetics 50(9):3866–3878.Crossref, Google Scholar
Karia T, Lastrucci G, Schweidtmann AM (2025) Deterministic global optimization over trained Kolmogorov Arnold networks. Preprint, submitted March 4, https://arxiv.org/abs/2503.02807.Google Scholar
Katz J, Pappas I, Avraamidou S, Pistikopoulos EN (2020) Integrating deep learning models and multiparametric programming. Comput. Chemical Engrg. 136:106801.Crossref, Google Scholar
Katz G, Barrett C, Dill DL, Julian K, Kochenderfer MJ (2017) Reluplex: An efficient SMT solver for verifying deep neural networks. Majumdar R, Kunčak V, eds. Comput. Aided Verification. CAV 2017, Lecture Notes in Computer Science, vol. 10426 (Springer, Cham, Switzerland), 97–117.Google Scholar
Katz G, Huang DA, Ibeling D, Julian K, Lazarus C, Lim R, Shah P, et al. (2019) The marabou framework for verification and analysis of deep neural networks. Dillig I, Tasiran S, eds. Comput. Aided Verification. CAV 2019, Lecture Notes in Computer Science, vol. 11561 (Springer, Cham, Switzerland), 443–452.Google Scholar
Keup C, Helias M (2022) Origami in N dimensions: How feed-forward networks manufacture linear separability. Preprint, submitted March 21, https://arxiv.org/abs/2203.11355.Google Scholar
Khalife S, Basu A (2022) Neural networks with linear threshold activations: Structure and algorithms. Aardal K, Sanità L, eds. Integer Programming Combin. Optim. IPCO 2022, Lecture Notes in Computer Science, vol. 13265 (Springer, Cham, Switzerland), 347–360.Google Scholar
Khalife S, Cheng H, Basu A (2023) Neural networks with linear threshold activations: Structure and algorithms. Math. Programming 206:333–356.Crossref, Google Scholar
Khedr H, Ferlez J, Shoukry Y (2021) Peregrinn: Penalized-relaxation greedy neural network verifier. Internat. Conf. Comput. Aided Verification (Springer, Cham, Switzerland), 287–300.Google Scholar
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. Preprint, submitted December 22, https://arxiv.org/abs/1412.6980.Google Scholar
Kody A, Chevalier S, Chatzivasileiadis S, Molzahn D (2022) Modeling the ac power flow equations with optimally compact neural networks: Application to unit commitment. Electric Power Systems Res. 213:108282.Crossref, Google Scholar
Kouvaros P, Kyono T, Leofante F, Lomuscio A, Margineantu D, Osipychev D, Zheng Y (2021) Formal analysis of neural network-based systems in the aircraft domain. Huisman M, Păsăreanu C, Zhan N, eds. Formal Methods. FM 2021, Lecture Notes in Computer Science, vol. 13047 (Springer, Cham, Switzerland), 730–740.Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Pereira F, Burges CJC, Bottou L, Weinberger KQ, eds. NIPS’12: Proc. 26th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1097–1105.Google Scholar
Kronqvist J, Misener R, Tsay C (2021) Between steps: Intermediate relaxations between big-M and convex hull formulations. Stuckey PJ, ed. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2021, Lecture Notes in Computer Science, vol. 12735 (Springer, Cham, Switzerland), 200–218.Google Scholar
Kronqvist J, Misener R, Tsay C (2025) P-split formulations: A class of intermediate formulations between big-M and convex hull for disjunctive constraints. Math. Programming, ePub ahead of print June 9, https://doi.org/10.1007/s10107-025-02232-1.Crossref, Google Scholar
Kumar A, Serra T, Ramalingam S (2019) Equivalent and approximate transformations of deep neural networks. Preprint, submitted April 11, https://arxiv.org/abs/1905.1142.Google Scholar
Lacoste-Julien S, Jaggi M, Schmidt M, Pletscher P (2013) Block-coordinate Frank-Wolfe optimization for structural SVMs. Dasgupta S, McAllester D, eds. ICML’13: Proc. 30th Internat. Conf. Machine Learn. (ICML), vol. 28 (PMLR), 53–61.Google Scholar
Lan J, Zheng Y, Lomuscio A (2022) Tight neural network verification via semidefinite relaxations and linear reformulations. Proc. AAAI Conf. Artificial Intelligence 36(7):7272–7280.Crossref, Google Scholar
Latorre F, Rolland P, Cevher V (2020) Lipschitz constant estimation of neural networks via sparse polynomial optimization. Internat. Conf. Learn. Representations (ICLR).Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444.Crossref, Google Scholar
LeCun Y, Bottou L, Orr GB, Müller KR (1998) Efficient BackProp. Montavon G, Orr G, Müller K, eds. Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science, vol. 1524 (Springer, Berlin, Heidelberg), 9–50.Crossref, Google Scholar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4):541–551.Crossref, Google Scholar
Lee J, Wilson D (2001) Polyhedral methods for piecewise-linear functions I: The lambda method. Discrete Appl. Math. 108(3):269–285.Crossref, Google Scholar
Lee GH, Alvarez-Melis D, Jaakkola TS (2019) Towards robust, locally linear deep networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Leino K, Wang Z, Fredrikson M (2021) Globally-robust neural networks. Meila M, Zhang T, eds. Internat. Conf. Machine Learn. (ICML), vol. 139 (PMLR, New York), 6212–6222.Google Scholar
Leofante F, Narodytska N, Pulina L, Tacchella A (2018) Automated verification of neural networks: Advances, challenges and perspectives. Preprint, submitted May 25, https://arxiv.org/abs/1805.09938.Google Scholar
Li L, Xie T, Li B (2022) SoK: Certified robustness for deep neural networks. 2023 IEEE Sympos. Security and Privacy (SP) (IEEE Computer Society, Washington, DC), 1289–1310.Google Scholar
Liang X, Xu J (2021) Biased ReLU neural networks. Neurocomput. 423:71–79.Crossref, Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. Preprint, submitted September 9, https://arxiv.org/abs/1509.02971.Google Scholar
Linnainmaa S (1970) The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master’s thesis, University of Helsinki, Helsinki, Finland. [In Finnish.]Google Scholar
Little W (1974) The existence of persistent states in the brain. Math. Biosci. 19(1–2):101–120.Crossref, Google Scholar
Liu X, Dvorkin V (2025) Optimization over trained neural networks: Difference-of-convex algorithm and application to data center scheduling. IEEE Control Systems Lett. 9:835–840.Crossref, Google Scholar
Liu B, Liang Y (2021) Optimal function approximation with ReLU neural networks. Neurocomput. 435:216–227.Crossref, Google Scholar
Liu X, Han X, Zhang N, Liu Q (2020) Certified monotonic neural networks. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 15427–15438.Google Scholar
Liu C, Arnon T, Lazarus C, Strong C, Barrett C, Kochenderfer MJ (2021) Algorithms for verifying deep neural networks. Foundations Trends Optim. 4(3–4):244–404.Crossref, Google Scholar
Liu Z, Wang Y, Vaidya S, Ruehle F, Halverson J, Soljačić M, Hou TY, Tegmark M (2025) KAN: Kolmogorov-Arnold Networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Lombardi M, Milano M, Bartolini A (2017) Empirical decision model learning. Artificial Intelligence 244:343–367.Crossref, Google Scholar
Lomuscio A, Maganti L (2017) An approach to reachability analysis for feed-forward ReLU neural networks. Preprint, submitted June 22, https://arxiv.org/abs/1706.07351.Google Scholar
Loukas A, Poiitis M, Jegelka S (2021) What training reveals about neural network complexity. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 494–508.Google Scholar
Lu Z, Pu H, Wang F, Hu Z, Wang L (2017) The expressive power of neural networks: A view from the width. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. NIPS’17: Proc. 31st Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 6232–6240.Google Scholar
Lueg L, Grimstad B, Mitsos A, Schweidtmann AM (2021) reluMIP: Open source tool for MILP optimization of ReLU neural networks. Accessed February 25, 2026, https://zenodo.org/records/5601907.Google Scholar
Lyu Z, Ko CY, Kong Z, Wong N, Lin D, Daniel L (2020) Fastened crown: Tightened neural network robustness certificates. Proc. AAAI Conf. Artificial Intelligence 34(4):5037–5044.Crossref, Google Scholar
Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. ICML Workshop Deep Learn. Audio, Speech Language Processing.Google Scholar
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Makhoul J, Schwartz R, El-Jaroudi A (1989) Classification capabilities of two-layer neural nets. Internat. Conf. Acoustics Speech Signal Processing (ICASSP) (IEEE, Piscataway, NJ), 635–638.Google Scholar
Malach E, Shalev-Shwartz S (2019) Is deeper better only when shallow is good? Wallach HM, Larochelle HM, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. NIPS’19: Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 6429–6438.Google Scholar
Mangasarian OL (1993) Mathematical programming in neural networks. ORSA J. Comput. 5(4):349–360.Link, Google Scholar
Maragos P, Charisopoulos V, Theodosis E (2021) Tropical geometry and machine learning. Proc. IEEE 109(5):728–755.Crossref, Google Scholar
Maragno D, Wiberg H, Bertsimas D, Birbil den Hertog D, Fajemisin AO (2023) Mixed-integer optimization with constraint learning. Oper. Res. 73(2):1011–1028.Link, Google Scholar
Maragno D, Kurtz J, Röber TE, Goedhart R, Birbil ŞI, den Hertog D (2024) Finding regions of counterfactual explanations via robust optimization. INFORMS J. Comput. 36(5):1316–1334.Link, Google Scholar
Masden M (2025) Algorithmic determination of the combinatorial structure of the linear regions of ReLU neural networks. SIAM J. Appl. Algebra Geometry 9(2):374–404.Google Scholar
Matoba K, Dimitriadis N, Fleuret F (2022) The theoretical expressiveness of maxpooling. Preprint, submitted March 2, https://arxiv.org/abs/2203.01016.Google Scholar
Matoušek J (2002) Lectures on Discrete Geometry, Graduate Texts in Mathematics, vol. 212 (Springer, New York).Crossref, Google Scholar
McBride K, Sundmacher K (2019) Overview of surrogate modeling in chemical process engineering. Chemie Ingenieur Technik 91(3):228–239.Crossref, Google Scholar
McCulloch W, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5:115–133.Crossref, Google Scholar
McDonald T, Tsay C, Schweidtmann AM, Yorke-Smith N (2024) Mixed-integer optimisation of graph neural networks for computer-aided molecular design. Comput. Chemical Engrg. 185:108660.Crossref, Google Scholar
Mhaskar HN, Poggio T (2020) Function approximation by deep networks. Comm. Pure Appl. Anal. 19(8):4085–4095.Crossref, Google Scholar
Minsky M, Papert S (1969) Perceptrons: An Introduction to Computational Geometry (MIT Press, Cambridge, MA).Google Scholar
Mirman M, Gehr T, Vechev M (2018) Differentiable abstract interpretation for provably robust neural networks. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. (ICML), vol. 80 (PMLR, New York), 3578–3586.Google Scholar
Misener R, Floudas CA (2012) Global optimization of mixed-integer quadratically-constrained quadratic programs (MIQCQP) through piecewise-linear and edge-concave relaxations. Math. Programming 136(1):155–182.Crossref, Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, et al. (2015) Human-level control through deep reinforcement learning. Nature 518:529–533.Crossref, Google Scholar
Montúfar G (2017) Notes on the number of linear regions of deep neural networks. Internat. Conf. Sampling Theory Appl. (SampTA) (IEEE, Piscataway, NJ).Google Scholar
Montúfar G, Ren Y, Zhang L (2022) Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums. SIAM J. Appl. Algebr. Geom. 6(4).Google Scholar
Montúfar G, Pascanu R, Cho K, Bengio Y (2014) On the number of linear regions of deep neural networks. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. NIPS’14: Proc. 28th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 2924–2932.Google Scholar
Motzkin T (1936) Beitrage zur theorie der linearen Ungleichungen. PhD thesis, University of Basel, Basel, Switzerland.Google Scholar
Mukhopadhyay S, Roy A, Kim LS, Govil S (1993) A polynomial time algorithm for generating neural networks for pattern classification: Its stability properties and some test results. Neural Comput. 5(2):317–330.Crossref, Google Scholar
Nair V, Hinton G (2010) Rectified linear units improve restricted Boltzmann machines. Fürnkranz J, Joachims T, eds. ICML’10: Proc. 27th Internat. Conf. Machine Learn. (ICML) (Omnipress, Madison, WI), 807–814.Google Scholar
Narodytska N, Kasiviswanathan S, Ryzhyk L, Sagiv M, Walsh T (2018) Verifying properties of binarized deep neural networks. AAAI Conf. Artificial Intelligence 32(1).Google Scholar
Nelles O, Fink A, Isermann R (2000) Local linear model trees (LOLIMOT) toolbox for nonlinear system identification. IFAC Proc. Vol. 33(15):845–850.Crossref, Google Scholar
Nesterov YE (1983) A method of solving a convex programming problem with convergence rate o(1k2). Proc. USSR Acad. Sci. 269:543–547.Google Scholar
Newton M, Papachristodoulou A (2021) Exploiting sparsity for neural network verification. Jadbabaie A, Lygeros J, Pappas GJ, Parrilo PA, Recht B, Tomlin CJ, Zeilinger MN, eds. Proc. 3rd Conf. Learn. Dynamics Control (L4DC), vol. 144 (PMLR, New York), 715–727.Google Scholar
Nguyen T, Huchette J (2022) Neural network verification as piecewise linear optimization: Formulations for the composition of staircase functions. Preprint, submitted November 17, https://arxiv.org/abs/2211.14706.Google Scholar
Nguyen Q, Mukkamala MC, Hein M (2018) Neural networks should be wide enough to learn disconnected decision regions. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. (ICML), vol. 80 (PMLR, New York), 3740–3749.Google Scholar
Novak R, Bahri Y, Abolafia DA, Pennington J, Sohl-Dickstein J (2018) Sensitivity and generalization in neural networks: An empirical study. Internat. Conf. Learn. Representations (ICLR).Google Scholar
OpenAI (2022) Introducing ChatGPT. Accessed February 25, 2026, https://openai.com/blog/chatgpt.Google Scholar
Padberg M (2000) Approximating separable nonlinear functions via mixed zero-one programs. Oper. Res. Lett. 27(1):1–5.Crossref, Google Scholar
Papalexopoulos TP, Tjandraatmadja C, Anderson R, Vielma JP, Belanger D (2022) Constrained discrete black-box optimization using mixed-integer programming. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Internat. Conf. Machine Learn. (ICML), vol. 162 (PMLR, New York), 17295–17322.Google Scholar
Park Y, Lee S, Kim G, Blei DM (2021a) Unsupervised representation learning via neural activation coding. Meila M, Zhang T, eds. Internat. Conf. Machine Learn. (ICML), vol. 139 (PMLR, New York), 8391–8400.Google Scholar
Park S, Yun C, Lee J, Shin J (2021b) Minimum width for universal approximation. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Park DS, Chan W, Zhang Y, Chiu CC, Zoph B, Cubuk Ed, Le QV (2019) SpecAugment: A simple data augmentation method for automatic speech recognition. Proc. Interspeech 2019 (International Speech Communication Association), 2613–2617.Google Scholar
Pascanu R, Montúfar G, Bengio Y (2014) On the number of response regions of deep feedforward networks with piecewise linear activations. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Patel RM, Dumouchelle J, Khalil E, Bodur M (2022) Neur2SP: Neural two-stage stochastic programming. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 23992–24005.Google Scholar
Perakis G, Tsiourvas A (2022) Optimizing objective functions from trained ReLU neural networks via sampling. Preprint, submitted May 27, https://arxiv.org/abs/2205.14189.Google Scholar
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. Proc. 2018 Conf. North Amer. Chapter Assoc. Comput. Linguistics (NAACL) (Association for Computational Linguistics, Stroudsburg, PA), 2227–2237.Google Scholar
Pham H, Ren A, Tahir I, Tong J, Serra T (2025) Optimization over trained (and sparse) neural networks: A surrogate within a surrogate. Preprint, submitted May 4, https://arxiv.org/abs/2505.01985.Google Scholar
Phuong M, Lampert CH (2020) Functional vs. parametric equivalence of ReLU networks. Internat. Conf. Learn. Representations (ICLR) (PMLR, New York).Google Scholar
Pilanci M, Ergen T (2020) Neural networks are convex regularizers: Exact polynomial-time convex optimization formulations for two-layer networks. Daumé H, Singh A, eds. ICML’20: Proc. 37th Internat. Conf. Machine Learn. (ICML) (JMLR), 7695–7705.Google Scholar
Plate C, Hahn M, Klimek A, Ganzer C, Sundmacher K, Sager S (2025) An analysis of optimization problems involving ReLU neural networks. Preprint, submitted February 5, https://arxiv.org/abs/2502.03016.Google Scholar
Pokutta S, Spiegel C, Zimmer M (2020) Deep neural network training with Frank-Wolfe. Preprint, submitted October 14, https://arxiv.org/abs/2010.07243.Google Scholar
Polyak BT (1964) Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5):1–17.Crossref, Google Scholar
Pulina L, Tacchella A (2010) An abstraction-refinement approach to verification of artificial neural networks. Touili T, Cook B, Jackson P, eds. Comput. Aided Verification. CAV 2010, Lecture Notes in Computer Science, vol. 6174 (Springer, Berlin, Heidelberg), 243–257.Google Scholar
Puthawala M, Kothari K, Lassas M, Dokmanić I, de Hoop M (2022) Globally injective ReLU networks. J. Machine Learn. Res. 23(1):4544–4598.Google Scholar
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training. Technical report, OpenAI, https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.Google Scholar
Raghu M, Poole B, Kleinberg J, Ganguli S, Dickstein J (2017) On the expressive power of deep neural networks. Precup D, Teh YW, eds. ICML’17: Proc. 34th Internat. Conf. Machine Learn. (ICML), vol. 70 (JMLR), 2847–2854.Google Scholar
Raghunathan A, Steinhardt J, Liang PS (2018) Semidefinite relaxations for certifying robustness to adversarial examples. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 10900–10910.Google Scholar
Ramachandran P, Zoph B, Le QV (2018) Searching for activation functions. ICLR Workshop Track.Google Scholar
Raman R, Grossmann I (1994) Modelling and computational techniques for logic based integer programming. Comput. Chemical Engrg. 18(7):563–578.Crossref, Google Scholar
Ramesh A, Dhariwal P, Nichol A, Chu C, Chen M (2022) Hierarchical text-conditional image generation with CLIP latents. Preprint, submitted April 13, https://arxiv.org/abs/2204.06125.Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann. Math. Statist. 22(3):400–407.Crossref, Google Scholar
Robinson H, Rasheed A, San O (2019) Dissecting deep neural networks. Preprint, submitted October 9, https://arxiv.org/abs/1910.03879.Google Scholar
Rolnick D, Kording K (2020) Reverse-engineering deep ReLU networks. Daumé H, Singh A, eds. ICML’20: Proc. 37th Internat. Conf. Machine Learn. (ICML), vol. 119 (JMLR), 8178–8187.Google Scholar
Rosenblatt F (1957) The perceptron—A perceiving and recognizing automaton. Technical Report 85-460-1, Cornell Aeronautical Laboratory, Buffalo, NY.Google Scholar
Rössig A, Petkovic M (2021) Advances in verification of ReLU neural networks. J. Global Optim. 81:109–152.Crossref, Google Scholar
Roth K (2021) A primer on multi-neuron relaxation-based adversarial robustness certification. ICML 2021 Workshop Adversarial Machine Learn.Google Scholar
Roy A, Kim LS, Mukhopadhyay S (1993) A polynomial time algorithm for the construction and training of a class of multilayer perceptrons. Neural Networks 6(4):535–545.Crossref, Google Scholar
Rubies-Royo V, Calandra R, Stipanovic DM, Tomlin C (2019) Fast neural network verification via shadow prices. Preprint, submitted June 21, https://arxiv.org/abs/1902.07247.Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536.Crossref, Google Scholar
Ryu M, Chow Y, Anderson R, Tjandraatmadja C, Boutilier C (2020) CaQL: Continuous action q-learning. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Safran I, Reichman D, Valiant P (2024) How many neurons does it take to approximate the maximum? Proc. 2024 Annual Sympos. Discrete Algorithms (SODA) (SIAM, Philadelphia), 3156–3183.Google Scholar
Sahiner A, Ergen T, Pauly JM, Pilanci M (2021) Vector-output ReLU neural network problems are copositive programs: Convex analysis of two layer networks and polynomial-time algorithms. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Sahiner A, Ergen T, Ozturkler B, Pauly JM, Mardani M, Pilanci M (2024) Scaling convex neural networks with Burer-Monteiro factorization. 12th Internat. Conf. Learn. Representations (ICLR).Google Scholar
Salman H, Yang G, Zhang H, Hsieh CJ, Zhang P (2019) A convex relaxation barrier to tight robustness verification of neural networks. Wallach HM, Larochelle HM, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. NIPS’19: Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 9835–9846.Google Scholar
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conf. Comput. Vision Pattern Recognition (CVPR) (IEEE, Piscataway, NJ), 4510–4520.Google Scholar
Sattelberg B, Cavalieri R, Kirby M, Peterson C, Beveridge R (2023) Locally linear attributes of ReLU neural networks. Frontiers Artificial Intelligence 6:1255192.Crossref, Google Scholar
Say B, Wu G, Zhou YQ, Sanner S (2017) Nonlinear hybrid planning with deep net learned transition models and mixed-integer linear programming. Proc. 30th Internat. Joint Conf. Artificial Intelligence (IJCAI), 750–756.Google Scholar
Scaman K, Virmaux A (2018) Lipschitz regularity of deep neural networks: Analysis and efficient estimation. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 3839–3848.Google Scholar
Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Networks 61:85–117.Crossref, Google Scholar
Schumann J, Gupta P, Nelson S (2003) On verification & validation of neural network based controllers. Engrg. Appl. Neural Networks (EANN).Google Scholar
Schwan R, Jones CN, Kuhn D (2023) Stability verification of neural network controllers using mixed-integer programming. IEEE Trans. Automatic Control 68(12):7514–7529.Crossref, Google Scholar
Schweidtmann AM, Mitsos A (2019) Deterministic global optimization with artificial neural networks embedded. J. Optim. Theory Appl. 180(3):925–948.Crossref, Google Scholar
Schweidtmann AM, Weber JM, Wende C, Netze L, Mitsos A (2022) Obey validity limits of data-driven models through topological data analysis and one-class classification. Optim. Engrg. 23(2):855–876.Crossref, Google Scholar
Seck I, Loosli G, Canu S (2021) Linear program powered attack. 2021 Internat. Joint Conf. Neural Networks (IJCNN) (IEEE, Piscataway, NJ), 1–8.Google Scholar
Serra T (2020) Enumerative branching with less repetition. Hebrard E, Musliu N, eds. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2020, Lecture Notes in Computer Science, vol. 12296 (Springer, Cham, Switzerland), 399–416.Google Scholar
Serra T, Hooker J (2020) Compact representation of near-optimal integer programming solutions. Math. Programming 182:199–232.Crossref, Google Scholar
Serra T, Ramalingam S (2020) Empirical bounds on linear regions of deep rectifier networks. Proc. AAAI Conf. Artificial Intelligence 34(4):5628–5635.Crossref, Google Scholar
Serra T, Kumar A, Ramalingam S (2020) Lossless compression of deep neural networks. Hebrard E, Musliu N, eds. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2020, Lecture Notes in Computer Science, vol. 12296 (Springer, Cham, Switzerland), 417–430.Google Scholar
Serra T, Tjandraatmadja C, Ramalingam S (2018) Bounding and counting linear regions of deep neural networks. Dy J, Krause A, eds. Internat. Conf. Machine Learn. (ICML), vol. 80 (PMLR, New York), 4558–4566.Google Scholar
Serra T, Yu X, Kumar A, Ramalingam S (2021) Scaling up exact neural network compression by ReLU stability. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 27081–27093.Google Scholar
Shi C, Emadikhiav M, Lozano L, Bergman D (2022) Careful! Training relevance is real. Preprint, submitted January 12, https://arxiv.org/abs/2201.04429.Google Scholar
Shi Z, Jin Q, Kolter Z, Jana S, Hsieh CJ, Zhang H (2025) Neural network verification with branch-and-bound for general nonlinearities. Gurfinkel A, Heule M, eds. Tools Algorithms Construction Anal. Systems. TACAS 2025, Lecture Notes in Computer Science, vol. 15696 (Springer, Cham, Switzerland), 315–335.Google Scholar
Sidrane C, Maleki A, Irfan A, Kochenderfer MJ (2022) OVERT: An algorithm for safety verification of neural network control policies for nonlinear systems. J. Machine Learn. Res. 23(117):1–45.Google Scholar
Sildir H, Aydin E (2022) A mixed-integer linear programming based training and feature selection method for artificial neural networks using piece-wise linear approximations. Chem. Engrg. Sci. 249:117273.Crossref, Google Scholar
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, et al. (2017) Mastering the game of Go without human knowledge. Nature 550:354–359.Crossref, Google Scholar
Singh G, Ganvir R, Püschel M, Vechev M (2019a) Beyond the single neuron convex barrier for neural network certification. Wallach HM, Larochelle HM, Beygelzimer A, d’Alché-Buc F, Fox EB, eds. NIPS’19: Proc. 33rd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 15098–15109.Google Scholar
Singh G, Gehr T, Püschel M, Vechev M (2019b) An abstract domain for certifying neural networks. Proc. ACM Programming Languages (POPL) 3:1–30.Google Scholar
Singh H, Kumar MP, Torr P, Dvijotham KD (2021) Overcoming the convex barrier for simplex inputs. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 4871–4882.Google Scholar
Singh G, Gehr T, Mirman M, Püschel M, Vechev M (2018) Fast and effective robustness certification. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 10825–10836.Google Scholar
Smith JE, Winkler RL (2006) The optimizer’s curse: Skepticism and postdecision surprise in decision analysis. Management Sci. 52(3):311–322.Link, Google Scholar
Sosnin P, Tsay C (2024) Scaling mixed-integer programming for certification of neural network controllers using bounds tightening. 2024 IEEE 63rd Conf. Decision Control (CDC) (IEEE, Piscataway, NJ), 1645–1650.Google Scholar
Sosnin P, Müller MN, Baader M, Tsay C, Wicker M (2025) Certified robustness to data poisoning in gradient-based training. Trans. Machine Learn. Res. (TMLR).Google Scholar
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: A simple way to prevent neural networks from overfitting. J. Machine Learn. Res. 15(56):1929–1958.Google Scholar
Stargalla M, Hertrich C, Reichman D (2025) The computational complexity of counting linear regions in ReLU neural networks. Preprint, submitted May 22, https://arxiv.org/abs/2505.16716.Google Scholar
Strong CA, Katz SM, Corso AL, Kochenderfer MJ (2022) ZoPE: A fast optimizer for ReLU networks with low-dimensional inputs. Deshmukh JV, Havelund K, Perez I, eds. NASA Formal Methods. NFM 2022, Lecture Notes in Computer Science, vol. 13260 (Springer, Cham, Switzerland), 299–317.Google Scholar
Strong CA, Wu H, Zeljić A, Julian KD, Katz G, Barrett C, Kochenderfer MJ (2021) Global optimization of objective functions represented by ReLU networks. Machine Learn. 112:3685–3712.Crossref, Google Scholar
Sudjianto A, Knauth W, Singh R, Yang Z, Zhang A (2020) Unwrapping the black box of deep ReLU networks: Interpretability, diagnostics, and simplification. Preprint, submitted November 8, https://arxiv.org/abs/2011.04041.Google Scholar
Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. NIPS’14: Proc. 28th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 3104–3112.Google Scholar
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. Dasgupta S, McAllester D, eds. ICML’13: Proc. 30th Internat. Conf. Machine Learn. (ICML), vol. 28 (JMLR), 1139–1147.Google Scholar
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. 2015 IEEE/CVF Conf. Comput. Vision Pattern Recognition (CVPR) (IEEE, Piscataway, NJ), 1–9.Google Scholar
Takai Y, Sannai A, Cordonnier M (2021) On the number of linear functions composing deep neural network: Towards a refined definition of neural networks complexity. Banerjee A, Fukumizu K, eds. Proc. 24th Internat. Conf. Artificial Intelligence Statist., vol. 130 (PMLR, New York), 3799–3809.Google Scholar
Tao Q, Li L, Huang X, Xi X, Wang S, Suykens JA (2022) Piecewise linear neural networks and deep learning. Nature Rev. Methods Primers 2:42.Crossref, Google Scholar
Telgarsky M (2015) Representation benefits of deep feedforward networks. Preprint, submitted September 27, https://arxiv.org/abs/1509.08101.Google Scholar
Thorbjarnarson T, Yorke-Smith N (2021) On training neural networks with mixed integer programming. IJCAI-PRICAI’20 Workshop Data Sci. Meets Optim., Yokohama, Japan.Google Scholar
Thorbjarnarson T, Yorke-Smith N (2023) Optimal training of integer-valued neural networks with mixed integer programming. PLoS One 18(2):e0261029.Crossref, Google Scholar
Tiwari S, Konidaris G (2022) Effects of data geometry in early deep learning. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 30099–30113.Google Scholar
Tjandraatmadja C, Anderson R, Huchette J, Ma W, Patel KK, Vielma JP (2020) The convex relaxation barrier, revisited: Tightened single-neuron relaxations for neural network verification. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 21675–21686.Google Scholar
Tjeng V, Xiao K, Tedrake R (2019) Evaluating robustness of neural networks with mixed integer programming. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Tong J, Cai J, Serra T (2024) Optimization over trained neural networks: Taking a relaxing walk. Dilkina B, ed. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2024, Lecture Notes in Computer Science, vol. 14743 (Springer, Cham, Switzerland), 221–233.Google Scholar
Trimmel M, Petzka H, Sminchisescu C (2021) TropEx: An algorithm for extracting linear terms in deep neural networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Tsay C, Baldea M (2019) 110th anniversary: Using data to bridge the time and length scales of process systems. Indust. Engrg. Chemistry Res. 58(36):16696–16708.Crossref, Google Scholar
Tsay C, Kronqvist J, Thebelt A, Misener R (2021) Partition-based formulations for mixed-integer optimization of trained ReLU neural networks. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 3068–3080.Google Scholar
Tseran H, Montúfar G (2021) On the expected complexity of maxout networks. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. NIPS’21: Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 28995–29008.Google Scholar
Turner M, Chmiela A, Koch T, Winkler M (2025) PySCIPOpt-ML: Embedding trained machine learning models into mixed-integer programs. Tack G, ed. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2025, Lecture Notes in Computer Science, vol. 15763 (Springer, Cham, Switzerland), 218–234.Google Scholar
Unser M (2019) A representer theorem for deep neural networks. J. Machine Learn. Res. 20(110):1–30.Google Scholar
Valerdi JL (2024) On minimal depth in neural networks. Preprint, submitted February 23, https://arxiv.org/abs/2402.15315.Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. NIPS’17: Proc. 31st Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 6000–6010.Google Scholar
Vielma JP (2015) Mixed integer linear programming formulation techniques. SIAM Rev. 57(1):3–57.Crossref, Google Scholar
Vielma JP (2019) Small and strong formulations for unions of convex sets from the Cayley embedding. Math. Programming 177(1–2):21–53.Crossref, Google Scholar
Vielma JP, Ahmed S, Nemhauser G (2010) Mixed-integer models for nonseparable piecewise-linear optimization: Unifying framework and extensions. Oper. Res. 58(2):303–315.Link, Google Scholar
Villani MJ, Schoots N (2023) Any deep ReLU network is shallow. Preprint, submitted June 20, https://arxiv.org/abs/2306.11827.Google Scholar
Vincent JA, Schwager M (2021) Reachable polyhedral marching (RPM): A safety verification algorithm for robotic systems with deep neural network components. IEEE Internat. Conf. Robotics Automation (ICRA) (IEEE, Piscataway, NJ), 9029–9035.Google Scholar
Vinyals O, Ewalds T, Bartunov S, Georgiev P, Vezhnevets AS, Yeo M, Makhzani A, et al. (2017) StarCraft II: A new challenge for reinforcement learning. Preprint, submitted August 16, https://arxiv.org/abs/1708.04782.Google Scholar
Volpp M, Fröhlich LP, Fischer K, Doerr A, Falkner S, Hutter F, Daniel C (2020) Meta-learning acquisition functions for transfer learning in Bayesian optimization. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Wang Y (2022) Estimation and comparison of linear regions for ReLU networks. De Raedt L, ed. Proc. 31th Internat. Joint Conf. Artificial Intelligence (IJCAI), 3544–3550.Google Scholar
Wang S, Sun X (2005) Generalization of hinging hyperplanes. IEEE Trans. Inform. Theory 51(12):4425–4431.Crossref, Google Scholar
Wang K, Lozano L, Bergman D, Cardonha C (2021) A two-stage exact algorithm for optimization of neural network ensemble. Stuckey PJ, ed. Integration Constraint Programming Artificial Intelligence Oper. Res. CPAIOR 2021, Lecture Notes in Computer Science, vol. 12735 (Springer, Cham, Switzerland), 106–114.Google Scholar
Wang K, Lozano L, Cardonha C, Bergman D (2023) Optimizing over an ensemble of trained neural networks. INFORMS J. Comput. 35(3):652–674.Link, Google Scholar
Wang S, Pei K, Whitehouse J, Yang J, Jana S (2018a) Efficient formal safety analysis of neural networks. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 6369–6379.Google Scholar
Wang S, Pei K, Whitehouse J, Yang J, Jana S (2018b) Formal security analysis of neural networks using symbolic intervals. SEC’18: Proc. 27th USENIX Conf. Security Sympos. (USENIX Association, San Francisco), 1599–1614.Google Scholar
Weng J, Ahuja N, Huang T (1992) Cresceptron: A self-organizing neural network which grows adaptively. 2021 Internat. Joint Conf. Neural Networks (IJCNN), vol. 1 (IEEE, Piscataway, NJ), 576–581.Google Scholar
Weng L, Zhang H, Chen H, Song Z, Hsieh CJ, Daniel L, Boning D, Dhillon I (2018) Towards fast computation of certified robustness for ReLU networks. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. (ICML), vol. 80 (PMLR, New York), 5276–5285.Google Scholar
Werbos P (1974) Beyond regression: New tools for prediction and analysis in the behavioral sciences. PhD thesis, Harvard University, Cambridge, MA.Google Scholar
Wicker MR, Heo J, Costabello L, Weller A (2023) Robust explanation constraints for neural networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Wicker M, Laurenti L, Patane A, Kwiatkowska M (2020) Probabilistic safety for Bayesian neural networks. Peters J, Sontag D, eds. Proc. 36th Conf. Uncertainty Artificial Intelligence (UAI), vol. 124 (PMLR, New York), 1198–1207.Google Scholar
Wicker MR, Sosnin P, Shilov I, Janik A, Mueller MN, de Montjoye YA, Weller A, Tsay C (2025) Certification for differentially private prediction in gradient-based training. Singh A, Fazel M, Hsu D, Lacoste-Julien S, Berkenkamp F, Maharaj T, Wagstaff K, Zhu J, eds. Proc. 42nd Internat. Conf. Machine Learn. (ICML), vol. 267 (PMLR, New York), 66726–66745.Google Scholar
Wilhelm ME, Wang C, Stuber MD (2022) Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization. J. Global Optim. 85:569–594.Crossref, Google Scholar
Witte C, Lüthje JT, Schulte V, Mitsos A, Bongartz D (2025) Deterministic global optimization with trained neural networks: Is the envelope of single neurons worth it? Preprint, submitted April 28, https://optimization-online.org/2025/04/deterministic-global-optimization-with-trained-neural-networks-is-the-envelope-of-single-neurons-worth-it/.Google Scholar
Wong E, Kolter Z (2018) Provable defenses against adversarial examples via the convex outer adversarial polytope. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. (ICML), vol. 80 (PMLR, New York), 5286–5295.Google Scholar
Wong E, Schmidt F, Metzen JH, Kolter JZ (2018) Scaling provable adversarial defenses. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 8410–8419.Google Scholar
Wright SJ (2018) Optimization algorithms for data analysis. Mahoney MW, Duchi JC, Gilbert AC, eds. The Mathematics of Data, IAS/Park City Mathematics Series, vol. 25 (American Mathematical Society, Providence, RI), 49–98.Crossref, Google Scholar
Wu G, Say B, Sanner S (2020) Scalable planning with deep neural network learned transition models. J. Artificial Intelligence Res. 68:571–606.Crossref, Google Scholar
Wu H, Zeljić A, Katz G, Barrett C (2022) Efficient neural network analysis with sum-of-infeasibilities. Fisman D, Rosu G, eds. Tools Algorithms Construction Anal. Systems. TACAS 2025, Lecture Notes in Computer Science, vol. 13243 (Springer, Cham, Switzerland), 143–163.Google Scholar
Xiang W, Tran HD, Johnson TT (2017) Reachable set computation and safety verification for neural networks with ReLU activations. Preprint, submitted December 21, https://arxiv.org/abs/1712.08163.Google Scholar
Xiao K, Tjeng V, Shafiullah N, Madry A (2019) Training for faster adversarial robustness verification via inducing ReLU stability. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Xie Y, Chen G, Li Q (2020a) A general computational framework to measure the expressiveness of complex networks using a tighter upper bound of linear regions. Preprint, submitted December 8, https://arxiv.org/abs/2012.04428.Google Scholar
Xie Q, Luong MT, Hovy E, Le QV (2020b) Self-training with noisy student improves ImageNet classification. 2018 IEEE/CVF Conf. Comput. Vision Pattern Recognition (CVPR) (IEEE, Piscataway, NJ), 10684–10695.Google Scholar
Xie J, Shen Z, Zhang C, Wang B, Qian H (2020c) Efficient projection-free online methods with stochastic recursive gradient. Proc. AAAI Conf. Artificial Intelligence 34(4):6446–6453.Crossref, Google Scholar
Xiong H, Huang L, Yu M, Liu L, Zhu F, Shao L (2020) On the number of linear regions of convolutional neural networks. Daumé III H, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn. (ICML), vol. 119 (PMLR, New York), 10514–10523.Google Scholar
Xu S, Vaughan J, Chen J, Zhang A, Sudjianto A (2022) Traversing the local polytopes of ReLU neural networks. AAAI Workshop AdvML (AAAI Press, Washington, DC).Google Scholar
Yang D, Balaprakash P, Leyffer S (2022) Modeling design and control problems involving neural network surrogates. Comput. Optim. Appl. 83:759–800.Crossref, Google Scholar
Yang X, Tran HD, Xiang W, Johnson T (2020) Reachability analysis for feed-forward neural networks using face lattices. Preprint, submitted March 2, https://arxiv.org/abs/2003.01226.Google Scholar
Yang X, Yamaguchi T, Tran HD, Hoxha B, Johnson TT, Prokhorov D (2021) Reachability analysis of convolutional neural networks. Preprint, submitted June 22, https://arxiv.org/abs/2106.12074.Google Scholar
Yarotsky D (2017) Error bounds for approximations with deep ReLU networks. Neural Networks 94:103–114.Crossref, Google Scholar
Zakrzewski RR (2001) Verification of a trained neural network accuracy. Internat. Joint Conf. Neural Networks (IJCNN), vol. 3 (IEEE, Piscataway, NJ), 1657–1662.Google Scholar
Zanotti L (2025) Linear-size neural network representation of piecewise affine functions in R2. Preprint, submitted March 17, https://arxiv.org/abs/2503.13001.Google Scholar
Zaslavsky T (1975) Facing Up to Arrangements: Face-Count Formulas for Partitions of Space by Hyperplanes (American Mathematical Society, Providence, RI).Google Scholar
Zhang R (2020) On the tightness of semidefinite relaxations for certifying robustness to adversarial examples. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. NIPS’20: Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 3808–3820.Google Scholar
Zhang X, Wu D (2020) Empirical studies on the properties of linear regions in deep neural networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Zhang L, Naitzat G, Lim LH (2018a) Tropical geometry of deep neural networks. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. (ICML), vol. 80 (PMLR, New York), 5824–5832.Google Scholar
Zhang A, Lipton ZC, Li M, Smola AJ (2023a) Dive into Deep Learning (Cambridge University Press, Cambridge, UK).Google Scholar
Zhang H, Weng TW, Chen PY, Hsieh CJ, Daniel L (2018b) Efficient neural network robustness certification with general activation functions. Bengio S, Beygelzimer A, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 4944–4953.Google Scholar
Zhang S, Campos J, Feldmann C, Walz D, Sandfort F, Mathea M, Tsay C, Misener R (2023b) Optimizing over trained GNNs via symmetry breaking. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. NIPS’23: Proc. 37th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 44898–44924.Google Scholar
Zhang H, Chen H, Xiao C, Gowal S, Stanforth R, Li B, Boning D, Hsieh CJ (2020) Towards stable and efficient training of verifiably robust neural networks. Internat. Conf. Learn. Representations (ICLR).Google Scholar
Zhang H, Wang S, Xu K, Li L, Li B, Jana S, Hsieh CJ, Kolter JZ (2022) General cutting planes for bound-propagation-based neural network verification. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1656–1670.Google Scholar
Zhao S, Tsay C, Kronqvist J (2023) Model-based feature selection for neural networks: A mixed-integer programming approach. Sellmann M, Tierney K, eds. Learn. Intelligent Optim. LION 2023, Lecture Notes in Computer Science, vol. 14286 (Springer, Cham, Switzerland), 223–238.Google Scholar
Zhou S, Schoellig AP (2019) An analysis of the expressiveness of deep neural network architectures based on their Lipschitz constants. Preprint, submitted January 18, https://arxiv.org/abs/1912.11511.Google Scholar
Zhou D, Brix C, Hanasusanto GA, Zhang H (2024) Scalable neural network verification with branch-and-bound inferred cutting planes. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. NIPS’24: Proc. 38th Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 29324–29353.Google Scholar
Zhu R, Lin B, Tang H (2020) Bounding the number of linear regions in local area for neural networks with ReLU activations. Preprint, submitted July 14, https://arxiv.org/abs/2007.06803.Google Scholar
Zou D, Balan R, Singh M (2019) On Lipschitz bounds of general convolutional neural networks. IEEE Trans. Inform. Theory 66(3):1738–1759.Crossref, Google Scholar

cover image INFORMS Journal on Computing

Articles In Advance

Article Information

Metrics

Information

Received:August 09, 2024
Accepted:December 14, 2025
Published Online:March 12, 2026

Cite as

Joey Huchette, Gonzalo Muñoz, Thiago Serra, Calvin Tsay (2026) When Deep Learning Meets Polyhedral Theory: A Survey. INFORMS Journal on Computing 0(0).

https://doi.org/10.1287/ijoc.2024.0902

Keywords

Acknowledgments

The authors thank Christian Tjandraatmadja and Toon Tran for early feedback on the manuscript and for asking questions that helped shape it. The authors also thank the three anonymous reviewers for their thorough reviews and recommendations.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

When Deep Learning Meets Polyhedral Theory: A Survey

References

Articles In Advance

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News