Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

Christoph Hertrich
Corresponding Author
Christoph Hertrich
[email protected]
https://orcid.org/0000-0001-5646-8567
Department of Mathematics, London School of Economics and Political Science, London, WC2A 2AE, United Kingdom;
Search for more papers by this author
,
Martin Skutella
Martin Skutella
[email protected]
https://orcid.org/0000-0002-9814-1703
Institute of Mathematics, Technische Universität Berlin, 10587 Berlin, Germany
Search for more papers by this author

Christoph Hertrich

Corresponding Author

Christoph Hertrich

[email protected]

https://orcid.org/0000-0001-5646-8567

Department of Mathematics, London School of Economics and Political Science, London, WC2A 2AE, United Kingdom;

Search for more papers by this author

Martin Skutella

[email protected]

https://orcid.org/0000-0002-9814-1703

Institute of Mathematics, Technische Universität Berlin, 10587 Berlin, Germany

Search for more papers by this author

Published Online:4 May 2023https://doi.org/10.1287/ijoc.2021.0225

References

Arora R, Basu A, Mianjy P, Mukherjee A (2018) Understanding deep neural networks with rectified linear units. Proc. Internat. Conf. on Learn. Representations.Google Scholar
Arora S, Barak B (2009) Computational Complexity: A Modern Approach (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Bahdanau D, Cho KH, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. Proc. 3rd Internat. Conf. on Learn. Representations.Google Scholar
Bellman R (1962) Dynamic programming treatment of the travelling salesman problem. J. ACM 9(1):61–63.Crossref, Google Scholar
Bello I, Pham H, Le QV, Norouzi M, Bengio S (2016) Neural combinatorial optimization with reinforcement learning. Preprint, submitted November 29, https://arxiv.org/abs/1611.09940.Google Scholar
Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: A methodological tour d’horizon. Eur. J. Oper. Res. 290(2):405–421.Crossref, Google Scholar
Bertschinger D, Hertrich C, Jungeblut P, Miltzow T, Weber S (2022) Training fully connected neural networks is ∃R-complete. Preprint, submitted April 4, https://arxiv.org/abs/2204.01368.Google Scholar
Bertsimas D, Weismantel R (2005) Optimization over Integers (Dynamic Ideas).Google Scholar
Cappart Q, Chételat D, Khalil EB, Lodi A, Morris C, Veličković P (2021) Combinatorial optimization and reasoning with graph neural networks. Proc. 30th Internat. Joint Conf. on Artificial Intelligence, 4348–4355.Google Scholar
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to Algorithms, 2nd ed. (MIT Press, Cambridge, MA).Google Scholar
Eldan R, Shamir O (2016) The power of depth for feedforward neural networks. Proc. Conf. on Learn. Theory, 907–940.Google Scholar
Emami P, Ranka S (2018) Learning permutations with sinkhorn policy gradient. Preprint, submitted May 18, https://arxiv.org/abs/1805.07010.Google Scholar
Fischetti M, Lodi A (2010) On the knapsack closure of 0-1 integer linear programs. Electronic Notes Discrete Math. 36:799–804.Crossref, Google Scholar
Froese V, Hertrich C, Niedermeier R (2022) The computational complexity of ReLU network training parameterized by data dimensionality. J. Artificial Intelligence Res. 74:1775–1790.Crossref, Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. Gordon GJ, Dunson DB, Dudìk N, eds. Proc. 14th Internat. Conf. on Artificial Intelligence and Statist., vol. 15, 315–323.Google Scholar
Goel S, Klivans AR, Manurangsi P, Reichman D (2021) Tight hardness results for training depth-2 ReLU networks. Lee JR, eds. Proc. 12th Innovations in Theoretical Comput. Sci. Conf., LIPIcs Series, vol. 185 (Schloss Dagstuhl - Leibniz-Zentrum für Informatik), 22:1–22:14.Google Scholar
Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc. Internat. Conf. on Machine Learn, 1319–1327.Google Scholar
Graves A, Fernández S, Schmidhuber J (2007) Multi-dimensional recurrent neural networks. Proc. Internat. Conf. on Artificial Neural Networks, 549–558.Google Scholar
Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. Proc. IEEE Internat. Conf. on Acoustics, Speech and Signal Processing (IEEE, Piscataway, NJ), 6645–6649.Crossref, Google Scholar
Gu S, Hao T (2018) A pointer network based deep learning algorithm for 0–1 knapsack problem. Proc. 10th Internat. Conf. on Advanced Comput. Intelligence, 473–477.Google Scholar
Haase CA, Hertrich C, Loho G (2023) Lower bounds on the depth of integral ReLU neural networks via lattice polytopes. Proc. 11th Internat. Conf. on Learn. Representations.Google Scholar
Hanin B (2019) Universal function approximation by deep neural nets with bounded width and ReLU activations. Mathematics 7(10):992.Crossref, Google Scholar
Hanin B, Rolnick D (2019) Complexity of linear regions in deep networks. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 2596–2604.Google Scholar
Hanin B, Sellke M (2017) Approximating continuous functions by ReLU nets of minimal width. Preprint, submitted October 31, https://arxiv.org/abs/1710.11278.Google Scholar
Held M, Karp RM (1962) A dynamic programming approach to sequencing problems. J. Soc. Industry Appl. Math. 10(1):196–210.Crossref, Google Scholar
Hertrich C, Sering L (2021) ReLU neural networks of polynomial size for exact maximum flow computation. Proc. 24th Conf. on Integer Programming and Combinatorial Optim.Google Scholar
Hertrich C, Skutella M (2021) Provably good solutions to the knapsack problem via neural networks of bounded size. Proc. Conf. AAAI Artificial Intelligence 35:7685–7693.Crossref, Google Scholar
Hertrich C, Basu A, Di Summa M, Skutella M (2021) Toward lower bounds on the depth of ReLU neural networks. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Information Processing Systems, vol. 34 (NeurIPS, San Diego), 3336–3348.Google Scholar
Hochbaum DS (1997) Various notions of approximations: Good, better, best, and more. Hochbaum DS, ed. Approximation Algorithms for NP-Hard Problems (PWS Publishing Co.), 346–446.Google Scholar
Hopfield JJ, Tank DW (1985) “Neural” computation of decisions in optimization problems. Biological Cybernetics 52(3):141–152.Crossref, Google Scholar
Karp RM (1972) Reducibility among combinatorial problems. Miller RE, Thatcher JW, eds. Complexity of Computer Computations (Springer, Berlin) 85–103.Crossref, Google Scholar
Kellerer H, Pferschy U, Pisinger D (2004) Knapsack Problems (Springer, Berlin).Crossref, Google Scholar
Khalife S, Basu A (2022) Neural networks with linear threshold activations: Structure and algorithms. Proc. Internat. Conf. on Integer Programming and Combinatorial Optim. (Springer, Berlin), 347–360.Google Scholar
Khalil E, Dai H, Zhang Y, Dilkina B, Song L (2017) Learning combinatorial optimization algorithms over graphs, Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Information Processing Systems, vol. 30 (NeurIPS, San Diego), 6348–6358.Google Scholar
Kleinberg J, Tardos E (2006) Algorithm Design (Pearson Education).Google Scholar
Kool W, van Hoof H, Welling M (2019) Attention, learn to solve routing problems! Proc. Internat. Conf. on Learn. Representations.Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444.Crossref, Google Scholar
Leighton FT (1991) Introduction to Parallel Algorithms and Architectures: Arrays ⋅ Trees ⋅ Hypercubes (Morgan-Kaufmann).Google Scholar
Li D, Liu J, Lee D, Seyedmazloom A, Kaushik G, Lee K, Park N (2021) A novel method to solve neural knapsack problems. Proc. Internat. Conf. on Machine Learn., 6414–6424.Google Scholar
Liang S, Srikant R (2017) Why deep neural networks for function approximation? Proc. Internat. Conf. on Learn. Representations.Google Scholar
Lodi A, Zarpellon G (2017) On learning and branching: A survey. TOP 25(2):207–236.Crossref, Google Scholar
Lombardi M, Milano M (2018) Boosting combinatorial problem modeling with machine learning. Proc. 27th Internat. Joint Conf. on Artificial Intelligence, 5472–5478.Google Scholar
Martello S, Toth P (1990) Knapsack Problems: Algorithms and Computer Implementations (John Wiley & Sons, Hoboken, NJ).Google Scholar
Montufar GF, Pascanu R, Cho K, Bengio Y (2014) On the number of linear regions of deep neural networks, vol. 27. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Adv. Neural Information Processing Systems, vol. 27 (NeurIPS, San Diego), 2924–2932.Google Scholar
Mukherjee A, Basu A (2017) Lower bounds over Boolean inputs for deep neural networks with ReLU gates. Preprint, submitted November 8, https://arxiv.org/abs/1711.03073.Google Scholar
Nguyen Q, Mukkamala MC, Hein M (2018) Neural networks should be wide enough to learn disconnected decision regions. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 3737–3746.Google Scholar
Nowak A, Villar S, Bandeira AS, Bruna J (2018) Revised note on learning quadratic assignment with graph neural networks. Proc. IEEE Data Sci. Workshop (IEEE, Piscataway, NJ), 1–5.Google Scholar
Ohlsson M, Peterson C, Söderberg B (1993) Neural networks for optimization problems with inequality constraints: The knapsack problem. Neural Comput. 5(2):331–339.Crossref, Google Scholar
Pascanu R, Montufar G, Bengio Y (2014) On the number of inference regions of deep feed forward networks with piece-wise linear activations. Proc. Internat. Conf. on Learn. Representations.Google Scholar
Raghu M, Poole B, Kleinberg J, Ganguli S, Dickstein JS (2017) On the expressive power of deep neural networks. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 2847–2854.Google Scholar
Safran I, Shamir O (2017) Depth-width tradeoffs in approximating natural functions with neural networks. Proc. Internat. Conf. on Machine Learn., 2979–2987.Google Scholar
Serra T, Tjandraatmadja C, Ramalingam S (2018) Bounding and counting linear regions of deep neural networks. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 4565–4573.Google Scholar
Shalev-Shwartz S, Ben-David S (2014) Understanding Machine Learning: From Theory to Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Smith KA (1999) Neural networks for combinatorial optimization: A review of more than a decade of research. INFORMS J. Comput. 11(1):15–34.Link, Google Scholar
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Ghahramani Z, Welling M, Cortes C, Lawrence L, Weinberger KQ, eds. Adv. Neural Information Processing Systems, vol. 27 (NeurIPS, San Diego).Google Scholar
Telgarsky M (2015) Representation benefits of deep feedforward networks. Preprint, submitted September 27, https://arxiv.org/abs/1509.08101.Google Scholar
Telgarsky M (2016) Benefits of depth in neural networks. Proc. Conf. on Learn. Theory, 1517–1539.Google Scholar
Vazirani VV (2001) Approximation Algorithms (Springer, Berlin).Google Scholar
Veličković P, Ying R, Padovano M, Hadsell R, Blundell C (2020) Neural execution of graph algorithms. Proc. Internat. Conf. on Learn. Representations.Google Scholar
Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Adv. Neural Information Processing Systems, vol. 28 (NeurIPS, San Diego), 2692–2700.Google Scholar
Williamson DP, Shmoys DB (2011) The Design of Approximation Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Woeginger GJ (2000) When does a dynamic programming formulation guarantee the existence of a fully polynomial time approximation scheme (FPTAS)? INFORMS J. Comput. 12(1):57–74.Link, Google Scholar
Xu K, Li J, Zhang M, Du SS, Kawarabayashi Ki, Jegelka S (2020a) What can neural networks reason about? Proc. Internat. Conf. on Learn. Representations.Google Scholar
Xu S, Panwar SS, Kodialam MS, Lakshman TV (2020b) Deep neural network approximated dynamic programming for combinatorial optimization. Proc. AAAI Conf. on Artificial Intelligence (AAAI Press, Washington, DC), 1684–1691.Google Scholar
Yang F, Jin T, Liu TY, Sun X, Zhang J (2018) Boosting dynamic programming with neural networks for solving np-hard problems. Proc. Asian Conf. on Machine Learn., 726–739.Google Scholar
Yarotsky D (2017) Error bounds for approximations with deep relu networks. Neural Networks 94:103–114.Crossref, Google Scholar
Ziegelmann M (2001) Constrained shortest paths and related problems. PhD thesis, Universität des Saarlandes Saarbrücken, Germany.Google Scholar

cover image INFORMS Journal on Computing

Volume 35, Issue 5

September-October 2023

Pages 909-1213, C2

Article Information

Metrics

Information

Received:August 26, 2021
Accepted:March 18, 2023
Published Online:May 04, 2023

Cite as

Christoph Hertrich, Martin Skutella (2023) Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size. INFORMS Journal on Computing 35(5):1079-1097.

https://doi.org/10.1287/ijoc.2021.0225

Keywords

Acknowledgments

The authors thank the referees for insightful comments that helped to improve the presentation of this paper. A large portion of this research was conducted while the first author was affiliated with TU Berlin and supported by DFG-GRK 2434 “Facets of Complexity.”

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

References

Volume 35, Issue 5

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News