Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

Published Online:https://doi.org/10.1287/ijoc.2021.0225

References

  • Arora R, Basu A, Mianjy P, Mukherjee A (2018) Understanding deep neural networks with rectified linear units. Proc. Internat. Conf. on Learn. Representations.Google Scholar
  • Arora S, Barak B (2009) Computational Complexity: A Modern Approach (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Bahdanau D, Cho KH, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. Proc. 3rd Internat. Conf. on Learn. Representations.Google Scholar
  • Bellman R (1962) Dynamic programming treatment of the travelling salesman problem. J. ACM 9(1):61–63.CrossrefGoogle Scholar
  • Bello I, Pham H, Le QV, Norouzi M, Bengio S (2016) Neural combinatorial optimization with reinforcement learning. Preprint, submitted November 29, https://arxiv.org/abs/1611.09940.Google Scholar
  • Bengio Y, Lodi A, Prouvost A (2021) Machine learning for combinatorial optimization: A methodological tour d’horizon. Eur. J. Oper. Res. 290(2):405–421.CrossrefGoogle Scholar
  • Bertschinger D, Hertrich C, Jungeblut P, Miltzow T, Weber S (2022) Training fully connected neural networks is ∃R-complete. Preprint, submitted April 4, https://arxiv.org/abs/2204.01368.Google Scholar
  • Bertsimas D, Weismantel R (2005) Optimization over Integers (Dynamic Ideas).Google Scholar
  • Cappart Q, Chételat D, Khalil EB, Lodi A, Morris C, Veličković P (2021) Combinatorial optimization and reasoning with graph neural networks. Proc. 30th Internat. Joint Conf. on Artificial Intelligence, 4348–4355.Google Scholar
  • Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to Algorithms, 2nd ed. (MIT Press, Cambridge, MA).Google Scholar
  • Eldan R, Shamir O (2016) The power of depth for feedforward neural networks. Proc. Conf. on Learn. Theory, 907–940.Google Scholar
  • Emami P, Ranka S (2018) Learning permutations with sinkhorn policy gradient. Preprint, submitted May 18, https://arxiv.org/abs/1805.07010.Google Scholar
  • Fischetti M, Lodi A (2010) On the knapsack closure of 0-1 integer linear programs. Electronic Notes Discrete Math. 36:799–804.CrossrefGoogle Scholar
  • Froese V, Hertrich C, Niedermeier R (2022) The computational complexity of ReLU network training parameterized by data dimensionality. J. Artificial Intelligence Res. 74:1775–1790.CrossrefGoogle Scholar
  • Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. Gordon GJ, Dunson DB, Dudìk N, eds. Proc. 14th Internat. Conf. on Artificial Intelligence and Statist., vol. 15, 315–323.Google Scholar
  • Goel S, Klivans AR, Manurangsi P, Reichman D (2021) Tight hardness results for training depth-2 ReLU networks. Lee JR, eds. Proc. 12th Innovations in Theoretical Comput. Sci. Conf., LIPIcs Series, vol. 185 (Schloss Dagstuhl - Leibniz-Zentrum für Informatik), 22:1–22:14.Google Scholar
  • Goodfellow I, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. Proc. Internat. Conf. on Machine Learn, 1319–1327.Google Scholar
  • Graves A, Fernández S, Schmidhuber J (2007) Multi-dimensional recurrent neural networks. Proc. Internat. Conf. on Artificial Neural Networks, 549–558.Google Scholar
  • Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. Proc. IEEE Internat. Conf. on Acoustics, Speech and Signal Processing (IEEE, Piscataway, NJ), 6645–6649.CrossrefGoogle Scholar
  • Gu S, Hao T (2018) A pointer network based deep learning algorithm for 0–1 knapsack problem. Proc. 10th Internat. Conf. on Advanced Comput. Intelligence, 473–477.Google Scholar
  • Haase CA, Hertrich C, Loho G (2023) Lower bounds on the depth of integral ReLU neural networks via lattice polytopes. Proc. 11th Internat. Conf. on Learn. Representations.Google Scholar
  • Hanin B (2019) Universal function approximation by deep neural nets with bounded width and ReLU activations. Mathematics 7(10):992.CrossrefGoogle Scholar
  • Hanin B, Rolnick D (2019) Complexity of linear regions in deep networks. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 2596–2604.Google Scholar
  • Hanin B, Sellke M (2017) Approximating continuous functions by ReLU nets of minimal width. Preprint, submitted October 31, https://arxiv.org/abs/1710.11278.Google Scholar
  • Held M, Karp RM (1962) A dynamic programming approach to sequencing problems. J. Soc. Industry Appl. Math. 10(1):196–210.CrossrefGoogle Scholar
  • Hertrich C, Sering L (2021) ReLU neural networks of polynomial size for exact maximum flow computation. Proc. 24th Conf. on Integer Programming and Combinatorial Optim.Google Scholar
  • Hertrich C, Skutella M (2021) Provably good solutions to the knapsack problem via neural networks of bounded size. Proc. Conf. AAAI Artificial Intelligence 35:7685–7693.CrossrefGoogle Scholar
  • Hertrich C, Basu A, Di Summa M, Skutella M (2021) Toward lower bounds on the depth of ReLU neural networks. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Adv. Neural Information Processing Systems, vol. 34 (NeurIPS, San Diego), 3336–3348.Google Scholar
  • Hochbaum DS (1997) Various notions of approximations: Good, better, best, and more. Hochbaum DS, ed. Approximation Algorithms for NP-Hard Problems (PWS Publishing Co.), 346–446.Google Scholar
  • Hopfield JJ, Tank DW (1985) “Neural” computation of decisions in optimization problems. Biological Cybernetics 52(3):141–152.CrossrefGoogle Scholar
  • Karp RM (1972) Reducibility among combinatorial problems. Miller RE, Thatcher JW, eds. Complexity of Computer Computations (Springer, Berlin) 85–103.CrossrefGoogle Scholar
  • Kellerer H, Pferschy U, Pisinger D (2004) Knapsack Problems (Springer, Berlin).CrossrefGoogle Scholar
  • Khalife S, Basu A (2022) Neural networks with linear threshold activations: Structure and algorithms. Proc. Internat. Conf. on Integer Programming and Combinatorial Optim. (Springer, Berlin), 347–360.Google Scholar
  • Khalil E, Dai H, Zhang Y, Dilkina B, Song L (2017) Learning combinatorial optimization algorithms over graphs, Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Information Processing Systems, vol. 30 (NeurIPS, San Diego), 6348–6358.Google Scholar
  • Kleinberg J, Tardos E (2006) Algorithm Design (Pearson Education).Google Scholar
  • Kool W, van Hoof H, Welling M (2019) Attention, learn to solve routing problems! Proc. Internat. Conf. on Learn. Representations.Google Scholar
  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444.CrossrefGoogle Scholar
  • Leighton FT (1991) Introduction to Parallel Algorithms and Architectures: Arrays ⋅ Trees ⋅ Hypercubes (Morgan-Kaufmann).Google Scholar
  • Li D, Liu J, Lee D, Seyedmazloom A, Kaushik G, Lee K, Park N (2021) A novel method to solve neural knapsack problems. Proc. Internat. Conf. on Machine Learn., 6414–6424.Google Scholar
  • Liang S, Srikant R (2017) Why deep neural networks for function approximation? Proc. Internat. Conf. on Learn. Representations.Google Scholar
  • Lodi A, Zarpellon G (2017) On learning and branching: A survey. TOP 25(2):207–236.CrossrefGoogle Scholar
  • Lombardi M, Milano M (2018) Boosting combinatorial problem modeling with machine learning. Proc. 27th Internat. Joint Conf. on Artificial Intelligence, 5472–5478.Google Scholar
  • Martello S, Toth P (1990) Knapsack Problems: Algorithms and Computer Implementations (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Montufar GF, Pascanu R, Cho K, Bengio Y (2014) On the number of linear regions of deep neural networks, vol. 27. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Adv. Neural Information Processing Systems, vol. 27 (NeurIPS, San Diego), 2924–2932.Google Scholar
  • Mukherjee A, Basu A (2017) Lower bounds over Boolean inputs for deep neural networks with ReLU gates. Preprint, submitted November 8, https://arxiv.org/abs/1711.03073.Google Scholar
  • Nguyen Q, Mukkamala MC, Hein M (2018) Neural networks should be wide enough to learn disconnected decision regions. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 3737–3746.Google Scholar
  • Nowak A, Villar S, Bandeira AS, Bruna J (2018) Revised note on learning quadratic assignment with graph neural networks. Proc. IEEE Data Sci. Workshop (IEEE, Piscataway, NJ), 1–5.Google Scholar
  • Ohlsson M, Peterson C, Söderberg B (1993) Neural networks for optimization problems with inequality constraints: The knapsack problem. Neural Comput. 5(2):331–339.CrossrefGoogle Scholar
  • Pascanu R, Montufar G, Bengio Y (2014) On the number of inference regions of deep feed forward networks with piece-wise linear activations. Proc. Internat. Conf. on Learn. Representations.Google Scholar
  • Raghu M, Poole B, Kleinberg J, Ganguli S, Dickstein JS (2017) On the expressive power of deep neural networks. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 2847–2854.Google Scholar
  • Safran I, Shamir O (2017) Depth-width tradeoffs in approximating natural functions with neural networks. Proc. Internat. Conf. on Machine Learn., 2979–2987.Google Scholar
  • Serra T, Tjandraatmadja C, Ramalingam S (2018) Bounding and counting linear regions of deep neural networks. Proc. Internat. Conf. on Machine Learn. (ICML, San Diego), 4565–4573.Google Scholar
  • Shalev-Shwartz S, Ben-David S (2014) Understanding Machine Learning: From Theory to Algorithms (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Smith KA (1999) Neural networks for combinatorial optimization: A review of more than a decade of research. INFORMS J. Comput. 11(1):15–34.LinkGoogle Scholar
  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Ghahramani Z, Welling M, Cortes C, Lawrence L, Weinberger KQ, eds. Adv. Neural Information Processing Systems, vol. 27 (NeurIPS, San Diego).Google Scholar
  • Telgarsky M (2015) Representation benefits of deep feedforward networks. Preprint, submitted September 27, https://arxiv.org/abs/1509.08101.Google Scholar
  • Telgarsky M (2016) Benefits of depth in neural networks. Proc. Conf. on Learn. Theory, 1517–1539.Google Scholar
  • Vazirani VV (2001) Approximation Algorithms (Springer, Berlin).Google Scholar
  • Veličković P, Ying R, Padovano M, Hadsell R, Blundell C (2020) Neural execution of graph algorithms. Proc. Internat. Conf. on Learn. Representations.Google Scholar
  • Vinyals O, Fortunato M, Jaitly N (2015) Pointer networks. Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, eds. Adv. Neural Information Processing Systems, vol. 28 (NeurIPS, San Diego), 2692–2700.Google Scholar
  • Williamson DP, Shmoys DB (2011) The Design of Approximation Algorithms (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Woeginger GJ (2000) When does a dynamic programming formulation guarantee the existence of a fully polynomial time approximation scheme (FPTAS)? INFORMS J. Comput. 12(1):57–74.LinkGoogle Scholar
  • Xu K, Li J, Zhang M, Du SS, Kawarabayashi Ki, Jegelka S (2020a) What can neural networks reason about? Proc. Internat. Conf. on Learn. Representations.Google Scholar
  • Xu S, Panwar SS, Kodialam MS, Lakshman TV (2020b) Deep neural network approximated dynamic programming for combinatorial optimization. Proc. AAAI Conf. on Artificial Intelligence (AAAI Press, Washington, DC), 1684–1691.Google Scholar
  • Yang F, Jin T, Liu TY, Sun X, Zhang J (2018) Boosting dynamic programming with neural networks for solving np-hard problems. Proc. Asian Conf. on Machine Learn., 726–739.Google Scholar
  • Yarotsky D (2017) Error bounds for approximations with deep relu networks. Neural Networks 94:103–114.CrossrefGoogle Scholar
  • Ziegelmann M (2001) Constrained shortest paths and related problems. PhD thesis, Universität des Saarlandes Saarbrücken, Germany.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.