Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete

April 04, 2022 · The Ethereal · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Daniel Bertschinger, Christoph Hertrich, Paul Jungeblut, Tillmann Miltzow, Simon Weber arXiv ID 2204.01368 Category cs.CC: Computational Complexity Cross-listed cs.LG, cs.NE Citations 35 Venue Neural Information Processing Systems Last Checked 1 month ago

Abstract

We consider the problem of finding weights and biases for a two-layer fully connected neural network to fit a given set of data points as well as possible, also known as EmpiricalRiskMinimization. Our main result is that the associated decision problem is $\exists\mathbb{R}$-complete, that is, polynomial-time equivalent to determining whether a multivariate polynomial with integer coefficients has any real roots. Furthermore, we prove that algebraic numbers of arbitrarily large degree are required as weights to be able to train some instances to optimality, even if all data points are rational. Our result already applies to fully connected instances with two inputs, two outputs, and one hidden layer of ReLU neurons. Thereby, we strengthen a result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021]. A consequence of this is that a combinatorial search algorithm like the one by Arora, Basu, Mianjy and Mukherjee [ICLR 2018] is impossible for networks with more than one output dimension, unless $\mathsf{NP}=\exists\mathbb{R}$.