The loss surface of deep linear networks viewed through the algebraic geometry lens
October 17, 2018 ยท Declared Dead ยท ๐ IEEE Transactions on Pattern Analysis and Machine Intelligence
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Dhagash Mehta, Tianran Chen, Tingting Tang, Jonathan D. Hauenstein
arXiv ID
1810.07716
Category
stat.ML: Machine Learning (Stat)
Cross-listed
cs.LG,
math.AG
Citations
35
Venue
IEEE Transactions on Pattern Analysis and Machine Intelligence
Last Checked
4 months ago
Abstract
By using the viewpoint of modern computational algebraic geometry, we explore properties of the optimization landscapes of the deep linear neural network models. After clarifying on the various definitions of "flat" minima, we show that the geometrically flat minima, which are merely artifacts of residual continuous symmetries of the deep linear networks, can be straightforwardly removed by a generalized $L_2$ regularization. Then, we establish upper bounds on the number of isolated stationary points of these networks with the help of algebraic geometry. Using these upper bounds and utilizing a numerical algebraic geometry method, we find all stationary points of modest depth and matrix size. We show that in the presence of the non-zero regularization, deep linear networks indeed possess local minima which are not the global minima. Our computational results clarify certain aspects of the loss surfaces of deep linear networks and provide novel insights.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning (Stat)
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Layer Normalization
๐ฎ
๐ฎ
The Ethereal
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
R.I.P.
๐ป
Ghosted
Variational Inference with Normalizing Flows
๐
๐
The Cartographer
Towards A Rigorous Science of Interpretable Machine Learning
R.I.P.
๐ป
Ghosted
Optimization Methods for Large-Scale Machine Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted