The loss surface of deep linear networks viewed through the algebraic geometry lens

October 17, 2018 ยท Declared Dead ยท ๐Ÿ› IEEE Transactions on Pattern Analysis and Machine Intelligence

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Dhagash Mehta, Tianran Chen, Tingting Tang, Jonathan D. Hauenstein arXiv ID 1810.07716 Category stat.ML: Machine Learning (Stat) Cross-listed cs.LG, math.AG Citations 35 Venue IEEE Transactions on Pattern Analysis and Machine Intelligence Last Checked 4 months ago
Abstract
By using the viewpoint of modern computational algebraic geometry, we explore properties of the optimization landscapes of the deep linear neural network models. After clarifying on the various definitions of "flat" minima, we show that the geometrically flat minima, which are merely artifacts of residual continuous symmetries of the deep linear networks, can be straightforwardly removed by a generalized $L_2$ regularization. Then, we establish upper bounds on the number of isolated stationary points of these networks with the help of algebraic geometry. Using these upper bounds and utilizing a numerical algebraic geometry method, we find all stationary points of modest depth and matrix size. We show that in the presence of the non-zero regularization, deep linear networks indeed possess local minima which are not the global minima. Our computational results clarify certain aspects of the loss surfaces of deep linear networks and provide novel insights.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning (Stat)

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML ๐Ÿ› arXiv ๐Ÿ“š 12.0K cites 9 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted