Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

November 20, 2017 ยท Declared Dead ยท ๐Ÿ› Neural Information Processing Systems

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Ziming Zhang, Matthew Brand arXiv ID 1711.07354 Category stat.ML: Machine Learning (Stat) Cross-listed cs.CV, cs.LG Citations 79 Venue Neural Information Processing Systems Last Checked 3 months ago
Abstract
By lifting the ReLU function into a higher dimensional space, we develop a smooth multi-convex formulation for training feed-forward deep neural networks (DNNs). This allows us to develop a block coordinate descent (BCD) training algorithm consisting of a sequence of numerically well-behaved convex optimizations. Using ideas from proximal point methods in convex analysis, we prove that this BCD algorithm will converge globally to a stationary point with R-linear convergence rate of order one. In experiments with the MNIST database, DNNs trained with this BCD algorithm consistently yielded better test-set error rates than identical DNN architectures trained via all the stochastic gradient descent (SGD) variants in the Caffe toolbox.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning (Stat)

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML ๐Ÿ› arXiv ๐Ÿ“š 12.0K cites 9 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted