DiracNets: Training Very Deep Neural Networks Without Skip-Connections

June 01, 2017 · Entered Twilight · 🏛 arXiv.org

"Last commit was 7.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, README.md, diracconv.py, diracnet-export.ipynb, diracnet.py, requirements.txt, test.py, train.py

Authors Sergey Zagoruyko, Nikos Komodakis arXiv ID 1706.00388 Category cs.CV: Computer Vision Citations 127 Venue arXiv.org Repository https://github.com/szagoruyko/diracnets ⭐ 588 Last Checked 2 months ago

Abstract

Deep neural networks with skip-connections, such as ResNet, show excellent performance in various image classification benchmarks. It is though observed that the initial motivation behind them - training deeper networks - does not actually hold true, and the benefits come from increased capacity, rather than from depth. Motivated by this, and inspired from ResNet, we propose a simple Dirac weight parameterization, which allows us to train very deep plain networks without explicit skip-connections, and achieve nearly the same performance. This parameterization has a minor computational cost at training time and no cost at all at inference, as both Dirac parameterization and batch normalization can be folded into convolutional filters, so that network becomes a simple chain of convolution-ReLU pairs. We are able to match ResNet-1001 accuracy on CIFAR-10 with 28-layer wider plain DiracNet, and closely match ResNets on ImageNet. Our parameterization also mostly eliminates the need of careful initialization in residual and non-residual networks. The code and models for our experiments are available at https://github.com/szagoruyko/diracnets