Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks
May 27, 2019 ยท Declared Dead ยท ๐ IEEE Transactions on Emerging Topics in Computing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Xiaoliang Dai, Hongxu Yin, Niraj K. Jha
arXiv ID
1905.10952
Category
cs.NE: Neural & Evolutionary
Cross-listed
cs.LG
Citations
33
Venue
IEEE Transactions on Emerging Topics in Computing
Last Checked
3 months ago
Abstract
Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications. However, their fixed architecture, substantial training cost, and significant model redundancy make it difficult to efficiently update them to accommodate previously unseen data. To solve these problems, we propose an incremental learning framework based on a grow-and-prune neural network synthesis paradigm. When new data arrive, the neural network first grows new connections based on the gradients to increase the network capacity to accommodate new data. Then, the framework iteratively prunes away connections based on the magnitude of weights to enhance network compactness, and hence recover efficiency. Finally, the model rests at a lightweight DNN that is both ready for inference and suitable for future grow-and-prune updates. The proposed framework improves accuracy, shrinks network size, and significantly reduces the additional training cost for incoming data compared to conventional approaches, such as training from scratch and network fine-tuning. For the LeNet-300-100 and LeNet-5 neural network architectures derived for the MNIST dataset, the framework reduces training cost by up to 64% (63%) and 67% (63%) compared to training from scratch (network fine-tuning), respectively. For the ResNet-18 architecture derived for the ImageNet dataset and DeepSpeech2 for the AN4 dataset, the corresponding training cost reductions against training from scratch (network fine-tunning) are 64% (60%) and 67% (62%), respectively. Our derived models contain fewer network parameters but achieve higher accuracy relative to conventional baselines.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Neural & Evolutionary
๐ฎ
๐ฎ
The Ethereal
R.I.P.
๐ป
Ghosted
Deep Learning using Rectified Linear Units (ReLU)
R.I.P.
๐ป
Ghosted
Generative Adversarial Text to Image Synthesis
R.I.P.
๐ป
Ghosted
Regularized Evolution for Image Classifier Architecture Search
R.I.P.
๐ป
Ghosted
Temporal Ensembling for Semi-Supervised Learning
๐
๐
Old Age
Learning Structured Sparsity in Deep Neural Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted