Navigating Local Minima in Quantized Spiking Neural Networks
February 15, 2022 ยท Entered Twilight ยท ๐ International Conference on Artificial Intelligence Circuits and Systems
Repo contents: .gitignore, LICENSE, README.md, dvs, earlystopping.py, evaluate.py, extract_test_set_accuracy.py, fmnist, mnist, plot_results.py, quickstart.ipynb, requirements.txt, set_all_seeds.py
Authors
Jason K. Eshraghian, Corey Lammie, Mostafa Rahimi Azghadi, Wei D. Lu
arXiv ID
2202.07221
Category
cs.LG: Machine Learning
Cross-listed
cs.NE
Citations
19
Venue
International Conference on Artificial Intelligence Circuits and Systems
Repository
https://github.com/jeshraghian/QSNNs
โญ 53
Last Checked
2 months ago
Abstract
Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms. However, these networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds. The broadly accepted trick to overcoming this is through the use of biased gradient estimators: surrogate gradients which approximate thresholding in Spiking Neural Networks (SNNs), and Straight-Through Estimators (STEs), which completely bypass thresholding in Quantized Neural Networks (QNNs). While noisy gradient feedback has enabled reasonable performance on simple supervised learning tasks, it is thought that such noise increases the difficulty of finding optima in loss landscapes, especially during the later stages of optimization. By periodically boosting the Learning Rate (LR) during training, we expect the network can navigate unexplored solution spaces that would otherwise be difficult to reach due to local minima, barriers, or flat surfaces. This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation as applied to Quantized SNNs (QSNNs). We provide a rigorous empirical evaluation of this technique on high precision and 4-bit quantized SNNs across three datasets, demonstrating (close to) state-of-the-art performance on the more complex datasets. Our source code is available at this link: https://github.com/jeshraghian/QSNNs.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted