SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer
November 15, 2023 Β· Declared Dead Β· π IEEE International Conference on Acoustics, Speech, and Signal Processing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yue Liu, Shanlin Xiao, Bo Li, Zhiyi Yu
arXiv ID
2311.08806
Category
cs.CV: Computer Vision
Cross-listed
cs.AI
Citations
5
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Last Checked
4 months ago
Abstract
As the third-generation neural network, the Spiking Neural Network (SNN) has the advantages of low power consumption and high energy efficiency, making it suitable for implementation on edge devices. More recently, the most advanced SNN, Spikformer, combines the self-attention module from Transformer with SNN to achieve remarkable performance. However, it adopts larger channel dimensions in MLP layers, leading to an increased number of redundant model parameters. To effectively decrease the computational complexity and weight parameters of the model, we explore the Lottery Ticket Hypothesis (LTH) and discover a very sparse ($\ge$90%) subnetwork that achieves comparable performance to the original network. Furthermore, we also design a lightweight token selector module, which can remove unimportant background information from images based on the average spike firing rate of neurons, selecting only essential foreground image tokens to participate in attention calculation. Based on that, we present SparseSpikformer, a co-design framework aimed at achieving sparsity in Spikformer through token and weight pruning techniques. Experimental results demonstrate that our framework can significantly reduce 90% model parameters and cut down Giga Floating-Point Operations (GFLOPs) by 20% while maintaining the accuracy of the original model.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computer Vision
π
π
Old Age
π
π
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
π
π
Old Age
SSD: Single Shot MultiBox Detector
π
π
Old Age
Squeeze-and-Excitation Networks
π
π
Old Age
Fast R-CNN
π
π
Old Age
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted