Trimming Down Large Spiking Vision Transformers via Heterogeneous Quantization Search

December 07, 2024 ยท Declared Dead ยท ๐Ÿ› IEEE International Conference on Application-Specific Systems, Architectures, and Processors

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Boxun Xu, Yufei Song, Peng Li arXiv ID 2412.05505 Category cs.NE: Neural & Evolutionary Cross-listed cs.AI Citations 2 Venue IEEE International Conference on Application-Specific Systems, Architectures, and Processors Last Checked 4 months ago
Abstract
Spiking Neural Networks (SNNs) are amenable to deployment on edge devices and neuromorphic hardware due to their lower dissipation. Recently, SNN-based transformers have garnered significant interest, incorporating attention mechanisms akin to their counterparts in Artificial Neural Networks (ANNs) while demonstrating excellent performance. However, deploying large spiking transformer models on resource-constrained edge devices such as mobile phones, still poses significant challenges resulted from the high computational demands of large uncompressed high-precision models. In this work, we introduce a novel heterogeneous quantization method for compressing spiking transformers through layer-wise quantization. Our approach optimizes the quantization of each layer using one of two distinct quantization schemes, i.e., uniform or power-of-two quantification, with mixed bit resolutions. Our heterogeneous quantization demonstrates the feasibility of maintaining high performance for spiking transformers while utilizing an average effective resolution of 3.14-3.67 bits with less than a 1% accuracy drop on DVS Gesture and CIFAR10-DVS datasets. It attains a model compression rate of 8.71x-10.19x for standard floating-point spiking transformers. Moreover, the proposed approach achieves a significant energy reduction of 5.69x, 8.72x, and 10.2x while maintaining high accuracy levels of 85.3%, 97.57%, and 80.4% on N-Caltech101, DVS-Gesture, and CIFAR10-DVS datasets, respectively.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Neural & Evolutionary

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

LSTM: A Search Space Odyssey

Klaus Greff, Rupesh Kumar Srivastava, ... (+3 more)

cs.NE ๐Ÿ› IEEE TNNLS ๐Ÿ“š 6.0K cites 11 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted