Multiplier-free In-Memory Vector-Matrix Multiplication Using Distributed Arithmetic
October 02, 2025 Β· Declared Dead Β· π International Symposium on Embedded Multicore/Many-core Systems-on-Chip
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Felix Zeller, John Reuben, Dietmar Fey
arXiv ID
2510.02099
Category
cs.AR: Hardware Architecture
Cross-listed
cs.NE
Citations
1
Venue
International Symposium on Embedded Multicore/Many-core Systems-on-Chip
Last Checked
3 months ago
Abstract
Vector-Matrix Multiplication (VMM) is the fundamental and frequently required computation in inference of Neural Networks (NN). Due to the large data movement required during inference, VMM can benefit greatly from in-memory computing. However, ADC/DACs required for in-memory VMM consume significant power and area. `Distributed Arithmetic (DA)', a technique in computer architecture prevalent in 1980s was used to achieve inner product or dot product of two vectors without using a hard-wired multiplier when one of the vectors is a constant. In this work, we extend the DA technique to multiply an input vector with a constant matrix. By storing the sum of the weights in memory, DA achieves VMM using shift-and-add circuits in the periphery of ReRAM memory. We verify functional and also estimate non-functional properties (latency, energy, area) by performing transistor-level simulations. Using energy-efficient sensing and fine grained pipelining, our approach achieves 4.5 x less latency and 12 x less energy than VMM performed in memory conventionally by bit slicing. Furthermore, DA completely eliminated the need for power-hungry ADCs which are the main source of area and energy consumption in the current VMM implementations in memory.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Hardware Architecture
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Corona: System Implications of Emerging Nanophotonic Technology
R.I.P.
π»
Ghosted
A scalable multi-core architecture with heterogeneous memory structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs)
R.I.P.
π»
Ghosted
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
R.I.P.
π»
Ghosted
Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks
R.I.P.
π»
Ghosted
SpArch: Efficient Architecture for Sparse Matrix Multiplication
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted