Fully-parallel Convolutional Neural Network Hardware

June 22, 2020 · Declared Dead · 🏛 arXiv.org

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Christiam F. Frasser, Pablo Linares-Serrano, V. Canals, Miquel Roca, T. Serrano-Gotarredona, Josep L. Rossello arXiv ID 2006.12439 Category cs.NE: Neural & Evolutionary Citations 1 Venue arXiv.org Last Checked 4 months ago

Abstract

A new trans-disciplinary knowledge area, Edge Artificial Intelligence or Edge Intelligence, is beginning to receive a tremendous amount of interest from the machine learning community due to the ever increasing popularization of the Internet of Things (IoT). Unfortunately, the incorporation of AI characteristics to edge computing devices presents the drawbacks of being power and area hungry for typical machine learning techniques such as Convolutional Neural Networks (CNN). In this work, we propose a new power-and-area-efficient architecture for implementing Articial Neural Networks (ANNs) in hardware, based on the exploitation of correlation phenomenon in Stochastic Computing (SC) systems. The architecture purposed can solve the difficult implementation challenges that SC presents for CNN applications, such as the high resources used in binary-tostochastic conversion, the inaccuracy produced by undesired correlation between signals, and the stochastic maximum function implementation. Compared with traditional binary logic implementations, experimental results showed an improvement of 19.6x and 6.3x in terms of speed performance and energy efficiency, for the FPGA implementation. We have also realized a full VLSI implementation of the proposed SC-CNN architecture demonstrating that our optimization achieve a 18x area reduction over previous SC-DNN architecture VLSI implementation in a comparable technological node. For the first time, a fully-parallel CNN as LENET-5 is embedded and tested in a single FPGA, showing the benefits of using stochastic computing for embedded applications, in contrast to traditional binary logic implementations.