Inference of Quantized Neural Networks on Heterogeneous All-Programmable Devices

June 21, 2018 · Declared Dead · 🏛 Design, Automation and Test in Europe

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Thomas B. Preußer, Giulio Gambardella, Nicholas Fraser, Michaela Blott arXiv ID 1806.08085 Category cs.NE: Neural & Evolutionary Citations 44 Venue Design, Automation and Test in Europe Last Checked 3 months ago

Abstract

Neural networks have established as a generic and powerful means to approach challenging problems such as image classification, object detection or decision making. Their successful employment foots on an enormous demand of compute. The quantization of network parameters and the processed data has proven a valuable measure to reduce the challenges of network inference so effectively that the feasible scope of applications is expanded even into the embedded domain. This paper describes the making of a real-time object detection in a live video stream processed on an embedded all-programmable device. The presented case illustrates how the required processing is tamed and parallelized across both the CPU cores and the programmable logic and how the most suitable resources and powerful extensions, such as NEON vectorization, are leveraged for the individual processing steps. The crafted result is an extended Darknet framework implementing a fully integrated, end-to-end solution from video capture over object annotation to video output applying neural network inference at different quantization levels running at 16~frames per second on an embedded Zynq UltraScale+ (XCZU3EG) platform.