SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

August 14, 2020 · Declared Dead · 🏛 ACM/IEEE International Conference on Mobile Computing and Networking

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Stefanos Laskaridis, Stylianos I. Venieris, Mario Almeida, Ilias Leontiadis, Nicholas D. Lane arXiv ID 2008.06402 Category cs.LG: Machine Learning Cross-listed cs.CV, cs.DC, stat.ML Citations 327 Venue ACM/IEEE International Conference on Mobile Computing and Networking Last Checked 3 months ago

Abstract

Despite the soaring use of convolutional neural networks (CNNs) in mobile applications, uniformly sustaining high-performance inference on mobile has been elusive due to the excessive computational demands of modern CNNs and the increasing diversity of deployed devices. A popular alternative comprises offloading CNN processing to powerful cloud-based servers. Nevertheless, by relying on the cloud to produce outputs, emerging mission-critical and high-mobility applications, such as drone obstacle avoidance or interactive applications, can suffer from the dynamic connectivity conditions and the uncertain availability of the cloud. In this paper, we propose SPINN, a distributed inference system that employs synergistic device-cloud computation together with a progressive inference method to deliver fast and robust CNN inference across diverse settings. The proposed system introduces a novel scheduler that co-optimises the early-exit policy and the CNN splitting at run time, in order to adapt to dynamic conditions and meet user-defined service-level requirements. Quantitative evaluation illustrates that SPINN outperforms its state-of-the-art collaborative inference counterparts by up to 2x in achieved throughput under varying network conditions, reduces the server cost by up to 6.8x and improves accuracy by 20.7% under latency constraints, while providing robust operation under uncertain connectivity conditions and significant energy savings compared to cloud-centric execution.