AsyncTaichi: On-the-fly Inter-kernel Optimizations for Imperative and Spatially Sparse Programming
December 15, 2020 Β· Declared Dead Β· + Add venue
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yuanming Hu, Mingkuan Xu, Ye Kuang, FrΓ©do Durand
arXiv ID
2012.08141
Category
cs.PL: Programming Languages
Cross-listed
cs.AI,
cs.GR
Citations
0
Last Checked
4 months ago
Abstract
Leveraging spatial sparsity has become a popular approach to accelerate 3D computer graphics applications. Spatially sparse data structures and efficient sparse kernels (such as parallel stencil operations on active voxels), are key to achieve high performance. Existing work focuses on improving performance within a single sparse computational kernel. We show that a system that looks beyond a single kernel, plus additional domain-specific sparse data structure analysis, opens up exciting new space for optimizing sparse computations. Specifically, we propose a domain-specific data-flow graph model of imperative and sparse computation programs, which describes kernel relationships and enables easy analysis and optimization. Combined with an asynchronous execution engine that exposes a wide window of kernels, the inter-kernel optimizer can then perform effective sparse computation optimizations, such as eliminating unnecessary voxel list generations and removing voxel activation checks. These domain-specific optimizations further make way for classical general-purpose optimizations that are originally challenging to directly apply to computations with sparse data structures. Without any computational code modification, our new system leads to $4.02\times$ fewer kernel launches and $1.87\times$ speed up on our GPU benchmarks, including computations on Eulerian grids, Lagrangian particles, meshes, and automatic differentiation.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Programming Languages
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
R.I.P.
π»
Ghosted
Glow: Graph Lowering Compiler Techniques for Neural Networks
R.I.P.
π»
Ghosted
Learnable Programming: Blocks and Beyond
R.I.P.
π»
Ghosted
Scenic: A Language for Scenario Specification and Scene Generation
R.I.P.
π»
Ghosted
Vandal: A Scalable Security Analysis Framework for Smart Contracts
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted