R.I.P.
๐ป
Ghosted
Partitioning Unstructured Sparse Tensor Algebra for Load-Balanced Parallel Execution
April 19, 2026 ยท Grace Period ยท + Add venue
Authors
Atharva Chougule, Alexander J Root, Rubens Lacouture, Bobby Yan, Rohan Yadav, Fredrik Kjolstad
arXiv ID
2604.17198
Category
cs.PL: Programming Languages
Citations
0
Abstract
Sparse tensor algebra is challenging to efficiently parallelize due to the irregular, data-dependent, and potentially skewed structure of sparse computation. We propose the first partitioning algorithm that provably load balances the computation of any sparse tensor algebra expression across parallel execution units. Our algorithm generalizes parallel merging algorithms to any number of operands, and to multi-dimensional, hierarchical sparse data structures. We implement our algorithm within an existing sparse tensor algebra compilation framework to automatically generate parallel sparse tensor algebra kernels that target multi-core CPUs and GPUs. We show that our generated code is competitive with hand-implemented parallelization strategies used by vendor libraries like Intel MKL and NVIDIA cuSPARSE (geo-means of $0.73$--$3.4\times$) and \textsc{Taco} (geo-means of $1.0$--$2.4\times$), and significantly outperforms general-purpose strategies for sparse tensor expressions where specialized algorithms have not been developed (geo-means of $2.0$--$6.4\times$).
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Programming Languages
R.I.P.
๐ป
Ghosted
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
R.I.P.
๐ป
Ghosted
Glow: Graph Lowering Compiler Techniques for Neural Networks
R.I.P.
๐ป
Ghosted
Learnable Programming: Blocks and Beyond
R.I.P.
๐ป
Ghosted
Scenic: A Language for Scenario Specification and Scene Generation
R.I.P.
๐ป
Ghosted