Accelerating Discrete Wavelet Transforms on Parallel Architectures

April 27, 2017 · Declared Dead · 🏛 arXiv.org

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors David Barina, Michal Kula, Michal Matysek, Pavel Zemcik arXiv ID 1704.08657 Category cs.PF: Performance Cross-listed cs.GR, cs.MM Citations 4 Venue arXiv.org Last Checked 2 months ago

Abstract

The 2-D discrete wavelet transform (DWT) can be found in the heart of many image-processing algorithms. Until recently, several studies have compared the performance of such transform on various shared-memory parallel architectures, especially on graphics processing units (GPUs). All these studies, however, considered only separable calculation schemes. We show that corresponding separable parts can be merged into non-separable units, which halves the number of steps. In addition, we introduce an optional optimization approach leading to a reduction in the number of arithmetic operations. The discussed schemes were adapted on the OpenCL framework and pixel shaders, and then evaluated using GPUs of two biggest vendors. We demonstrate the performance of the proposed non-separable methods by comparison with existing separable schemes. The non-separable schemes outperform their separable counterparts on numerous setups, especially considering the pixel shaders.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Performance

R.I.P. 👻 Ghosted

GraphMat: High performance graph analytics made productive

Narayanan Sundaram, Nadathur Rajagopalan Satish, ... (+5 more)

cs.PF 🏛 VLDB 📚 339 cites 11 years ago

R.I.P. 👻 Ghosted

A General Formula for the Stationary Distribution of the Age of Information and Its Application to Single-Server Queues

Yoshiaki Inoue, Hiroyuki Masuyama, ... (+2 more)

cs.PF 🏛 IEEE TIT 📚 257 cites 8 years ago

R.I.P. 👻 Ghosted

AI Benchmark: All About Deep Learning on Smartphones in 2019

Andrey Ignatov, Radu Timofte, ... (+7 more)

cs.PF 🏛 ICCV W 📚 239 cites 6 years ago

R.I.P. 👻 Ghosted

BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning

Yuqing Zhu, Jianxun Liu, ... (+6 more)

cs.PF 🏛 SoCC 📚 237 cites 8 years ago

R.I.P. 👻 Ghosted

Online normalizer calculation for softmax

Maxim Milakov, Natalia Gimelshein

cs.PF 🏛 arXiv 📚 152 cites 7 years ago

R.I.P. 👻 Ghosted

CLTune: A Generic Auto-Tuner for OpenCL Kernels

Cedric Nugteren, Valeriu Codreanu

cs.PF 🏛 ICEMS 📚 132 cites 9 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago