End-to-End Optimized Image Compression with the Frequency-Oriented Transform

January 16, 2024 · Declared Dead · 🏛 Machine Vision and Applications

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yuefeng Zhang, Kai Lin arXiv ID 2401.08194 Category cs.CV: Computer Vision Cross-listed cs.AI, cs.MM Citations 7 Venue Machine Vision and Applications Last Checked 4 months ago

Abstract

Image compression constitutes a significant challenge amidst the era of information explosion. Recent studies employing deep learning methods have demonstrated the superior performance of learning-based image compression methods over traditional codecs. However, an inherent challenge associated with these methods lies in their lack of interpretability. Following an analysis of the varying degrees of compression degradation across different frequency bands, we propose the end-to-end optimized image compression model facilitated by the frequency-oriented transform. The proposed end-to-end image compression model consists of four components: spatial sampling, frequency-oriented transform, entropy estimation, and frequency-aware fusion. The frequency-oriented transform separates the original image signal into distinct frequency bands, aligning with the human-interpretable concept. Leveraging the non-overlapping hypothesis, the model enables scalable coding through the selective transmission of arbitrary frequency components. Extensive experiments are conducted to demonstrate that our model outperforms all traditional codecs including next-generation standard H.266/VVC on MS-SSIM metric. Moreover, visual analysis tasks (i.e., object detection and semantic segmentation) are conducted to verify the proposed compression method could preserve semantic fidelity besides signal-level precision.