CCMR: High Resolution Optical Flow Estimation via Coarse-to-Fine Context-Guided Motion Reasoning
November 05, 2023 Β· Declared Dead Β· π IEEE Workshop/Winter Conference on Applications of Computer Vision
Authors
Azin Jahedi, Maximilian Luz, Marc Rivinius, AndrΓ©s Bruhn
arXiv ID
2311.02661
Category
cs.CV: Computer Vision
Citations
13
Venue
IEEE Workshop/Winter Conference on Applications of Computer Vision
Repository
https://github.com/cv-stuttgart
Last Checked
1 month ago
Abstract
Attention-based motion aggregation concepts have recently shown their usefulness in optical flow estimation, in particular when it comes to handling occluded regions. However, due to their complexity, such concepts have been mainly restricted to coarse-resolution single-scale approaches that fail to provide the detailed outcome of high-resolution multi-scale networks. In this paper, we hence propose CCMR: a high-resolution coarse-to-fine approach that leverages attention-based motion grouping concepts to multi-scale optical flow estimation. CCMR relies on a hierarchical two-step attention-based context-motion grouping strategy that first computes global multi-scale context features and then uses them to guide the actual motion grouping. As we iterate both steps over all coarse-to-fine scales, we adapt cross covariance image transformers to allow for an efficient realization while maintaining scale-dependent properties. Experiments and ablations demonstrate that our efforts of combining multi-scale and attention-based concepts pay off. By providing highly detailed flow fields with strong improvements in both occluded and non-occluded regions, our CCMR approach not only outperforms both the corresponding single-scale attention-based and multi-scale attention-free baselines by up to 23.0% and 21.6%, respectively, it also achieves state-of-the-art results, ranking first on KITTI 2015 and second on MPI Sintel Clean and Final. Code and trained models are available at https://github.com/cv-stuttgart /CCMR.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computer Vision
π
π
Old Age
π
π
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
R.I.P.
π»
Ghosted
You Only Look Once: Unified, Real-Time Object Detection
π
π
Old Age
SSD: Single Shot MultiBox Detector
π
π
Old Age
Squeeze-and-Excitation Networks
R.I.P.
π»
Ghosted
Rethinking the Inception Architecture for Computer Vision
Died the same way β π 404 Not Found
R.I.P.
π
404 Not Found
Deep High-Resolution Representation Learning for Visual Recognition
R.I.P.
π
404 Not Found
HuggingFace's Transformers: State-of-the-art Natural Language Processing
R.I.P.
π
404 Not Found
CCNet: Criss-Cross Attention for Semantic Segmentation
R.I.P.
π
404 Not Found