Semantically Stable Image Composition Analysisvia Saliency and Gradient Vector Flow Fusion

April 14, 2026 ยท Grace Period ยท ๐Ÿ› ICPR 2026

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Armin Dadras, Robert Sablatnig, Franziska Proksa, Markus Seidl arXiv ID 2604.16500 Category cs.CV: Computer Vision Citations 0 Venue ICPR 2026
Abstract
The reliable computational assessment of photographic composition requires features that are discriminative of spatial layout yet robust to semantic content. This paper proposes a low-level representation grounded in the assumption that composition can be understood as the flow of visual attention across geometric structure. We introduce VFCNet, which fuses saliency and edge information into a gradient vector flow (GVF) field. The model computes dual-stream GVF representations, integrates them via attention, and extracts multi-scale flow features with a DINOv3 backbone. VFCNet achieves state-of-the-art performance on the PICD benchmark (CDA-1: 0.683, CDA-2: 0.629), improving by 33.1\% and 36.1\% over the previous best method. We also show that a simple classifier on self-supervised DINOv3 features substantially outperforms more sophisticated, composition-specialized models. Code is available at https://github.com/ADadras/VFCNet
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision

๐ŸŒ… ๐ŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV ๐Ÿ› ICCV ๐Ÿ“š 27.7K cites 11 years ago