Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks

December 05, 2017 · Declared Dead · 🏛 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Long Chen, Hanwang Zhang, Jun Xiao, Wei Liu, Shih-Fu Chang arXiv ID 1712.01928 Category cs.CV: Computer Vision Citations 310 Venue 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Last Checked 2 months ago

Abstract

We propose a novel framework called Semantics-Preserving Adversarial Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test images and their classes are both unseen during training. SP-AEN aims to tackle the inherent problem --- semantic loss --- in the prevailing family of embedding-based ZSL, where some semantics would be discarded during training if they are non-discriminative for training classes, but could become critical for recognizing test classes. Specifically, SP-AEN prevents the semantic loss by introducing an independent visual-to-semantic space embedder which disentangles the semantic space into two subspaces for the two arguably conflicting objectives: classification and reconstruction. Through adversarial learning of the two subspaces, SP-AEN can transfer the semantics from the reconstructive subspace to the discriminative one, accomplishing the improved zero-shot recognition of unseen classes. Comparing with prior works, SP-AEN can not only improve classification but also generate photo-realistic images, demonstrating the effectiveness of semantic preservation. On four popular benchmarks: CUB, AWA, SUN and aPY, SP-AEN considerably outperforms other state-of-the-art methods by an absolute performance difference of 12.2\%, 9.3\%, 4.0\%, and 3.6\% in terms of harmonic mean values