Redundancy-Adaptive Multimodal Learning for Imperfect Data
October 23, 2023 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Mengxi Chen, Jiangchao Yao, Linyu Xing, Yu Wang, Ya Zhang, Yanfeng Wang
arXiv ID
2310.14496
Category
cs.MM: Multimedia
Citations
8
Venue
arXiv.org
Last Checked
3 months ago
Abstract
Multimodal models trained on complete modality data often exhibit a substantial decrease in performance when faced with imperfect data containing corruptions or missing modalities. To address this robustness challenge, prior methods have explored various approaches from aspects of augmentation, consistency or uncertainty, but these approaches come with associated drawbacks related to data complexity, representation, and learning, potentially diminishing their overall effectiveness. In response to these challenges, this study introduces a novel approach known as the Redundancy-Adaptive Multimodal Learning (RAML). RAML efficiently harnesses information redundancy across multiple modalities to combat the issues posed by imperfect data while remaining compatible with the complete modality. Specifically, RAML achieves redundancy-lossless information extraction through separate unimodal discriminative tasks and enforces a proper norm constraint on each unimodal feature representation. Furthermore, RAML explicitly enhances multimodal fusion by leveraging fine-grained redundancy among unimodal features to learn correspondences between corrupted and untainted information. Extensive experiments on various benchmark datasets under diverse conditions have consistently demonstrated that RAML outperforms state-of-the-art methods by a significant margin.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Multimedia
π
π
Old Age
R.I.P.
π»
Ghosted
Viewport-Adaptive Navigable 360-Degree Video Delivery
π
π
The Cartographer
A Comprehensive Survey on Cross-modal Retrieval
π
π
The Cartographer
An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges
R.I.P.
π»
Ghosted
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
R.I.P.
π»
Ghosted
Video Generation From Text
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted