Controlling Decision Drift in Multimodal Sentiment Analysis with Missing Modalities

May 16, 2026 ยท Grace Period ยท ๐Ÿ› IJCAI 2026

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Chenglizhao Chen, Yuchen Cao, Xinyu Liu, Mengke Song, Guisheng Zhang, Xiaomin Yu arXiv ID 2605.16889 Category cs.CV: Computer Vision Citations 0 Venue IJCAI 2026
Abstract
Multimodal sentiment analysis relies on textual, acoustic, and visual signals, yet real-world data often suffer from modality missing and quality imbalance. Existing methods generate features for modality missing from available ones, but differences in expression mechanisms and sentiment dynamics across modalities may cause the generated features to deviate from true distributions and mislead prediction. In addition, unreliable modalities may dominate fusion, resulting in representation shift across modality combinations and unstable sentiment representations. To address these challenges, we propose a two-level reference alignment framework. The framework introduces stable references at the feature representation and sentiment decision levels to improve robustness under modality missing. First-level reference alignment leverages complete-modality samples to constrain representations and align different modality combinations into a shared sentiment space. Second-level reference alignment enforces cross-modal consistency at the decision level by suppressing unreliable modalities through prototype retrieval and voting. As a result, the framework maintains stable and reliable sentiment predictions under diverse missing-modality patterns. Experiments on CMU-MOSI and CMU-MOSEI show consistent improvements across various missing-modality settings. Under full-modality input, the proposed method achieves state-of-the-art performance, with ACC of 86.28% and 85.88%, and F1 of 86.24% and 85.86%.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision

๐ŸŒ… ๐ŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV ๐Ÿ› ICCV ๐Ÿ“š 27.7K cites 11 years ago