🔮
🔮
The Ethereal
Hölder++: Improving the Quality-Coherence Trade-off in Multimodal VAEs
June 11, 2026 · Grace Period · 🏛 ICML 2026
Authors
Huyen Vo, María Martínez-García, Isabel Valera
arXiv ID
2606.13381
Category
cs.LG: Machine Learning
Citations
0
Venue
ICML 2026
Abstract
Existing approaches for multimodal variational autoencoders (VAEs) face a trade-off between generative quality and coherence-i.e., they struggle to generate realistic and diverse samples that, at the same time, are semantically consistent across modalities. A recent work shows that using a simple approximation to Hölder pooling as an aggregation method improves coherence over the SOTA MMVAE+, despite assuming a single shared representation across all modalities. Yet, it slightly compromises sample diversity. Inspired by this insight, we propose Hölder++, a novel multimodal VAE that improves the generative quality-coherence trade-off through: (i) the first implementation of Hölder pooling without any approximation for multimodal VAEs; (ii) an extended architecture that models distinct shared and private (i.e., modality-specific) representations (Hölder+); and (iii) hierarchical inference that further enhances the disentanglement between the shared and private representations (Hölder++). Our experiments corroborate that Hölder++ consistently improves the generative quality-coherence trade-off, yields more structured latent spaces, and learns shared representations that are informative for downstream tasks.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Machine Learning
🔮
🔮
The Ethereal
Continuous control with deep reinforcement learning
🌅
🌅
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
🌅
🌅
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
🌅
🌅
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
🔮
🔮
The Ethereal