๐
๐
Old Age
PaCX-MAE: Physiology-Augmented Chest X-Ray Masked Autoencoder
June 01, 2026 ยท Grace Period ยท ๐ the ICML 2026 3rd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences
Authors
Yancheng Liu, Kenichi Maeda, Manan Pancholy
arXiv ID
2606.01537
Category
cs.CV: Computer Vision
Cross-listed
cs.LG
Citations
0
Venue
the ICML 2026 3rd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences
Abstract
Clinical diagnosis often requires combining imaging with physiological measurements, yet deployed models typically operate on unimodal data. We present PaCX-MAE, a cross-modal distillation framework that injects physiological priors into chest X-ray (CXR) encoders while remaining strictly unimodal at inference. PaCX-MAE augments in-domain masked autoencoding with a dual contrastive-predictive objective, aligning CXR representations with paired ECG and laboratory embeddings. Extensive evaluation across nine benchmarks demonstrates consistent improvements over domain-specific MAE, particularly on physiology-dependent tasks (e.g., +2.7 AUROC on MedMod; +6.5 F1 on VinDr). The method proves highly label-efficient in the 1% regime and preserves anatomical fidelity, achieving parity with MAE on segmentation tasks. Zero-shot and attention analyses confirm that PaCX-MAE successfully learns to attend to physiological indicators, such as the cardiac silhouette, absent in standard visual pretraining.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
๐
๐
Old Age
Fast R-CNN
๐
๐
Old Age