PaCX-MAE: Physiology-Augmented Chest X-Ray Masked Autoencoder

June 01, 2026 ยท Grace Period ยท ๐Ÿ› the ICML 2026 3rd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Yancheng Liu, Kenichi Maeda, Manan Pancholy arXiv ID 2606.01537 Category cs.CV: Computer Vision Cross-listed cs.LG Citations 0 Venue the ICML 2026 3rd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences
Abstract
Clinical diagnosis often requires combining imaging with physiological measurements, yet deployed models typically operate on unimodal data. We present PaCX-MAE, a cross-modal distillation framework that injects physiological priors into chest X-ray (CXR) encoders while remaining strictly unimodal at inference. PaCX-MAE augments in-domain masked autoencoding with a dual contrastive-predictive objective, aligning CXR representations with paired ECG and laboratory embeddings. Extensive evaluation across nine benchmarks demonstrates consistent improvements over domain-specific MAE, particularly on physiology-dependent tasks (e.g., +2.7 AUROC on MedMod; +6.5 F1 on VinDr). The method proves highly label-efficient in the 1% regime and preserves anatomical fidelity, achieving parity with MAE on segmentation tasks. Zero-shot and attention analyses confirm that PaCX-MAE successfully learns to attend to physiological indicators, such as the cardiac silhouette, absent in standard visual pretraining.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision

๐ŸŒ… ๐ŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV ๐Ÿ› ICCV ๐Ÿ“š 27.7K cites 11 years ago