Latent Diffusion Pretraining for Crystal Property Prediction

May 30, 2026 ยท Grace Period ยท ๐Ÿ› ICML 2026

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Shrimon Mukherjee, Kishalay Das, Partha Basuchowdhuri, Pawan Goyal, Niloy Ganguly arXiv ID 2606.00776 Category cs.LG: Machine Learning Citations 0 Venue ICML 2026
Abstract
Fast and accurate prediction of crystal properties is a central challenge in new materials design. Graph neural networks and Transformer-based models have emerged as powerful tools for this task due to their ability to encode the local structural environment of atoms within a crystal. However, these models are data-hungry, and in practice, labeled data for crystal properties are scarce. Pretraining-finetuning strategies, particularly those based on diffusion models, have shown promise in addressing these limitations. In this work, we introduce a novel latent diffusion based pretraining framework, CrysLDNet, designed to mitigate data scarcity. Our approach integrates a Variational Autoencoder (VAE) with a diffusion model during the pretraining stage. The VAE encoder maps 3D crystal structures into a smooth latent space within which the diffusion process is applied. This latent diffusion pretraining enables the graph encoder to effectively capture structural and chemical semantics from large-scale unlabeled data, which can then be finetuned for specific property prediction tasks. Comprehensive experiments on popular DFT datasets for property prediction reveal that CrysLDNet significantly outperforms both training-from-scratch and pretrained baselines, with improvements of 4.26% and 4.90% on the JARVIS and MP datasets, respectively. Additionally, the learned representations remain robust in sparse-data conditions and are expressive enough to correct DFT errors when finetuned with limited experimental data. Code is available at: https://github.com/shrimonmuke0202/CrysLDNet.git.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning