Sliced Denoising: A Physics-Informed Molecular Pre-Training Method
November 03, 2023 Β· Declared Dead Β· π International Conference on Learning Representations
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yuyan Ni, Shikun Feng, Wei-Ying Ma, Zhi-Ming Ma, Yanyan Lan
arXiv ID
2311.02124
Category
q-bio.BM
Cross-listed
cs.AI,
cs.LG
Citations
17
Venue
International Conference on Learning Representations
Last Checked
2 months ago
Abstract
While molecular pre-training has shown great potential in enhancing drug discovery, the lack of a solid physical interpretation in current methods raises concerns about whether the learned representation truly captures the underlying explanatory factors in observed data, ultimately resulting in limited generalization and robustness. Although denoising methods offer a physical interpretation, their accuracy is often compromised by ad-hoc noise design, leading to inaccurate learned force fields. To address this limitation, this paper proposes a new method for molecular pre-training, called sliced denoising (SliDe), which is based on the classical mechanical intramolecular potential theory. SliDe utilizes a novel noise strategy that perturbs bond lengths, angles, and torsion angles to achieve better sampling over conformations. Additionally, it introduces a random slicing approach that circumvents the computationally expensive calculation of the Jacobian matrix, which is otherwise essential for estimating the force field. By aligning with physical principles, SliDe shows a 42\% improvement in the accuracy of estimated force fields compared to current state-of-the-art denoising methods, and thus outperforms traditional baselines on various molecular property prediction tasks.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β q-bio.BM
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Protein secondary structure prediction using deep convolutional neural fields
R.I.P.
π»
Ghosted
Protein structure generation via folding diffusion
R.I.P.
π
404 Not Found
LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search
R.I.P.
π»
Ghosted
What is a meaningful representation of protein sequences?
R.I.P.
π»
Ghosted
Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted