Heterogenous Ensemble of Models for Molecular Property Prediction

November 20, 2022 ยท Entered Twilight ยท ๐Ÿ› arXiv.org

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

Repo contents: LICENSE, NVIDIA_PCQM4Mv2.pdf, README.md, TransformerM, cnn, data, ensemble, molecular_transformer, pd_dgn

Authors Sajad Darabi, Shayan Fazeli, Jiwei Liu, Alexandre Milesi, Pawel Morkisz, Jean-Franรงois Puget, Gilberto Titericz arXiv ID 2211.11035 Category cs.LG: Machine Learning Cross-listed q-bio.QM Citations 0 Venue arXiv.org Repository https://github.com/jfpuget/NVIDIA-PCQM4Mv2 โญ 17 Last Checked 3 months ago
Abstract
Previous works have demonstrated the importance of considering different modalities on molecules, each of which provide a varied granularity of information for downstream property prediction tasks. Our method combines variants of the recent TransformerM architecture with Transformer, GNN, and ResNet backbone architectures. Models are trained on the 2D data, 3D data, and image modalities of molecular graphs. We ensemble these models with a HuberRegressor. The models are trained on 4 different train/validation splits of the original train + valid datasets. This yields a winning solution to the 2\textsuperscript{nd} edition of the OGB Large-Scale Challenge (2022) on the PCQM4Mv2 molecular property prediction dataset. Our proposed method achieves a test-challenge MAE of $0.0723$ and a validation MAE of $0.07145$. Total inference time for our solution is less than 2 hours. We open-source our code at https://github.com/jfpuget/NVIDIA-PCQM4Mv2.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning