SnapBoost: A Heterogeneous Boosting Machine
Thomas Parnell, Andreea Anghel, et al.
NeurIPS 2020
Chemical reactions can be classified into distinct categories that encapsulate concepts for how one molecule is transformed into another. One can encode these concepts in rules specifying the set of atoms and bonds that change during a transformation, which is commonly known as a reaction template. While there exist multiple possibilities to represent a chemical reaction in a vector representation, or fingerprint, this is not the case for reaction templates. As a consequence, methods to navigate the space of reaction templates are limited. In this work, we introduce the first reaction template fingerprint. To this end, we follow a data-driven approach relying on a masked language modelling task on SMIRKS strings. We combine unsupervised pre-training with fine-tuning on the classification of templates according to the RXNO ontology, for which we achieve up to 98.4% classification accuracy. We highlight how the learned embeddings can be extracted and used in downstream applications.
Thomas Parnell, Andreea Anghel, et al.
NeurIPS 2020
Girmaw Abebe Tadesse, Celia Cintas, et al.
ICML 2020
Shiqiang Wang, Jake Perazzone, et al.
INFOCOM 2023
Gaetano Rossiello, Nhan Pham, et al.
ICLR 2025