
The two fundamental questions of organic chemistry are: how can we make the molecule, and how can we be sure we made the molecule?
Together with Exscientia, we have investigated how language models can predict the selectivity of C–H borylation. By fine-tuning a pre-trained language model, we have achieved 95% site accuracy, surpassing the state-of-the-art performance of graph neural networks.
Our new DP5q analysis of NMR spectra can quantify the confidence in the interpretation of an NMR spectrum thousands of times faster than classic DFT methods. It can now be installed from GitHub to run on your local machine!
Publications
Leveraging Language Model Multi-Tasking to Predict C–H Borylation Selectivity
– Journal of chemical information and modeling
(2024)
64,
4286
(doi: 10.1021/acs.jcim.4c00137)