Assigning confidence to molecular property prediction

Introduction: Computational modeling has rapidly advanced over the last decades. Recently, machine learning has emerged as a powerful and cost-effective strategy to learn from existing datasets and perform predictions on unseen molecules. Accordingly, the explosive rise of data-driven techniques raises an important question: What confidence can be assig…

Read more

Stacking Gaussian processes to improve [Formula: see text] predictions in the SAMPL7 challenge

Accurate predictions of acid dissociation constants are essential to rational molecular design in the pharmaceutical industry and elsewhere. There has been much interest in developing new machine learning methods that can produce fast and accurate pKa predictions for arbitrary species, as well as estimates of prediction uncertainty. Previously, as part of the …

Read more

Expanded Ensemble Methods Can be Used to Accurately Predict Protein-Ligand Relative Binding Free Energies

Alchemical free energy methods have become indispensable in computational drug discovery for their ability to calculate highly accurate estimates of protein-ligand affinities. Expanded ensemble (EE) methods, which involve single simulations visiting all of the alchemical intermediates, have some key advantages for alchemical free energy calculation. However, t…

Read more

Antagonism between substitutions in β-lactamase explains a path not taken in the evolution of bacterial drug resistance

CTX-M β-lactamases are widespread in Gram-negative bacterial pathogens and provide resistance to the cephalosporin cefotaxime but not to the related antibiotic ceftazidime. Nevertheless, variants have emerged that confer resistance to ceftazidime. Two natural mutations, causing P167S and D240G substitutions in the CTX-M enzyme, result in 10-fold increased…

Read more

SARS-CoV-2 Simulations Go Exascale to Capture Spike Opening and Reveal Cryptic Pockets Across the Proteome

SARS-CoV-2 has intricate mechanisms for initiating infection, immune evasion/suppression, and replication, which depend on the structure and dynamics of its constituent proteins. Many protein structures have been solved, but far less is known about their relevant conformational changes. To address this challenge, over a million citizen scientists banded togeth…

Read more

Protein sequence models for prediction and comparative analysis of the SARS-CoV-2 -human interactome

Viruses such as the novel coronavirus, SARS-CoV-2, that is wreaking havoc on the world, depend on interactions of its own proteins with those of the human host cells. Relatively small changes in sequence such as between SARS-CoV and SARS-CoV-2 can dramatically change clinical phenotypes of the virus, including transmission rates and severity of the disease. On…

Read more

Deep learning the structural determinants of protein biochemical properties by comparing structural ensembles with DiffNets

Understanding the structural determinants of a protein’s biochemical properties, such as activity and stability, is a major challenge in biology and medicine. Comparing computer simulations of protein variants with different biochemical properties is an increasingly powerful means to drive progress. However, success often hinges on dimensionality reduction alg…

Read more

SARS-CoV-2 simulations go exascale to predict dramatic spike opening and cryptic pockets across the proteome

SARS-CoV-2 has intricate mechanisms for initiating infection, immune evasion/suppression and replication that depend on the structure and dynamics of its constituent proteins. Many protein structures have been solved, but far less is known about their relevant conformational changes. To address this challenge, over a million citizen scientists banded together …

Read more