Folding@home has long sought to understand how proteins self-assemble, or fold, into their functional structures and what the functional implications of dynamics within the context of a folded protein are. As such, many may wonder what it means for the project when they hear that a new software algorithm called AlphaFold has “solved” the protein folding problem.
For background, AlphaFold is a machine learning algorithm that was trained to predict the structure of a protein from the sequence of chemicals, called amino acids, that the protein is made of. The algorithm was trained on the protein databank (PDB),(7) which is a publicly available repository of over 200K protein structures that has been accumulated over decades by requiring structural biologists to deposit their structures during peer review of their work. Many other algorithms had been developed to predict protein structures using a combination of physics and machine learning based on available structures. For decades, the performance of these methods was regularly tested through blind predictions via the critical assessment of protein structure prediction (CASP) competition. While the field made great progress over time, it had hit somewhat of a plateau in recent years. AlphaFold broke this trend, making a substantial stride in accuracy. Its predictive power is one of the most compelling examples of the enormous power that computational methods have to offer biomedical research.
While AlphaFold is an amazing advance, it does not solve the problems that Folding@home focuses on. A key tenet of much of our work at Folding@home is that individual protein structures are enormously valuable but are just the tip of the iceberg. A single structure does not tell us how a protein folds up into that structure, nor does it tell us what the moving parts of a protein are that allow it to function.
The upshot is that AlphaFold has created many new opportunities for Folding@home. Our work on the dynamics of folded proteins generally depends on having at least one high-resolution structure from experiments. For many proteins, no such structure is available, so we at Folding@home have had little to contribute to better understanding such proteins. Now, however, structures predicted with AlphaFold are sufficiently accurate that we can use them as starting points for our work even when no experimental structure is available. In one recent example, we used the AlphaFold-predicted structure of an important drug target called PPM1D to understand how some mysterious inhibitors of the protein likely work.
If you’d like to learn more, I recently wrote a perspective piece on this topic here.