Comparison between FAH and Anton's approaches

Right now, the two most powerful supercomputers for studying protein folding are Folding@home and a very impressive special purpose computer from DE Shaw Researched, called ANTON. We’re often been asked “how do they compare?” The approaches are very different, so comparisons aren’t completely straightforward. ANTON takes the traditional approach to studying protein folding, where one performs a few (often 1 or 2) long trajectories to study the process. Folding@home takes a statistical approach, which has two primary benefits: 1) it can access folding on dramatically longer timescales (milliseconds, instead of microsecond folding events over a single long trajectory) and 2) it can give statistically significant results on those long timescales.

The main concern about the method in FAH is that since it is such a radically new approach, does it work reliably? Previous tests of FAH have been to experiment, which is the gold standard test, but also brings in other issues, such as how good are our models of reality. Thus, while FAH’s approach has done well compared to experiment, it is useful to compare FAH and ANTON directly, since they use the same models, etc. Comparison of our statistical approach (using Markov State Models, aka MSMs) directly with data from ANTON would go a long way to showing that the MSM approach works for even non-trivial systems (they have been previously tested for long dynamics on small systems).

In a recently published paper, we make this comparison. By applying MSMs to data from ANTON, we find that FAH’s approach (MSMs) can reproduce the long timescales in ANTON data very well. Moreover, we also find that the MSM approach can find important new features missing in the more traditional analysis approach originally applied to the ANTON data, relevant for understanding folding and function.

For us, this is exciting since it shows the capabilities of the MSM method. However, I want to stress that perhaps the most exciting part is how ANTON and FAH could be used together. A run on ANTON followed by more thorough sampling in Folding@home could be the best of both worlds.

PS Here’s the abstract for our paper (http://pubs.acs.org/doi/abs/10.1021/ja207470h?prevSearch=lane%2Bpande&searchHistoryKey=):

Two strategies have been recently employed to push molecular simulation to long, biologically relevant timescales: projection-based analysis of results from specialized hardware producing a small number of ultra-long trajectories and the statistical interpretation of massive parallel sampling performed with Markov state models (MSMs). Here, we assess the MSM as an analysis method by constructing a Markov model from ultra-long trajectories, specifically two previously reported 100 µs trajectories of the FiP35 WW domain (Shaw et. al. (2010) Science, 330: 341-346). We find that the MSM approach yields novel insights. It discovers new statistically significant folding pathways, in which either beta-hairpin of the WW domain can form first. The rates of this process approach experimental values in a direct quantitative comparison (timescales of 5.0 µs and 100 ns), within a factor of ~2. Finally, the hub-like topology of the MSM and identification of a holo conformation predicts how WW domains may function through a conformational selection mechanism