What is adaptive sampling, and how is it related to MSMs?

When researchers are using computers to study protein conformational dynamics (how the protein changes shape as it’s folding), the conventional approach for unbiased all-atom molecular dynamics is two-step. First, they run a set of simulations, and second, after the simulations have completed, they analyze the resulting data. The adaptive sampling Markov State Model approach involves breaking this paradigm by interleaving these two steps. Instead of building the model only after the data has been collected, it is instead built on the fly as the data is being generated. A feedback loop can then be set up where the current state of the model is used to inform the progress of further simulations.

Imagine, for example, that you were exploring a maze for the first time. Although you have no map, you do have a GPS which is able to track your progress and display the parts of the maze you’ve explored. One approach is to put the GPS in your purse and walk around blindly — bumping off walls — for as long as possible. Once you’re tired, you take out the GPS and analyze the path your trajectory took; by looking at your path on the GPS you’re able to see the structure of the maze and have effectively built a map. Unfortunately, you notice that you’ve wasted a lot of time stuck in various parts of the maze. Instead, the smarter strategy is to watch the GPS as you walk around — to try to build your map of the maze incrementally. Using your map, you’re able to identify when you’re “stuck” in a certain part of the maze, and to avoid re-exploring parts of the maze that you’re confident that you’ve already discovered.

In many ways, these two approaches to exploring a maze are analogous to the two approaches to collecting and analyzing molecular simulations. Due to the incremental nature of building the model on the fly in the adaptive sampling approach, it is possible to increase the efficiency of simulations.