A peek into Core 17 benchmarking

Our primary goal with benchmarking is "equal points for equal work." However, making this process consistent over lots of different types of WUs and different types of hardware is tricky. We had an internal discussion about the PPD for two projects (7810 and 8900) recently and we thought donors might find these details interesting.

We were working to rebalance the points to make the PPD consistent, but just doing that over the wide range of hardware is difficult. Check out the graph below which shows the PPD on the y-axis and donor GPUs sorted along the x-axis by typical PPD. The dark line shows averages and the gray area shows error bars (variation between WUs for a given project on the same GPU type).

What we see is that our protocol balanced the PPD on the low end, but on the high there is both bigger variation (more shaded areas) and also bigger differences on the very highest power GPUs. In these situations, we usually go with our protocols, but this time, given all the analysis we did on it, I thought it would be interesting for donors to see these sorts of details.

It's these sorts of variations which leads to PPD fluctuations, so perhaps the main lesson here is that even with our protocols and plans, it's really hard to be consistent over all the different hardware, even when we're talking about just GPUs and just 2 projects.