For native GPU FLOPS, we tried to count each operation as just one FLOP, so a sqrt is one FLOP and a reciprocal sqrt is two FLOPS, etc., which leads to native GPU FLOP counts of the form:
Sqrt = 1Rsqrt = 2Log = 1Exp = 1Acos =1Cos =1Sin =1Dot3 = 5Cross3 = 9
For x86 FLOPS, we followed
http://ai.stanford.edu/~paskin/slam/javadoc/javaslam/util/Flops.html
Which, for example, leads to x86 FLOP counts:
Sqrt = 15Rsqrt = 16Log = 20Exp = 20
