For native GPU FLOPS, we tried to count each operation as just one FLOP, so a sqrt is one FLOP and a reciprocal sqrt is two FLOPS, etc., which leads to native GPU FLOP counts of the form:
Sqrt = 1
Rsqrt = 2
Log = 1
Exp = 1
Acos =1
Cos =1
Sin =1
Dot3 = 5
Cross3 = 9
For x86 FLOPS, we followed
http://ai.stanford.edu/~paskin/slam/javadoc/javaslam/util/Flops.html
Which, for example, leads to x86 FLOP counts:
Sqrt = 15
Rsqrt = 16
Log = 20
Exp = 20