Even the FLOPS itself is a fitted curve.
Even the FLOPS itself is a fitted curve.
I guess it depends on whether the benchmarks are run for a specific period of time and you "count the results" (eg Tropics, where the "score" the FPS) or whether they run until a certain amount of work is done and stop (eg C-Ray or timed kernel compile where the "score" is effectively 1 over the time taken to complete the work).
In the first case time cancels out because the score is effectively # frames over energy required or (average FPS * time) / (average W * time), but in the second case the score is effectively a constant (fixed amount of work) over energy required or 1 / (average W * time). The denominator is the same in both cases but you don't have time in the numerator for the first two benchmarks.
In the compilation benchmark, you can think of the performance as being measured in "compilations per second" (since the compilation takes many seconds, the numerical value would be less than one).
So whatever the benchmark, the performance can be stated as "operations per second" for some definition of operation (compilation, frame render, etc.). The denominator is Power (or average Power) which is equal to Energy per second. So in all cases:
But since the computational task is the same for each CPU within a given benchmark, we can define (normalize) the operation to always be equal to 1 (for any benchmark) -- in other words, one operation equals one compilation, one set of frame renderings, etc.Code:
Performance per Watt = ( operations / sec ) / ( Energy / sec ) = operations / Energy
Bottom line is that comparing relative Performance per Watt among various CPUs is equivalent to comparing the reciprocal of total energy required to complete the computation for each CPU. And total energy required can be determined by first finding the average power consumed during the computation and multiplying that by the time required to complete the computation. This is exactly what tuke81 did in his graph in this thread.
Just to be clear, I wasn't trying to come up with a new formula, just trying to break what seemed to be a discussion looping around the fact that sometimes time mattered and sometimes it didn't. Agree that you can always include time as long as it's also included in the numerator for things like FPS benchmarks to properly calculate the total # of "operations".
Also how is it possible, that in Unigine Tropics v1.3 the i7 3770 is faster than any of the Haswells? Could it be, that there is something wrong with the gpu turbo or the Haswell driver?
To complicate further and just generally mess with people, isn't this what some people are grasping for:
Performance/Total energy expended = (1/sec)/(average watt * sec) =(1/sec)*(1/watt*sec) = 1/watts*sec^2
This to equate a computer thats twice as fast but uses double the energy to complete a certain workload as the comparison. Though i agree it's an odd metric. :D
Well while using SI-units like joules are ok. I have always used Wh for electric energy(batteries and electric grid). Like i.e. c-ray bench would be:
The other thing what puzzles me is the used power supply. Is the power supply efficiency bothered at all in the tests? Efficiency is lower the less percentage of power load is used from power supply rating.
Lets assume that platinium rated 450W power supply is used, then i.e. kaveri uses approx 90*119W/100=107.1W power which is actually 100%*107.1W/450W=23.8% load. 23.8% load in platinium rating uses over 90% efficiency so math is fine.
But now let see the least power usage scenario. i3 2120 uses approx 90*58.8W/100=52.92W power which is 100%*52.92W/450W=11.76% load. 11.76% load on platinium rated power supply is no where near at 90% efficiency the real efficiency will be closer to 80% and real cpu power consumption is in real lower.
But of course used power is what is measured. But the thing is if you want to omit power supply efficiency from equation you
a) have to take care that power supply efficiency curve is known and calculate real system power usage omitting power supply all together or
b) use small enough power supply that load is between 20%-100% with all cpus.
The thing is, I cannot imagine a situation where it would be more useful to know the Performance per Joule instead of knowing the energy AND time required to complete the operation (which is what we know from phoronix's timed benchmark combined with tuke81's graph). If you are trying to minimize energy usage, you choose the one that used the lowest energy to complete the operation. If you have to complete the operation in minimum time, you choose the one that completed the operation in the shortest amount of time.
Therefore, I agree that Performance per Joule is an "odd metric". Not very useful.