AMD Cool 'n' Quiet, Turbo Core Impact On Linux
Phoronix: AMD Cool 'n' Quiet, Turbo Core Impact On Linux
For those wondering about the impact that AMD's Cool 'n' Quiet and Turbo Core technologies have under Linux for the latest-generation Bulldozer processors, here are some tests illustrating the changes in performance, power consumption, and operating temperature.
Is Turbo Core even working like it should in multi-threaded workloads?
Last edited by PsynoKhi0; 11-16-2011 at 11:15 AM.
Oh wow, Michael.... you need to read up more about how and when turbo core works;
There are two stages of boost; 500 MHz, and 1 GHz. The 500 MHz boost will happen when all cores are active, but the workload is such that the TDP is NOT EXCEEDED by giving it the extra 500 MHz. The 1 GHz boost will happen only if NO MORE THAN HALF of the cores are active, and the extra boost will not cause it to exceed the TDP. In addition, as I understand it, the 1 GHz boost will only happen if the PROPER HALF of the cores are under load, i.e., one core from each module. If 2 cores in 1 module are loaded, it will only give you 500 MHz.
In order to achieve the 1 GHz boost, the PROCESSOR SCHEDULING ALGORITHM needs to account for the requirements. It needs to distribute the work to the proper cores in order to allow one core of each module to shut down.
The reason why you are observing that turbo core only works with SINGLE THREADED workloads is that this is the only way you are able to trick it into keeping the load off of at least one core of each module.
Finally, you have totally misunderstood the 500 MHz boost conditions; it will NOT boost if all the cores are loaded.
"However, if all cores are being pushed to their limits, Turbo Core is activated but at a lower frequency than the maximum." -- this is totally wrong. If *SOME* of the cores are being pushed to their limits, which may or may not exceed half of the cores, such that giving the loaded cores an extra 500 MHz **WILL NOT** cause it to exceed TDP, that is when the extra 500 MHz is applied. If all the cores are loaded to their limits, it will definitely exceed TDP with the 500 MHz boost. You can have some of them pushed to their limits with the balance of them running *MODERATE* workloads.
Multi-threaded workloads that stress all the cores will NOT be boosted. Only if some of the cores are NOT pegged, and the TDP will not be exceeded by the boost.
Originally Posted by PsynoKhi0
To test CnQ try some more relevant workloads like "emerge -v wine" or any workload that spawns short lived processes repeatedly.
The new processes tend to be scheduled on another core (an idle one) and stop before the cores have time to speed up. It would be interesting to see how different processors handle that.
Two things that I wonder, "and the TDP will not be exceeded by the boost. " is that temperature related at all? I mean, I would guess, if I use watercooling or extreme awesome air-cooling, there should be no issue ramping up a single threaded app to max clock, right?
Also, with regards to phenom II too, I've never noticed it kicking in, but it's also really hard to see, without benchmarking, what the current CPU frequency is. /proc/cpuinfo unfortunatly does not show this, which I find a big shame.
As for the article, there seem to be a few things missing (cpu speed graphs when turbo-boost is kicking in properly) and what I found really odd, was that CnQ +Turbo Cool (was that combo enabled, I assumed as such) didn't reduce used power. CnQ should work just as Turbo-boost wasn't enabled when load is low, right? I Just always thought turbo-boost as an extra P state.
What about enabling both features at the same time?
The tests seem to be done using just one of the features, can't they both be enabled?
Michael, please at least fix this article by embedding correct graphs for OpenSSL - a.t.m. there are graphs for Smallpt there instead.
Secondly, it would be good it you would fix your slightly incorrect description of turbo-core 2.0 technology. Pretty correct description had already been posted in this thread, still it lacks some implementation details about temperature vs. TDP relations and the fact that the actual value hardware measures is "the actual CPU power consumption" (which is pretty much the same as the actual amount of heat produced by CPU) and turbo-core deactivates boost in case temperature became too close to design limits and CPU power consumption exceeds maximum TDP. In case temperature raises up even more no matter the boost had been turned off (for example it your cooling solution is not adequate) and had exceeded design limits - throttling comes into play, eventually followed by emergency system reset/halt in case temperature still continues to rise.
What would be more interesting to see is the results for the system having both turbo core and cool-n-quiet enabled. Typical user want his or her system to be as cool and quiet as possible while doing some "idle" tasks, but demands max performance when it comes to doing really CPU-intensive tasks. That's why these two technologies complement each other: cool-n-quite takes care of the "idle" case, while turbo boost helps with CPU-intensive tasks speedup.
I've got another situation that CnQ completely screws up on. I've got an OpenCL WebM (VP8 video) decoder which spawns thousands of threads per 1080p frame of video that is decoded. Enabling CnQ chops about 30% off of the decoding frame rate, even when it's just shuffling the work to/from the video card. It's not perfectly optimized, but when I'm trying to get repeatable benchmarking numbers, power management must be disabled first.
Originally Posted by 1234