Announcement

Collapse
No announcement yet.

Here's Why Radeon Graphics Are Faster On Linux 3.12

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Luke
    replied
    There are 5 power states on my FX 8120

    Originally posted by agd5f View Post
    The number of power states the CPU has may also be a factor; e.g., if the CPU has four vs two states. If the CPU has fewer states, the CPU will spend less or no time in the "middle" states. If there are just two states, you'll end up with max performance anytime there is a call for performance.
    Power states available on my FX-8120 are: 1.4 GHZ, 1.9 GHZ, 2.3 GHZ, 2.8 GHZ, 4.4 GHZ (the last one is overclocked with mulitpler setting in BIOS). "Turbo mode" is disabled, as there is plenty of cooler so if one core can go to 4.4 GHZ, all of them at once can unless one is bad, which is not the case. No game I have can use more than one core, but video rendering uses all of them and makes torrents of heat for a short time.

    Leave a comment:


  • Vim_User
    replied
    Originally posted by agd5f View Post
    The number of power states the CPU has may also be a factor; e.g., if the CPU has four vs two states. If the CPU has fewer states, the CPU will spend less or no time in the "middle" states. If there are just two states, you'll end up with max performance anytime there is a call for performance.
    Isn't the default behavior for ondemand to switch to the highest available step immediately? I thought it is only the conservative governor that switches through the middle states.

    Leave a comment:


  • Dukenukemx
    replied
    So if the drivers are that sensitive to cpu performance, then does that mean the drivers aren't multi-threaded?

    Leave a comment:


  • agd5f
    replied
    The number of power states the CPU has may also be a factor; e.g., if the CPU has four vs two states. If the CPU has fewer states, the CPU will spend less or no time in the "middle" states. If there are just two states, you'll end up with max performance anytime there is a call for performance.

    Leave a comment:


  • liam
    replied
    Originally posted by GreatEmerald View Post
    And will make your FPS dip to 30 if 60 can't be sustained, instead of just 59... No, you should have VSync on only for games you know will never dip below 60 (or whatever your refresh rate may be).
    Wouldn't that only be the case if either you didn't have extra buffers, or, the rendering time was consistently over 16.7ms? For the later, I'd imagine it was highly tied to the particular scene.

    Leave a comment:


  • GreatEmerald
    replied
    Originally posted by Ericg View Post
    Its not an "intel governor" its the ondemand governor in the subsystem that handles ALL CPU scaling. This change effects every CPU that uses the ondemand governor-- interestingly enough (in perspective of your post) no modern intel CPU actually uses the ondemand governor UNLESS you're on *buntu.
    By "ondemand governor" you actually mean the acpi_cpufreq driver.

    Originally posted by schmidtbag View Post
    My bad - I should have been more precise, I forgot this is the internet and everything stated must be 100% accurate. While governors such as "ondemand" or "performance" apply to either AMD or Intel, there are still drivers (if that's even the right word) that affects how these governors work between CPUs. In other words, the governors ARE specific to, at the very least, the CPU family. It could ven be specific to each generation or each model, but I wouldn't know for sure. So for example if you have an AMD system that can clock from 1.2GHz to 3.5Ghz, it doesn't mean an intel CPU can operate the same way and remain stable. If the governors were indifferent to the CPU, problems like this would have been found a long time ago.

    The point of me saying this is there's a possibility that the ondemand governor for AMD might have done a better job at determining what frequency to operate at.
    I highly doubt that. The driver is the same and generic, called acpi_cpufreq. The reason why you can't use the same clocks on different processors is that it probes your processor capabilities. Much like when writing a program with runtime SIMD support ? you first ask the processor what SIMD it supports before using it, but you don't need to compile your program for every CPU out there.

    Now historically there were drivers like amd-powernow that were specific to AMD processors, but they have been long deprecated in favour of acpi_cpufreq.

    Originally posted by chrisb View Post
    That's not true for vsync and triple buffering.

    Also note that most console games actually lock to 30fps for a more consistent smooth experience:
    Sure. My point was that you can't just say "always use VSync for games", because that's just not good advice. It depends on the game and hardware.

    Originally posted by Ericg View Post
    I guess we'll need to wait for Michael's power consumption benchmarks to figure out if this is a good change in the subsystem or not... I mean yes we're getting higher performance, but what about non-gaming workloads? Is 3.12 going to kill battery life (compared to 3.11) because of this change? For gaming I have no problem with higher power consumption, its expected. But what about flash? Or other 'constant' workloads that DON'T require maxxed out freqs.
    Does PTS have any good power consumption benchmarks? I know you can count any benchmark power consumption, but they are made to stress things, so it's not a good metric, because it will be exactly the same as using the performance governor. Is there a specific test for idle power consumption, or as you said, things like flash?

    Leave a comment:


  • Luke
    replied
    Both kernels still affected on AMD Bulldozer, 3.12 not as badly

    Originally posted by schmidtbag View Post
    That makes me feel a lot better then - at least that means this isn't just a problem with the radeon drivers. As Luke has pointed out with his FX-8120, he didn't get any performance hits between the kernel versions so in my personal opinion, it seems the blame is the intel ondemand governor.

    What I'd be more interested in at this point is seeing a test with the HD6870 (due to having the greatest impact all around) on an AMD FX-8XXX system between kernels 3.11 and 3.12 AND compare that to the intel results. A CPU like that ought to be plenty sufficient to give similar results, so assuming the CPU isn't a bottleneck, that would be a good way to prove that the intel governor was faulty. If the overall frame rate is significantly lower regardless of CPU power state, this might be more than just a governor problem.


    Assuming the intel governor has been faulty all along, at least we now know it is working properly and all future benchmarks can remain accurate and meaningful without Michael having to change the governor.
    On my FX8120, both Linux 3.11 and Linux 3.12 suffer a GPU performance hit when running the "ondemand" governor in Critter, the only non-CPU limited game I have.

    I jusst benchmarked Linux 3.11 and Linux 3.12 in Critter, with the "ondemand" governor. Here's what I got:

    Linux 3.11, "ondemand:"

    max framerate 432fps, typical in the mid 300's, lowest dip around 280fps

    Linux 3.12, "ondemand:"

    Max framerate 527fps, typical high 400's, one dip to 291fps

    Linux 3.12, "performance"

    647fps highest seen, high 500's typical, some dips to 447

    Clearly something in Linux 3.12 did in fact help, but only partially in the case of AMD bulldozer. Still a big gain, but not all the way to what I am used to. I normally set cpufreq-applet to max frequency for all games, thus I did not see any gain until I tried leaving the governor on "ondemand" where I set it for almost everything else.

    Kdenlive rendering, BTW, is also strongly affected by governor settiing, can with 3.11 mean the difference between twice run time or <1 1/2 times runtime to render a video out
    Last edited by Luke; 15 October 2013, 04:35 PM.

    Leave a comment:


  • Marc Driftmeyer
    replied
    Wake me when games fully leverage OpenCL and off-load much of this waste on the CPU to the GPGPU(s). The amount of cycles still being wasted on CPUs for games is absurd.

    Leave a comment:


  • markg85
    replied
    I have to say, this is where you really shine Michael! Good article, nicely explained! Just a top piece

    Leave a comment:


  • Temar
    replied
    Originally posted by pingufunkybeat View Post
    So you wouldn't test binary drivers at all?

    They are not installed out-of-the-box and are considered a non-default configuration?
    There is a huge difference. People are used to install new drivers whereas they are not used to tune scheduler parameters.

    Leave a comment:

Working...
X