Announcement

Collapse
No announcement yet.

Here's Why Radeon Graphics Are Faster On Linux 3.12

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #81
    I have to say, this is where you really shine Michael! Good article, nicely explained! Just a top piece

    Comment


    • #82
      Wake me when games fully leverage OpenCL and off-load much of this waste on the CPU to the GPGPU(s). The amount of cycles still being wasted on CPUs for games is absurd.

      Comment


      • #83
        Both kernels still affected on AMD Bulldozer, 3.12 not as badly

        Originally posted by schmidtbag View Post
        That makes me feel a lot better then - at least that means this isn't just a problem with the radeon drivers. As Luke has pointed out with his FX-8120, he didn't get any performance hits between the kernel versions so in my personal opinion, it seems the blame is the intel ondemand governor.

        What I'd be more interested in at this point is seeing a test with the HD6870 (due to having the greatest impact all around) on an AMD FX-8XXX system between kernels 3.11 and 3.12 AND compare that to the intel results. A CPU like that ought to be plenty sufficient to give similar results, so assuming the CPU isn't a bottleneck, that would be a good way to prove that the intel governor was faulty. If the overall frame rate is significantly lower regardless of CPU power state, this might be more than just a governor problem.


        Assuming the intel governor has been faulty all along, at least we now know it is working properly and all future benchmarks can remain accurate and meaningful without Michael having to change the governor.
        On my FX8120, both Linux 3.11 and Linux 3.12 suffer a GPU performance hit when running the "ondemand" governor in Critter, the only non-CPU limited game I have.

        I jusst benchmarked Linux 3.11 and Linux 3.12 in Critter, with the "ondemand" governor. Here's what I got:

        Linux 3.11, "ondemand:"

        max framerate 432fps, typical in the mid 300's, lowest dip around 280fps

        Linux 3.12, "ondemand:"

        Max framerate 527fps, typical high 400's, one dip to 291fps

        Linux 3.12, "performance"

        647fps highest seen, high 500's typical, some dips to 447

        Clearly something in Linux 3.12 did in fact help, but only partially in the case of AMD bulldozer. Still a big gain, but not all the way to what I am used to. I normally set cpufreq-applet to max frequency for all games, thus I did not see any gain until I tried leaving the governor on "ondemand" where I set it for almost everything else.

        Kdenlive rendering, BTW, is also strongly affected by governor settiing, can with 3.11 mean the difference between twice run time or <1 1/2 times runtime to render a video out
        Last edited by Luke; 15 October 2013, 04:35 PM.

        Comment


        • #84
          Originally posted by Ericg View Post
          Its not an "intel governor" its the ondemand governor in the subsystem that handles ALL CPU scaling. This change effects every CPU that uses the ondemand governor-- interestingly enough (in perspective of your post) no modern intel CPU actually uses the ondemand governor UNLESS you're on *buntu.
          By "ondemand governor" you actually mean the acpi_cpufreq driver.

          Originally posted by schmidtbag View Post
          My bad - I should have been more precise, I forgot this is the internet and everything stated must be 100% accurate. While governors such as "ondemand" or "performance" apply to either AMD or Intel, there are still drivers (if that's even the right word) that affects how these governors work between CPUs. In other words, the governors ARE specific to, at the very least, the CPU family. It could ven be specific to each generation or each model, but I wouldn't know for sure. So for example if you have an AMD system that can clock from 1.2GHz to 3.5Ghz, it doesn't mean an intel CPU can operate the same way and remain stable. If the governors were indifferent to the CPU, problems like this would have been found a long time ago.

          The point of me saying this is there's a possibility that the ondemand governor for AMD might have done a better job at determining what frequency to operate at.
          I highly doubt that. The driver is the same and generic, called acpi_cpufreq. The reason why you can't use the same clocks on different processors is that it probes your processor capabilities. Much like when writing a program with runtime SIMD support ? you first ask the processor what SIMD it supports before using it, but you don't need to compile your program for every CPU out there.

          Now historically there were drivers like amd-powernow that were specific to AMD processors, but they have been long deprecated in favour of acpi_cpufreq.

          Originally posted by chrisb View Post
          That's not true for vsync and triple buffering.

          Also note that most console games actually lock to 30fps for a more consistent smooth experience:
          Sure. My point was that you can't just say "always use VSync for games", because that's just not good advice. It depends on the game and hardware.

          Originally posted by Ericg View Post
          I guess we'll need to wait for Michael's power consumption benchmarks to figure out if this is a good change in the subsystem or not... I mean yes we're getting higher performance, but what about non-gaming workloads? Is 3.12 going to kill battery life (compared to 3.11) because of this change? For gaming I have no problem with higher power consumption, its expected. But what about flash? Or other 'constant' workloads that DON'T require maxxed out freqs.
          Does PTS have any good power consumption benchmarks? I know you can count any benchmark power consumption, but they are made to stress things, so it's not a good metric, because it will be exactly the same as using the performance governor. Is there a specific test for idle power consumption, or as you said, things like flash?

          Comment


          • #85
            Originally posted by GreatEmerald View Post
            And will make your FPS dip to 30 if 60 can't be sustained, instead of just 59... No, you should have VSync on only for games you know will never dip below 60 (or whatever your refresh rate may be).
            Wouldn't that only be the case if either you didn't have extra buffers, or, the rendering time was consistently over 16.7ms? For the later, I'd imagine it was highly tied to the particular scene.

            Comment


            • #86
              The number of power states the CPU has may also be a factor; e.g., if the CPU has four vs two states. If the CPU has fewer states, the CPU will spend less or no time in the "middle" states. If there are just two states, you'll end up with max performance anytime there is a call for performance.

              Comment


              • #87
                So if the drivers are that sensitive to cpu performance, then does that mean the drivers aren't multi-threaded?

                Comment


                • #88
                  Originally posted by agd5f View Post
                  The number of power states the CPU has may also be a factor; e.g., if the CPU has four vs two states. If the CPU has fewer states, the CPU will spend less or no time in the "middle" states. If there are just two states, you'll end up with max performance anytime there is a call for performance.
                  Isn't the default behavior for ondemand to switch to the highest available step immediately? I thought it is only the conservative governor that switches through the middle states.

                  Comment


                  • #89
                    There are 5 power states on my FX 8120

                    Originally posted by agd5f View Post
                    The number of power states the CPU has may also be a factor; e.g., if the CPU has four vs two states. If the CPU has fewer states, the CPU will spend less or no time in the "middle" states. If there are just two states, you'll end up with max performance anytime there is a call for performance.
                    Power states available on my FX-8120 are: 1.4 GHZ, 1.9 GHZ, 2.3 GHZ, 2.8 GHZ, 4.4 GHZ (the last one is overclocked with mulitpler setting in BIOS). "Turbo mode" is disabled, as there is plenty of cooler so if one core can go to 4.4 GHZ, all of them at once can unless one is bad, which is not the case. No game I have can use more than one core, but video rendering uses all of them and makes torrents of heat for a short time.

                    Comment


                    • #90
                      On demand apparently does not correspond to its name

                      Originally posted by schmidtbag View Post
                      I seriously cannot believe the amount of finger-pointing going on here, and sadly its making me lose a lot of respect for people who I otherwise thought were humble and hard-workers.


                      It is NOT by any means Michael's fault why the results came out this way. He is not obligated to run under a different governor, just as he isn't obligated to use some obscure distro, make some minor kernel tweak or tweak the drivers for individual tests - the point of these benchmarks is to show what the average person will/should encounter from everyday upgrades. If the governor is known to be problematic then fine, switch out of it, but if the CPU is actually underclocking due to a lack of stress, that really gets me to think the governor is not the problem.

                      What Michael is obligated to do is benchmark using the most typical/average software setup, and a hardware setup that has the lowest probability of skewing the results (meaning, his choice of CPU, mobo, RAM, and storage were fine for testing GPUs, because all of those parts are good enough that they SHOULDN'T be a bottleneck). If you want benchmarks for the utmost highest possible results, you're in the wrong place and always have been. Even if this website was strictly benchmarks and nothing else, no single person would ever have the time to set up some of the silly or unrealistic requests here.

                      I'm not (yet) blaming the driver developers either, since they were being affected by an outside source.

                      HOWEVER

                      The CPU Michael used was better than almost anything AMD has to offer. That being said, it is absolutely unacceptable for the drivers to be THAT held back by the CPU, even in its low-freq state. This could mean that APUs are behind in performance simply because the CPU isn't fast enough. I'm not bashing AMD CPUs either - AMD CPUs are fully capable of playing most modern games without being maxed out.

                      But like I said, I'm not yet blaming the driver developers. I think tests should be done with catalyst the HD6870 (because that had the greatest performance impact) and see how much of a difference that makes. If, while testing catalyst, the 3.11 ondemand vs 3.12 ondemand has a performance impact less than 5%, that's where I think the open source radeon drivers are the blame.
                      I fully agree with this. I think the way Michael has done the benchmarks is the correct way to do it, and he has done an amazing job in unraveling this mystery.

                      As far as I can see it's the cpufreq-subsystem that is the fault here. Just read the name out loud a couple of times: ON DEMAND. That would indicate that when there is a demand for cpu-power, the governor should make sure that the demand is met. That's sorta indicated in the name of the governor, isn't it? Apparently that has not been the case. As far as I can see, a governor named ondemand should deliver max frequency if that is demanded to perform the tasks, and therefore there should be no major difference between the performance governor and the ondemand governor.

                      And marek's comment on not being interested in the result of a benchmark without the governor set to performance, I firmly disagree in. As far as I know marek works for AMD? Therefore he should be interested in his graphics driver performing well under "vanilla" conditions, although that does not stress his work to the max. He is, after all, contributing to a product that has to compete to be the better option, vanilla settings or not. Just because one is not to blame for a problem, does not mean one is not affected by it, and therefore should not care.

                      "Oh, look. One of the machines in my gym has a wire that's almost about to break. Whatever, that's not my problem. They should probably fix that, but I won't inform them". Wire breaks, I'm not able to use the pec dec until they get spare parts. Not my fault, but still I'm affected.

                      Comment

                      Working...
                      X