Announcement

Collapse
No announcement yet.

Ondemand governor dramatically slows down mesa perfomance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by vadimg View Post
    My understanding of the issue is that even though glxgears is a single-threaded app and one might expect 100% load on a single core, that thread has to wait while kernel driver and GPU process the command streams (it's offloaded to separate thread in r600g). If for example kernel + gpu processing takes about 15% of time, main thread will spend these 15% waiting, and the load of main thread will never be higher than 100 - 15 = 85%, thus it will never trigger frequency increase with the ondemand governor because default threshold is 95%. Also I think that these waits encourage the scheduler to move the waiting thread between cpus, distributing the load and reducing average load per cpu even more. In the end typical distribution of load per cpu with glxgears for me looks like the following: ...

    Also I think the arguments that ondemand is intended for powersaving are not very correct, I think the described behavior is not what the most users expect. Users expect that the frequency will be raised if the application needs it, so that they'll always have the best performance with performance-sensitive apps. And in windows cpu governor works more like this expectation, at least with 3d apps, though I suspect they simply detect 3d activity and raise the frequency in such cases. Anyway, even in windows forcing the cpus to max freq also helped me sometimes with 3d apps performance, though usually it doesn't provide noticeable benefits.
    You are not looking at the bigger picture. There is only so much a CPU does that can count as its load. As long as a CPU has got power will its clock not stop ticking and it is either executing user space code, kernel space code or it is waiting for I/O. If you would add all of this into the load average number then you would get a fixed 100%, because it is all it ever does. Makes sense?

    If you so will is the average load of a CPU a lie. It is an artificial number, which was introduced to tell between how much a CPU is running user code and how much kernel code. There is nothing wrong with it. An operating system should leave as much CPU time to the user and only use little for itself. As long as this is the case does the concept of the load average work. Just like nobody blames the clock on the wall for the dinner being late...

    There are two, maybe three, problems here. One is with the benchmarks sitting in user space doing little else but to cause a CPU to spend most of its time running kernel code. The other is with the drivers that are sitting in kernel space and probably should have most of their code running in user space. If I am not mistaken then the Nvidia kernel module of the proprietary driver is about 13MBs? Whatever its exact size is, it is a very large kernel module, and I am guessing that it is doing a lot more for a kernel module than it should be doing. The third problem might be an unusual CPU/GPU combination where an expensive and fast CPU is combined with a cheap, slow GPU, which adds to the I/O wait times.

    The first problem, the one of the benchmarks, can be solved by choosing the right benchmarks and running them correctly. Making sure a benchmark does not get outsmarted by other software or hardware is a part of this. The second problem can only be solved by the people who write kernel modules. I am guessing it is only a matter of time until we see some changes here. Last, but not least, does one have to question the combination of a slow GPU with a fast CPU when the goal is to get a high frame rate, because GPUs are still a lot cheaper than CPUs are while GPUs do the most work.

    So when you take a look at the bigger picture will you see that there are more than the cpu-freq governors and that these are not a problem as these can either be tweaked or set to do nothing (with the "performance" governor). There is even a user space governor for when one wants to control the frequency directly from user space.
    Last edited by sdack; 04 June 2013, 12:43 PM.

    Comment


    • #52
      Originally posted by sdack View Post
      You are not looking at the bigger picture. There is only so much a CPU does that can count as its load. As long as a CPU has got power will its clock not stop ticking and it is either executing user space code, kernel space code or it is waiting for I/O. If you would add all of this into the load average number then you would get a fixed 100%, because it is all it ever does. Makes sense?

      If you so will is the average load of a CPU a lie. It is an artificial number, which was introduced to tell between how much a CPU is running user code and how much kernel code.
      I used the term 'average load' as a percentage of time when cpu is not idle, that is, when CPU is running in C0 power state, as opposed to power-saving states (C1,...) with stopped CPU clocks. That's exactly what the "cpupower monitor" shows in the C0 column, here is full description:
      Code:
      # sudo cpupower monitor -l
      Monitor "Mperf" (3 states) - Might overflow after 922000000 s
      C0    [T] -> Processor Core not idle
      Cx    [T] -> Processor Core in an idle state
      Freq    [T] -> Average Frequency (including boost) in MHz
      As for the remaining part of your comment, I'm not sure what you are trying to say. This thread is not about choosing the right benchmarks that will demonstrate max performance with ondemand governor, and not about choosing the hardware configuration that will work best with ondemand governor. It's about the ondemand governor that in many cases results in bad performance with many OpenGL apps on different hardware configurations.

      And I know that I can tune something or set the frequency manually. I just tried to explain the existing problem with ondemand governor which was intentionally created to relieve the people of the need to adjust the frequency manually.

      Comment


      • #53
        Originally posted by vadimg View Post
        I used the term 'average load' as a percentage of time when cpu is not idle, that is, when CPU is running in C0 power state, as opposed to power-saving states (C1,...) with stopped CPU clocks. That's exactly what the "cpupower monitor" shows in the C0 column, here is full description:
        Code:
        # sudo cpupower monitor -l
        Monitor "Mperf" (3 states) - Might overflow after 922000000 s
        C0    [T] -> Processor Core not idle
        Cx    [T] -> Processor Core in an idle state
        Freq    [T] -> Average Frequency (including boost) in MHz
        As for the remaining part of your comment, I'm not sure what you are trying to say. This thread is not about choosing the right benchmarks that will demonstrate max performance with ondemand governor, and not about choosing the hardware configuration that will work best with ondemand governor. It's about the ondemand governor that in many cases results in bad performance with many OpenGL apps on different hardware configurations.

        And I know that I can tune something or set the frequency manually. I just tried to explain the existing problem with ondemand governor which was intentionally created to relieve the people of the need to adjust the frequency manually.
        I am not saying anything. I am telling you where to look for your problems. Do not blame the ondemand governor for not working right when you only do not know how it is working and when to use it.
        Last edited by sdack; 04 June 2013, 05:11 PM.

        Comment


        • #54
          Originally posted by sdack View Post
          I am not saying anything. I am telling you where to look for your problems. Do not blame the ondemand governor for not working right when you only do not know how it is working and when to use it.
          I'm pretty sure he does know how it's working, given that he just went through and explained exactly what was going on in great detail.

          Your point is apparently that the ondemand governor is working as expected and you don't want to change it. I, and many others, would argue that it kind of sucks and should probably be improved.

          Comment


          • #55
            Originally posted by smitty3268 View Post
            I'm pretty sure he does know how it's working, given that he just went through and explained exactly what was going on in great detail.

            Your point is apparently that the ondemand governor is working as expected and you don't want to change it. I, and many others, would argue that it kind of sucks and should probably be improved.
            I believe he did not. He used his own definition of a load average while there is already a definition of it. I was not blaming him for not knowing about it. He only gave his explanation on what he sees as being the load average. Its actual definition is around for longer than I am programming UNIX systems, which is what I am doing for almost 25 years now.

            The problem is not with the ondemand governor. As I wrote earlier does the performance governor work too and one can leave the power saving to the idle routine. The progress in microprocessor technology has now lead to newer CPUs implementing better power saving methods in hardware, making current software implementations obsolete. You can read about it here.

            Would you then try to complain about the ondemand governor for failing here, when the idea of controlling the CPU frequency in order to save power was first realised in software? No, of course not. It would be as absurd as trying to blame a horse for not wanting to drink gasoline. What you do is you learn to use the right power saving features and move on. Just like you stick to feeding horses apples and use the gasoline for the engines.

            You imagine that one needs to modify the ondemand governor, because you believe in fixing the symptom rather than the cause. Well, the problem is more complex and does not originate from the governor. You will need to make the governor aware of details within the user space, which is what you'd be doing if you made it aware of OpenGL applications. An operating system needs to draw a line between user space and kernel space for a lot of reasons. More precisely do its developers need to draw this line as they do not want to care and cannot care for what goes on in user space. It is the very reason why we have a user space governor, which passes control over the frequency to user space code and so we can run governors within user space that are aware of details within user space and in order to keep these out of the kernel.

            Also the problem you see then does not occur with every OpenGL application. On top of it do many people actually limit their frame rate with vsync, which not only saves them power, but makes it unlikely that they will have the problem. A modified ondemand governor would be counter-effective to them and only cause them a greater electricity bill. You would first need to gather more evidence of the problem, find out the precise conditions under which it occurs, and then implement a working solution with a user space governor.

            In short, I judge the problem as a minor one, which is not worth doing anything about, but suggest to anyone who sees it to tweak the ondemand governor or to use the performance governor.
            Last edited by sdack; 05 June 2013, 08:53 AM.

            Comment


            • #56
              Originally posted by sdack View Post
              In short, I judge the problem as a minor one, which is not worth doing anything about, but suggest to anyone who sees it to tweak the ondemand governor or to use the performance governor.
              Sure, a 50% fps drop is a minor problem that is delivered broken and everyone is to fix it from his side manually.
              I fail to see any out of box solution from your side, other than criticizing people for discussion using abstract references.
              Sure its nice to make a professor-look, whilst sitting on 5th point and doing nothing. I guess you still load kernel modules manually?

              The Intel problem you linked is utterly irrelevant and pointless. While modern AND Intel processors can intellectually manage the load balance on their side, other processors still can't; Intel already posted correct decision - each processor - his own driver.
              But this, hal-like, cpufreq is inefficient even for non-Intel AND non-modern processors, because its irritatingly named, polling instead of on-demand and falsely preconfigured per defaults for generic (non-server) tasks.
              Last edited by brosis; 05 June 2013, 09:25 AM.

              Comment


              • #57
                Originally posted by brosis View Post
                Sure, a 50% fps drop is a minor problem that is delivered broken and everyone is to fix it from his side manually.
                I fail to see any out of box solution from your side, other than criticizing people for discussion using abstract references.
                Sure its nice to make a professor-look, whilst sitting on 5th point and doing nothing. I guess you still load kernel modules manually?

                The Intel problem you linked is utterly irrelevant and pointless. While modern AND Intel processors can intellectually manage the load balance on their side, other processors still can't; Intel already posted correct decision - each processor - his own driver.
                But this, hal-like, cpufreq is inefficient even for non-Intel AND non-modern processors, because its irritatingly named, polling instead of on-demand and falsely preconfigured per defaults for generic (non-server) tasks.
                There is no 50% drop other than with outdated benchmarks. This can be avoided by using the performance governor. If this is too difficult for you and you require out-of-the-box products then you better play your games under Windows. And when you do be aware that one can tweak Windows over its registry. So better be prepared to shout loud and clear for out-of-the-box products when someone suggests you change a registry setting.

                Comment


                • #58
                  Originally posted by sdack View Post
                  I believe he did not. He used his own definition of a load average while there is already a definition of it. I was not blaming him for not knowing about it. He only gave his explanation on what he sees as being the load average. Its actual definition is around for longer than I am programming UNIX systems, which is what I am doing for almost 25 years now.
                  Well, probably I should have used something like 'CPU utilization' instead of 'CPU load' to avoid misunderstanding. English is not my native and some terms do not map very well when translated literally. Anyway, I think it was pretty clear from my comments that I didn't use standard UNIX definition of 'load average' because it's not even measured in percents. By the way, what definition you relied on when you also measured it in percents and wrote the following:

                  "If you would add all of this into the load average number then you would get a fixed 100%"
                  "If you so will is the average load of a CPU a lie. It is an artificial number, which was introduced to tell between how much a CPU is running user code and how much kernel code."

                  Your words also don't map very well to the standard definition of 'load average' that you know from your 25 years of unix programming experience, so please don't try to say that you misunderstood my comment because of the wrong wording and to turn this into meaningless linguistic discussion.

                  Originally posted by sdack View Post
                  The problem is not with the ondemand governor. As I wrote earlier does the performance governor work too and one can leave the power saving to the idle routine.
                  If the problem can be solved by removing ondemand governor from the equation, does it really mean for you that the problem is not with the ondemand governor.

                  Originally posted by sdack View Post
                  The progress in microprocessor technology has now lead to newer CPUs implementing better power saving methods in hardware, making current software implementations obsolete. You can read about it here.
                  Unfortunately, existance of newer cpus doesn't make this problem magically disappear for all users.

                  Originally posted by sdack View Post
                  You imagine that one needs to modify the ondemand governor, because you believe in fixing the symptom rather than the cause. Well, the problem is more complex and does not originate from the governor. You will need to make the governor aware of details within the user space, which is what you'd be doing if you made it aware of OpenGL applications.
                  Not really, if we would want to make it aware of GPU activity, we would only need some interface with kernel GPU driver.

                  Originally posted by sdack View Post
                  Also the problem you see then does not occur with every OpenGL application. On top of it do many people actually limit their frame rate with vsync, which not only saves them power, but makes it unlikely that they will have the problem.
                  Does it mean that the problem doesn't exist? Also this problem will be observable with OpenCL apps as well, where you don't have vsync to justify the current behavior of the ondemand governor.

                  Originally posted by sdack View Post
                  A modified ondemand governor would be counter-effective to them and only cause them a greater electricity bill.
                  What modifications exactly are you talking about? So far I don't see any proposed patches here. Are you criticizing some theoretically possible modifications that would potentially break ondemand governor for some users? I agree, those would be bad modifications.

                  Originally posted by sdack View Post
                  You would first need to gather more evidence of the problem, find out the precise conditions under which it occurs
                  That's exactly what we were trying to do in this thread before you came to tell us that we have no problems.

                  Originally posted by sdack View Post
                  In short, I judge the problem as a minor one, which is not worth doing anything about, but suggest to anyone who sees it to tweak the ondemand governor or to use the performance governor.
                  Thanks for your judgement that you shared with us from the heights of your experience, though the possibility of using another governor or tweaking some settings to work around the issues with ondemand governor was clear from the first comments in this thread and so far you didn't add anything useful.

                  Comment


                  • #59
                    Originally posted by vadimg View Post
                    That's exactly what we were trying to do in this thread before you came to tell us that we have no problems.
                    Learn to understand comments and how to reply. Who still replies with a million quotes?! If you do not understand a comment then read it again until you do and do not reply before then. You really want to be grateful for the insights I have given you and do not want to become a star in a drama.

                    Back to the topic... Where is your problem with the ondemand governor now? It is gone. You have learned that it is not broken, that there is already code in the kernel to handle your case, and where to look for your actual problem. If you need a few more extra FPS then we can discuss what else you can do. Start with telling us your hardware specs, the distribution and the desktop you are using and also list the applications where you have a problem with. If it is just with benchmarks then I suggest you stop running them.

                    Comment


                    • #60
                      Originally posted by sdack View Post
                      Learn to understand comments and how to reply. Who still replies with a million quotes?! If you do not understand a comment then read it again until you do and do not reply before then. You really want to be grateful for the insights I have given you and do not want to become a star in a drama.

                      Back to the topic... Where is your problem with the ondemand governor now? It is gone. You have learned that it is not broken, that there is already code in the kernel to handle your case, and where to look for your actual problem. If you need a few more extra FPS then we can discuss what else you can do. Start with telling us your hardware specs, the distribution and the desktop you are using and also list the applications where you have a problem with. If it is just with benchmarks then I suggest you stop running them.
                      Wow. Please stop acting like everyone except you is an idiot. We're not. And we're not grateful to your special "insights" either.

                      Comment

                      Working...
                      X