Announcement

Collapse
No announcement yet.

AMDGPU DRM Driver To Finally Expose GPU Load Via Sysfs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMDGPU DRM Driver To Finally Expose GPU Load Via Sysfs

    Phoronix: AMDGPU DRM Driver To Finally Expose GPU Load Via Sysfs

    The AMDGPU DRM driver appears to finally be crossing the milestone of exposing the current GPU load (as a percentage) in a manner that can be easily queried via sysfs...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    This business percentage number has been available in debugfs for a few years. Also, this is a measure of how busy the SMU (power controller) thinks the GPU is based on an aggregation of inputs from various blocks on the GPU. If you want finer grained readings (e.g., shaders vs video decoder vs dma engine, etc.) you need to poll the busy states for each block or use perf counters, so it's not exactly a trivial problem to solve and it's not necessarily obvious exactly how to expose the data. Depending on what you are trying to do, it may or may not be particularly useful.

    Comment


    • #3
      Originally posted by agd5f View Post
      This business percentage number has been available in debugfs for a few years. Also, this is a measure of how busy the SMU (power controller) thinks the GPU is based on an aggregation of inputs from various blocks on the GPU. If you want finer grained readings (e.g., shaders vs video decoder vs dma engine, etc.) you need to poll the busy states for each block or use perf counters, so it's not exactly a trivial problem to solve and it's not necessarily obvious exactly how to expose the data. Depending on what you are trying to do, it may or may not be particularly useful.
      The problem with debugfs is that on most Linux distros, debugfs isn't accessible unless root.

      For comparisons of say looking at the GPU utilization between a game with OpenGL and Vulkan or getting a rough estimate, this output appears to be comparable enough for those purposes?
      Michael Larabel
      https://www.michaellarabel.com/

      Comment


      • #4
        Slightly unrelated, but I wish AMDGPU would allow controlling color spaces and dithering like radeon (apparently) did.
        I had to modify my EDID so AMDGPU would make my monitor use RGB rather than YCbCr.

        Comment


        • #5
          I hope it will be more useful than the value that is reported in /sys/kernel/debug/dri/0/amdgpu_pm_info . It often jumps from 0 to 100% when there is constant load, while Mesa's OGL shows a reliable value like e.g. constant 70%.
          The Windows driver of AMD has the same problem, unlike Nvidia and Intel.

          Comment


          • #6
            Originally posted by Michael View Post

            The problem with debugfs is that on most Linux distros, debugfs isn't accessible unless root.

            For comparisons of say looking at the GPU utilization between a game with OpenGL and Vulkan or getting a rough estimate, this output appears to be comparable enough for those purposes?
            Originally posted by aufkrawall View Post
            I hope it will be more useful than the value that is reported in /sys/kernel/debug/dri/0/amdgpu_pm_info . It often jumps from 0 to 100% when there is constant load, while Mesa's OGL shows a reliable value like e.g. constant 70%.
            The Windows driver of AMD has the same problem, unlike Nvidia and Intel.
            It's the same busy value that is exposed in both places (both debugfs and now sysfs). It's what the SMU uses to determine busy-ness for clocking purposes. Mesa and umr actually poll the busy state of whatever blocks you want to look at (shaders, RBs, sdma, etc.) to determine the load. It really depends what you are looking for. For example, you could be running some compute app in the background that pegs the shader cores but doesn't make use of any other blocks on the GPU (SDMA, UVD, VCE, etc.). Would you consider that to be full load? The SMU would so you'd see load at 100% or close to it, but you'd still have full capacity available on the other hw blocks. I suppose the ideal case would be a separate load for each engine on the GPU, but then you are back to polling or perf counters.

            So generally when running a graphics workload it should give you a rough idea of how loaded the GPU is (e.g., 30% vs 50% vs 90%), but I don't know how meaningful small differences would be. We usually use busy polling or perf counters to get numbers for profiling.

            Comment


            • #7
              So it reports the maximum of the loads of different HW blocks? Meaning if the reported load is 100%, a hw block is currently the bottleneck for the GPU. It would be interesting to see what part that is.
              In the case of games, I guess that would probably be the shaders?

              Comment


              • #8
                Originally posted by agd5f View Post
                It's the same busy value that is exposed in both places (both debugfs and now sysfs). It's what the SMU uses to determine busy-ness for clocking purposes. Mesa and umr actually poll the busy state of whatever blocks you want to look at (shaders, RBs, sdma, etc.) to determine the load. It really depends what you are looking for. For example, you could be running some compute app in the background that pegs the shader cores but doesn't make use of any other blocks on the GPU (SDMA, UVD, VCE, etc.). Would you consider that to be full load? The SMU would so you'd see load at 100% or close to it, but you'd still have full capacity available on the other hw blocks. I suppose the ideal case would be a separate load for each engine on the GPU, but then you are back to polling or perf counters.
                Hm, then maybe the logic which selects which GPU part is representative for the the reported loaded could need some optimization?
                It gets extremely unreliable when there isn't full load, like e.g. an fps limit in a game or other workloads which don't max out the GPU by definition (e.g. mpv).

                Comment


                • #9
                  Originally posted by aufkrawall View Post
                  Hm, then maybe the logic which selects which GPU part is representative for the the reported loaded could need some optimization?
                  It gets extremely unreliable when there isn't full load, like e.g. an fps limit in a game or other workloads which don't max out the GPU by definition (e.g. mpv).
                  You can tweak the heuristics that the SMU uses to decide how to select power levels via the pp_power_profile_mode file.

                  Comment


                  • #10
                    Originally posted by valici View Post
                    So it reports the maximum of the loads of different HW blocks? Meaning if the reported load is 100%, a hw block is currently the bottleneck for the GPU. It would be interesting to see what part that is.
                    In the case of games, I guess that would probably be the shaders?
                    As game engines are designed in different ways and we also have porting losses shifting bottlenecks around - I'm not so sure if shaders are always the bottleneck. But for games, it would make sense to take any bottleneck - wherever it may be - as the 100% load limiter. It only makes sense for Devs to get a fine grained analysis - the user has pretty much no meaningful option to change it.

                    "Your geometry pipeline is saturated, while your shaders are at 30% load" - what shall the "average user" do with this kind of information?

                    Game developers also have the option to use this new debugging tool from AMD which would give them a much more informed analysis on where the bottlenecks exactly are and which processes stall the rendering. I think AMD calls if PerfStudio.

                    Anyhow - for other tasks you may have a quick poor-mans overview on what is stalling your GPU without performing an in-depth analysis. Maybe there is a usecase or two for that.

                    Comment

                    Working...
                    X