Announcement

Collapse
No announcement yet.

Mesa Gets Testing Patches For New Zen Optimization Around Thread Pinning

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mesa Gets Testing Patches For New Zen Optimization Around Thread Pinning

    Phoronix: Mesa Gets Testing Patches For New Zen Optimization Around Thread Pinning

    It was just yesterday that the AMD Zen L3 thread pinning was dropped from Mesa due to that optimization not panning out as intended for benefiting the new AMD processors with the open-source Linux graphics driver stack. Lead Mesa hacker Marek Olšák is already out with a new Zen tuning implementation that may deliver on the original optimization goal...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Would be great to see these patches tested on a great variety of 1st and 2nd Gen Ryzen chips with different CCX-Layouts (e.g. 1600, TR 1920X).

    Comment


    • #3
      I may be misunderstanding the problem here (and this might be the completely wrong place to ask), but isn't it reasonable to examine the importance of the rendering task in this scenario?

      1./ If we're using a PC with Mesa for a task, odds are we're relying on a GUI of some sort?
      2./ If the GUI is the primary means of visually interpreting the state of the task that is being worked on, does that not imply that this task is of high importance?
      3./ If communicating the visual state of the task is of high importance, then why is it up to the kernel to decide any of this? Shouldn't the kernel be told "this task is important enough that *we* want *you* to schedule everything else around it" instead of us having to chase the kernel's decisions?

      What am I missing here? Is what I am describing above similar to the old approach that was dropped?

      (This is probably a rather naïve question, but I figure it's never too late to learn something...)
      Last edited by ermo; 13 November 2018, 08:16 AM.

      Comment


      • #4
        Originally posted by ermo View Post
        I may be misunderstanding the problem here (and this might be the completely wrong place to ask), but isn't it reasonable to examine the importance of the rendering task in this scenario?

        1./ If we're using a PC with Mesa for a task, odds are we're relying on a GUI of some sort?
        2./ If the GUI is the primary means of visually interpreting the state of the task that is being worked on, does that not imply that this task is of high importance?
        3./ If communicating the visual state of the task is of high importance, then why is it up to the kernel to decide any of this? Shouldn't the kernel be told "this task is important enough that *we* want *you* to schedule everything else around it" instead of us having to chase the kernel's decisions?

        What am I missing here? Is what I am describing above similar to the old approach that was dropped?

        (This is probably a rather naïve question, but I figure it's never too late to learn something...)
        so, in normal operation the CPU threads jump around from CPU to CPU - sharing the love (and power budget). This change to chase the game threads as they jump CPUs keeps the same L3 cache in use (so CCX1's L3 cache is used by all threads) and avoids the penalty of moving memory around/warming up a new cache

        Comment


        • #5
          Don't get me wrong - I love these sorts of optimizations, but part of me questions if this should be a Mesa thing. And what I mean by that is I think this is something that could/should be done via a CPU scheduler. To my understanding, basically what's required of this scenario is for a parent process and all child processes to be within the same CCX. This is something that would benefit any application, not just Mesa. I understand and commend Marek for not wanting to wait for a proper CPU scheduler, but I'm not so sure a CPU-specific optimization in Mesa is the best idea.

          But, I could be wrong. I'm sure at least 1 Phoronix community member is going to ream me out for thinking this way.
          Last edited by schmidtbag; 13 November 2018, 10:08 AM.

          Comment


          • #6
            Originally posted by schmidtbag View Post
            Don't get me wrong - I love these sorts of optimizations, but part of me questions if this should be a Mesa thing.
            I'm suspecting that having the application decide is more effective. As the application developers will know what is the best core/cache allocation for their application, while for a scheduler it's not really easy task to guess what is best on the fly.

            And even if it may not be the best to have applications optimize like this, imho stuff like Mesa should get a free pass as it is a critical system component.

            Comment


            • #7
              I think the issue is you don't want ALL threads pinned together on the same ccx, only some of them. So really only the application can know which ones are important to keep together and which can be spread around.

              The kernel definitely needs to add an API to allow the app to request certain threads to be grouped together rather than manually doing this in iserspace though.

              Comment


              • #8
                Originally posted by smitty3268 View Post
                The kernel definitely needs to add an API to allow the app to request certain threads to be grouped together rather than manually doing this in iserspace though.
                This would be beneficial from security point of view also, if you can choose to have certain threads on the same core without others. It would make some spectre type attacks harder or practically impossible without disabling SMT completely.

                (you would still need to think about whether those threads can attack each other)

                Comment


                • #9
                  Naïvely, it sounds like maybe it could be beneficial to have (at least) two types of scheduler "hints" then:

                  1./ Physical core proximity (run this group of threads/set of processes on the same core) for SMT purposes.

                  2./ Cache sharing (make sure this group of threads/sets of processes share the same Level N cache)

                  But I'll let the pros figure out if this actually makes sense in practice.

                  Comment

                  Working...
                  X