Announcement

**ms178** · 13 November 2018, 07:41 AM

Would be great to see these patches tested on a great variety of 1st and 2nd Gen Ryzen chips with different CCX-Layouts (e.g. 1600, TR 1920X).

**ermo** · 13 November 2018, 08:11 AM

I may be misunderstanding the problem here (and this might be the completely wrong place to ask), but isn't it reasonable to examine the importance of the rendering task in this scenario?

1./ If we're using a PC with Mesa for a task, odds are we're relying on a GUI of some sort?
2./ If the GUI is the primary means of visually interpreting the state of the task that is being worked on, does that not imply that this task is of high importance?
3./ If communicating the visual state of the task is of high importance, then why is it up to the kernel to decide any of this? Shouldn't the kernel be told "this task is important enough that *we* want *you* to schedule everything else around it" instead of us having to chase the kernel's decisions?

What am I missing here? Is what I am describing above similar to the old approach that was dropped?

(This is probably a rather naïve question, but I figure it's never too late to learn something...)

**boxie** · 13 November 2018, 08:24 AM

Originally posted by ermo View Post

I may be misunderstanding the problem here (and this might be the completely wrong place to ask), but isn't it reasonable to examine the importance of the rendering task in this scenario?

1./ If we're using a PC with Mesa for a task, odds are we're relying on a GUI of some sort?
2./ If the GUI is the primary means of visually interpreting the state of the task that is being worked on, does that not imply that this task is of high importance?
3./ If communicating the visual state of the task is of high importance, then why is it up to the kernel to decide any of this? Shouldn't the kernel be told "this task is important enough that *we* want *you* to schedule everything else around it" instead of us having to chase the kernel's decisions?

What am I missing here? Is what I am describing above similar to the old approach that was dropped?

(This is probably a rather naïve question, but I figure it's never too late to learn something...)

so, in normal operation the CPU threads jump around from CPU to CPU - sharing the love (and power budget). This change to chase the game threads as they jump CPUs keeps the same L3 cache in use (so CCX1's L3 cache is used by all threads) and avoids the penalty of moving memory around/warming up a new cache

**schmidtbag** · 13 November 2018, 09:45 AM

Don't get me wrong - I love these sorts of optimizations, but part of me questions if this should be a Mesa thing. And what I mean by that is I think this is something that could/should be done via a CPU scheduler. To my understanding, basically what's required of this scenario is for a parent process and all child processes to be within the same CCX. This is something that would benefit any application, not just Mesa. I understand and commend Marek for not wanting to wait for a proper CPU scheduler, but I'm not so sure a CPU-specific optimization in Mesa is the best idea.

But, I could be wrong. I'm sure at least 1 Phoronix community member is going to ream me out for thinking this way.

**starshipeleven** · 13 November 2018, 11:30 AM

Originally posted by schmidtbag View Post

Don't get me wrong - I love these sorts of optimizations, but part of me questions if this should be a Mesa thing.

I'm suspecting that having the application decide is more effective. As the application developers will know what is the best core/cache allocation for their application, while for a scheduler it's not really easy task to guess what is best on the fly.

And even if it may not be the best to have applications optimize like this, imho stuff like Mesa should get a free pass as it is a critical system component.

**smitty3268** · 13 November 2018, 03:30 PM

I think the issue is you don't want ALL threads pinned together on the same ccx, only some of them. So really only the application can know which ones are important to keep together and which can be spread around.

The kernel definitely needs to add an API to allow the app to request certain threads to be grouped together rather than manually doing this in iserspace though.

**Tomin** · 13 November 2018, 03:57 PM

Originally posted by smitty3268 View Post

The kernel definitely needs to add an API to allow the app to request certain threads to be grouped together rather than manually doing this in iserspace though.

This would be beneficial from security point of view also, if you can choose to have certain threads on the same core without others. It would make some spectre type attacks harder or practically impossible without disabling SMT completely.

(you would still need to think about whether those threads can attack each other)

**ermo** · 14 November 2018, 09:55 AM

Naïvely, it sounds like maybe it could be beneficial to have (at least) two types of scheduler "hints" then:

1./ Physical core proximity (run this group of threads/set of processes on the same core) for SMT purposes.

2./ Cache sharing (make sure this group of threads/sets of processes share the same Level N cache)

But I'll let the pros figure out if this actually makes sense in practice.

Announcement

Mesa Gets Testing Patches For New Zen Optimization Around Thread Pinning

Mesa Gets Testing Patches For New Zen Optimization Around Thread Pinning

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment