Announcement

**mdedetrich** · 27 May 2022, 10:03 AM

Originally posted by MorrisS. View Post

So the two environment are structurally different. At this point a natural based Os cannot become what it is not. If Nvidia cannot implement a solution and Linux cannot switch to explicit sync because of its nature as you state ("why android can?") the only solution is that Linux users don't use Nvidia video cards.

To be clear, Linux can switch to explicit sync it will just take a while because it has to be done gradually. What can't be done (or more accurately is both incredibly stupid and feasibly impractical) is for the NVidia driver to be re-coded to support implicit sync. Now that NVidia have opened up the kernel space of their graphics driver so that Nouveau can finally reclock the cards, a third option would be for Nouveau to get up to speed with supporting the newest NVidia cards but I don't think that will happen faster than Linux slowly transitioning to explicit sync and these issues getting ironed out.

Overall I think its safe to say that in terms of total effort its probably much faster to start moving properly to explicit sync rather than implementing all of these workarounds to try and get implicit sync to work. It may piss of Wayland devs/ecosystem more that it delays the transition to Wayland but it will be less painful for everyone involved (and honestly as is visible trying to transition with Wayland while using implicit hasn't historically been successful).

**mdedetrich** · 27 May 2022, 10:11 AM

Originally posted by oiaohm View Post

The reality is AMD and Intel in mainline drivers get implicit sync and explicit sync to in fact work with each other. Glamor working with AMD and Intel and other mainline drivers basically kills Nvidia arguement by the Linux kernel rules. There is a very high price for not being mainline with the Linux kernel. Big one is your request for kernel changes are straight can be vetoed by anyone who had mainline code.

And has been said ad-nauseam, the fact that Intel/AMD managed to get it to work is complete moot point. NVidia's driver is not designed like AMD's or Intels and its not feasible to get it to work with implicit sync (NVidia tried to do this, exactly as mesa wanted and it failed). So you can accept that or you can keep on shaking your fist like an old man complaining about people on his lawn.

Also they have had a like decade worth of time to get all of the peculiarities of GBM/KMS/Mesa worked out? NVidia just started working with with this part of the stack like 2-3 years ago.

I mean if you want to delay the Wayland transition by another 10 years then that sounds like a brilliant idea

**wertigon** · 27 May 2022, 11:07 AM

Originally posted by mdedetrich View Post

NVidia's driver is not designed like AMD's or Intels

From a Linux user point of view this point is *completely*, utterly irrelevant.

Look, it's like what is happening right now in the auto industry. In my town there are two dealers. On the left side of the main street, there is Ford selling ICE trucks, and on the right side of the street there is these upstart Chinese bastards called BYD selling EV trucks, that are in pretty much every way superior to whatever Ford is offering, at a comparable price.

Can you give me ANY compelling reason why I should not buy a BYD if the BYD meets all my needs and does so while being cheaper to drive, cheaper to own, and is at price parity with the ICE car? And, why should the consumer care whether or not Ford cannot compete because their cars are built in a different way?

So, I am a Linux user. I run Wayland due to it being the better alternative for me. I want to buy a new GPU. Why should I pick Nvidia, especially if it won't work for my machine?

**mdedetrich** · 27 May 2022, 11:15 AM

Originally posted by wertigon View Post

From a Linux user point of view this point is *completely*, utterly irrelevant.

Look, it's like what is happening right now in the auto industry. In my town there are two dealers. On the left side of the main street, there is Ford selling ICE trucks, and on the right side of the street there is these upstart Chinese bastards called BYD selling EV trucks, that are in pretty much every way superior to whatever Ford is offering, at a comparable price.

Can you give me ANY compelling reason why I should not buy a BYD if the BYD meets all my needs and does so while being cheaper to drive, cheaper to own, and is at price parity with the ICE car? And, why should the consumer care whether or not Ford cannot compete because their cars are built in a different way?

Coming up non apt examples as analogies is not helping your argument.

Originally posted by wertigon View Post

So, I am a Linux user. I run Wayland due to it being the better alternative for me. I want to buy a new GPU. Why should I pick Nvidia, especially if it won't work for my machine?

Good to know, but also not entirely relevant.

**MorrisS.** · 27 May 2022, 11:32 AM

Originally posted by piotrj3 View Post

Jason Ekstrand tells you exactly why it is not useful.

You grasp now scale of changes to move entirly to explicit sync? Well Nvidia in some cases properly guess synchronization fences so in case of Wayland program on Wayland it works, the issue is more about Xwayland and interactions between Wayland and for example screen capture. So yes the problem is way bigger then I suggested above to properly solve *all* issues. But many of those issues are also plaguing OSS drivers or were plaguing OSS drivers in the past. Keep in mind Nvidia is bleeding edge, while Intel/AMD spend more then 8 years on ironing Wayland bugs.

Is there any current Wayland compositor supporting explicit sync at the moment? (likely, a wayland compositor based on Vulkan). The transition you state asks for the replace of the opengl stack with vulkan. Once linux graphical environment is completely vulkan compliant, wayland and its compositors will be completely explicit sync. Chrome 102 has implemented WEBGPU now. So, I assume chrome developers are preparing the browser to switch to vulkan api. If wayland is easily adaptable, explicit sync will be easily applied. I assume that vulkan is the answer to the problem. Nvidia is in advance, wayland has developed before vulkan, and vulkan is beginning the phase of its implementation. Until Linux operating systems are not vulkan based, the problem will persists. As ignorant in the subject, I don't see any other perspective. So, the matter is if the wayland integration should have been developed alongside vulkan, because what is anterior is implicit sync based. Probably, no....wayland was too too forward respect to the current explicit sync needs.

**piotrj3** · 27 May 2022, 11:42 AM

Originally posted by oiaohm View Post

Buffer Sharing and Synchronization — The Linux Kernel documentation

https://www.kernel.org/doc/html/v5.9/driver-api/dma-buf.html

DMA_BUF_IOCTL_SYNC is in fact merged quite a while back. Yes that in the 5.9 documentation. Most of jason latter patch is test suite and documentation on what was already in the kernel. Yes the horrible Linux kernel problem a feature gets created and nobody documents how to use it. DMA_BUF_IOCTL_SYNC traces back to Android and it explicit sync.

Android being explicit sync would have to solve the same problem when using DMA-BUF right.

Buffer Sharing and Synchronization (dma-buf) — The Linux Kernel documentation

https://docs.kernel.org/driver-api/dma-buf.html#implicit-fence-poll-support

You need to read this. Note the last sentence the first bit is a is clear only signals on completion this is only returns from poll when its done. The way implicit sync is implement here is you don't do many context switches because you call poll you lose your application cpu slice and get a new cpu slice when the implicit fence is resolved. Your process does not get CPU slices it is not doing many context switches. What you just wrote is not how the DMABUF implicit sync works.

There is not multi queries and wait in the DMABUF implicit sync implementation this is why it can deadlock.

Sync files the Linux explicit sync is not linked to context switches. So you might spinlock waiting on a Sync file consuming insane amount of CPU processing power at the worst.

There is a benefit if a futex that is explicit sync compatible is that while waiting on sync you are not giving the application time slices after time slices of cpu time that basically spin locking on the explicit sync waiting for it to complete. This is exactly what DMABUF implicit sync is designed to prevent. DMABUF implicit sync due to using the Poll syscalls is basically integrated with the Linux kernel CPU scheduler.

This is the problem doing implicit sync on top of Nvidia explicit sync is not going to emulate DMABUF implicit sync because you are missing the CPU scheduler integration.

And how do you know that?

What happens to thread that is epolling. It sleeps in __add_wait_queue_exclusive() as set_current_state(TASK_INTERRUPTIBLE). If that thread will be taken away by kernel to do something else it will context switch.

Do you know what eventfd() is? That is right it makes same addition to epoll() table. those FD stuff are used for sake of explicit sync by nvidia and this is what they want to get from DMA_BUF. On both you can spinlock to watch over if work is done. On both you can spinlock with wait. But there is something explicit sync can do that implicit sync can't.

There is also pthread_cond_wait (which will make same wait as epoll() would cause), and pthread_cond_signal. Nothing also prevents you from using classic mutexes. Again explicit sync can do all things implicit sync can do in same complexity and waits, but you also get more options.

**piotrj3** · 27 May 2022, 12:06 PM

Originally posted by MorrisS. View Post

So the two environment are structurally different. At this point a natural based Os cannot become what it is not. If Nvidia cannot implement a solution and Linux cannot switch to explicit sync because of its nature as you state ("why android can?") the only solution is that Linux users don't use Nvidia video cards.

Not really because there are ways explicit sync can be implemented in Linux. Eventfd() exist in linux and has properties of explicit sync like working. Same way as DMA_BUF cooporote with epoll() we could make something like FD_BUF that works with eventfd() . In fact a lot of recent works allow converting DMA_BUF to eventfd event. In Ekstrand patch you have those funny dma_buf_fd objects that i think might finally solve the issue.

**oiaohm** · 27 May 2022, 07:04 PM

Originally posted by piotrj3 View Post

To be honest problem is 4 part:

- Wayland by itself is smallest part of problem because it actually is supposed to support explicit sync so someone though about that, but more laziness we have one route why make another

- Wayland and open source stack around it was developed before DX12/Vulkan. I don't believe any sane developer would go implicit sync route when DX12/Vulkan was around,

- Linux and principle "Everything is a file" and file with locks on it, is implicit sync like structure. DMA_BUF and GBM are file like structure, so maybe it felt natural, KMS is implicit too,

- maybe there were existing hardware/drivers that are only made with implicit sync in mind?

KMS implicit sync is like the DMABUF implicit sync where application is only getting CPU timeslices from the scheduler when it can proceed forwards because everything is in sync.

The reality here is neither KMS implicit sync or DMABUF implicit sync is a generic implicit sync both have cpu scheduler integration.

**oiaohm** · 27 May 2022, 07:57 PM

Originally posted by piotrj3 View Post

And how do you know that?

What happens to thread that is epolling. It sleeps in __add_wait_queue_exclusive() as set_current_state(TASK_INTERRUPTIBLE). If that thread will be taken away by kernel to do something else it will context switch.

Do you know what eventfd() is? That is right it makes same addition to epoll() table. those FD stuff are used for sake of explicit sync by nvidia and this is what they want to get from DMA_BUF. On both you can spinlock to watch over if work is done. On both you can spinlock with wait. But there is something explicit sync can do that implicit sync can't.

There is also pthread_cond_wait (which will make same wait as epoll() would cause), and pthread_cond_signal. Nothing also prevents you from using classic mutexes. Again explicit sync can do all things implicit sync can do in same complexity and waits, but you also get more options.

You need to look closer at the wait. Nvidia explicit sync is not scheduler integrated. Yes you can wait but the kernel scheduler does not have the information from the Nvidia explicit sync to know it should not wake a process up right no because the explicit sync is not in the right state for the application to continue.

Why implicit sync is being demand in the Linux kernel graphics stack so much is that this is suiting the CPU use case not the GPU use case. What Nvidia offering does not suit the CPU use case.

Classic mutexes don't solve the problem. Remember you can have more than one application waiting on a DMABUF so a application internal Mutex is not going to help you correctly. There need to be a graphical futex/mutex that is sync aware the implicit sync in DMABUF and KMS are horrible implementations form poll based Mutex.

https://www.kernel.org/doc/html/late...equeue-pi.html

Why pay if you know what you were refering to. Yes above is how pthread_cond_wait is in fact done in the Linux kernel. Do note
pthread_cond_broadcast/pthread_cond_signal and pthread_cond_wait area pair. So every time the condition changes state the pthread_cond_broadcast or pthread_cond_signal has to be called on it so pthread_cond_wait can in fact work. So this does not work with Nvidia explicit sync.

So how are you going to know that explicit sync value has changed to call either pthread_cond_broadcast/pthread_cond_signal to use pthread_cond_wait. Same problem happens with a normal Mutex.

eventfd can get you close.

eventfd(2) - Linux manual page

https://man7.org/linux/man-pages/man2/eventfd.2.html

POLLIN and POLLOUT that DMABUF/KMS has implemented is to eventfd specification so poll works correctly and that is where implicit sync comes from. So o dear implicit sync behavour is defined in eventfd epoll.

This is a case the kernel side need to be able to fill in the POLLIN and POLLOUT data some how so that eventfd works right. Signal has to come from somewhere.

There is also a reason why this is wanted to come from kernel. So that if you give a process very high priority you don't kill the signal source and for sure deadlock with graphical.

Remember how people said implicit sync comes from everything being a file they are absolutely correctly. Eventfd is a file event thing so it has implicit sync like it or not if you implement it correctly. This is also why when Nvidia goes we have explicit sync only everyone else like "What the ...." are not not using epoll for stuff are you ignoring how its meant to be implemented it defined??

**MorrisS.** · 30 May 2022, 11:23 AM

GitHub - st3r4g/swvkc: experimental Wayland Vulkan compositor

https://github.com/st3r4g/swvkc

experimental Wayland Vulkan compositor. Contribute to st3r4g/swvkc development by creating an account on GitHub.

swvkc is an experimental Wayland compositor meant to explore the possibility of using Vulkan as a rendering backend.
swvkc prioritizes direct scanout of client buffers when possible. When compositing needs to be done, it renders to the screen with simple copy commands from the Vulkan API.
Some goals/directions of the project:

Do the minimal work necessary to display client buffers, do not introduce screen tearing/stuttering or input lag.
Stick to minimal window management features.
Try to write simple, easy to understand code.

Announcement

NVIDIA's List Of Known Wayland Issues From SLI To VDPAU, VR & More

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment