Announcement

Collapse
No announcement yet.

KDE On The Importance Of Wayland Explicit Sync

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • oiaohm
    replied
    Originally posted by mdedetrich View Post
    This is false, an NVidia engineer stated precisely why it was difficult for their driver to implement implicit sync in the way being asked and given that, in that same post they also openly asked the community how to implement such in thing.
    ​Example is given in fact. Nouveau does support implicit sync with Nvidia hardware. ​ The open source world has been offering to-do it themselves if they are allowed to.

    Originally posted by mdedetrich View Post
    You suggested this same solution you mentioned in this thread to them, they responded to you saying why its not possible and from then on you (and no one else from that matter) never responded to them, so from their side they were being told that a solution was "easy", they said why its not with a technical explanation why and no one actually proposed how they could do it easily.
    Just like windows the Linux kernel has stack of shared code to implicit implicit sync on top of a explicit sync driver/hardware this is code that is MIT licensed and shared with BSD.

    Notice here saying Nvidia was not give how is absolute bogus garbage. Nvidia does not like the performance cost. AMD and Intel do pay the performance cost to have implicit sync.

    Originally posted by mdedetrich View Post
    Also to be blunt, given that X11 exists early adoption of Wayland is way below priority versus getting things done properly, and its ironic that on one hand people are complaining that NVidia is lazy in open source and then on the other hand people are now complaining that NVidia took the proper but painful route which also required their own engineers to spend time getting it through. Its a catch 22 where NVidia is always the bad guy.
    The reality here is Nvidia over and over again takes performance over doing things properly for system stability.

    Implicit sync advantage is that you can have a third party to application like kernel controlling when particular operations happen. For resource allocate and free/release to make sure that one program does not rug pull another program this is important to system stability.

    The problem here is the safety of implicit sync does come with a performance cost. To be real using a protected mode OS the way we do today does technically come with a performance cost as well.

    You have to remember some of the reason why Valve was willing to fund Zink development is the fact that Nvidia Opengl implicit sync does not in fact work right resulting in particular games not working right.

    In some ways Nvidia is tunneled visioned and that the problem. To sell GPU cards you want to have bigger number in benchmarks. You don't see GPU reviewers running the Opengl/Vulkan/Direct X conformance suites to make sure vendors are not cheating their implementations. Yes giving up system stability to make benchmarks go faster is way to sell more cards. But there are markets like Linux world that more server based where stability is more highly valued.

    Yes eglstreams is classic example where yes it design in theory would be faster. Yes would require everything altered to support it but when Nvidia developer was being ask by lead KDE developer to prove stability there was no way to-do it. You look at eglstreams ignoring/not including protections between OS processes memory allocations yes this improves performance but not having these protections there is no way you can have stability.

    mdedetrich there is a reasons why Nvidia to the open source world of LInux is almost always the bad guy. Number 1 of these Linux world values stability over performance. You can see this with the history of how long the Linux kernel held on to the big kernel lock and other things but then Nvidia values performance over stability, This equals end up with disputes.

    Yes part of being sure of stability is also open source drivers so the code can be validated that it not doing something stupid to cause stability problems.

    The wish for stability in the Linux world puts parties like Nvidia into conflict.

    Leave a comment:


  • wertigon
    replied
    Originally posted by mdedetrich View Post

    Your doing an excellent job of creating bulls**t analogies and early Wayland adoption has close to zero actual dent on NVidia's market performance/demographics. Hate to break it to you but there is no money in Linux desktop.
    There were hardly any money in smartphones, or LCD screens or heck, internet streaming services either, in the beginning. Netflix started as a DVD rental company and Tesla almost went bankrupt in creating model 3.

    But, sure. Bet the farm on the status quo remaining, the more arrogant the company the harder the fall

    That said the fall will not come tomorrow. Give it a few decades.

    Leave a comment:


  • mdedetrich
    replied
    Originally posted by wertigon View Post

    Also to be blunt, given that ICE cars exists early adoption of electric cars is way below priority versus getting things done properly (said Toyota)

    Also to be blunt, given that dumbphones exists early adoption of smartphones is way below priority versus getting things done properly (said Nokia)

    Also to be blunt, given that DVDs exists early adoption of streaming is way below priority versus getting things done properly (said Blockbuster)

    Also to be blunt, given that camera film exists early adoption of digital cameras is way below priority versus getting things done properly (said Kodak)

    Also to be blunt, how did that work out for these market leaders?
    Your doing an excellent job of creating bulls**t analogies and early Wayland adoption has close to zero actual dent on NVidia's market performance/demographics. Hate to break it to you but there is no money in Linux desktop.

    Leave a comment:


  • wertigon
    replied
    Originally posted by mdedetrich View Post

    Also to be blunt, given that X11 exists early adoption of Wayland is way below priority versus getting things done properly

    Also to be blunt, given that ICE cars exists early adoption of electric cars is way below priority versus getting things done properly (said Toyota)

    Also to be blunt, given that dumbphones exists early adoption of smartphones is way below priority versus getting things done properly (said Nokia)

    Also to be blunt, given that DVDs exists early adoption of streaming is way below priority versus getting things done properly (said Blockbuster)

    Also to be blunt, given that camera film exists early adoption of digital cameras is way below priority versus getting things done properly (said Kodak)

    Also to be blunt, how did that work out for these market leaders?

    Leave a comment:


  • mdedetrich
    replied
    Originally posted by MrCooper View Post


    It most certainly would have required much less. They put in quite a lot of work for explicit sync across the stack.
    This is false, an NVidia engineer stated precisely why it was difficult for their driver to implement implicit sync in the way being asked and given that, in that same post they also openly asked the community how to implement such in thing.

    You suggested this same solution you mentioned in this thread to them, they responded to you saying why its not possible and from then on you (and no one else from that matter) never responded to them, so from their side they were being told that a solution was "easy", they said why its not with a technical explanation why and no one actually proposed how they could do it easily.

    Also to be blunt, given that X11 exists early adoption of Wayland is way below priority versus getting things done properly, and its ironic that on one hand people are complaining that NVidia is lazy in open source and then on the other hand people are now complaining that NVidia took the proper but painful route which also required their own engineers to spend time getting it through. Its a catch 22 where NVidia is always the bad guy.

    Make up your god damn mind.
    Last edited by mdedetrich; 15 April 2024, 05:50 AM.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by MrCooper View Post
    It most certainly would have required much less. They put in quite a lot of work for explicit sync across the stack.
    I stand corrected.

    Leave a comment:


  • oiaohm
    replied
    Originally posted by MrCooper View Post
    Because the kernel allows submitting GPU work using buffers which have other GPU work still in flight. This has always been the case, and Xorg has always relied on it, with upstream DRM drivers.
    Just because something has always been the case does it make it right.

    RCU was patented so GPU designs could not use it because GPU vendors want to keep code cross platform and not pay for RCU patents. RCU is designed to allow inflight processing as well. Thing about it the reader with RCU cannot see the new version being worked on only when it ready does it come visible.

    Look at RCU. RCU in a lot of ways is a form of dynamic multi buffering. You call the Read lock you get current complete version of the item at the time of the read lock it can be cleaned up when the read lock is released.

    Yes the copy bit of RCU is the inflight processing.

    The thing here is does a Wayland compositor/display server really need to see items in flight of other processes. Or should what is visible to a Wayland compositor/display server from other processes be behind a RCU where the only thing that visible is the current complete versions with all the processing in flight of other applications hidden.

    What you need to lock for two process modifying the same buffer is different to what you need to lock for a process that just reading a buffer. Also most cases the items reading buffer really only need to see the last completed.

    This is implicit sync for you. There is options to be selective on what is visible.

    RCU provides a method to hide the inflight processing from items that really don't need to know about it or wait on it. Fun part due to the history of RCU this has not be implemented in GPU drivers.

    This is point are you doing forms of wait on items that should have been hidden from the compositor in the first place.

    The Linux kernel allows users to be altering RCU protected items all the time and having inflight things happening on RCU protected items all the time. Big advantage of RCU is that you cannot see the inflight version as a reader. To the reader the inflight version does not exist only the completed versions exist with RCU.

    I am very suspect that RCU would give lot simpler code than using explicit sync or what you are doing working out what buffers are not ready in the compositor.

    The historic forms of GPU implicit sync you can see items in flight. GPU implementations have not used forms of implicit sync where you cannot see in flight historically these make sense for items like display servers/wayland compositors.

    Basically an option to say with these buffers only show me the ready to present stuff and let me pretend the stuff not ready to present does not exist can be done with RCU methods.

    Leave a comment:


  • MrCooper
    replied
    Originally posted by oiaohm View Post
    RCU form of implicit sync.
    It has nothing to do with RCU, the analogy makes no sense.

    My question is why is it possible at all for the Compositor to depend on unfinished client GPU work?
    Because the kernel allows submitting GPU work using buffers which have other GPU work still in flight. This has always been the case, and Xorg has always relied on it, with upstream DRM drivers.

    Yes your solution is to wait in the compositor until the buffer is ready. My question is why does the compositor have to waste CPU cycles doing that
    It doesn't waste CPU cycles. It's just file descriptors as part of the event loop. This works exactly the same with explicit sync in the protocol.

    Originally posted by smitty3268 View Post
    It took multiple years for the explicit sync work to get into wayland. There's zero chance Nvidia couldn't have accomplished that much quicker in their own driver, [...]

    It certainly would have required more development time in man hours for Nvidia.
    It most certainly would have required much less. They put in quite a lot of work for explicit sync across the stack.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by mdedetrich View Post

    It's nonsense because it doesn't agree with your world view, that's it. It's a fact that NVidia rebuilt their entire drivers from scratch in vista era based on explicit sync, this is already established knowledge and if you look at how NVidia designs their own APIs (i.e tegra on arm) its also bloody obvious. The reason why NVidias current driver isn't an issue with X11/xorg is that x11/xorg is an end to end solution which means that NVidia can avoid using things like gbm, this is not possible with Wayland (Wayland is just a protocol and everyone expects to use it with GBM).
    You're saying all this like I don't already know it. I wish you had actually read what I wrote and thought about it, rather than just throwing me into some "nvidia haters camp who must be an idiot".

    Of course it's possible for NVidia to put proper implicit sync support
    THANK YOU! For finally agreeing with what I've been trying to get across this whole time...

    but the amount of effort/work/validation required means it would have overall taken longer than just fixing explicit sync proper in Wayland, so technically speaking it's a really stupid idea.
    It took multiple years for the explicit sync work to get into wayland. There's zero chance Nvidia couldn't have accomplished that much quicker in their own driver, if only because they can decide easily whatever they want about their own driver. The upstream wayland/x solution required a bunch of collaboration with other vendors who all had different ideas about how to make it work best and therefor bogged things down for a long time.

    It certainly would have required more development time in man hours for Nvidia. I never said otherwise. As for whether it was a stupid or smart idea, well, that's an opinion different people can reasonably disagree on. I already said above in my first post that I bet Nvidia is really happy with their decision right now, so yep - I agree that from their perspective it was a smart idea to do it this way. The viewpoint of an end user who wanted to use wayland earlier might be different.

    And the actual devs who weren't in the irrational hating NVidia band camp agreed. This is the thing you don't actually get, when the devs looked into the actual details everyone that is relevant came to this conclusion.
    NVidia devs came to that conclusion. Because they didn't want to do the work. Which is exactly what I've been telling you the whole time.

    OSS devs never agreed with that assessment, and last time I pointed you directly to quotes from Faith Erkstrand pointing that out. Precisely because you were trying to point to them as supporting your view instead of mine. MrCooper here is another OSS dev which clearly disagrees with your take. So don't try to claim they all support you.

    You also seem to have come to some kind of conclusion that I must be against explicit sync in general, which was never the case. Supporting backwards compatibility doesn't mean you are against the newer, better solution.

    Anyway, I do not hate nvidia. I even use an nvidia gpu in some of my machines. The fact that you call me some irrational nvidia hater kind of makes me want to ignore everything else you say because you clearly aren't paying much attention to what i'm saying if that's your take.
    Last edited by smitty3268; 09 April 2024, 11:36 PM.

    Leave a comment:


  • oiaohm
    replied
    Originally posted by MrCooper View Post
    Per my blog post, it's because the compositor makes its own GPU work depend on unfinished client GPU work. This means the compositor GPU work cannot finish before the client work, even if the compositor uses a higher priority GPU context. The solution is for the compositor to wait for client GPU work to finish before making use of the resulting buffer contents.
    RCU form of implicit sync. This form of sync you can only make work dependent on finished work. You are fixing this problem in the compositor.

    If the sync between the client application and the compositor is RCU based the compositor would only ever see complete ready to display work.

    My question is why is it possible at all for the Compositor to depend on unfinished client GPU work? There better better be a good reason because if there is not a good reason the issue is the sync implementation needing to change.

    RCU method would result in everything the compositor being able to see is the finished ready to present work.

    Remember patents on RCU only run out in 2010. For the graphics stack that young enough for those doing cross platform drivers not to have fully considered using it due to not using it back in the day due to patents.

    Yes your solution is to wait in the compositor until the buffer is ready. My question is why does the compositor have to waste CPU cycles doing that and it would not need to do that if it can never see incomplete work. RCU design the compositor would not have to be processing the update to the compositor everything it always seeing is ready to present because the incomplete work is hidden by the RCU operation method..

    This might be a GPU memory management limitation or something but it really work asking about.

    The RCU based implicit sync for display items like compositors might have a serous advantage over Explicit sync by completely getting rid of compositors needing to wait for incomplete work to come completed due to this from of implicit sync would just make only completed work visible to the compositor and remove complete means to stall on incomplete work because you cannot stall/wait on something you cannot see.

    Leave a comment:

Working...
X