Announcement

Collapse
No announcement yet.

NVIDIA 555.58 Stable Linux Driver Brings Wayland Explicit Sync, GSP Firmware Default

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by mdedetrich View Post
    Its not a twitter post I am referring to, but an actual discussion on mesa gitlab. I don't why I have to keep on repeating this, but NVidia actually asked for advice there on how to implement their supposed implicit sync fix (without killing performance or recoding fundamental parts of their driver) and they never got a reply from anyone.
    Of course they did not get reply to that question the AMD, Intel and Arm developers had reworked fundamental parts of their kernel drivers to support by implicit and explicit sync at the same time.

    Also when they first asked that question Nvidia source code to there driver was also still hidden mdedetrich. Reality the first time Nvidia asked the question they were told to open source and mainline their driver to use the shared framework for implicit and explicit sync that the Intel, AMD and Arm developers had been working on.

    Its the valve funded developer working on Zink who worked out how to do implicit sync on top of explicit sync without altering the Nvidia driver code. Zink can have Xwayland working without the added explicit sync support without major over head.

    mdedetrich I have answered this before as well yet you keep on repeating it hoping no one around who is aware of the truth. There is a price to having a closed source kernel driver this is not sharing development with other vendors. Yes advantage when you have tech the other vendors don't have but a total disadvantage when the other vendors have tech you don't.

    Comment


    • #62
      Originally posted by oiaohm View Post

      Not true. Simon reworked existing protocol

      Above is the existing from 2016.



      The difference between these two is the following.

      1) the original was written before DRM syncobj was formally named so has a lot of describing the function of what a DRM syncobj is.
      2) did not include means to create syncobj without informing GPU that it was created.
      3) Xwayland was not altered to use the original.

      The original has been supported by KDE from 2017 and only worked with GPU with open source drivers.

      Sync objects by Wayland protocol has been there for quite sometime. AMD and Intel and ARM have no reason to make Xwayland support explicit protocol because their drivers support implicit and explicit sync side by side with a shared Linux kernel code for doing this.
      Notice it is part of unstable. So it was something proposed and so on, but it was part of unstable. Simon improved it and moved it to staging. And it took 3 years to make it merged and there was tons of discussions about UMF.

      >Warning! The protocol described in this file is experimental and backward incompatible changes may be made.

      In linked file. I will not get into discussion why Nvidia didn't implement protocol almost no one was using that could break at anytime. I would say if i was GPU maker and was releasing stable drivers I would only want to limit my support to at best staging protocols, unstable could potentially could be tested at maybe their Vulkan developer driver branch, but that unstable protocol would be useless in itself. To make it useful to resolve their problems (and biggest their problem was present and xwayland) you would have to make present/xwayland support it. But building a city on unstable extension that can be broken without any guarantee of backwards compability would be extremly bold move.

      Comment


      • #63
        Originally posted by mdedetrich View Post

        Its not a twitter post I am referring to, but an actual discussion on mesa gitlab. I don't why I have to keep on repeating this, but NVidia actually asked for advice there on how to implement their supposed implicit sync fix (without killing performance or recoding fundamental parts of their driver) and they never got a reply from anyone.
        They gave entire presentations at XDC about how unsuitable GBM was.

        Anyway, I really don't care what you believe so I'll leave it at that.

        Comment


        • #64
          Originally posted by piotrj3 View Post
          Notice it is part of unstable. So it was something proposed and so on, but it was part of unstable. Simon improved it and moved it to staging. And it took 3 years to make it merged and there was tons of discussions about UMF.
          This protocol enables explicit synchronization of asynchronous graphics operations on buffers on a per-commit basis. Support is currently limited to dmabuf buffers and dma_fence fence FDs. Explicit synchronization provides a...

          It was put in unstable in 2018. Here is a trap for this merge to happen there must be at least 1 implementation at the time. There was 2 chrome own and KDEs.

          >Warning! The protocol described in this file is experimental and backward incompatible changes may be made.
          ​In linked file.
          There is a reason why this is here and it not what you think. The reason for this is very important.

          Originally posted by piotrj3 View Post
          I will not get into discussion why Nvidia didn't implement protocol almost no one was using that could break at anytime.
          No body was using was not true. You find 2020 mesa3d using it.

          Originally posted by piotrj3 View Post
          I would say if i was GPU maker and was releasing stable drivers I would only want to limit my support to at best staging protocols, unstable could potentially could be tested at maybe their Vulkan developer driver branch, but that unstable protocol would be useless in itself.
          You need to see this is a two to tango problem. While eglstreams was being messed around with by Nvidia the "zwp_linux_explicit_synchronization_v1" extension could not be locked down for stable because if Nvidia got eglstreams to work for compatibility this protocol extension could require changing.

          Originally posted by piotrj3 View Post
          ​To make it useful to resolve their problems (and biggest their problem was present and xwayland) you would have to make present/xwayland support it. But building a city on unstable extension that can be broken without any guarantee of backwards compability would be extremly bold move.
          Did eglstreams do changes that broke backwards compatibility yes it did.

          The reality the stable process for the explicit sync support in Wayland Protocol could only start once Nvidia
          1) Made a working solution that was not GBM based.
          2) adopted GBM based solution.

          Nvidia has taken the GBM based solution so now stable process for Linux explicit synchronization could start.

          Do remember Nvidia added eglstream support to xwayland and never got that to work right.
          Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

          So Nvidia had no problem attempting to build a city on unstable foundations without any guarantee of backwards compatibility. Eglstreams was extreamly bold move on Nvidia part that absolutely did not play out and caused major delays to the explicit synchronization support in Wayland protocol being declared stable.

          piotrj3 what went wrong here is mutex vs spinlock all over again.

          People see Nvidia pushing for explicit sync as a good thing but this is why eglstreams does not work. Yes Nvidia push for explicit sync is what lead to them making eglstreams that does not work.

          You could say the parties pushing for implicit sync were like the parties who classed mutex as the best because it provided stable results. The parties pushing for pure explicit sync like Nvidia are like the parties that pushed for spinlocks in userspace for performance. Problem here neither party is in fact right. The correct locking answer for the mutex vs spinlock debate in userspace for best performance and stability is a futex or Wait­On­Address both are part way between a mutex and spinlock in function.

          There is no formal name to the middle ground between implicit sync and explicit sync other than a DRI3 syncobj.

          The solution for explicit sync going forwards with Wayland will be 95% what google developers made in 2016. We would have been there a lot sooner if Nvidia had not rejected GBM/DMABUF solution and gone done the eglstreams rabbit hole of attempting explicit sync only with a protocol they were breaking all the time that could never be made work.

          Yes the reason why Nvidia rejected the GBM/DMABUF solution is they could not see why a little bit of implicit sync was important so everything run with stability. Yes without that little bit of implicit sync you have a broken mess of eglstreams. The way Microsoft mandates all GPU vendors make their drivers for Windows there is no avoiding this little bit of implicit sync with Windows. The Linux world gave Nvidia enough freedom to shoot themselves in the foot and attempt a 100 percent explicit sync solution..

          Yes it like 99 percent explicit sync with 1 percent implicit sync and everything works fine. 100 percent explicit sync everything is a broken mess.

          Yes this 99 percent explicit sync with 1 percent implicit sync is like how a futex is like 99% spinlock with 1% mutex. Yes getting the balance between stability and performance normally does not fit into 100 percent black and white answers.
          Last edited by oiaohm; 30 June 2024, 10:39 PM.

          Comment


          • #65
            Originally posted by smitty3268 View Post

            They gave entire presentations at XDC about how unsuitable GBM was.
            Yes because at the time it was designed around implicit sync

            Originally posted by smitty3268 View Post
            Anyway, I really don't care what you believe so I'll leave it at that.
            Its not a matter of belief, there was completely open conversation between nvidia developers and the linux kernel/graphics developers on mesa gitlab about this precisely where the actual explicit sync features was being developed and technical details was being discussed, as is the norm for open source development.

            I don't know why you keep on bringing up twitter or somewhat related 10 year old presentations.

            Comment


            • #66
              Originally posted by mdedetrich View Post

              Thats because the current implementation is not ideal due to having to support both implicit and explicit sync
              Just double-checked, that's not the case. The performance hit happens due to CPU overhead on the client side, without doing anything related to implicit sync.

              RADV even completely disables implicit sync for its WSI BOs, such that it can't work even in other contexts, which can break things: mesa#11294 (comment 2446719)

              as to not break userspace until the transition period is over
              There's no such transition. Mesa can never break implicit sync.​

              Originally posted by mdedetrich View Post

              Yes because at the time it was designed around implicit sync
              Nothing has changed regarding GBM & synchronization. Implicit sync in Wayland isn't directly related to GBM anyway.

              Comment


              • #67
                Originally posted by mdedetrich View Post
                Yes because at the time it was designed around implicit sync
                This is wrong gbm was not in fact designed around implicit sync. GBM has implicit operations. Turns out these implicit operations are important so that OS kernel knows what buffers the application is handling and messing with so in case of termination correct clean up can be performed.

                GBM from the start include means to be used in explicit and implicit sync solutions. GBM model with graphics is like the model for the futex for general locking. Use enough implicit operations for stability and be able to us explicit operations for the left over.

                Yes these are two Nvidia developers.

                Yes first one say stop asking for implicit sync support.


                This second one.
                X11 explicit sync does not require that the compositor support explicit sync. If it does not, Xwayland will use DMA_BUF_IOCTL_IMPORT/EXPORT_SYNC_FILE to translate between the two synchronization models.
                The reality the X11 protocol design this is not just Xwayland turns out mandates that implicit and explicit sync must play ball with each other or bad things will happen.

                The reality here legacy application support says that implicit sync has to be implemented somewhere. If not in the kernel driver this has to be implemented in the opengl libraries and the kernel linux framebuffer emulation for particular framebuffer syscalls that expect implicit sync.

                Reality the Linux kernel legacy interface of Linux framebuffer and the X11 protocol both end up mandating the existence of implicit sync. At some point Nvidia just going to have to accept they cannot keep on saying no implicit sync because the legacy part of Linux will keep on demanding implicit sync so items can work.

                There is a reason why Intel, AMD and arm decide to start working on a generic solution in linux kernel space to put implicit sync on top of their explicit sync drivers because they saw these limitations over a decade ago.

                Nvidia being very stubbornly slow on the uptake. Heck Nvidia were stubbornly slow on the update that some implicit operations were required even with explicit sync to make sure OS kernel could correctly clean up in case of application termination or crash this is why Nvidia had to adopt GBM in the end.

                Yes on the gitlab we see over and over again the Nvidia developers internally don't agree with each other over this explicit sync only stand. Intel, AMD and ARM developers you don't see them disagreeing with members from the same company publicly.

                Really I expect to see Nvidia some point in the future add support for the shared implicit sync support in the Linux kernel and then claim they never had explicit sync/explicit operations only policy.

                Comment


                • #68
                  Can someone please explain to me why NVIDIA has not just deprecated their drivers and encouraged users to make the only sane choice, AMD?

                  NVIDIA is going to deprecate their graphics card stack soon anyway when they realize that AI is the only real user of their cards.

                  Comment


                  • #69
                    I've been running full Wayland now that NVIDIA supports explicit sync and so does kwin. I remember people thinking that Wayland would be only for phones before the vast security issues came to light.

                    Comment


                    • #70
                      Originally posted by AlanTuring69 View Post
                      Can someone please explain to me why NVIDIA has not just deprecated their drivers and encouraged users to make the only sane choice, AMD?

                      NVIDIA is going to deprecate their graphics card stack soon anyway when they realize that AI is the only real user of their cards.
                      Not all data center modeling is AI.



                      Good percentage of Nvidia data center income is the A16. Remember you still need to at time visualize you AI generated results.

                      Yes some of the reason why Nvidia added a risc-v cpu to their GPU so they could run their trade secrets directly on their card and open source everything else.

                      Nvidia is following now the process AMD did when they took over ATI. AMD did set of hardware alterations to the ATI GPu designs to split trade secrets away from OS driver as well. Nvidia is just over a decade behind. At some point I do see the Linux closed source Nvidia driver to disappear in the same way the ATI/AMD one did.

                      Comment

                      Working...
                      X