Announcement

Collapse
No announcement yet.

NVIDIA Transitioning To Official, Open-Source Linux GPU Kernel Driver

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by birdie View Post
    Anecdotal evidence is anecdotal. I've been using NVIDIA drivers for Linux for over two decades now and I've spent maybe 5 hours "fixing" them in this time period (most fixes being applying patches for new unsupported kernels which has happened at most a couple of times).

    Meanwhile I've had criticial issues with open source Intel and AMD drivers some of which are unresolved to this fucking day.
    Removed links but I will answer them.

    https://forums.tomshardware.com/thre...-bios.3711200/ Those amd fan speed issues I am sorry they are not the open source driver. They are firmware/bios on AMD the cards. Some vendors AMD cards are good on reporting fan speed others are bad. Yes the disable 0 rpm there is no card setting with AMD that is pure software override so that is use fancontrol. Fancontrol appears error prone because the depending on the vendor of the card depends if you need to set in bios manual or automatic fan curve to get correct reporting and worst sometimes the manual fan curve has to be a particular fan curve so fan speed reporting is right.

    Yes the failure to get correct fan speed out of cards with AMD based graphics cards happens under windows as well. Yes software override to 0 rpm is how it done under windows as well with AMD, Here a horrible fact AMD reference cards don't have 0 rpm mode this is in fact added by MSI and others. Yes we have developers working on Linux from AMD but we don't have the ODM who made the cards providing developers to fix up their own made quirks..

    Sorry fancontrol trouble under AMD will not be improved by attempting to implement that in kernel. AMD hardware has some bugs.

    You need to go back and read that intel bug report you have been asked to do a particular thing that you have not. I have had equal issue with intel and 1440 mode before but it turned out to be a particular monitors screwed edid yes that monitor works fine when you use a third party box to replace edid equal but correctly formated edid yes the one log that is asked for will show up that problem. Windows is little more lax on edid stuff. So not sure if your intel problem is really intel driver problem or simply your monitor is screwed up. Yes even with Nvidia at times you need to override the monitors edid to get correct monitor options. Yes annoying Intel, AMD and Nvidia all tolerate different forms of edid screwed up. Yes all of them work if with monitor edid if the monitor is sending exactly to the specification.

    So that intel graphics refusing todo a particular output with a particular monitor you can change that amd and nvidia with the right screwed up monitor the same fault will show up. As I openly admitted graphics drivers generally are not bug free. There are types of bugs that a universal no matter the GPU vendor one of them is edid issues resulting in not being able to use particular output modes when you pair X driver with Y monitor.


    Originally posted by birdie View Post
    Also, unlike with AMD and Intel, with NVIDIA I can use and switch between a couple of driver releases freely.

    With AMD and Intel? You either have the latest kernel and drivers bugs or a not so new kernel which itself is buggy but graphics drivers work. No fucking middle ground.

    The fact that drivers are coupled to the kernel is an abomination no other serious commerical OS does. Period. Even RHEL and Android don't feature this brain damaged decision. Drivers are too fucking big and complicated to be inlined. They must be developed separately, they must be a way to shuffle them. The Linux kernel project does not understand this and so we have it. Either a stable kernel with outdated drivers or a bleeding edge kernel with bleeding edge regressions and bugs. F this crap. This is why Linux is not going anywhere aside from its uses where people make it play by the rules: RHEL/Android - the only two reliable kernel distributions, with proper CI/QA/QC.
    There is more to this. AMD and Intel open source I can use multi versions of mesa at the same time with the same kernel driver. Nvidia closed source yes you can reboot and change between driver versions. The Nvidia driver is not designed to have a stable ABI between userspace and kernel space like the AMD and Intel and all the other open source drivers that are included in the kernel. Did you miss that you change complete kernels yet you don't need to change the userspace with the open source graphics drivers to match.

    Remember not all your graphics issues come from the kernel driver. The open source drivers horrible as it sounds gives you broader mix and match options because you can mix and match the userspace part of the driver with the kernel part of the driver. Yes I will give you its a pain to replace the complete kernel.

    Also you have claimed something false. https://backports.wiki.kernel.org/index.php/Main_Page "Either a stable kernel with outdated drivers or a bleeding edge kernel with bleeding edge regressions and bugs." Yes this is a false statement. Stable kernel has backported driver option so stable kernel with the drivers it released with and the backported drivers with the stable kernel. Yes RHEL and Ubuntu have backported drivers under different names HWE from ubuntu used the backport enable quite a bit.

    I will give you the upstream kernel does not give the clearest instructions on all the steps need to back-port individual drivers in included kernel documentation resulting in people like you birdie thinking it not feature. Its a feature that RHEL, Ubuntu, SUSE and many other distribution kernel makers are using.

    So you statement that drivers must be developed separately is based on the idea that upstream drivers cannot be shuffled back into stable branch kernels right? If that is the case you badly wrong because system to-do that does exist and is maintained and the backport patches is small part part of that.

    Yes Android and RHEL kernels do backport drivers from the bleeding edge to their stable kernels. Funding for the backports code large percentage comes from IBM/Redhat.

    I would not say that drivers are too big or complicated to be in-lined in operating system development. The one cause of in-lining is that the user-space and kernel space have to have a stable ABI between the two. Driver mainlined in Linux basically mandates that you must be able to mix and match kernel modules with user-spaces.

    The issue of having to use the latest bleeding edge Linux kernel to test out newest version of a driver is more a case of taking path of least effort not what is really possible. Yes the effort to make backporting drivers from bleeding edge back to stable kernels as simple as possible has not been done. Making a stable kernel ABI for drivers does not solve the issues either.


    You see these different failures all in the same line with Windows fairly much every time Microsoft does a major kernel update. Linux distributions do these kernel updates more often so break out of tree drivers more often. Do the not include drivers with Windows break commonly the answer is yes they do. Does Nvidia end up having to release new drivers for windows altering for the new kernel releases changes yes they do. Do end users end up suffering with broken systems under Windows when Windows updates kernel because their drivers are miss aligned with their kernel yes they do.

    birdie like it or not drivers developed separately to the kernel as Windows does have some serous stability downsides. I will give you there need more developed middle ground. As in drivers developed inline with kernel with a user-friendly back-port system to older kernel versions I would see as ideal solution.

    The reality here I don't see any advantage in the purely independent driver development there is more than enough documentation of the issues be it Linux or Windows or Freebsd that it is problematic. The fact its problematic is why the stable kernel ABI for drivers falls flat on face under closer inspection.

    Birdie think about this when do you want driver developers to find out about a internal kernel ABI/API change your choices are:
    1) when the kernel is being developed (this is inline driver development freebsd/linux include drivers)
    2) when user attempt to use the driver and then reports the issue(this is your developed separately this is windows and nvidia closed source)

    Remember we are talking millions of lines of code in some drivers. Remember option 2 the user may have basically a no interface machine. I will give you that you want more options to use more driver versions when you run into trouble. Inlined development reduces the amount of trouble you have to deal with because inlined development addresses particular problem of ABI/API changes.

    Birdie you came to this with the idea that the Linux way is wrong without considering what the Linux way does right because of inlining drivers. The amount of API/ABI changes the Linux kernel does is massive yet it only the out of tree drivers that are being hit by it should have had you asking yourself a few questions how and what is the particular advantage here. Remember API/ABI changes happen intentionally and not intentionally in all commercial operating systems that have attempted the stable kernel ABI for drivers route resulting in driver failure.

    Middle ground between inlining drivers and multi versions of drivers would be useful as in all drivers inlined into kernel development so API/ABI issues turn up early for developers to address hopefully before hitting end users and a functional and userfriendly back-port system to allow those using older kernels to use the newer mainline drivers. Yes there is a functional backport system for Linux kernel drivers from mainline back to older kernels is it user-friendly to those without major kernel development skill the answer is no its not and that is where I see the problem is.

    Comment


    • Originally posted by billyswong View Post
      On the surface, the inline-ness of Linux driver is bad design. But paradoxically, it is such design that push many drivers in Linux to go open source, which is probably one of the reasons why Linux become the most supported open source OS.


      The problem here the inline-ness is double side sword. API/ABI changes happen no matter what you goal is this happens with Windows so breaking drivers. Every example of third party drivers have example of something changes in kernel and the driver critically fails.

      Its really simple to ignore the history of the Linux kernel inlined drivers and their successful function. Being inlined with the core kernel development means kernel function changes are viewable to all driver developers. You see on the linux kernel mailing list some function is going to be changed and driver developers come out and say hell no that going to break us here then work between the driver developers to resolve the problem before it gets to end user.

      Split kernel and driver developer as Windows and other commercial OS try result in driver developers not knowing what the kernel developers are up to and the reverse this results in driver developers presuming that something is defined functionality when it not and the kernel developers thinking something is not used feature so can be changed when it is used on feature both of these errors combined result in driver failures.

      There is downsides to both models. Something to forget early Linux kernel history the drivers and kernel were in separate CVS repositories. Yes issues with ABI/API alignment turned up back then. Was the Linux kernel inlined driver model the answer is no it was not. The fact it was not means you need to wonder why.



      Yes this include example where operating systems trying to keep a stable kernel ABI end up having to keep around old and broken ABI that results in bad operating system stability. Yes security issues also trace to locking down the kernel driver ABI.

      The idea that inline-ness of Linux driver development being bad design is not really backed by facts. There are valid design reasons to go the inline driver route. Both routes out of tree driver development and inline driver development route both have their own particular downsides. That the problem.

      The reality is you want to somehow do both. Inline development so driver developers know about core kernel changes before end users end up on the wrong side of them and a system to use multi versions of drivers. Windows is example of pure not inline development because driver developers on windows cannot see what the kernel developers are doing and their are failure after failure caused by it. Linux is example of inline development the biggest problem is not having a simple system to have multi versions of a driver.

      Comment


      • I don't think a "back-port system" for inline kernel driver can ever be user friendly, as you are talking about recompiling a kernel here.

        eBPF looks like an interesting and promising "middle ground" in this regard. oiaohm as you mentioned, many hardware support issue are due to hardware quirks. Putting those data outside kernel while keeping the common functionalities inside kernel may be the path of all drivers forward.

        Comment


        • Originally posted by oiaohm View Post



          The problem here the inline-ness is double side sword. API/ABI changes happen no matter what you goal is this happens with Windows so breaking drivers. Every example of third party drivers have example of something changes in kernel and the driver critically fails.

          Its really simple to ignore the history of the Linux kernel inlined drivers and their successful function. Being inlined with the core kernel development means kernel function changes are viewable to all driver developers. You see on the linux kernel mailing list some function is going to be changed and driver developers come out and say hell no that going to break us here then work between the driver developers to resolve the problem before it gets to end user.

          Split kernel and driver developer as Windows and other commercial OS try result in driver developers not knowing what the kernel developers are up to and the reverse this results in driver developers presuming that something is defined functionality when it not and the kernel developers thinking something is not used feature so can be changed when it is used on feature both of these errors combined result in driver failures.

          There is downsides to both models. Something to forget early Linux kernel history the drivers and kernel were in separate CVS repositories. Yes issues with ABI/API alignment turned up back then. Was the Linux kernel inlined driver model the answer is no it was not. The fact it was not means you need to wonder why.



          Yes this include example where operating systems trying to keep a stable kernel ABI end up having to keep around old and broken ABI that results in bad operating system stability. Yes security issues also trace to locking down the kernel driver ABI.

          The idea that inline-ness of Linux driver development being bad design is not really backed by facts. There are valid design reasons to go the inline driver route. Both routes out of tree driver development and inline driver development route both have their own particular downsides. That the problem.

          The reality is you want to somehow do both. Inline development so driver developers know about core kernel changes before end users end up on the wrong side of them and a system to use multi versions of drivers. Windows is example of pure not inline development because driver developers on windows cannot see what the kernel developers are doing and their are failure after failure caused by it. Linux is example of inline development the biggest problem is not having a simple system to have multi versions of a driver.
          To fix the Windows issue (which happens in a time frame of years, compared to mere months as in Linux, as they guarantee API/ABI stability for drivers within a given major release of Windows) it suffices to have an open development model. I, as a driver developer, should follow the kernel mailing list for breakages. There's no requirement to have them in-tree. That paired with a compromise to keep interfaces stable should suffice.

          I see there are some advantages in not having those stable interfaces (and prefer that model), but denying that other ways to get most of the benefits exist is, well, denial.
          Regarding open sourcing as an alleged consequence of this... I have yet to see proof that its effect was really significant and positive in terms of getting vendors to open source their drivers. The biggest users of Linux are embedded manufacturers and most of those have a pretty awful history of making a single code drop of closed source code and force you to stick to a decades old kernel. Several brands of consumer products also share similar problems, or directly don't support Linux (take a look at how many WiFi drivers were reverse engineered because of this). Stable interfaces would have made it much cheaper for them to at least keep their proprietary drivers up to date. Unstable ones certainly didn't encourage them to open them anyway, so the end result was simply poor support for users.

          In the end, it may or may not be a case of perfect being the enemy of good enough. You won't change the fact some companies will not do the work of upstreaming and open sourcing. It's expensive. It's an additional cost. Many of those have a single dev that moves from device to device and barely does any hardware bring up before moving on to the next project.

          Comment


          • Originally posted by birdie View Post

            Kinda sums up Open Source/Linux zealots' attitude: WE CAN HATE, WE HATE, WE WILL HATE.
            I have also used AMD and NVidia GPUs for two decades in Linux for work and in Windows for gaming.

            AMD Windows drivers have always had inconsistent hit and miss performance in different games (due to poor implementation of DirectX11 and OpenGL which AMD said it won't fix) and driver updates would often break something. NVidia drivers have had top performance in every game I played, they just always work with no issues.

            For work in Linux I do scientific computing and machine learning - the domains NVidia innovated with CUDA, created a flourishing ecosystem, established itself as the industry leader and is the unrivalled best choice - everything just works on NVidia. AMD GPUs, on the other hand, have limited compatibility with scientific computing and machine learning software, so that every time I give AMD a try I run into one problem or another, ending up wasting days investigating and eventually hitting some currently unsupported feature or known unresolved issue.

            From my perspective, the only good thing about AMD GPUs is the open-source Linux driver, which is irrelevant for me because I don't do Linux kernel development or compile custom kernels*. Software compatibility, performance and robustness of AMD GPUs are unsatisfactory and inadequate for me, both in Linux and Windows.

            It is rather
            cringey to read all these open-source zealot nonsense claims which aren't grounded in facts and reek of ignorance. Hating feels right and good, makes the heart pump faster, which makes it very addictive. That's why the haters keep clutching for the straws making up ludicrous reasons to justify their hate for NVidia. They aren't interested in facts, they don't share anything informative, they rather look for other haters to resonate with.

            * I do build zenpower3 module for monitoring temps and voltages of Zen 3 because open-source pioneer AMD doesn't provide that in Linux kernel.
            Last edited by max0x7ba; 17 May 2022, 06:01 PM.

            Comment


            • Originally posted by billyswong View Post
              I don't think a "back-port system" for inline kernel driver can ever be user friendly, as you are talking about recompiling a kernel here.
              It for sure can be made more userfriendly. RHEL does back-port drivers without rebuilding complete kernels. Yes this would require some integrating the abstraction of new to old of the back-port system with DKMS. Its a case you have to rebuild the kernel module and have suitable abstraction. This of course requires resources.

              Remember the Linux kernel does support building third party modules. Its just having abstraction between old and new.

              Originally posted by billyswong View Post
              eBPF looks like an interesting and promising "middle ground" in this regard. oiaohm as you mentioned, many hardware support issue are due to hardware quirks. Putting those data outside kernel while keeping the common functionalities inside kernel may be the path of all drivers forward.
              Some areas eBPF is the solution other areas are not.

              Originally posted by sinepgib View Post
              To fix the Windows issue (which happens in a time frame of years, compared to mere months as in Linux, as they guarantee API/ABI stability for drivers within a given major release of Windows) it suffices to have an open development model. I, as a driver developer, should follow the kernel mailing list for breakages. There's no requirement to have them in-tree. That paired with a compromise to keep interfaces stable should suffice.
              There is a claim here that is false that Microsoft in fact guarantee API/ABI driver stability they write that they do but real world that like Nvidia issue linked before proves that ABI stability does not fact happen no matter how much Microsoft attempts to. There are tones of examples where that just does not happen that there are tones of examples you do need to look at why.

              Windows in reality you would only see what the Linux world would call longterm kernels this is in fact confirmable by getting the Windows kernel version numbers there is in fact Windows kernel version numbers. Windows as less kernel caused issues with drivers by releasing kernels less often. The horrible reality is every time Microsoft releases a new kernel to end users some drivers do in fact break even that they have promised API/ABI stability. The cause is simple Windows kernel developers don't know how driver developers are in fact using the ABI and the driver developers presume something is defined when its not. Inline development means those making core kernel changes can look at the code in fact using those functions in the drivers and see issues. The reality is even with total control like Microsoft of the kernel space they cannot keep the driver ABI stable enough not to cause problems.

              Note you talked about driver developers should follow the kernel mailing list for breakages there are many drivers in the Linux kernel that have zero active developers that keep on working. There are many cases of window drivers that end up completely failing on end users resulting in either end users having to disable windows updates or stop using the hardware.

              There is a trick to it why old drivers keep on working that are upstream that have no active developers.
              https://en.wikipedia.org/wiki/Coccinelle_(software)
              Semantic patches. Due to all the code being inline and upstream its possible to use Semantic patches to correct developer coded incorrect presumes. Also these can be used to location code that is not to expected usage.

              Originally posted by sinepgib View Post
              I see there are some advantages in not having those stable interfaces (and prefer that model), but denying that other ways to get most of the benefits exist is, well, denial.
              Regarding open sourcing as an alleged consequence of this... I have yet to see proof that its effect was really significant and positive in terms of getting vendors to open source their drivers. The biggest users of Linux are embedded manufacturers and most of those have a pretty awful history of making a single code drop of closed source code and force you to stick to a decades old kernel. Several brands of consumer products also share similar problems, or directly don't support Linux (take a look at how many WiFi drivers were reverse engineered because of this). Stable interfaces would have made it much cheaper for them to at least keep their proprietary drivers up to date. Unstable ones certainly didn't encourage them to open them anyway, so the end result was simply poor support for users.

              In the end, it may or may not be a case of perfect being the enemy of good enough. You won't change the fact some companies will not do the work of upstreaming and open sourcing. It's expensive. It's an additional cost. Many of those have a single dev that moves from device to device and barely does any hardware bring up before moving on to the next project.
              This contains a stack of bad presumes also you have ignored what you wrote. Proprietary drivers or out of tree drivers require on going work to remain working under Windows or Linux. Linux kernel supports a lot of legacy hardware that Windows does not because the inlined drivers require less work. Semantic patches that the Linux kernels started using in 2007.

              Originally posted by sinepgib View Post
              Many of those have a single dev that moves from device to device and barely does any hardware bring up before moving on to the next project.
              Lets say this person is the driver developer who does not upstream their driver who is going to be their to perform the maintenance so the driver works with newer kernels the answer is no body if they don't release the source code at all. Releasing the source code someone else could chose to pick up the code base either maintain it or upstream it. The issue of being stuck on old OS versions happened in embedded space with Windows CE and other closed source operating systems used in embedded space in history. Being stuck on a decades old kernel/OS is a problem that existed before Linux was heavily used in the embedded space and so called stable kernel driver ABI did nothing to address this. Core of this problem is the developer moves on and there is no one to maintain the drivers. Inlined into the kernel tree.

              Stable kernel driver ABI is never 100 percent stable while the operating system still has kernel development.


              The reality is lot more complex. Lets take Qualcomm here they did the code dumps like you described and other third parties have been working on getting those code bases to quality to inline into the Linux kernel. Notice that stalls out around 2016 because then there is a change at Qualcomm

              In this blog, Vinod Koul shares detailed instructions to get started with the mainline Linux kernel on arm64 Qualcomm Snapdragon based devices.


              That is because around 2014 Qualcomm started noticing that on their old boards the new kernel from mainline worked better than their custom kernels. Yes then started working to up-streaming their drivers directly.

              I will not dispute the awful history of vendors just doing code drops but there are vendor after vendor like Qualcomm who have worked out that is more cost effect to upstream and gets them better quality product and needing lower cost to keep the drivers functional. Yes there is a long history of third parties picking up those code drops and up-streaming what they can.

              Additional cost arguement in reality does not check out. There is additional cost to be saved if you are just going to make something and just walk away and abandon it without up-streaming your drivers or releasing the source code. Problem here if this is your objective end user is going to be stuck in the past be it a OS that claims driver ABI stability or not because there is going to be no developer to update the driver when the driver comes incompatible. The reality of the driver coming incompatible its not if but when.

              If you are going to keep developer on the driver maintain the driver so it works this is going to be a on going cost right. Think about it you have released the code its up-streamed one the Semantic patches are doing some of the required updating work as in the kernel developers change something they make a Sementic patch to update all drivers inlined in the Linux kernel. Third parties can decide to take up driver development.

              Take old intel graphics cards recent new drivers made by non Intel personal because they are still using the hardware this is only possible without major reverse engineering because they had the old drivers to look at so those source code releases are important.

              This is the problem stable kernel driver ABI promise really does not change a thing. Its raised as a solution but as soon as you properly look the stable driver ABI does not check out.

              The reality the stable kernel driver ABI promises by all operating systems not just Windows while the operating system is still being developed/maintained does not save money because developers are required to maintain the drivers as the failures turn up. Linux and Freebsd inlined drivers in fact have a lower maintenance cost. The lower maintenance cost that has been documented over and over again has many long term kernel developers not understanding why its like pulling teeth to get vendors to upstream drivers because not up-streaming drivers is just costing them over the long term.

              Yes this is a problem short term cost of up-streaming is slightly higher with the long term being way cheaper due to how much is pure autofixed with Semantic patches and having other third parties being able to submit patches so the over cost it cheaper to inline the code into the Linux and freebsd kernel trees.

              The additional cost arguement is something people bring up as a case for Stable kernel ABI drivers not being aware that this arguement has been disproved over and over again.

              Comment


              • Originally posted by max0x7ba View Post
                AMD Windows drivers have always had inconsistent hit and miss performance in different games (due to poor implementation of DirectX11 and OpenGL which AMD said it won't fix) and driver updates would often break something. NVidia drivers have had top performance in every game I played, they just always work with no issues.
                Linus tech tips and others have found Nvidia drivers on Linux for gaming is no where near their Windows drivers. Us who are more desktop Linux users we are more worried with issues of system start up right and that any gaming we do we don't waste ages messing around.

                Originally posted by max0x7ba View Post
                For work in Linux I do scientific computing and machine learning - the domains NVidia innovated with CUDA, created a flourishing ecosystem, established itself as the industry leader and is the unrivalled best choice - everything just works on NVidia. AMD GPUs, on the other hand, have limited compatibility with scientific computing and machine learning software, so that every time I give AMD a try I run into one problem or another, ending up wasting days investigating and eventually hitting some currently unsupported feature or known unresolved issue.
                This is true but I am not person who doing a lot of CUDA stuff. In fact I don't do any CUDA stuff. Web development and other development so no CUDA. The means to start my system get X done is very important.

                Originally posted by max0x7ba View Post
                From my perspective, the only good thing about AMD GPUs is the open-source Linux driver, which is irrelevant for me because I don't do Linux kernel development or compile custom kernels*. Software compatibility, performance and robustness of AMD GPUs are unsatisfactory and inadequate for me, both in Linux and Windows.
                Distribution kernel updates also get you on the wrong side at times of the Nvidia driver. Software compatibility was good fun having Nvidia causing issues with Firefox. There are different opengl applications that are fine with the open source graphical stack that are not fine with Nvidia stack under Linux. So there are issues in Nvidia driver quality for your more general desktop usage with Linux.

                Mesa does properly check that the drivers are correctly passing the opengl test suite. Nvidia opengl driver on Linux will claim to support more opengl that it in fact properly passes when put against the test suite. So over claiming and under delivering.

                Originally posted by max0x7ba View Post
                It is rather cringey to read all these open-source zealot nonsense claims which aren't grounded in facts and reek of ignorance. Hating feels right and good, makes the heart pump faster, which makes it very addictive. That's why the haters keep clutching for the straws making up ludicrous reasons to justify their hate for NVidia. They aren't interested in facts, they don't share anything informative, they rather look for other haters to resonate with.
                There are a list of reason that are valid and they break into to major camps.
                1) lack of proper integration by Nvidia. These link problems with Wayland development. Links to issues laptops and proper power management. This also causes issues of starting up with a black screen that is not good.
                2) The legal of it. This causes different distributions lots of complex problems.

                Open sourcing Nvidia kernel module over time could fix both these.

                Comment


                • oiaohm When you said RHEL or Canonical can do the driver back-port, it only shows they are developer friendly. That doesn't count as user friendly. in your words
                  Yes there is a functional backport system for Linux kernel drivers from mainline back to older kernels is it user-friendly to those without major kernel development skill the answer is no its not and that is where I see the problem is.
                  We are talking about discovering ABI breakage and adapt the new driver module backward for older kernels here. When instructions and binary package in form of DKMS appear for users, the backport is already done by developers and waiting users to apply. It is not the backport operation itself.

                  Comment


                  • Originally posted by birdie View Post

                    Anecdotal evidence is anecdotal. I've been using NVIDIA drivers for Linux for over two decades now and I've spent maybe 5 hours "fixing" them in this time period (most fixes being applying patches for new unsupported kernels which has happened at most a couple of times).

                    Meanwhile I've had criticial issues with open source Intel and AMD drivers some of which are unresolved to this fucking day.

                    Do not BS me with open source drivers "quality", "integration" and "seamlessness", PLEASE. This is all pure crap if basic features do not work.

                    Also, unlike with AMD and Intel, with NVIDIA I can use and switch between a couple of driver releases freely.

                    With AMD and Intel? You either have the latest kernel and drivers bugs or a not so new kernel which itself is buggy but graphics drivers work. No fucking middle ground.

                    The fact that drivers are coupled to the kernel is an abomination no other serious commerical OS does. Period. Even RHEL and Android don't feature this brain damaged decision. Drivers are too fucking big and complicated to be inlined. They must be developed separately, they must be a way to shuffle them. The Linux kernel project does not understand this and so we have it. Either a stable kernel with outdated drivers or a bleeding edge kernel with bleeding edge regressions and bugs. F this crap. This is why Linux is not going anywhere aside from its uses where people make it play by the rules: RHEL/Android - the only two reliable kernel distributions, with proper CI/QA/QC.
                    You're not wrong, it's often critical to keep certain drivers fully up to date, graphic drivers among the most common of those. It is often equally critical to stick with an LTS kernel to avoid bugs, and now those needs are in conflict.

                    Drivers should not be part of the kernel, or maybe they should, it is convenient to be able to install your OS and everything just works out of the box, but there should be an easy way to use a more recent version of specific drivers without using a newer or a custom kernel to achieve it. Similarly driver updates should not need to meet kernel merge windows to go through, long times pass between kernel releases after all, but there's no sense for a driver to wait that long for an update.

                    Comment


                    • Originally posted by billyswong View Post
                      oiaohm When you said RHEL or Canonical can do the driver back-port, it only shows they are developer friendly. That doesn't count as user friendly. in your words

                      We are talking about discovering ABI breakage and adapt the new driver module backward for older kernels here. When instructions and binary package in form of DKMS appear for users, the backport is already done by developers and waiting users to apply. It is not the backport operation itself.
                      Big thing here is DKMS allows source building of modules. So you are not needing to look for ABI breakage instead just API breakage. Again for this to function well you are going to need the driver to be open source not a binary blob.

                      https://github.com/dell/dkms Yes dkms its optional to include prebuilt kernel modules for selected kernels. Yes this is why you run into UEFI secure boot issue where stuff is not signed without you putting a module signing solution on you system. This has in fact downgraded your OS security..

                      This is the thing dkms could be extended to be aware of what version kernel source in the DKMS package is and from there chose a back-porting compatible set of headers to have abstraction performed when the module for the targeted kernel is built.

                      ABI breakage basically a mute point when you are dealing with open source drivers. API breakage is the problem with open source drivers..

                      Comment

                      Working...
                      X