Announcement

Collapse
No announcement yet.

AMD GPU Linux Driver Becoming "Really Really Big" That It's Starting To Cause Problems

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by user1 View Post

    So is hybrid kernel. It's difference compared to monolithic is mainly in the internal kernel structure. And microkernels are basically non-existant in general purpose OS'es.

    I know that Windows has hybrid shutdown since Windows 8, which speeds up boot time after shutdown. I don't think even MacOS has something like this.

    Originally posted by Ironmask View Post

    While I prefer microkernel designs as well, this has nothing to do with kernel architecture. Linux GPU drivers run in usermode, and this specific problem is because it's taking a long time to load the GPU driver *after* the kernel is loaded and initialized, and the boot screen is waiting for the GPU driver to load.
    I did not sleep at all yesterday and what I wanted to say was different.

    I used the word "monolithic" not in the context of microkernel/monolithic kernels and you both were correct in your rebuttals, but I meant that the Linux kernel is monolithic in terms of device drivers being part of it vs e.g. Windows where third party drivers are external and can be updated at any time and can be properly split for certain GPU uArchs.

    This is unlikely to ever happen to Linux though.

    Comment


    • #62
      Originally posted by _r00t- View Post
      The amdgpu driver is one the biggest project ever in the Linux Kernel regarding the amount of lines of code it has. I don't want to be ungrateful and I would like to thank the amdgpu developers for their work, but some parts probably needs to get rework. The issue with plymouth is just a minor issue compared to the gpu lockup with ring timeouts. I have this issue at least for 4 years, sometimes rarely, sometimes often. I hope this will be solved in Linux Kernel 6.12 as is mentioned here.
      KDE should not advertise the adaptive sync feature at the moment. It took me months to realize that this feature alone was responsible for the most gpu lockups. Until then, it was a daily gpu lockup horror. To all, who suffer just set "MESA_GLTHREAD=single" system wide and disable the "adaptive sync" in the desktop of your choice. It works for me (ATM).

      Comment


      • #63
        Originally posted by sbin View Post

        There are many people with much older hardware that use Linux thanks to the flexibility and performance that can be achieved on an optimised installation, unlike on some other OSes. It is a valid use case.
        [CLENCHES FIST STAMPS FOOT]

        FTR I've no side to take here in the subject matter. Just pointing out yet another overarching infantile generalization very likely to be mostly untrue bc of one reason or the other.

        Comment


        • #64
          I have 5 different devices running Linux with AMD, Intel CPUs and all with AMD GPUs. Three of them are gaming systems. I always build and run the latest stable Linux kernel since more than a decade now and amdgpu driver has no problems during boot or any other task, including gaming.

          The only problem is the header files take too much space and, maybe, adding some extra time to kernel build. There is definitely a need to handle that header file mess. This is not a concern for any user who do not deal with building kernel.

          Comment


          • #65
            Originally posted by user556 View Post

            Michael just said the bulk of the six million lines, which he'd linked, is from the auto-generated headers. But those headers could very much be holding a large amount of config data for the wide swatch of GPUs supported that then all ends up in the compiled binary.
            The size of the register header files has no effect on the compiled binary size.

            Comment


            • #66
              I still hear about AMD open-source good times and all that, but probably see 5 AMDGPU bug reports a week newly reported on Fedora, openSUSE, and even FreeBSD forums (their drm-kmod uses Linux GPU drivers and AMD's ordeal even stretches there )

              Sounds like another radeonsi -> amdgpu trim is needed (and maybe guidelines to stop bulking the driver so much to not have to continue doing that). Aren't the GPUs supposed to have all major operations black-box'd in the GPU firmware, on the GPU itself? Isn't that what NV is doing to have GSP and separate open drivers?

              Comment


              • #67
                People having issues can force load the amdgpu driver early during the boot process. Both dracut and mkinitcpio support this. Plymouth could also be configured with a bigger timeout value.

                IMO, even thinking about splitting the amdgpu kernel driver is insane as a reaction to this purely cosmetic problem. Further, the issue seems to occur on Fedora, but I haven't seen a report from OpenSUSE or Ubuntu yet. They all use plymouth though -> likely distro/packaging specific issue.

                And yes, the kernel source tree is approx. 1.6GB extracted now, with around 480mb being amdgpu register documentation headers (for GCN1-5, RDNA1, 2, 3, 3.5, 4, CDNA1-3 and all APU derivatives of them!) that collapse in size in the compiled driver. So what?! - it still easily fits on a computer from the last 20 or so years and the cheapest of SD cards and USB sticks…

                The compiled amdgpu module is 20mb uncompressed and 4.5mb compressed. The proprietary nvidia driver .ko is 50mb+ compressed(!) - why don't people report plymouth issues en-masse there? Because the issue isn't code size or dynamic module linking!

                Think what the driver getting split would mean: libdrm, mesa, aswell as amdgpu-pro would need to support those extra interfaces. Boom - code duplication, separate bug trackers and split development resources in several downstream projects.

                In case of the headers being pushed to an extra repo: this would make it difficult to build the kernel on systems with irregular internet access and generally add an annoying extra step for packagers and people wanting to build it.

                Just leave things as they are and let Fedora sort out things using far less invasive means. The debate about this issue is utterly ridiculous and the problem has been inflated way out of proportion.

                Comment


                • #68
                  Just FYI, this is not an issue with the size of amdgpu, it's an issue in ftrace. Disabling ftrace fixes the long load times. See:

                  For the discussion on fixing ftrace.

                  Comment

                  Working...
                  X