Announcement

Collapse
No announcement yet.

Idea Raised For Reducing The Size Of The AMDGPU Driver With Its Massive Header Files

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • LtdJorge
    replied
    Originally posted by nerdistmonk View Post

    My machine boots almost as fast as an Apple ][e, you switch it on, you hear a beep (you can add in the clunk clunk clunk clunk sound in your head)
    and the thing is already sitting on a ly login screen before the monitor can finish initializing. I don't use plymouth, just Ly for the display manager,
    and sway for the desktop.
    I use greetd with gtkgreet and sway. Systemd as init and login manager.

    And yeah, in my case, the motherboard post with almost every hardware option disabled but the ones I do have, takes more than the rest. Sadly, I cannot optimize that further.

    Leave a comment:


  • nerdistmonk
    replied
    Originally posted by LtdJorge View Post

    True, although on modern SSDs Plymouth actually slows down the boot process.
    My machine boots almost as fast as an Apple ][e, you switch it on, you hear a beep (you can add in the clunk clunk clunk clunk sound in your head)
    and the thing is already sitting on a ly login screen before the monitor can finish initializing. I don't use plymouth, just Ly for the display manager,
    and sway for the desktop.

    Leave a comment:


  • LtdJorge
    replied
    Originally posted by bridgman View Post
    Just a reminder that there are two separate discussions going on - one is about header file size (which does not affect compiled driver size) and the other is about supporting ~14 years of hardware with a single (amdgpu) kernel driver, which makes the compiled driver larger. If we split the support across two or more driver (which would involve duplicating a bunch of code).

    It's still not clear to us why amdgpu's compiled binary size is a problem if the much larger NVidia driver's size is not, but hopefully that will become more clear with time.

    Separately I have seen a few comments about the code becoming too complex because of all the conditionals required to support many generations of hardware but that is not how amdgpu is implemented - we have a separate chunk of code for each HW block generation (or range of very similar generations), load up a pointer array during initialization based on the HW blocks in the GPU then call into the appropriate code at runtime via that pointer array.

    For that matter it is also not clear why header file size is a problem other than maybe being aesthetically displeasing - the header text compresses very efficiently and is downloaded in compressed form, and uncompressed size on disk is tiny compared to the smallest SSDs available today.

    That said, if we were to split amdgpu into two or more drivers the associated header files would automatically be split as well (I say "automatically" but it would be a bunch of developer work) but it would bring an ongoing burden of duplicating common code and bug fixes in that common code across the two drivers. We could split the code in three so we only have a single copy of common code but that would make partitioning the header files more difficult.

    Not sure what the DRM maintainers (airlied, Daniel) would prefer but worth revisiting (it has been discussed before IIRC).
    Agreed. If you guys working on it are OK with those massive autogenerated definitions, why change it? It's not like it's going to have an effect on us, the users.

    Leave a comment:


  • LtdJorge
    replied
    Originally posted by Blisterexe View Post

    Plymouth's fancy splash screens contribute considerably to linux feeling polished or good to the average person, as silly as that might seem
    True, although on modern SSDs Plymouth actually slows down the boot process.

    Leave a comment:


  • bridgman
    replied
    Just a reminder that there are two separate discussions going on - one is about header file size (which does not affect compiled driver size) and the other is about supporting ~14 years of hardware with a single (amdgpu) kernel driver, which makes the compiled driver larger. If we split the support across two or more driver (which would involve duplicating a bunch of code).

    It's still not clear to us why amdgpu's compiled binary size is a problem if the much larger NVidia driver's size is not, but hopefully that will become more clear with time.

    Separately I have seen a few comments about the code becoming too complex because of all the conditionals required to support many generations of hardware but that is not how amdgpu is implemented - we have a separate chunk of code for each HW block generation (or range of very similar generations), load up a pointer array during initialization based on the HW blocks in the GPU then call into the appropriate code at runtime via that pointer array.

    For that matter it is also not clear why header file size is a problem other than maybe being aesthetically displeasing - the header text compresses very efficiently and is downloaded in compressed form, and uncompressed size on disk is tiny compared to the smallest SSDs available today.

    That said, if we were to split amdgpu into two or more drivers the associated header files would automatically be split as well (I say "automatically" but it would be a bunch of developer work) but it would bring an ongoing burden of duplicating common code and bug fixes in that common code across the two drivers. We could split the code in three so we only have a single copy of common code but that would make partitioning the header files more difficult.

    Not sure what the DRM maintainers (airlied, Daniel) would prefer but worth revisiting (it has been discussed before IIRC).
    Last edited by bridgman; 23 September 2024, 04:34 PM.

    Leave a comment:


  • rhbvkleef
    replied
    Originally posted by coder View Post
    I think some of what he was proposing to prune was the documentation they contain. If that's needed to figure out what the definitions mean and how to use them, then you'd want that information to be maintained somewhere. If just the needed definitions are in the kernel tree, then these should be derived from the full headers that should live elsewhere.​
    I guess that would be fine.

    Originally posted by coder View Post
    The headers are generated from their hardware design sources. They're not going to release those, and they'd bloat the tree way worse than the existing headers, if that did happen.
    If this is the case, we might have to ask the question of whether these sources are compatible with GPL-2.0. These generated sources are then not the "original" sources. The original sources would be the hardware design sources. I guess this would open a huge can of worms that we might not want to open regarding closed hardware though...​

    Originally posted by jsbiff View Post
    From my reading of the article, even though the headers aren't used by the driver, some of them are used by other projects, like Mesa. It sounds like if you delete them totally, you break Mesa and maybe some other stuff?
    I've revisited my opinion on the matter since I posted my original comment. Storing the full AMDGPU headers in a separate repository, and then generating pruned headers from those, which are then stored in the kernel tree seems to me like the best option. That way, the full headers are at least available, albeit with a slight bit more work, and building the kernel would not necessarily require pulling that separate repository.

    Leave a comment:


  • WileEPyote
    replied
    Originally posted by ahrs View Post

    Me too but I barely get to see them, even on my old dual-core laptop it boots so fast since I upgraded it with a Samsung SSD. Anyone still using a hard drive in 2024 has no excuse.

    This is a Gentoo system with the bare-minimum mind you. I think Udev and NetworkManager forms the bulk of time spent booting. Distributions like Ubuntu that do stupid things like mount a million squashfs mounts for Snap at boot are going to take longer, there's no way around that.
    Adjust your parameters for a verbose output, and you get to see extra text, and feel extra cool.

    Leave a comment:


  • ahrs
    replied
    Originally posted by WileEPyote View Post

    I prefer my lines of scrolling text. It feels all technical and Matrix-y.
    Me too but I barely get to see them, even on my old dual-core laptop it boots so fast since I upgraded it with a Samsung SSD. Anyone still using a hard drive in 2024 has no excuse.

    This is a Gentoo system with the bare-minimum mind you. I think Udev and NetworkManager forms the bulk of time spent booting. Distributions like Ubuntu that do stupid things like mount a million squashfs mounts for Snap at boot are going to take longer, there's no way around that.

    Leave a comment:


  • WileEPyote
    replied
    Originally posted by ahrs View Post

    The average linux boot on a system with good hardware (reasonable CPU/GPU and fast PCIe Gen4/5 NVMe drive) probably spends more time in Plymouth than actually booting. Splash screens were great when people had slow hard drives so needed something pretty to look at while the rest of the system caught up. Not so much anymore, a quiet black screen that boots instantly to GDM or SDDM is enough.
    I prefer my lines of scrolling text. It feels all technical and Matrix-y.

    Leave a comment:


  • kiffmet
    replied
    People having issues can force load the amdgpu driver early during the boot process. Both dracut and mkinitcpio support this. Plymouth could also be configured with a bigger timeout value.

    IMO, even thinking about splitting the amdgpu kernel driver is insane as a reaction to this purely cosmetic problem. Further, the issue seems to occur on Fedora, but I haven't seen a report from OpenSUSE or Ubuntu yet. They all use plymouth though -> likely distro/packaging specific issue.

    And yes, the kernel source tree is approx. 1.6GB extracted now, with around 480mb being amdgpu register documentation headers (for GCN1-5, RDNA1, 2, 3, 3.5, 4, CDNA1-3 and all APU derivatives of them!) that collapse in size in the compiled driver. So what?! - it still easily fits on a computer from the last 20 or so years and the cheapest of SD cards and USB sticks…

    The compiled amdgpu module is 20mb uncompressed and 4.5mb compressed. The proprietary nvidia driver .ko is 50mb+ compressed(!) - why don't people report plymouth issues en-masse there? Because the issue isn't code size or dynamic module linking!

    Think what the driver getting split would mean: libdrm, mesa, aswell as amdgpu-pro would need to support those extra interfaces. Boom - code duplication, separate bug trackers and split development resources in several downstream projects.

    In case of the headers being pushed to an extra repo: this would make it difficult to build the kernel on systems with irregular internet access and generally add an annoying extra step for packagers and people wanting to build it.

    Just leave things as they are and let Fedora sort out things using far less invasive means. The debate about this issue is utterly ridiculous and the problem has been inflated way out of proportion.

    Leave a comment:

Working...
X