Hi everyone. This is a question I have had for a while. But basically most servers are just running Linux. And lots of these servers are using Nvidia cards to do all kinds of GPU accelerated tasks, especially now with the boom in AI, GPUs are being used practically all the time for someone to use chatgpt. So one would think that Nvidia would put lots of effort into making their drivers work every well on Linux because of this. Or are the drivers only bad for consumer cards like the 3090's and 1660's or are Nvidia just supplying the big companies with special drivers? Or am I just missing something?
Announcement
Collapse
No announcement yet.
Why are Nvidia drivers so bad if most servers use Linux?
Collapse
X
-
Originally posted by Lests1986 View PostHi everyone. This is a question I have had for a while. But basically most servers are just running Linux. And lots of these servers are using Nvidia cards to do all kinds of GPU accelerated tasks, especially now with the boom in AI, GPUs are being used practically all the time for someone to use chatgpt. So one would think that Nvidia would put lots of effort into making their drivers work every well on Linux because of this. Or are the drivers only bad for consumer cards like the 3090's and 1660's or are Nvidia just supplying the big companies with special drivers? Or am I just missing something?
I believe issues with drivers come down to how distros provide it, or how that distro's set-up handles NVIDIA's run script (likely shouldn't use that unless you're on a distro that doesn't provide NV drivers, which to me would imply it'd go badly on that distro and I'd use some other distro).
-
Originally posted by Espionage724 View Post
I used a RTX 3060 and drivers and general GPU control was easier than anything with an AMD RX 580 and 6600 XT.
I believe issues with drivers come down to how distros provide it, or how that distro's set-up handles NVIDIA's run script (likely shouldn't use that unless you're on a distro that doesn't provide NV drivers, which to me would imply it'd go badly on that distro and I'd use some other distro).
Comment
-
Originally posted by Panix View Post
If it's better than 2 AMD gpus - that seems to conflict with the amd fanboys on here - who insist everything is peachy clean with whatever they do. AMD gpus/drivers in Linux is probably fine - since it's integrated in the kernel but once you try to use software - that AMD has little support in - which is most software programs, then that's probably where the downhill performance happens?
My issues with AMD were just trying to get an OpenCL stack working. Mesa's thing didn't work. AMDGPU-PRO worked but the full-stack was sub-par to Mesa, and OpenCL wasn't just copy two files and win; needed its own complicated process (the wackiness of Fedora needing an unofficial shim to prevent newer Fedora Mesa packages from clashing with the AMDGPU-PRO CL libs and even that shim breaking; thus black screens at boot after updates at random).
ROCm was somehow even worse like quietly not supporting RX 580 even though the notes said it could and leading to hours of nonsense troubleshooting (because ROCm logs were useless in straight-up mentioning that), and not officially packaged anywhere reasonable (not RHEL/server/old) up until somewhat recently. At some point it worked pretty well on my RX 6600 XT though before I sold it.
There was also a period of about a year where VA-API would hard-crash a RX 580; it got fixed eventually.
Meanwhile, NVIDIA (EVGA RTX 3060 LHR) was as-easy as installing 3-5 packages on Fedora (driver, GL, VK, VDPAU, CUDA/CL/compute), and it just-worked, always even after Fedora updates (and with LHR I was even able to bypass that ) They have a GPU control panel, and have an option to change HDMI color space. That option was such bs up until recently; everyone practically defaults to Limited RGB for whatever reason. Intel can toggle it via xrandr prop no problem. NVIDIA had the control panel. AMD, the leaders of open-source graphics, were amazingly awful- radeonsi had the xrandr prop (like Intel); all was ok
- radeionsi -> AMDGPU ("the future") lost the prop
- Nothing user-space to toggle it either (no control panel, no sysfs toggle, nothing)
- And of course, defaulting to Limited RGB meant that I either dealt with washed-out colors, or figure something else out
- The only other option (until recently) was an EDID override; I figured it out but that was its own wild ordeal of having to dump it, edit it to remove Limited RGB, and load it
If I wanted a no-nonsense GPU for anything other than Windows, it'd be NVIDIA, but I'd also look at Intel Arc depending on what OS I want to use (I'd really prefer Arc but I'm also liking FreeBSD and would need to figure out the graphics situation there).Last edited by Espionage724; 29 September 2024, 04:24 PM.
- Likes 1
Comment
-
Originally posted by Espionage724 View Post
openSUSE and Fedora forums have mentions of AMDGPU issues (rendering, black screens) multiple times a week. AMD is clearly playing beta-testing on users as of lately, while distros play along.
My issues with AMD were just trying to get an OpenCL stack working. Mesa's thing didn't work. AMDGPU-PRO worked but the full-stack was sub-par to Mesa, and OpenCL wasn't just copy two files and win; needed its own complicated process (the wackiness of Fedora needing an unofficial shim to prevent newer Fedora Mesa packages from clashing with the AMDGPU-PRO CL libs and even that shim breaking; thus black screens at boot after updates at random).
ROCm was somehow even worse like quietly not supporting RX 580 even though the notes said it could and leading to hours of nonsense troubleshooting (because ROCm logs were useless in straight-up mentioning that), and not officially packaged anywhere reasonable (not RHEL/server/old) up until somewhat recently. At some point it worked pretty well on my RX 6600 XT though before I sold it.
There was also a period of about a year where VA-API would hard-crash a RX 580; it got fixed eventually.
Meanwhile, NVIDIA (EVGA RTX 3060 LHR) was as-easy as installing 3-5 packages on Fedora (driver, GL, VK, VDPAU, CUDA/CL/compute), and it just-worked, always even after Fedora updates (and with LHR I was even able to bypass that ) They have a GPU control panel, and have an option to change HDMI color space. That option was such bs up until recently; everyone practically defaults to Limited RGB for whatever reason. Intel can toggle it via xrandr prop no problem. NVIDIA had the control panel. AMD, the leaders of open-source graphics, were amazingly awful- radeonsi had the xrandr prop (like Intel); all was ok
- radeionsi -> AMDGPU ("the future") lost the prop
- Nothing user-space to toggle it either (no control panel, no sysfs toggle, nothing)
- And of course, defaulting to Limited RGB meant that I either dealt with washed-out colors, or figure something else out
- The only other option (until recently) was an EDID override; I figured it out but that was its own wild ordeal of having to dump it, edit it to remove Limited RGB, and load it
If I wanted a no-nonsense GPU for anything other than Windows, it'd be NVIDIA, but I'd also look at Intel Arc depending on what OS I want to use (I'd really prefer Arc but I'm also liking FreeBSD and would need to figure out the graphics situation there).
Thanks for the detailed reply - pretty disappointing if AMD Linux drivers are problematic in those distros - ironically, those 2 are the ones I plan to use (the only other possibility is some flavour of Ubuntu or Debian - I thought about a derivative of Arch like CachyOS - but, I don't want to distro hop too much or have multiple distros installed - gonna try to limit to around 2). I think Arch might be too much maintenance....dunno.
I was flip flopping on which gpu to buy next - but, the constant issues with AMD drivers - especially, when using productivity software means I'm almost certain going Nvidia next gpu purchase. Still debating/contemplating which one though - 4070 Ti Super, 4080 or 3090 (probably buying used).
AMD gpus - supposedly are subpar/mediocre (at best) at Davinci Resolve, Blender, SD - and it sounds like even in gaming (for some things). It sounds like they are even problematic for even AMD gpu users - if there's black screens and rendering issues.
Oh yeah, to answer the original question of the thread - the gpu isn't really used that much or stressed for a server - so, the administrator just needs to use the Nouveau (?) driver or to install the Nvidia driver once (probably) - since, the server is probably using a fairly stable distro - Debian or Ubuntu? If using Red Hat, the PC is probably using integrated graphics (?) - e.g. Intel - so, there won't be a concern if Nvidia drivers are problematic? I don't think the hypothetical - problematic nvidia drivers - makes a difference in using Linux for a server. There would be ways around that issue - if it were an issue.
Comment
-
The drivers are by far the best on Linux, but the various display technologies, desktops and open source in general were really behind until recently. Wayland didn't support explicit sync, it was very difficult to ship your own libopengl, and many distros had extremely anti-user packaging policies that forced them to install the package manually or enable some sketchy repo. Once installed though, they are the pinnacle of driver development and are basically flawless.
The only thing missing now is Linux having a stable API to allow significant contributors like NVIDIA to instantly support new kernel versions instead of continuing to be petty and change the API every release because Linus hates NVIDIA.
Comment
-
Originally posted by Espionage724 View Post
I used a RTX 3060 and drivers and general GPU control was easier than anything with an AMD RX 580 and 6600 XT.
I believe issues with drivers come down to how distros provide slope game, or how that distro's set-up handles NVIDIA's run script (likely shouldn't use that unless you're on a distro that doesn't provide NV drivers, which to me would imply it'd go badly on that distro and I'd use some other distro).
Comment
Comment