Announcement

**microcode** · 07 May 2021, 11:50 AM

Originally posted by milkylainen View Post

Pissing contest. All this is just needs to be called for what it is.

Does this mean that we invalidate all drivers for hardware that cannot be bought yet?
But this was NvLink2. And it is available in Power9?
Or what the hell does "open source user/client" in the hardware space mean?

Also. The kernel community does not have to hide anything or being coy about it.
This should be called. "We like to take a dump on NVidia because we can."

(No, I do not like NVidia.)

It means that if you want to maintain these drivers and APIs, then you can do it in your own branch. If they couldn't even test the driver if they had the hardware, then why is it upstream?

**DanL** · 07 May 2021, 02:18 PM

Originally posted by milkylainen View Post

This should be called. "We like to take a dump on NVidia because we can."

Yeah, I'm sure this made Hellwig's day. But the good news for people using this is that they can use a 5.10.x LTS kernel until 2026 if they have to.

**MaxToTheMax** · 07 May 2021, 03:16 PM

I agree with this decision, kernel developers should not be expending labor on behalf of features that require proprietary software to use.

Originally posted by dh04000 View Post

So say for a user that had hardware that depends on that code. Did that permeantly break that hardware for future kernels? Or can that be readded on the userside when say I stalling the Nvidia driver?

You could patch the driver back in if you needed it.

**pal666** · 07 May 2021, 04:04 PM

Originally posted by milkylainen View Post

Or what the hell does "open source user/client" in the hardware space mean?

it means software which will use that hardware. you can't use hardware without software

**pal666** · 07 May 2021, 04:06 PM

Originally posted by dh04000 View Post

When installing* the

you can do anything to opensource software, but upstream kernel isn't obliged to help you

**pal666** · 07 May 2021, 04:08 PM

Originally posted by chuckatkins View Post

So if the "client" is the application using GPUs across NVLink then the assertion is simply wrong. However, if they are instead viewing the GPU driver as the client then the assertion holds.

obviously they are viewing closest client, not someone on the other side of internet, viewing generated picture from opensource browser

**pal666** · 07 May 2021, 04:09 PM

and on topic https://www.youtube.com/watch?v=JbovJbKALzA

**chuckatkins** · 07 May 2021, 04:21 PM

Originally posted by pal666 View Post

obviously they are viewing closest client, not someone on the other side of internet, viewing generated picture from opensource browser

I don't mean remote browser client / far-away backend server. In the context of HPC it'd be something like an MPI job running with one process per GPU on the same machine sending MPI messages directly between the GPUs via NVLink. You could use the benchmark suite that ships with MPVAPICH (an open source MPI implementation), you could use OpenFOAM (a hugely popular open source CFD simulation), etc. These are open source applications and test suites that run directly on the machine with the GPUs.

**oiaohm** · 07 May 2021, 07:40 PM

Originally posted by chuckatkins View Post

The issue in this case I suppose is what defines "client". HPC and AI are the biggest market for NVLink I believe and most of the widely used HPC codes using NVIDIA GPUs are Open Source. So if the "client" is the application using GPUs across NVLink then the assertion is simply wrong. However, if they are instead viewing the GPU driver as the client then the assertion holds.

The code you need to test the kernel driver function. Application using GPU across NVLink by closed source drivers does not count to the Linux kernel. Think about it closed source drivers how are you going to look inside to make sure you have proper test suite coverage for the kernel driver.

A open source client application that used the driver to validate all the functions of the NVLink driver that was not a driver would have still have counted as a open source client/user. Yes they are meaning a direct open source user/client. If its a open source application using a closed source library that does not count to keep the feature in the Linux kernel. Nvidia did promise when they got the driver in the Linux kernel there would be a open source validation tool for the NVLink that has never come.

Reality here users effected by this change need to be up Nvidia ribs because the reason the driver removed is Nvidia did not do what they said they would. Yes since then Linux kernel has got more strict that you don't get in with a promise any more the stuff has to exist at merge into mainline because too many parties like Nvidia have not been keeping up the promise to provide the user space open source parts required for testing latter.

**oiaohm** · 07 May 2021, 07:48 PM

Originally posted by MaxToTheMax View Post

I agree with this decision, kernel developers should not be expending labor on behalf of features that require proprietary software to use.

That not the problem in fact. There are many parts the Linux kernel maintain that are highly useful to proprietary software. The problem is how to test the Linux kernel if you don't have a decent reference to see how different drivers should be working. As valve stated with developing on Linux being able to see inside the mesa drivers allowed their developers to understand why particular things would not work they way they expected even on Windows. There is a need for a kernel driver developer to have source access to at least 1 user space that can use it.

So its not that it requires proprietary software to use the feature. Its that proprietary software gets in the way of validating if a driver it working right and making sure you have test suite covering where needs to be. The reality here the kernel developers are unable to expend labour to make test suites without open source user space implementation to look inside. So its not that Linux kernel developers are expending labour on items with these problems its that they cannot even if they want to properly put labour into these parts.

A reference open source userspace implementation is required for practical Linux kernel development reasons for all drivers.

Announcement

Linux 5.13 Yanks A NVIDIA NVLink Driver For Lack Of Open-Source User

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment