Announcement

**davidhendricks** · 26 July 2022, 11:38 AM

Originally posted by coder View Post

My assumption is that BIOS/UEFI triggers devices to reset themselves in some fashion. Then, when the kernel's device driver starts interacting with the device, it's essentially starting from a blank slate. I don't know if this is true, however.

In contrast, if you merely reset the kernel, then the various devices in the system could be left essentially in their previous state. This could tickle bugs in device drivers that you wouldn't hit in a full reboot.

Again, all of this is conjecture. If anyone has actual knowledge to share on the subject, please do.

If the kernel does not shut a device down cleanly then it is a bug that needs to be fixed. That said, I've seen the issue you describe, but fortunately it's not too common these days.

**coder** · 26 July 2022, 12:38 PM

Originally posted by lowflyer View Post

Most seem to be ok with just about anything "as long as its good"

The kernel's contributions policy needs to have a fair degree of neutrality. If a contribution meets the standards, it shouldn't be discriminated against, unless it's from a submitter with a track record of bad behavior (e.g. those UMN students supposedly researching faithless actors).

Originally posted by lowflyer View Post

Mistakes happen and that's the reason why I think it is important to look at the intent of this "fix".

Agreed. However, any rejections should ultimately follow a fair and sound appraisal of the patch and be rooted in established policies & conventions.

Apart from that, the Linux Foundation & kernel development community should continue pushing the envelope in security tools, testing, & practices.

**coder** · 26 July 2022, 12:42 PM

Originally posted by sinepgib View Post

The use case is pretty much the same as all other boot time optimizations we've been seeing from western companies such as Amazon and Google.

Don't forget IBM, here!

**coder** · 26 July 2022, 12:45 PM

Originally posted by davidhendricks View Post

Why settle for a mediocre solution when you can do as the Bytedance folks have done and make it work to best suite your business needs.

The problem is non-local. We're talking about any given hardware devices in your system not being properly reset by their driver. That doesn't have a centralized solution, such as this patch.

If you want to use kexec for fast reboots on bare hardware, that's up to you. Personally, I'll take the hit on reboot time in the interest of better stability.

**yump** · 26 July 2022, 02:50 PM

After learning about "sysctl kexec" a while back, I read the docs and wrote a script to present a menu of kernels, load up the correct initramfs, and do a kexec reboot. But it never gets used because I came to the conclusion that half the benefit of a reboot is verifying that the machine is still capable of it.

**davidhendricks** · 26 July 2022, 03:35 PM

Originally posted by coder View Post

The problem is non-local. We're talking about any given hardware devices in your system not being properly reset by their driver. That doesn't have a centralized solution, such as this patch.

The centralized solution is to fix broken drivers. Large companies such as Bytedance have very tight control over their hardware and use a limited number of components. Problems with the general PC industry often do not apply. They also have a small army of kernel developers and can get vendors to fix things as needed.

If you want to use kexec for fast reboots on bare hardware, that's up to you. Personally, I'll take the hit on reboot time in the interest of better stability.

Fair enough, do what works best for you. Large companies with datacenters full of servers tend to want maximum ROI (e.g. minimum downtime) so spending some effort to make kexec fast and reliable is well worth it if it means kernel/OS updates, error handling, etc. takes seconds instead of minutes.

**erniv2** · 26 July 2022, 05:10 PM

Originally posted by davidhendricks View Post

The centralized solution is to fix broken drivers. Large companies such as Bytedance have very tight control over their hardware and use a limited number of components. Problems with the general PC industry often do not apply. They also have a small army of kernel developers and can get vendors to fix things as needed.

Fair enough, do what works best for you. Large companies with datacenters full of servers tend to want maximum ROI (e.g. minimum downtime) so spending some effort to make kexec fast and reliable is well worth it if it means kernel/OS updates, error handling, etc. takes seconds instead of minutes.

The enterprise case where you know what hardware is used and only a small amout of drivers is loaded and you know they are in a good working state, and can handle the reset and do the nessecary writes to setup the hardware correct then yes it´s fine.

Then comes the security aspect, lets say you use a dgpu for compute tasks you also need to make sure that the gpu driver does a zerofill of the dgpu ram, so no old code is still there, or graphic artifacts occure because suddenly you see a ghost image of the game you played a few hours ago, or get some funny compute results from the previous kernel.

You need some kind of driver quality standard when kexec is used to ensure all device registers get rewritten with sane values, and all dedicated ram get´s zeroed.

You even need to ensure that the kexec kernel is not loaded in the previous kernel space and the previous kernel get´s zeroed.

Copy from wiki

While feasible, implementing a mechanism such as kexec raises two major challenges:

Memory of the currently running kernel is overwritten by the new kernel, while the old one is still executing.
The new kernel will usually expect all hardware devices to be in a well defined state, in which they are after a system reboot because the system firmware resets them to a "sane" state. Bypassing a real reboot may leave devices in an unknown state, and the new kernel will have to recover from that.

**coder** · 26 July 2022, 05:27 PM

Originally posted by davidhendricks View Post

The centralized solution is to fix broken drivers.

That's a moving target, as is the hardware. Like I said, if you want to roll the dice, go for it! I will stick to the simplest and best-tested path.

Originally posted by davidhendricks View Post

Large companies such as Bytedance have very tight control over their hardware and use a limited number of components.

As previously noted by others, Byte Dance might be even be using it only within a VM, in which case the virtual devices should provide much smaller surface area for things to get into a bad state.

**coder** · 26 July 2022, 05:38 PM

Originally posted by erniv2 View Post

Then comes the security aspect, lets say you use a dgpu for compute tasks you also need to make sure that the gpu driver does a zerofill of the dgpu ram, so no old code is still there, or graphic artifacts occure because suddenly you see a ghost image of the game you played a few hours ago, or get some funny compute results from the previous kernel.

...or let's say a cloud user gets a couple stray images from a pr0n site, which was using that GPU previously.

Realistically, I expect all GPU compute runtimes are good about ensuring memory is zero'd, when first allocated to a process. I think they all now support memory encryption, as well.

Originally posted by erniv2 View Post

You need some kind of driver quality standard when kexec is used to ensure all device registers get rewritten with sane values,

For me, this is the main thing. Not just the registers, but all of the embedded micro-controllers, engines, and busses all have to get reset and re-initialized in the proper order. I trust most drivers to get it right, but you really don't want any exceptions to that. Your system is only as stable as its most unstable device/driver.

**erniv2** · 26 July 2022, 05:51 PM

Originally posted by erniv2 View Post

You need some kind of driver quality standard when kexec is used to ensure all device registers get rewritten with sane values

Quote myself i just relised my error that will never happen because then you need an engineering sheet wich details all registers, and they are never going to be revealed to the linux kernel else everybody would start hacking pci registers , hey if i haxxor this my card suddenly support BAR and if i set that it runs another cache etc.

Announcement

ByteDance Working To Make It Faster Kexec Booting The Linux Kernel

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment