Announcement

Collapse
No announcement yet.

ByteDance Working To Make It Faster Kexec Booting The Linux Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by yump View Post
    After learning about "sysctl kexec" a while back, I read the docs and wrote a script to present a menu of kernels, load up the correct initramfs, and do a kexec reboot. But it never gets used because I came to the conclusion that half the benefit of a reboot is verifying that the machine is still capable of it.
    Do you mind sharing the script?

    Comment


    • #42
      Originally posted by coder View Post
      The kernel's contributions policy needs to have a fair degree of neutrality. If a contribution meets the standards, it shouldn't be discriminated against, unless it's from a submitter with a track record of bad behavior (e.g. those UMN students supposedly researching faithless actors).

      Agreed. However, any rejections should ultimately follow a fair and sound appraisal of the patch and be rooted in established policies & conventions.

      Apart from that, the Linux Foundation & kernel development community should continue pushing the envelope in security tools, testing, & practices.
      We are pretty much on the same page here. I am quite pleased with the way the discussion on the kernel mailing list progresses. However, I have lost quite a bit of my trust in the Linux Foundation since they introduced their "inclusivity stance".

      Comment


      • #43
        Originally posted by lowflyer View Post

        You telling me what I should care about is blowing the horn of the CCP.

        I am amazed that so many of you are ready to pick up and run with an argument without having read it because it seems to criticize China. I picked China because the original article was about TikTokTM . Last time I checked this company was from China, and I'm ready to bet that it is still Chinese at the time of writing this post.

        Apparently only one guy seems to take the point seriously: "Why on earth do they need that?". Most seem to be ok with just about anything "as long as its good":





        May I remind you of Heartbleed. That was also a commit that was checked 10 times before it was merged. Mistakes happen and that's the reason why I think it is important to look at the intent of this "fix". We are speculating here about a possible use case and the comments of the author of the changes (Albert Huangtjie) do not shine a bright light on it. My phantasies go wild thinking of possible malicious uses of this function.
        I consider these contributions to be good too, in some way.

        Every bad piece of code merged contributes on improving code review and not make the same mistakes again.
        Last edited by timofonic; 27 July 2022, 05:09 AM.

        Comment


        • #44
          Originally posted by sinepgib View Post
          It's not about whether or not it criticizes China, but whether or not it implies China is any different than any other country in that regard. Note nobody is assuming any more good will from China than they are assuming from western governments or companies. You explicitly said you don't trust it not because you don't understand the use case (which is a very valid reason to not use the feature), but because of where it comes from.
          My reaction would be the same if you had said you don't trust it because it comes from Intel or Microsoft, and for the latter you can actually check my previous comments in the forum to see I'm honest about it.
          It is about China because it comes from China. You are correct that it is not always "just China". But China has it's issues and we're not discussing Microsoft or Intel here.
          I consider ignoring the intent after not understanding the use case as recklessness.


          Originally posted by sinepgib View Post
          No, it wasn't. There were exactly two people looking at commits for OpenSSL, there was a huge discussion about relying on the voluntary work of a few people for critical infrastructure at the time because of that.
          This was a reply to RejectModernity 's claim that all commits are checked 10 times before merging. You correctly say: "it wasn't" (past tense). The late insight that only two people have actually verified the Heartbleed commit came only *after* the hearts were bleeding.


          Originally posted by sinepgib View Post
          So now it's about mistakes. I'll give you the benefit of doubt and say it was just the way you expressed it that made it seem like you were accusing someone of malice just for their provenance, when in reality it was just an assumption about the quality of their programmers.
          I don't doubt the quality of Chinese programmers. I think I was pretty explicit about assuming malice.
          Ignoring possible malice *after* seeing that the code has no or very little real world value is, again, recklessness. Things like that *have* happened in the past. However, mistakes happen (for Heartbleed at least two times).


          Originally posted by sinepgib View Post
          The use case is pretty much the same as all other boot time optimizations we've been seeing from western companies such as Amazon and Google. Maybe questionable, but booting machines on the fly is currently valuable, and doing it fast when the actual spikes in usage appear more so.
          No, this is different.

          In general I would agree that rebooting on the fly is important. I am one of the most annoyed and impatient persons when booting up. But kexec reboot seems to be only a thing on servers. On servers nobody really cares about hours of rebooting time since it happens so seldom.


          Originally posted by sinepgib View Post
          Care to give an example?
          I already gave one in the picture I posted along my first post in this thread.



          I am pleased about the discussion over this pull request on the kernel mailing list. It's about possible savings of max 500ms (half a second) at the expense of eventually rebooting into an already failed (or tainted, or ...) kernel. At the same time, I notice that Albert Huangtjie just ignored the question about embedded systems.

          Comment


          • #45
            Originally posted by timofonic View Post

            I consider these contributions to be good too, in some way.

            Every bad piece of code merged contributes on improving code review and not make the same mistakes again.
            Well, - agreed. - No. Nope. Most definitely not.

            The current observable efforts in the industry all move towards preventing to write a "bad piece of code" in the first place (static code analysis). Before being merged.

            I am with you that bad code can stir up "more effort". But with the linux kernel we should be past that phase. Such mistakes have happened in the past. We should not aim to repeat them.

            Comment


            • #46
              Originally posted by S.Pam View Post

              Do you mind sharing the script?
              Only tested on Fedora... 35. Initramfs and kernel paths may have to be corrected for other distros.

              Code:
              #!/usr/bin/env bash
              
              kexec_reboot_to_kernel () {
                  local kernel="$1"
                  local initramfs="/boot/initramfs-${kernel#*vmlinuz-}.img"
                  if ! [ -f "$initramfs" ]; then
                      echo "can't find initramfs $initramfs" 1>&2
                      exit 1
                  fi
                  kexec -l "$kernel" --initrd="$initramfs" --reuse-cmdline \
                      && systemd-run systemctl kexec
              }
              
              main () {
                  local kernel
                  if [ $UID -ne 0 ]; then
                      echo "This script requires root."
                      exit 3
                  fi
                  if [ "$#" -eq 1 ]; then
                      kernel="$1"
                  else
                      echo "Please choose a kernel, or ^C to cancel:"
                      select kernel in /boot/vmlinuz-*; do
                          if [ -n "$kernel" ]; then
                              break
                          fi
                      done
                  fi
                  if [ -e "$kernel" ]; then
                      kexec_reboot_to_kernel "$kernel"
                  else
                      echo "file $kernel does not exist" 1>&2
                      exit 2
                  fi
              }
              
              main "$@"

              Comment


              • #47
                Originally posted by yump View Post
                Only tested on Fedora... 35. Initramfs and kernel paths may have to be corrected for other distros.
                Wowzers! Just tried it on Debian and it works!

                Just had to adjust the initramfs filename and install missing package kexec-tools.
                Nice trick


                (won't repeat often... fear sits deep)

                Comment


                • #48
                  Originally posted by lowflyer View Post
                  It is about China because it comes from China. You are correct that it is not always "just China". But China has it's issues and we're not discussing Microsoft or Intel here.
                  I consider ignoring the intent after not understanding the use case as recklessness.
                  Fair enough. I just don't consider the origin as reason enough to reject something. The lack of use case may be a different story.

                  Originally posted by lowflyer View Post
                  This was a reply to RejectModernity 's claim that all commits are checked 10 times before merging. You correctly say: "it wasn't" (past tense). The late insight that only two people have actually verified the Heartbleed commit came only *after* the hearts were bleeding.
                  But that claim wasn't for open source in general, but for the Linux kernel. While whether or not that's true depends mostly on the subsystem it affects (only two people reviewed a patch I sent about 5 years ago, but it was for a rather niche case that doesn't affect the general population), stuff affecting the core tend to have many more eyeballs for the Linux kernel. OpenSSL was much much different.

                  Originally posted by lowflyer View Post
                  I don't doubt the quality of Chinese programmers. I think I was pretty explicit about assuming malice.
                  Ignoring possible malice *after* seeing that the code has no or very little real world value is, again, recklessness. Things like that *have* happened in the past. However, mistakes happen (for Heartbleed at least two times).
                  Again, Heartbleed, different project with different implications and a different community involved. Regarding assuming malice, I will assume many other things before, such as:
                  - The misconception from the author that its use is obvious;
                  - Sheer incompetence in proper communication.

                  Only after questioning and a refusal to explain will I assume malice.

                  Originally posted by lowflyer View Post
                  No, this is different.

                  In general I would agree that rebooting on the fly is important. I am one of the most annoyed and impatient persons when booting up. But kexec reboot seems to be only a thing on servers. On servers nobody really cares about hours of rebooting time since it happens so seldom.
                  Need I remind you of Facebook, Amazon and IBM actually clearly caring? Besides, that doesn't sound true if we consider there are at least three competing solutions for live patching around, all from western companies. Security updates happen and redundancy is an extra cost, so reducing downtime by other means is an interesting case.

                  Originally posted by lowflyer View Post
                  I already gave one in the picture I posted along my first post in this thread.
                  I just checked the tiny image and I get nothing out of it.

                  Originally posted by lowflyer View Post
                  I am pleased about the discussion over this pull request on the kernel mailing list. It's about possible savings of max 500ms (half a second) at the expense of eventually rebooting into an already failed (or tainted, or ...) kernel. At the same time, I notice that Albert Huangtjie just ignored the question about embedded systems.
                  I'll check that one later, I should be working right now.

                  Comment


                  • #49
                    Originally posted by sinepgib View Post
                    I just checked the tiny image and I get nothing out of it.
                    The Phoronix forums software started defaulting to storing a cached version of embedded images, a while ago. Seems like a useful feature, except the images tend to get downsampled and the total amount of disk space per user is quite limited.

                    Comment


                    • #50
                      Originally posted by sinepgib View Post
                      But that claim wasn't ... the Linux kernel. ... <snip>
                      ... and the other comments

                      Why is it these days not enough to say what you mean? Why is it necessary to *always* emphasize what you *did not mean*? (are you assuming malice?)
                      • Ignoring the origin is recklessness. It's only prudent to look a little bit closer, given the track record of China. (I'm getting criticized for assuming malice)
                      • I don't buy the stance that "oh - on linux *everything* is different and better". Linux has its own share of very similar issues. coder mentioned the UMN students.
                      • This *does not* mean that all other reasons are not worth looking at. The reasons you mention are spot on!
                      • Albert did already not answer one explicit question. Could be an oversight. Could be a language issue. But could also be a ...

                      Originally posted by sinepgib View Post
                      Need I remind you of Facebook, Amazon and IBM actually clearly caring? Besides, that doesn't sound true if we consider there are at least three competing solutions for live patching around, all from western companies. Security updates happen and redundancy is an extra cost, so reducing downtime by other means is an interesting case.
                      Shaving off less than 500ms is not what the big companies are looking for. Live patching is a technique that *avoids* going through a reboot. And which security patch needs a reboot in rapid succession? This does not negate that shorter boot times are a good thing! I never said something like that.


                      Originally posted by sinepgib View Post
                      I just checked the tiny image and I get nothing out of it.
                      A hint is the title of the image: "Click Farm".

                      Another hint is in the last line of my previous post:
                      Originally posted by lowflyer View Post
                      ... eventually rebooting into an already failed (or tainted, or ...) kernel. ...

                      Comment

                      Working...
                      X