Announcement

Collapse
No announcement yet.

VMware Hits A Nasty Performance Regression With Linux 5.13

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VMware Hits A Nasty Performance Regression With Linux 5.13

    Phoronix: VMware Hits A Nasty Performance Regression With Linux 5.13

    VMware has found the Linux 5.13 kernel that was released as stable one month ago has led to a serious performance regression for their ESXi enterprise hypervisor...

    https://www.phoronix.com/scan.php?pa...sty-Linux-5.13

  • #2
    Interesting that the title of the offending commit is not really disclosing the substantive change that causes the problem.

    When I custom compile a kernel I get the option to chnage the default time slice to be 1ms/3.333ms/10ms, but I think this scheduler configuration goes beyond even those timeslices...

    Comment


    • #3
      Not related to VM, but I got a severe regression with an Intel 10Gbps NIC (x710-T2L), with Linux 5.13.5. The Ethernet speed drop drastically when copying few files at the same time (which is my default use case with this setup). I finally had to rollback to 5.12.

      Comment


      • #4
        Not on my watch it won't

        $ gzip -cd /proc/config.gz | grep -i SCHED_DEBUG
        # CONFIG_SCHED_DEBUG is not set

        Ubuntu's kernel:

        $ grep -i SCHED_DEBUG /boot/config-5.4.0-81-generic
        CONFIG_SCHED_DEBUG=y

        Comment


        • #5
          Originally posted by Ealrann View Post
          Not related to VM, but I got a severe regression with an Intel 10Gbps NIC (x710-T2L), with Linux 5.13.5. The Ethernet speed drop drastically when copying few files at the same time (which is my default use case with this setup). I finally had to rollback to 5.12.
          I have suspicion that I am also experiencing the same problem, have intel 10GBE NIC's and I noticed the through put went down roughly at the same time I upgraded the Kernel to 5.13.x . I will downgrade at some point in the future to see if it improves performance, thanks for the heads up.

          Comment


          • #6
            In my Kernel testing I encountered quite a few stability issues with 5.13, albeit I suspect taht these are partly compiler related. GCC 10.3 at -O2 (with my usual optimizations) is the most stable but the CPU utilization roughly doubled when the network traffic is under full load with 1 GBit (NIC is an Intel i350-T2V2) from 6% to 11 %; gaming performance is okay though albeit a tad slower. GCC 11.1 at -O3 has fundamental issues, but these are known and won't be fixed in the upstream Kernel. My Polly+LTO-optimized build with Clang 12.0.1 was in between performance/stability-wise, Clang 13 had some issues, but that is still in developement and to be expected from an experimental compiler.

            Comment


            • #7
              The article should be updated to reflect Valentin Schneider's reply:

              sysctl_sched_wakeup_granularity's default value hasn't been touched since
              2009:

              172e082a9111 ("sched: Re-tune the scheduler latency defaults to decrease worst-case latencies")

              and the automagic scaling (see kernel/sched/fair.c::update_sysctl()) hasn't
              changed much either.

              What's likely to happen here is that you have a service in your distro (or
              somesuch) tweaking those values, and since the incriminated commit moves
              those files to /sys/kernel/debug/sched/, said service doesn't do anything
              anymore.

              Comment


              • #8
                Originally posted by perpetually high View Post
                Not on my watch it won't

                $ gzip -cd /proc/config.gz | grep -i SCHED_DEBUG
                # CONFIG_SCHED_DEBUG is not set

                Ubuntu's kernel:

                $ grep -i SCHED_DEBUG /boot/config-5.4.0-81-generic
                CONFIG_SCHED_DEBUG=y
                Yes. I run Gentoo...double checked my config to be sure, it's not set on any of my servers and desktop. Although I haven't updated to 5.13 yet anyway. No one cares, but I'm post it anyway. Thank you for your time

                Comment


                • #9
                  What the vmware guys did explained... https://www.youtube.com/watch?v=MS2aEfbEi7s

                  Comment


                  • #10
                    Originally posted by sandy8925

                    You can just use zgrep
                    You're right. I didn't realize I was using gzip with --decompress. I had that shortcut with zgrep set up at some point, and must have just used my bash history to keep using the command above. Thanks for the heads up

                    Comment

                    Working...
                    X