Announcement

Collapse
No announcement yet.

VMware: ESXi VM Performance Tanks Up To 70% Due To Intel Retbleed Mitigation

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • VMware: ESXi VM Performance Tanks Up To 70% Due To Intel Retbleed Mitigation

    Phoronix: VMware: ESXi VM Performance Tanks Up To 70% Due To Intel Retbleed Mitigation

    VMware's performance engineering team today announced a performance regression in Linux 5.19 affecting compute performance up to -70%, networking up to -30%, and storage up to -13%. But the unfortunate thing is the heavy hitting regressions are known and a side effect of the Intel Retbleed mitigation for older processors...

    https://www.phoronix.com/news/VMware...-ESXi-Retbleed

  • #2
    ouch. I dont use esxi, but it's probably safe that kvm probably has similar hits

    Comment


    • #3
      That’s rough.

      Good response from the intel engineer. That got a good laugh out of me.

      Comment


      • #4
        So if anybody is wondering why "compute" performance decreased by "up to 70 percent", that's because they are counting raw thread creation time as a compute benchmark.

        Comment


        • #5
          Michael

          Wording error "creation times dropping from 16 to 27 ms" should be something like "creation times increasing from 16 to 27 ms"

          Comment


          • #6
            So... "Well, duh.. :-)"​ is now considered a professional response to a very intelligently written and well documented post on the LKML?

            Ranks right up there with the picture of Linus flipping the birdie to Nvidia...
            ...and Lennart Poettering responding to obvious bugs in SystemDeath as "WONTFIX".
            Last edited by NotMine999; 09 September 2022, 08:09 PM. Reason: Why not?

            Comment


            • #7
              Originally posted by NotMine999 View Post
              So... "Well, duh.. :-)"​ is now considered a professional response to a very intelligently written and well documented post on the LKML?
              When that performance lost was in fact noted
              https://lore.kernel.org/lkml/f9fd86a...MS.aculab.com/

              Yes July 2022 over 3 months ago. That the perform on 5.19 is totally horrible on particular hardware. Nothing VMware has found is new. It the same as what was found in July 2022 and there is a patch under way to hopefully reduce the damage.
              https://lore.kernel.org/lkml/2022090...infradead.org/

              This is not Lennart Poettering WONTFIX. This is a pure "Well duh" event. We know the problem we have been writing about it for months attempting to fix it and now you turn up and spot it as well.

              First question how professionally written in the vmware post. Turns out not that good. Looks very professional until you think software development is very much science.

              Repeatability and Reproducibility are important. How useful are those vmware metrics. Turns out to be fairly close to useless to anyone who does not own ESXi right because you will not be able to reproduce it clearly.

              Could vmware developer run a few benchmarks on bare metal to show the problem independent or dependent to ESXi yes he could but did he no? Could have run a few benchmarks with Linux kernel KVM to see if it had the same problem? Yes this is going to cost more time. Think if the problem is repeatable with KVM people without ESXi could repeat the problem.

              Think about this from what was presented by vmware developer and only what was presented. Could the problem in fact be a ESXi problem not the Linux known problem? The answer is yes. Remember if it a ESXi Problem this may no be a Linux kernel problem.

              Lot of ways the vmware developer deserved got well duh response. 1. reporting already known problem that being worked on. 2. Not really providing anything useful to dig into the problem or truly isolate the problem. Yes that second part they failed to correctly isolate the problem and provide the information showing that. Failure to isolate the problem means that even when the new patch is applied their issue may not be fixed fully because their might be some corner case with ESXI the vmware developer failed to look for.

              Yes Well, duh. has been a valid professional response to a person state the known without any useful facts about it. Well Duh response intel really should cause the vmware developer to look at what they posted and how did I screw this up. Please note this is not the first time that this vmware developer has posted stuff without properly isolating it as Linux kernel. Yes this same intel developer the past 6 times vmware developer did this told them to go back and do better testing. At this point vmware developer is not worth the long answer anymore.

              Poor quality post getting a poor quality professional response does make valid sense when you know the history that this is not the first time the vmware developer has done this mistake and has already been told not to.

              Comment


              • #8
                Well, duh.. :-)
                Priceless.

                Comment


                • #9
                  I have been holding off on purchasing new hardware for a very long time and I guess that will continue. I'm almost 10 years behind at this point.

                  I wish Intel would open-source the QAT firmware as I'm sure that could be used for a whole lot more stuff than Intel has implemented. To help offset all these weak points.

                  Comment


                  • #10
                    Originally posted by Lbibass View Post
                    That’s rough.

                    Good response from the intel engineer. That got a good laugh out of me.
                    I still think it's not that funny
                    Imagine you have a farm of servers with intel processors. You bought them because of their performance, probably due to very good price to performance ratio.
                    Suddenly that changes and those processors are a very bad deal. You would feel a bit tricked because that is purely fault of the company and there is not really an alternative, except exposing yourself to security issues, which again, are things you didn't know in advance that are part of the deal.

                    Are refunds possible based on these regressions? "Well, duh, yes" would be a good answer to this one

                    Comment

                    Working...
                    X