Announcement

Collapse
No announcement yet.

The "What If" Performance Cost To Kernel Page Table Isolation On AMD CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by Teggs View Post

    They pay a lot of attention to these problems, but I would say only sometimes is the mitigation 'quick'. A lot of them don't resolve until or after the researcher forces the vendor's hand.

    If I recall the sequence, Spectre was made known at Black Hat 2016 (beginning of August), Google revealed Meltdown and such to x86 vendors in July 2017, and no public disclosure or patch was made available for them until January 2018.

    There are a lot of variables in how serious or credible a vulnerability is, how difficult it is to mitigate, how many organisations have to cooperate... but also some people still try to bury things under the rug.
    I think it’s just the engineer taking their time to investigate the problem and come up with a fix in hardware level and software level.

    I think all the Intel CPUs after official spectre disclosure after 2018 has hardware level migration for spectre to reduce the performance cost of the migration, and they also submit PRs to Linux and Windows for the fix in additional to release new microcodes for old CPUs.

    This is going to take a lot of time, considering modern CPUs are extremely complex.

    Comment


    • #22
      Kernel Self Protection Project - recommended settings

      I can still do VR in a vm with GPU passthrough using 6 cores of a 3700x with these on linux-hardened

      Comment


      • #23
        Originally posted by NobodyXu View Post

        Researchers don't need to investigate every single device/CPU.

        Once they find out a specific CPU model is vulnerable, they can conclude all CPUs using the same architecture is vulnerable.
        And they will also be able to easily test other architectures by reusing the script/coding they write.
        You are right. I should have said "years to research a single model" not "a single chip".

        IMHO I had the exact opposite view on this when it comes to how the manufacturers treat the hardware level bugs.

        I think they really put security over performance.

        Both spectre or meltdown are extremely hard to exploit for hackers.
        Testing access time of memory is hard and its result is going to be quite unstable and just as any hacking, it involves guessing.

        So constructing such a program to exploit these CVEs are no easy task.
        I don't agree. IMO a hacker isn't always some far away entity that is trying to arbitrarily guess your port-knocking sequence before they start with guessing your private keys. It can be someone who shares a cloud provider CPU or even shares a core if you are on a cheap VPS hosting platform. A hacker just needs a few bytes to reduce their guessing probability from 100s of years to a few hours or minutes if they are lucky. It might not be an easy task but it's not impossible either, most definitely easier than guessing private keys.

        IMO if they placed security equal to performance perhaps 1 mitigation would have been needed. If they placed security over performance 0 mitigations would have been needed. Instead we see ~30 hardware vulnerabilities being fixed in software. I'm sure we will see more going forward over the next few years.

        And unlike CVEs at software levels, such as buffer overflow, use after free, etc, where they is widely used in real world before they were migrated with
        technologies such as ASLR, kfence, initialised stack variables, better C standard library functions, etc, these hardware bugs are really hard to exploit and there
        is no report on real world attack based on that.
        I think many of the software changes have been useful in catching bad software. I would like to focus on hardware though, with the execption of (k)ASLR since it's mentioned in the study that we are discussing. The study stated: "We demonstrated the applicability in real-world scenarios to break (fine-grained) KASLR, monitor kernel activity, establish a covert channel, and even leak kernel memory with Spectre gadgets".

        So I would say Intel, AMD and other manufacturers actually paid a lot of attention to this and fixed them (provide micro codes update, kernel patchs and compiler patchs)
        quickly and put the security over performance (initial kernel patch actually is more pedantic and lose more performance, you can check these benchmark on this site).
        I understand something like Spectre V2 is challenging to solve, so I'm not going to say Intel or AMD should through out performance and go in-order, but given the amount of critical flaws found... it's clear as day that security was not on the top of the priority list. Intel nor AMD knew about the problems they depended on 3rd party research. If it wasn't for the researches they still would not have know about the flaws today. Again it's a matter of where you spend your engineers' time. For Intel and AMD it has always been winning the performance race.

        Security sensitive applications have stayed away from x86 implementations even before these vulnerabilities were discovered. Using a modern x86 CPU one cannot deterministically know how code will be executed in hardware. If you value security over performance you want a more deterministic hardware execution path and isolate cache. You achieve that by doing less parallel tasks. AMD and Intel CPUs are not designed to work in such a way, it would make them way to slow. Hence performance over security.

        Comment


        • #24
          Originally posted by Linuxxx View Post

          Then how on earth did Intel achieve these results with Tiger Lake?

          https://www.phoronix.com/scan.php?pa...igations&num=1
          You have to be joking. You really can't see that there's something drastically wrong there?

          The kernel could be broken, Michael could have made a mistake or Intel profiled and optimised mitigations. Knowing Intel and their numerous counts of performance over security... I'm putting my money on profiling mitigations. Just like some other people in the comments said, by doing that they likely opened up new security vulnerabilities.

          If that gives you peace of mind then well I won't be able to convince you of anything.

          Comment


          • #25
            Originally posted by Jabberwocky View Post

            You have to be joking. You really can't see that there's something drastically wrong there?

            The kernel could be broken, Michael could have made a mistake or Intel profiled and optimised mitigations. Knowing Intel and their numerous counts of performance over security... I'm putting my money on profiling mitigations. Just like some other people in the comments said, by doing that they likely opened up new security vulnerabilities.

            If that gives you peace of mind then well I won't be able to convince you of anything.
            Care to elaborate how profiling & optimizing for mitigations is considered bad practice & "opened up new security vulnerabilities"?

            And still, shouldn't AMD try to achieve the same, so end-users aren't forced to choose between either performance or security?

            Comment


            • #26
              Originally posted by Linuxxx View Post

              Care to elaborate how profiling & optimizing for mitigations is considered bad practice & "opened up new security vulnerabilities"?

              And still, shouldn't AMD try to achieve the same, so end-users aren't forced to choose between either performance or security?
              Have you ever heard of Richard Feynman talking about magnets? If you have then you will understand why I have trouble to explain that to you. I don't have remotely as much CS knowledge as Richard Feynman has about physics but the basic concept still applies.

              If I over simplify it by saying it's a band-aid vs a cure then I'm not explaining it properly... why do I think it's not a cure why is it more likely to be a band-aid. Why can't Intel just hire 100 different teams to fix these problems faster, they can afford it can't they? Why is doing 3 things faster than doing 2 of those 3 and why is that potentically a bad thing? Why are CPUs still leaking information after so many teams from the top companies in the world tried to solve the problems?

              I honestly think the best thing would just to be to wait and see what comes from the benchmarks.

              Comment


              • #27
                Originally posted by Jabberwocky View Post

                Have you ever heard of Richard Feynman talking about magnets? If you have then you will understand why I have trouble to explain that to you. I don't have remotely as much CS knowledge as Richard Feynman has about physics but the basic concept still applies.

                If I over simplify it by saying it's a band-aid vs a cure then I'm not explaining it properly... why do I think it's not a cure why is it more likely to be a band-aid. Why can't Intel just hire 100 different teams to fix these problems faster, they can afford it can't they? Why is doing 3 things faster than doing 2 of those 3 and why is that potentically a bad thing? Why are CPUs still leaking information after so many teams from the top companies in the world tried to solve the problems?

                I honestly think the best thing would just to be to wait and see what comes from the benchmarks.
                Spectre, meltdown is rooted from Out Of Order Execution.

                If you want to get rid of it, you need to get rid of OOE.
                And OOE is very imoor for the performance of the modern CPUs, virtually all consumer-level CPUs have them.
                Remove them and the performance will suck.

                Comment


                • #28
                  Originally posted by NobodyXu View Post
                  I think they really put security over performance.
                  Originally posted by NobodyXu View Post
                  Spectre, meltdown is rooted from Out Of Order Execution.

                  If you want to get rid of it, you need to get rid of OOE.
                  And OOE is very imoor for the performance of the modern CPUs, virtually all consumer-level CPUs have them.
                  Remove them and the performance will suck.
                  Earlier you said they valued performance over security and now you're saying if you give security too high priority then performance will suck.

                  I agree, market research shows that no consumer would want a CPU designer to remove OoOE. Instead of throwing out the baby with the bathwater we should look at CPU designs. If we do that we will see some CPUs using OoOE did not have as many vulnerabilities as others. Why is that? What is different? Fortunately we have top CPU designers that said that Intel does not like to redesign from scratch and that's not a good approach. Branch prediction in x86 is insanely good and complex. I'm sure if you do something extreme like flush caches after each call and then OoOE will still be faster than IOE. Flushing caches was already too expensive for Intel and they valued performance too much to do something like that. That cost is insane hence they did not value security over performance.

                  On the other side of the conversation: I wouldn't mind having some low power device using the Cortex-A510. Something like a cheap phone could still be a good product for using IOE.

                  Comment


                  • #29
                    Originally posted by Jabberwocky View Post



                    Earlier you said they valued performance over security and now you're saying if you give security too high priority then performance will suck.

                    I agree, market research shows that no consumer would want a CPU designer to remove OoOE. Instead of throwing out the baby with the bathwater we should look at CPU designs. If we do that we will see some CPUs using OoOE did not have as many vulnerabilities as others. Why is that? What is different? Fortunately we have top CPU designers that said that Intel does not like to redesign from scratch and that's not a good approach. Branch prediction in x86 is insanely good and complex. I'm sure if you do something extreme like flush caches after each call and then OoOE will still be faster than IOE. Flushing caches was already too expensive for Intel and they valued performance too much to do something like that. That cost is insane hence they did not value security over performance.

                    On the other side of the conversation: I wouldn't mind having some low power device using the Cortex-A510. Something like a cheap phone could still be a good product for using IOE.
                    Having a absolutely safe and simple backup CPU is always good, in case the complex one fail.

                    Comment

                    Working...
                    X