Announcement

Collapse
No announcement yet.

Benchmarking AMD Zen 3 With Predictive Store Forwarding Disabled

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Benchmarking AMD Zen 3 With Predictive Store Forwarding Disabled

    Phoronix: Benchmarking AMD Zen 3 With Predictive Store Forwarding Disabled

    This past week AMD published a security analysis of AMD Zen 3's new Predictive Store Forwarding (PSF) functionality. In there they did acknowledge there is the possibility where bad PSF functionality could lead to a side-channel attack albeit the real-world exposure would be quite low. In any case they are allowing interested users to disable the Predictive Store Forwarding functionality, but what they didn't comment on in that paper was what performance overhead to expect if disabling PSF. So my Easter weekend turned into AMD Zen 3 PSF benchmarking.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Phew. Dodged that one by a hair.
    Nice to see that performance wasn't shot to bits.

    Comment


    • #3
      One can reverse the question: what is the interest of enabling PFS if it does not provide an advantage?

      Comment


      • #4
        Yeah, was wondering that, too. Either I have a mistake in thinking or PSF wouldn't really help all that much in most scenarios. Or, it still wasn't really operating / non-operating. Or the kernel paramter did something different. Maywe we need something that definitely calls SECCOMP and forces the kernel to run this code in safe modes?
        Stop TCPA, stupid software patents and corrupt politicians!

        Comment


        • #5
          AMD: in where the mitigation comes for free.

          Also, a box for Epyc? Never before server processors had fancy boxes..

          Comment


          • #6
            Originally posted by tildearrow View Post
            Also, a box for Epyc? Never before server processors had fancy boxes..
            Xeons were available in boxed versions since their beginning

            And even their newest versions share the same design, which I find neat:

            Comment


            • #7
              Cue to birdie crying in fetal position after seeing those benchmarks.

              Comment


              • #8
                Well, if PSF is that ineffective it's currently a waste of silicon space, and AMD should either improve it or remove it. I can't help but wonder if something wasn't set up correctly for these tests, or if there's some other unintended anomaly, but if it's really this bad PSF doesn't seem to add any real value to a processor.
                Last edited by muncrief; 04 April 2021, 11:32 PM.

                Comment


                • #9
                  Originally posted by muncrief View Post
                  Well, if PSF is that ineffective it's currently a waste of silicon space, and AMD should either improve it or remove it. I can't help but wonder if something wasn't setup correctly for these tests, or if there's some other unintended anomaly, but if it's really this bad PSF doesn't seem to be add any real value to a processor.
                  Maybe it requires not yet enabled GCC optimizations?

                  Comment


                  • #10
                    It seems like it's likely a feature that becomes more effective in longer running processes. A short benchmark might not be affected nearly as much as a long-running server process.

                    PREDICTIVE STORE FORWARDING It is common for a CPU to execute a load instruction to an address that was recently written by a store. Many modern processors implement a technique known as Store-To-Load-Forwarding (STLF) to improve performance in such cases. With STLF, data from the store is forwarded directly to the load without having to wait for it to be written to memory. In a typical CPU, STLF occurs after the address of both the load and store are calculated and determined to match. PSF expands on this by speculating on the relationship between loads and stores without waiting for the address calculation to complete. With PSF, the CPU learns over time the relationship between loads and stores. If STLF typically occurs between a particular store and load, the CPU will remember this. When the CPU sees the store/load pair again, it may predict that STLF will occur and speculatively forward the data from the store to the load. This is done before confirming that the store and load are in fact to the same address.
                    The PSF is limited to training about store/load dependencies within the same context. A context is defined by the current values of CPL, ASID, PCID, CR3, and SMM status. Training only occurs if both the store and load execute in the same context. Any time that any piece of the context state changes (e.g. system call) existing training information is flushed. In particular, this flushing occurs on all far control transfers which includes all CPL changes, system call and return, interrupt/exceptions, SMM entry/exit, and VM entry/exit. Note that the PSF predictor is partitioned amongst SMT threads so the activity of one SMT thread does not influence the PSF predictions of the sibling thread. Finally, the store and load used to train the PSF must be relatively close together in the instruction stream and there cannot be any pipeline flushes (such as due to a mis-predicted branch) between the store and the load.

                    Comment

                    Working...
                    X