No announcement yet.

Design of a SPECTRE-Resistant High-Performance CPU

  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Bad news: I just found a nasty error in my implementation of "speculation cache buffer".

    It goes like this: a store instruction can evict an entry from L1. "speculation cache buffer" must emulate this in order to match nonspeculative behaviour, but my "speculation cache buffer" cannot do this. Any ideas for solutions?

    Or, maybe this is one of those small imperfections that can perhaps be ignored. Risky, but maybe.


    • #22
      No, I think it is OK.

      Because, it is not the behavior during a speculative window that matters (hopefully), but only the system state that remains after the speculation window is closed.


      • #23
        Here is a thought on the difference between speculative and nonspeculative CPUs

        First, the "CPU" must be defined with this property:

        Having the information storage capacity exactly equal to the amount of information that can be transferred from it to the memory by any instruction sequence.
        Since this amount of information can actually vary over time, but that property is not particularly important for analysis, then maybe it can also be postulated that the CPU storage capacity of a particular CPU model is a constant in time.


        • #24
          Since in this analysys the program counter register / instruction pointer register is not part of the CPU information storage capacity, the speculative CPU (which has multiple (or infinite) PC/IP registers) actually has more storage capacity than a nonspeculative CPU.

          Then, the definition of storage capacity of a speculative CPU can be this:
          At each endpoint of any speculated execution path, having the information storage capacity exactly equal to the amount of information that can be transferred from it to the memory by any instruction sequence appended to the end of the speculated execution path.
          Last edited by xfcemint; 23 September 2021, 03:39 AM.


          • #25
            I think that I am getting incomprehensible so, lets simplify it like this:

            Imagine that at every branching point the speculative CPU multiplies into two CPUs, and both having identical information in their internal storage when the "muliplication" happens.


            • #26
              So, each "instance" of a speculative CPU is actually the same as a non-spculative CPU, and it can continue executing the same way, but the speculative CPU can delay the decision on a branch direction in the sense that speculative "instances" are allowed to kill themselves.

              Then, the question is: can the speculative CPU instances communicate between themselves? If they can, you get SPECTRE.


              • #27
                Oh, I forgot to mention: when the speculative CPU encounters a branching point, it "multiplies" into two instances, and also the attached memory multiplies into two. So, each CPU+memory instance has the same "system state" after the "multiplication".


                • #28
                  There is another thing that is causing confusion here, so let's take it out of the way:

                  The program (instruction sequence) is not in the main memory that is attached to the CPU. Instructions are fed to the CPU from some separate source. The program counter(s) (i.e. information on location of excution endpoints) are also not part of the CPU.

                  I postulated earlier that all the possible execution paths form a tree, to simplify the analysis. So, let's say that loops don't matter. In other words: there is no "jump backwards" instruction, no "for" loops or anything similar. Lets also say that there is only one type of branching instruction and that it is a yes/no decision.


                  • #29
                    The CPU has two lights (like LED diodes) for indicating branch direction, one is for "YES" decision, one is for "NO" decision. The external unit that supplies instructions uses that information to select an execution path.

                    When the speculative CPU encounters a branching instruction, it creates two new CPU+memory instances. One new CPU+memory instance turns on the "YES" light, and the other one turns on the "NO" light. At that moment, the external instruction-supplying unit also creates two instances of itself, but these two instances are not the same: each instruction-supply unit instance only contains instructions relevant to the selected branching direction, in other words: it selects the part of the execution tree from the accepted execution endpoint, in the direction indicated by LED light, to the leaves of the execution tree (let's call the leaves "program end").
                    Last edited by xfcemint; 23 September 2021, 07:13 AM.


                    • #30
                      To make a speculative CPU+memory instance behave as a compatible nonspeculative CPU, it needs one additional bit of data: the correct decision on the branching point where the CPU+memory instance was created. Let's call this bit "DIRECTION", with two states: CORRECT and INCORRECT. This bit is created at the same moment when the CPU+memory instance was created.

                      Therefore, a CPU+memory instance can contain multiple "DIRECTION" bits, where the number of direction bits corresponds to the number of undecided branching points.

                      The CPU must be able to process an instruction "SEKILL", with one operand that selects one of the previous branching points (it is an integer, like: how many branching points backwards).

                      When a nonspeculative CPU encounters this SEKILL instruction, it is a no-op (does nothing).

                      When a speculative CPU+memory instance encounters the "SEKILL" it reads the appropriate DIRECTION bit. If it says "CORRECT", it does nothing (no-op). If it says "INCORRECT", the speculative CPU+memory instance kills itself.

                      If the execution semantics of a speculative CPU contains the described SEKILL instruction and the DIRECTION bits, and the program contains the appropriate SEKILL instructions (this is very easy), and the speculative CPU instances cannot communicate, then the results of the program execution are going to be identical to the results of a nonspeculative CPU. So, in a sense, such a speculative CPU is 100% compatible with a nonspeculative CPU.

                      If, under same conditions, the speculative CPU instances can communicate, then there is a possibility that a speculative CPU and its nonspeculative CPU counterpart are not 100% compatible.
                      Last edited by xfcemint; 23 September 2021, 01:50 PM.