Announcement

Collapse
No announcement yet.

Intel Revs Its Linear Address Masking Patches For Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Revs Its Linear Address Masking Patches For Linux

    Phoronix: Intel Revs Its Linear Address Masking Patches For Linux

    Added to Intel's documentation in late 2020 and initial kernel patches out since early 2021, Intel has been slowly working on Linear Address Masking (LAM) support for the Linux kernel. Out this past week was finally the latest iteration of this work for leveraging untranslated address bits of 64-bit linear addresses to be used for storing arbitrary software metadata...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    What would be possible use cases for this? I vaguely remember something like this being present in i960mx ...

    Comment


    • #3
      -----

      Comment


      • #4
        The documentation and commit message say "metadata", but imagine the struct packing possibilities for march=native in languages that allow the compiler to freely arrange member variables. Anything that has a pointer in it along with some field(s) that fit in 6/15 bits can be made smaller, without having to insert code to clear the upper bits before using the pointer.

        Comment


        • #5
          Mainly for atomic operations. Since there is no double compare and swap (that is compare 2x 64 bits register and swap if equal) in AMD64, you can't write a some lock free structures without this (like a double linked list, or a binary tree). Using this free bits means you can now have those, since you can store a "revision" counter in these 7 bits (9 bits if you count the 2 low bits too), so you can be safe with up to 2^9 simultaneous operation on a shared 64 bit value (like a A-B-A problem that's corrupted after 1 swap, you can still be safe after 2^8 swaps)

          Comment


          • #6
            i960MX had this "tag bit" thing that made it possible to implement memory protection in hardware. Could this be used in the same way? These "free" bits could represent bitmasks that would map to processes (or users) and would represent ownership of memory addresses ...

            Comment


            • #7
              Originally posted by bob l'eponge View Post
              Mainly for atomic operations. Since there is no double compare and swap (that is compare 2x 64 bits register and swap if equal) in AMD64, you can't write a some lock free structures without this (like a double linked list, or a binary tree). Using this free bits means you can now have those, since you can store a "revision" counter in these 7 bits (9 bits if you count the 2 low bits too), so you can be safe with up to 2^9 simultaneous operation on a shared 64 bit value (like a A-B-A problem that's corrupted after 1 swap, you can still be safe after 2^8 swaps)
              While it's presumably more expensive than a 64-bit operation, "lock cmxchg16b" (the 64-bit variant of the infamous "lock cmpxchg8b") is a double compare and swap.

              Comment


              • #8
                Originally posted by archkde View Post

                While it's presumably more expensive than a 64-bit operation, "lock cmxchg16b" (the 64-bit variant of the infamous "lock cmpxchg8b") is a double compare and swap.
                It's a double width compare an swap, not a double compare and swap. It's useful but it forces the layout of the pointer to be contiguous which is harder.

                Comment


                • #9
                  Originally posted by bob l'eponge View Post

                  It's a double width compare an swap, not a double compare and swap. It's useful but it forces the layout of the pointer to be contiguous which is harder.
                  Makes sense, but don't your examples have the same problem?

                  Comment


                  • #10
                    It depends. If the CPU actually ignore those bits when accessing the pointed item but really compare them in the CAS operation, then it behaves like a DCAS, since it's comparing X and Y at unrelated address, and you can fiddle with the bits in Y and X whatever you like. The operation is something like
                    Code:
                    X = X | some marker in high bits; Z = Y & high bits mask;  DCAS(X, Y, Z, new value)
                    .

                    If instead it ignores the high bit in all instructions then it's a regular DWCAS as you said.
                    I don't know this technology so I can't say. We'll certainly see people use it through.

                    Comment

                    Working...
                    X