Announcement

Collapse
No announcement yet.

Intel Makes Cryptography Faster On Linux

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Makes Cryptography Faster On Linux

    Phoronix: Intel Makes Cryptography Faster On Linux

    The Linux 3.8 kernel is continuing to pull in massive amounts of new code as shown by all of the noteworthy pull requests that have been highlighted on Phoronix in the past few days. The latest pull request to catch my interest has been the crypto work, thanks to performance-enhancing additions by Intel...

    http://www.phoronix.com/vr.php?view=MTI1MjE

  • #2
    ISA bloat

    If you add new instructions to the instruction set architecture every time there is a new algorithm then you eventually end up with a pretty bloated instruction set architecture.

    Comment


    • #3
      I was also wondering why not just implement lower level instructions and let the cryptography algorithms use them.
      Same about video - they put decoders of codecs right into silicon, doing so imo only makes sense on small devices where being paranoid about power usage is ok.

      Comment


      • #4
        AFAIK you don't get the performance gain if the instructions are any lower level. Might be possible to generalize them a bit more I guess...

        Comment


        • #5
          Don't forget the LZO improvement:
          https://git.kernel.org/?p=linux/kern...1915e5ae057826

          Comment


          • #6
            100% wrong

            Originally posted by uid313 View Post
            If you add new instructions to the instruction set architecture every time there is a new algorithm then you eventually end up with a pretty bloated instruction set architecture.
            If you DO NOT add new instructions to the instruction set every time there is a new algorithm, then you eventually end up with "Microsoft" style APIs where you call the one single "Do everything" API and then you have to pass in a struct with the actual commands that you want to execute and then you end up with EVEN MORE BLOAT.

            YOU DO KNOW that processors have been invented that have ONE move instruction and no other instructions at all! NOP is just moving a register onto itself. Everything you do depends on which register you choose as the source or target. NATURALLY there is JUST AS MUCH BLOAT in this implementation as any other.

            TELL US is there MORE bloat in a wide API where each API function is cleanly written with its own entry point, or in YOUR scenario where the library must have a big switch statement and logic decoding just to figure out what the user wanted to do?

            Comment


            • #7
              Originally posted by frantaylor View Post
              If you DO NOT add new instructions to the instruction set every time there is a new algorithm, then you eventually end up with "Microsoft" style APIs where you call the one single "Do everything" API and then you have to pass in a struct with the actual commands that you want to execute and then you end up with EVEN MORE BLOAT.

              YOU DO KNOW that processors have been invented that have ONE move instruction and no other instructions at all! NOP is just moving a register onto itself. Everything you do depends on which register you choose as the source or target. NATURALLY there is JUST AS MUCH BLOAT in this implementation as any other.

              TELL US is there MORE bloat in a wide API where each API function is cleanly written with its own entry point, or in YOUR scenario where the library must have a big switch statement and logic decoding just to figure out what the user wanted to do?
              WHAT'S up WITH your CAPS lock? NEED new KEYBOARD?

              Comment


              • #8
                Originally posted by frantaylor View Post
                If you DO NOT add new instructions to the instruction set every time there is a new algorithm, then you eventually end up with "Microsoft" style APIs where you call the one single "Do everything" API and then you have to pass in a struct with the actual commands that you want to execute and then you end up with EVEN MORE BLOAT.

                YOU DO KNOW that processors have been invented that have ONE move instruction and no other instructions at all! NOP is just moving a register onto itself. Everything you do depends on which register you choose as the source or target. NATURALLY there is JUST AS MUCH BLOAT in this implementation as any other.

                TELL US is there MORE bloat in a wide API where each API function is cleanly written with its own entry point, or in YOUR scenario where the library must have a big switch statement and logic decoding just to figure out what the user wanted to do?
                I guess you can design nice APIs with object-oriented programming and pluggable modules that implements interfaces or inherits from another class.

                The more instructions you add to the instruction set architecture the more it become CISC and it eventually turns into VAX. Then those algorithms gets old and outdated to newer and improved algorithms and they just become legacy baggage on modern hardware.

                Comment


                • #9
                  Originally posted by mark45 View Post
                  WHAT'S up WITH your CAPS lock? NEED new KEYBOARD?
                  YOU seem to be suffering from the SAME problem!

                  Do YOU speak in monotone ALL THE TIME? DO YOU EVER USE INFLECTION?

                  WHY do you CHANGE THE LANGUAGE YOU SPEAK when you sit in front of the computer? Why do you REMOVE ALL THE INFLECTION? Is there some sort of bandwidth issue with processing ALL THOSE EXTRA BITS?

                  GO read ZIPPY THE PINHEAD and perhaps you can COMPLAIN TO HIM TOO!
                  Last edited by frantaylor; 12-14-2012, 12:23 PM.

                  Comment


                  • #10
                    Dude, only stupid people use caps lock all the time, and take a hike to calm down.

                    Comment


                    • #11
                      Originally posted by mark45 View Post
                      WHAT'S up WITH your CAPS lock? NEED new KEYBOARD?
                      ymmd! .

                      Comment


                      • #12
                        ARMv8 chips will also have 10x faster cryptography than current ARM chips.

                        Comment


                        • #13
                          Originally posted by uid313 View Post
                          I guess you can design nice APIs with object-oriented programming and pluggable modules that implements interfaces or inherits from another class.

                          The more instructions you add to the instruction set architecture the more it become CISC and it eventually turns into VAX. Then those algorithms gets old and outdated to newer and improved algorithms and they just become legacy baggage on modern hardware.
                          NO they do not! The code to execute these older instructions is moved OUT of the core and into microcode so the older instructions can be emulated using the new core. Thus the core can be clean of the old architectural decisions.

                          HOW do you think Intel has managed to keep up performance while maintaining compatibility with 8086? Do you honestly assert that all our modern Intel CPUs are carrying around an 8086 core so they can execute those instructions? NO, they are executed in microcode.

                          BESIDES ALL OF THIS, the actual instruction set executed by the core has ZILCH to do with the instruction set exposed to software. When your code is executed on a modern CPU the instructions are pipelined and re-ordered and re-written and the code executed by the core is NOT THE SAME as the code you wrote. So the API you see, even from assembly language, is all just an abstraction anyway. The Intel 32-bit instruction set runs on both Pentium Pro and on Ivy Bridge despite the fact that those two processors have very very little in common.

                          You really have to ask if "legacy baggage" has any relevance when it has no practical meaning. IBM mainframes are still emulating the 360 instruction set from the 1970's and yet they are not slowed down. Heck your modern IBM mainframe is emulating about a dozen different old IBM architectures and it does it all without slowing down modern code by even a cycle. IBM Mainframes? you mean those systems that have been running linux in a VM environment long before Intel?

                          Comment


                          • #14
                            missing out on the advantage!

                            Originally posted by uid313 View Post
                            I guess you can design nice APIs with object-oriented programming and pluggable modules that implements interfaces or inherits from another class.

                            The more instructions you add to the instruction set architecture the more it become CISC and it eventually turns into VAX. Then those algorithms gets old and outdated to newer and improved algorithms and they just become legacy baggage on modern hardware.
                            "Reduced Instruction Set" means increased code size. Instructions that carry less information mean that you need more of them

                            CISC actually does quite well performance wise because one of the execution bottlenecks is the fetching of instructions. When you use long winded RISC instructions you have more code and more read cycles to fetch your big code from slow RAM.

                            The ideal instruction set is Hoffman-encoded CISC where the often used instructions are very short and the little-used instructions are longer. This maximizes the available memory bandwidth.

                            And AGAIN since the instruction set is just an abstraction and has NOTHING to do with the actual hardware you might as well shoot for performance. Who cares about that dreadful CISC code? Nobody is ever going to look at it. It shoots out of the compiler and into the instruction unit and nobody needs to actually look at it or appreciate its intrinsic beauty.

                            Comment


                            • #15
                              Originally posted by frantaylor View Post
                              You really have to ask if "legacy baggage" has any relevance when it has no practical meaning. IBM mainframes are still emulating the 360 instruction set from the 1970's and yet they are not slowed down. Heck your modern IBM mainframe is emulating about a dozen different old IBM architectures and it does it all without slowing down modern code by even a cycle. IBM Mainframes? you mean those systems that have been running linux in a VM environment long before Intel?
                              It still makes chips bigger, more complex, more expensive, hotter and consume more.


                              Anyway, AES-NI and similar instructions make it easier to plant backdoor.

                              Comment

                              Working...
                              X