Announcement

Collapse
No announcement yet.

GCC To Begin Implementing MMX Intrinsics With SSE Instructions

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • GCC To Begin Implementing MMX Intrinsics With SSE Instructions

    Phoronix: GCC To Begin Implementing MMX Intrinsics With SSE Instructions

    While current-generation Intel/AMD CPUs are still supporting the MMX SIMD instruction set from two decades ago, a set of GCC compiler patches are pending to begin implementing MMX intrinsics using SSE instructions...

    http://www.phoronix.com/scan.php?pag...nsics-With-SSE

  • #2
    So it seems that intel is taking steps to retire the MMX instruction set from their hardware? Humph.

    Of course we might be able to recompile our Linux / BSD programs and libraries. We might see updated versions of them than will not do any MMX calls at all.

    But what about binary software? One day we will need a lot of emulators if we want to run e.g. some older games that make use of MMX. It was around for a long time (Pentium MMX from 1993?; OMG they had this horrible ad on TV with these guys in clean-room suits suddenly dancing around and now the colour comes to the internet or something) and it is really wide-spread, even smaller CPUs like VIA C3, Geode GX2 / LX support it.
    Retiring such an important and widely used instruction set sounds like creating a lot of problems for years.
    Stop TCPA, stupid software patents and corrupt politicians!

    Comment


    • #3
      Originally posted by Adarion View Post
      So it seems that intel is taking steps to retire the MMX instruction set from their hardware? Humph.

      Of course we might be able to recompile our Linux / BSD programs and libraries. We might see updated versions of them than will not do any MMX calls at all.

      But what about binary software? One day we will need a lot of emulators if we want to run e.g. some older games that make use of MMX. It was around for a long time (Pentium MMX from 1993?; OMG they had this horrible ad on TV with these guys in clean-room suits suddenly dancing around and now the colour comes to the internet or something) and it is really wide-spread, even smaller CPUs like VIA C3, Geode GX2 / LX support it.
      Retiring such an important and widely used instruction set sounds like creating a lot of problems for years.
      In my opinion, it is improbable for MMX or x87 FPU to be retired from CPUs.

      Comment


      • #4
        There was chatter some time ago that Intel would remove some legacy functionality, e.g. MMX or x87 FPU but also 16-bit real mode for classic BIOS (also scrapping CSM along the way) etc. - see: https://arstechnica.com/gadgets/2017...-bios-by-2020/

        I'd say: Good riddance!

        Comment


        • #5
          My guess is that they might intend to switch to an implementation which emulates MMX in microcode, which would be slower than before, but fast enough for any old binaries designed for much slower CPUs overall.

          Comment


          • #6
            Originally posted by atomsymbol View Post

            In my opinion, it is improbable for MMX or x87 FPU to be retired from CPUs.
            Why is that? The instructions will just generate a trap and be software emulated instead. That will likely be plenty fast for older software anyway.

            You know, that was how the x87 instructions were run in the first place, on your 386 without the 387 FPU . It is also why the x87 instructions are such an inefficient bolt-on to the native instructions.

            Comment


            • #7
              it's even simpler... modern intel CPUs have 2 ALUs ("ports" in Intel speak) for SSE and AVX but only 1 for MMX/x86.... so you can run 2 SSE or AVX instructions per cycle, but only one for mmx/x87.... so half the performance.

              Comment


              • #8
                Seems like a good idea to convert a lot of these ops to traps, I'm surprised more of them aren't traps yet. AAA, AAD, AAM, AAS, DEC... Nobody generates this code anymore, many of these instructions occur nowhere in any software packaged for Debian.

                Comment


                • #9
                  Originally posted by Adarion View Post
                  ...Pentium MMX from 1993?
                  The first Pentium MMX CPUs were released in 1997.

                  Comment


                  • #10
                    Originally posted by Veto View Post
                    Why is that? The instructions will just generate a trap and be software emulated instead. That will likely be plenty fast for older software anyway.
                    I think a good argument is that if it was AMD/Intel's intention to trap and software emulate MMX/387 instructions in new CPUs they would have already done so years ago.

                    An Intel's trait is that it has been strict about compatibility for 40+ years and has never removed a major instruction set as far as I know. (AMD has discontinued some unsuccessful instruction sets (3DNOW, XOP)).

                    Also, every new generation of Intel desktop x86 CPUs tries to be strictly more performant than the previous generation. Removing x87/MMX from future CPUs would make the new CPU slower than the previous generation when running the relatively large x87 code base and the relatively small MMX code base. The only exception from this rule was Pentium 4, which Intel intended to be strictly faster than Pentium 3, but failed. (AMD has also intended for Bulldozer to be strictly faster than Phenom II, but failed).

                    Originally posted by Veto View Post
                    You know, that was how the x87 instructions were run in the first place, on your 386 without the 387 FPU .
                    Good point. But silicon budget was much lower back then compared to today. Not having 387 in silicon made sense from financial perspective and lowered the total cost of the 386 machine. There is no financial advantage to removing MMX/387 from CPUs today.

                    387/MMX are a piece of IP that isn't evolving much and has low maintenance cost.

                    Originally posted by Veto View Post
                    It is also why the x87 instructions are such an inefficient bolt-on to the native instructions.
                    They are relatively efficient in my opinion. x87 FPU did not receive 128/256-bit vector registers which would make it able to compete with SSE/AVX in terms of performance. It is true that the x87 stack registers are less efficient compared to SSE/AVX non-stack registers and it is more complicated to extract instruction-level parallelism (ILP) from it compared to SSE/AVX.

                    It was a mistake to alias the newer MMX 64-bit registers (mm0 to mm7) to the already existing x87 floating-point stack registers.

                    Comment

                    Working...
                    X