Announcement

Collapse
No announcement yet.

GCC To Begin Implementing MMX Intrinsics With SSE Instructions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by coder View Post
    There's a lot of legacy code that was originally written for 32-bit, but then recompiled for 64-bit.
    with this change, a new recompile would automatically upgrade that code to SSE, likely improving performance.

    Comment


    • #22
      Originally posted by hotaru View Post
      with this change, a new recompile would automatically upgrade that code to SSE, likely improving performance.
      Yes, I understand. Thanks.

      It's a nice feature. Back when I read about Skylake's reduced MMX throughput, I wondered whether something like this might already exist.

      Comment


      • #23
        Originally posted by atomsymbol
        I think a good argument is that if it was AMD/Intel's intention to trap and software emulate MMX/387 instructions in new CPUs they would have already done so years ago.
        Strange argument: They can't do it now, because they would then have done it years ago?! Remember that it takes a long time for software to stop using a feature - even when deprecated and crippled. But you can argue that the best point in time would be to have left it out of AMD64 from the start...

        Originally posted by atomsymbol
        There is no financial advantage to removing MMX/387 from CPUs today.
        387/MMX are a piece of IP that isn't evolving much and has low maintenance cost.
        Why are you so sure of that? The x87 is a peculiar 80-bit FPU with a stack-structured 80-bit register file. I would rather believe it is a major kludge to keep around and hinders optimization of the FPU ALUs in a modern CPU. That argument is at least somewhat backed by the fact, that x87 support is becoming half-heartedly implemented on modern CPUs.

        Originally posted by atomsymbol
        They are relatively efficient in my opinion. x87 FPU did not receive 128/256-bit vector registers which would make it able to compete with SSE/AVX in terms of performance.
        It doesn't really matter, x87 is largely deprecated by now. As you can read in another comment, x87 will run at half speed of SSE on modern processors and 128/256 bit vector registers wouldn't even make sense for a 80-bit FPU anyway (It sounds like what you wish for is really SSE...). If you need the extra precision we will be better of by Float128 support, which will not suffer from truncation and scale nicely vector-wise.

        Comment


        • #24
          Originally posted by Veto View Post
          Strange argument: They can't do it now, because they would then have done it years ago?! Remember that it takes a long time for software to stop using a feature - even when deprecated and crippled. But you can argue that the best point in time would be to have left it out of AMD64 from the start...
          Edit: fair point. Maybe they were worried about breaking too much code that people would like to recompile for 64-bit, as I did.

          Originally posted by Veto View Post
          As you can read in another comment, x87 will run at half speed of SSE on modern processors
          That was in reference to MMX. Relative to SSE, I think x87 is yet slower, still.
          Last edited by coder; 03 February 2019, 02:21 PM.

          Comment


          • #25
            Originally posted by rene View Post

            plus many CPUs don't turbo boost as much (clock lower) with AVX in use, …
            No, they don't use turbo in AVX mode, the AVX mode is the opposite of turbo mode, it changes the clock frequency BELOW the base number.

            Comment


            • #26
              Originally posted by coder View Post
              No, because AMD didn't implement SSE2 in its first AMD64 CPUs. So, they'd have dropped MMX and replaced it with... ?
              No AMD64 always had SSE2. You are probably thinking of the first Athlons (K7), which did not have SSE2, but they didn't have AMD64 either. The K8 had both.

              Comment


              • #27
                Originally posted by coder View Post
                No, because AMD didn't implement SSE2 in its first AMD64 CPUs.
                that's obviously not true, because if they didn't support SSE2, they wouldn't be AMD64.

                from what I've been able to find, the earliest AMD64 processors were the Opteron 240, 242, and 244, which do support SSE2: https://en.wikipedia.org/wiki/List_o...mmer"_(130_nm)

                Comment


                • #28
                  Originally posted by hreindl View Post
                  i call bullshit
                  Well, I can't say the AMD CPUs we had were AMD64-capable, as we were still using a 32-bit kernel. But, they certainly made CPUs, after the first Opterons were introduced, that didn't have SSE2. I wish they did.

                  Comment


                  • #29
                    Originally posted by hreindl View Post
                    well, than simply don't say "No, because AMD didn't implement SSE2 in its first AMD64 CPUs" when you haven't your facts straight
                    there are also Intel Celeron CPU's without AMD64 support after they had released the first ones already
                    Take a breath. Now, exhale. Now, go check my earlier post and see that I already edited it, after acknowledging your reply.

                    You won, dude. I mis-remembered. It happens. Now, go eat a cookie or something.

                    Comment


                    • #30
                      Originally posted by carewolf View Post

                      No, they don't use turbo in AVX mode, the AVX mode is the opposite of turbo mode, it changes the clock frequency BELOW the base number.
                      It depends on the overall thermal configuration of the system. By and large, intel leaves the thermal limits to the system integrators to determine based on the overall power delivery and cooling capacity of the system. In home built desktops, the bios usually can be configured to allow a higher level of TDP, and if the heatsink has the thermal capacity, the chip will happily run AVX at full speed. Now on laptops or servers, the thermal constraints are a lot more severe, so AVX downclocking is a real thing.

                      Comment

                      Working...
                      X