Linus Torvalds Comes Out Against "Completely Broken" x86_64 Feature Levels

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • blackshard
    Senior Member
    • Oct 2009
    • 602

    #31
    Originally posted by avis View Post
    Where is CoC(k)?

    Why is Linus still committing to the kernel? Oh, wait, without him everything will fall apart.
    CoC in this case does not apply at all. It's easy to see there is not a person that is behaving (the conduit) badly. It's just an opinion over a technical decision, that I share a lot too: it has no sense to introduce a segmentation at any level in the kernel to make things even more complex with other ifs or #ifdefs here and there.

    Comment

    • openminded
      Senior Member
      • Feb 2022
      • 223

      #32
      Originally posted by avis View Post
      Where is CoC(k)?

      Why is Linus still committing to the kernel? Oh, wait, without him everything will fall apart.
      It's rather "fart apart" - according to Linus

      Comment

      • avis
        Senior Member
        • Dec 2022
        • 2252

        #33
        Originally posted by blackshard View Post

        CoC in this case does not apply at all. It's easy to see there is not a person that is behaving (the conduit) badly. It's just an opinion over a technical decision, that I share a lot too: it has no sense to introduce a segmentation at any level in the kernel to make things even more complex with other ifs or #ifdefs here and there.
        That was sarcasm. There was nothing wrong with his message but for the snowflakes and woke it might sound "bad".

        Comment

        • elbci
          Junior Member
          • Nov 2024
          • 16

          #34
          - ...what did we learn today?

          - Today we learned if you are obedient to your corporate masters and support their russo phobia or CIS phobia or palestinophobia or whatever, then it's technical debate and you don't have to be obedient to the COC.

          - Great kids! And do we call this blatant hypocrisy or democracy and freedom of speech?

          - Me, me teacher, I know me me ask me...​

          Comment

          • coder
            Senior Member
            • Nov 2014
            • 8922

            #35
            Originally posted by patrick1946 View Post
            Actually v4 was useful until Intel could not get their little cores up to that level.
            I think what really changed is that Intel took another look at the real benefits of 512-bit and decided it wasn't justified. You can just stick with 256-bit and have bigger cores simply add more issue ports.

            For instance, I point to ARM Neoverse V1 and V2. V1 implemented SVE @ 2x 256-bit. V2 implemented it @ 4x 128-bit. The benefit of implementing at 128-bit is better hardware utilization for all of that legacy NEON code. Yet, I'm sure they wouldn't have done it, if it compromised SVE performance/efficiency hardly at all.
            Last edited by coder; 05 December 2024, 02:38 PM.

            Comment

            • coder
              Senior Member
              • Nov 2014
              • 8922

              #36
              Originally posted by patrick1946 View Post
              Why should a new version a superset of the lower numbered? If you bring out a CPU which is not supporting that version you have to find an other. Mostly it will be older but it can be newer too. It is called deprecation.
              x86-64 feature levels were meant as a simplification. Once you add in the complexity of deprecation, it partially defeats the point of not just relying on the CPUID bits, as Linus mentioned.

              I don't entirely reject the idea of ISA feature levels or having v5 be a superset of v3, while leaving v4 as an orphan, but it's just taking us down a road that could ultimately make ISA feature levels somewhat self-defeating.

              Comment

              • coder
                Senior Member
                • Nov 2014
                • 8922

                #37
                Originally posted by Gonk View Post
                No, any attempt to support AVX-512 was doomed before Alder Lake's e-cores came along.

                I'd love it if Zen 6 turned out to be a solid green row.
                If you eliminate the Xeon Phi SKUs from that graph (first two rows), it gets a lot simpler. If you eliminate Cannon Lake (which never really shipped in volume) and Cooper Lake (which, again, is fairly niche), it gets simpler still, making almost every Intel CPU a superset of the priors.

                BTW, how does Alder Lake support VP2INTERSECT but Sapphire Rapids doesn't? Not to mention Tiger Lake supporting it...

                Comment

                • coder
                  Senior Member
                  • Nov 2014
                  • 8922

                  #38
                  Originally posted by hardfalcon View Post
                  Even the usefulness of v2 and v3 seems questionable. One example: Even though Skylake CPUs are supposed to support AVX and AVX2, since last summer, the Linux kernel blocks/disables AVX and AVX2 support for userland code on Skylake CPUs to mitigate the GDS/Downfall vulnerability: https://www.phoronix.com/review/downfall
                  No, not all of AVX/AVX2! Just the GATHER instructions! That's just a small % of the total.

                  Comment

                  • vient
                    Junior Member
                    • Jul 2023
                    • 19

                    #39
                    Originally posted by coder View Post
                    BTW, how does Alder Lake support VP2INTERSECT but Sapphire Rapids doesn't? Not to mention Tiger Lake supporting it...
                    VP2INTERSECT on Intel had so abysmal performance that software emulation was actually faster.

                    More info here: http://www.numberworld.org/blogs/202...x512_teardown/
                    1. Intel added AVX512-VP2INTERSECT to Tiger Lake. But it was really slow. (microcoded ~25 cycles/46 uops)
                    2. It was so slow that someone found a better way to implement its functionality without using the instruction itself.
                    3. Intel deprecates the instruction and removes it from all processors after Tiger Lake. (ignoring the fact that early Alder Lake unofficially also had it)
                    4. AMD adds it to Zen5.
                    ​...
                    But how good is AMD's implementation? 1 cycle throughput.

                    Comment

                    • hwertz
                      Phoronix Member
                      • Apr 2008
                      • 96

                      #40
                      I think what people are missing is Linus' point that the cpu already has cpuid flags. And already regularly includes regular and fast code paths based on some instruction existing or not. So there's no reason to build a x86-64v4 kernel when the kernel can detect if AVX512 exists, test if it actually gets a speedup (sometimes it doesn't) and use it if it does. There's stuff for encryption, compression, stuff in the RAID modules. etc. that does exactly this (test several implementations at start and use whichever is fastest.)

                      Just to add, this whole thing with AVX512 is a bit gross IMHO. The ARM NEON (and equivalent instructions on 64-bit) you just specify the length. You can have some chip with like 128-bit vector stuff and it'll do 512-bit or even 2048-bit. It's an implementation detail, not 'we have wider vector units so here's the same instructions but twice as wide of vectors.'
                      Last edited by hwertz; 05 December 2024, 03:04 PM.

                      Comment

                      Working...
                      X