Announcement

Collapse
No announcement yet.

Linux x86 FPU Code Getting Reworked In Preparation For Intel AMX

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux x86 FPU Code Getting Reworked In Preparation For Intel AMX

    Phoronix: Linux x86 FPU Code Getting Reworked In Preparation For Intel AMX

    It's been one year now that Intel has been posting Linux kernel patches to enable AMX support for upcoming Sapphire Rapids processors. Over the past year their Linux kernel patches for enabling Advanced Matrix Extensions has gone through 11 rounds of review but that journey isn't over yet...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I know that many enthusiasts have written Intel off but Intel sees where the real money is and it's not the desktop. AMD can have the desktop, Intel is in a battle with Nvidia for the big money markets, AI, ML, Big Data, etc. These AMX extensions show Intel's commitment to attacking those markets as hard as they can.

    Comment


    • #3
      Originally posted by sophisticles View Post
      I know that many enthusiasts have written Intel off but Intel sees where the real money is and it's not the desktop.
      You clearly haven't even looked at Intel's financial disclosures. In their Q2 2021 quarterly report, their Data Center group had revenues of $6.5B, while the Client Computing group brought in $10.1B.

      Originally posted by sophisticles View Post
      AMD can have the desktop, Intel is in a battle with Nvidia for the big money markets, AI, ML, Big Data, etc.
      If Intel's investors became aware of any such strategy, I'm sure Intel would be looking for a new CEO in very short order. They simply cannot afford to walk away from desktops and laptops.

      Originally posted by sophisticles View Post
      These AMX extensions show Intel's commitment to attacking those markets as hard as they can.
      I don't know about that. It sounds like a big chunk of die space, so it won't be free. And only a small minority of server applications are going to use it. It still won't be competitive with GPU or deep learning ASIC solutions, either. I mean, you just have to look at the relative silicon area to see that. So, they're basically adding a "deep learning tax" on all of their server customers, to cater for a small segment of the market that will largely be embracing GPUs and ASICs, anyhow.

      One can either see it as brave or Intel just following their old mindset of looking at the CPU as the single weapon with which to attack all problems. However you choose to look at it, it'll be very interesting to see how well it performs, and whether they follow-up with further extensions to make it into the sort of general matrix math extension its name would suggest.

      IMO, the bolder move is their decision to scrap Xeon Phi and establish the Xe-based product portfolio as their new general-purpose compute accelerator platform. It'd have been really interesting if they'd put a little Xe block in their server CPUs, instead of tacking on yet another big x86 ISA extension.

      Comment


      • #4
        Originally posted by coder View Post
        You clearly haven't even looked at Intel's financial disclosures. In their Q2 2021 quarterly report, their Data Center group had revenues of $6.5B, while the Client Computing group brought in $10.1B.
        Well, they do still care about the laptop market. Desktops have very much been a low-priority, which is how they ended up being beat by a tiny company like AMD (in comparison to Intel).

        If Intel's investors became aware of any such strategy, I'm sure Intel would be looking for a new CEO in very short order. They simply cannot afford to walk away from desktops and laptops.
        Well... They actually did just fire their CEO. Intel won't tell you they are ignoring desktop, but it's pretty obvious they don't care much about it beyond trying to maintain their current share. It's not a growth market, and all their attention has been on entering new growth markets recently. It's been a point of contention with some investors, who see them wildly throwing money around in a bunch of different directions rather than focusing on what they're good at, but it could pay off in the long run.
        Last edited by smitty3268; 12 October 2021, 05:45 PM.

        Comment


        • #5
          Originally posted by coder View Post
          IMO, the bolder move is their decision to scrap Xeon Phi and establish the Xe-based product portfolio as their new general-purpose compute accelerator platform. It'd have been really interesting if they'd put a little Xe block in their server CPUs, instead of tacking on yet another big x86 ISA extension.
          In a way Intel has been doing this for a while. with their Xeon+FPGA hybrid products:

          https://www.eetasia.com/intel-double...scalable-fpga/

          BTW, while I am a big fan of GPU acceleration and ASIC chips, there are some workloads where AVX- can't be beat and Intel has said they are prepared to introduce AVX-1024 if they have to.

          https://www.tomshardware.com/news/cp...-optimizations

          The results they obtained with Amazon-670K, WikiLSHTC-325K, and Text8 datasets are indeed very promising with the optimized SLIDE engine. Intel's Cooper Lake (CPX) processor can outperform Nvidia's Tesla V100 by about 7.8 times with Amazon-670K, by approximately 5.2 times with WikiLSHTC-325K, and by roughly 15.5 times with Text8. In fact, even an optimized Cascade Lake (CLX) processor can be 2.55–11.6 times faster than Nvidia's Tesla V100.
          This is the type of performance that AMD can't counter no matter how many cores they throw at the problem.
          Last edited by sophisticles; 12 October 2021, 11:01 PM.

          Comment


          • #6
            Originally posted by smitty3268 View Post
            Desktops have very much been a low-priority, which is how they ended up being beat by a tiny company like AMD (in comparison to Intel).
            I've seen no indication of that. Intel has recently faced exactly the same challenges in all product segments, and it stems mostly from problems with their post-14 nm manufacturing technology.

            The reason they rolled out newer micro-architectures in laptops before desktops (speaking specifically of Ice Lake and Tiger Lake) is that their 10 nm and 10 nm SFF (to a lesser extent) had yield and frequency scaling problems that kept them from being competitive on the desktop. If you look, you'll see that both laptop and server Ice Lake CPUs do not clock as high as their 14 nm predecessors. Ice Lake also didn't offer a > 4-core laptop CPU, even after they already had >= 6-core Coffee Lake CPUs in older laptops. In fact, they had to port Comet Lake as a high-end laptop option, specifically because they couldn't hit the high core-counts or frequencies on Ice Lake's 10 nm process.

            Tiger Lake was a better story, but perhaps yield problems kept them from rolling it out to desktops sooner. Perhaps by the time they could've ramped up volumes enough, Alder Lake would've already been breathing down its neck. So, it basically got squeezed out of the desktop roadmap, much in the same way as happened with Broadwell.

            As for AMD, their success has been due to:
            • Newly re-invigorated team, fueling the ground-up development of a new micro-architecture (you can read some behind-the-scenes accounts of that here and here)
            • Intel's manufacturing problems
            • TSMC's manufacturing ascendancy
            • AMD's chiplet strategy
            And AMD isn't just beating Intel on the desktop. They're rapidly increasing their server marketshare, as well. At this point, the main thing holding them back is the infamous "chip shortage". This is also why Intel is guaranteed to sell a large volume of whatever it makes -- because it simply has the lion's share of manufacturing capacity dedicated to CPUs.

            Originally posted by smitty3268 View Post
            Well... They actually did just fire their CEO.
            That was certainly more for execution problems than poor product strategy.

            Originally posted by smitty3268 View Post
            Intel won't tell you they are ignoring desktop, but it's pretty obvious they don't care much about it beyond trying to maintain their current share. It's not a growth market,
            No, it's not at all obvious. They've traditionally launched their new micro-architectures on the desktop, and Alder Lake will usher in a return to that practice. And for about the past 5 years, it has been a growth market. More than that, it's a halo market -- it grabs headlines and attention from users and developers. Furthermore, desktop substantially overlaps with their laptop platform. Desktop & laptop is also where they refine new micro-architectures and manufacturing nodes, before using them for their server products.

            Originally posted by smitty3268 View Post
            all their attention has been on entering new growth markets recently.
            You mean AI? They have to be a player in AI. There's too much money being made in that market for Intel to not to play in that market. Not having a compelling AI portfolio also jeopardizes their datacenter solution, especially as companies like Nvidia and AMD are increasing offering full-stack solutions of their own.

            Originally posted by smitty3268 View Post
            It's been a point of contention with some investors, who see them wildly throwing money around in a bunch of different directions rather than focusing on what they're good at,
            Intel is losing server market share, and that's not a situation they can turn around quickly or by throwing money at it. They need to catch up in their manufacturing node and make sure they have a competitive offering, which also includes things like AI and storage. That's the best chance they have at holding AMD and ARM at bay, but ultimately they must start to look beyond x86.

            Comment


            • #7
              Originally posted by sophisticles View Post
              In a way Intel has been doing this for a while. with their Xeon+FPGA hybrid products:

              https://www.eetasia.com/intel-double...scalable-fpga/
              FPGAs aren't competitive with deep learning ASICs or even Nvidia GPUs. They just don't have the compute density. Newer FPGAs are including more purpose-built blocks to try and address this, but that's not what Intel is integrating into its Xeons.

              Originally posted by sophisticles View Post
              BTW, while I am a big fan of GPU acceleration and ASIC chips, there are some workloads where AVX- can/t be beat and Intel has said they are prepared to introduce AVX-1024 if they have to.

              https://www.tomshardware.com/news/cp...-optimizations
              The problem with over-interpreting that is they're banking on GPUs being bad at sparsity. Ampere has both BFloat16 and better sparsity support. Also, it says nothing about inferencing, which is where the vast majority of deep learning compute power goes.

              As for AVX-1024, that's nothing compared with AMX. AMX registers are 8192 bits.

              Originally posted by sophisticles View Post
              This is the type of performance that AMD can't counter no matter how many cores they throw at the problem.
              This point I really don't follow. SLIDE is a generic algorithm. No reason it can't be implemented on AMD CPUs. I looked at the paper to see if they cited any unique features of AVX-512 that it required, but they only talked about its width and Cooper Lake's support for BFloat16. Better yet, they specifically quantified the impact of both AVX-512 and BFloat16. The best-case benefit of the former was only 1.22x, while the best-case benefit from the latter was only 1.39x (and on a different dataset, which only got a 1.12x speedup from AVX-512). Combined, that's only a 1.56x benefit, which is well within the realm of a gap AMD can bridge with more/faster cores running AVX2. However, that's not even the challenge AMD is currently facing, since Cooper Lake is a niche within a niche. Epyc Milan is mostly contending with Ice Lake, which does not have BFloat16.

              Furthermore, AMD is rumored to be putting AVX-512 in Zen 4. They've already embraced BFloat16 in their GPUs (MI100 had it for almost a year, now), so it'd be weird if Zen 4 took AVX-512 but passed up the the BFloat16 instructions. So, to the extent that SLIDE truly does give CPUs an unassailable advantage in training performance over all current & future GPUs (and I must say I'm incredulous of that), it's not an advantage Intel does or at least will hold over AMD.

              Comment


              • #8
                Originally posted by coder View Post
                ...
                I started to type out a long response to all this, but honestly I don't care enough to go through all that. Suffice to say, you've definitely not changed my mind and I think it's very obvious from looking at the company and it's decisions from a bunch of different angles, but go ahead and believe whatever you want. I'm quite certain neither of us can convince the other, so there's no point in continuing the discussion anyway.

                Comment


                • #9
                  Originally posted by smitty3268 View Post
                  It's been a point of contention with some investors, who see them wildly throwing money around in a bunch of different directions rather than focusing on what they're good at,
                  The other thought I have about this is it seems to miss a basic fact about business. In the tech industry, you can't afford to sit still. You have to try and grow. If you're not growing, you're probably shrinking, and that's definitely not what investors want or expect. And there are two basic ways to grow: by increasing market share and by entering new markets. Now, until a couple years ago, Intel had almost 100% of the server market. So, increasing market share wasn't an option. The only way to grow faster than the market was to enter new markets. Hence, their focus on AI, self-driving, 5G, and the rest of it.

                  As for their spending and its impact on their recent execution problems, I would agree that it has indeed been an issue. But not for the reasons you cite. The amount of money they've spent on acquisitions to enter new markets pales in comparison with what they've been spending on dividends and stock buy-backs. Had they put some of that into more R&D on their manufacturing process, perhaps their recent woes could've been mitigated or eliminated. I'm not knowledgeable enough to say, but I've definitely heard rumors they lost some significant expertise though aggressive cost-saving policies (i.e. workforce trimming). On this front, their new CEO has made it clear that more stock buy-backs will not be coming in the foreseeable future.

                  Comment


                  • #10
                    Originally posted by smitty3268 View Post
                    Suffice to say, you've definitely not changed my mind
                    So, you care enough to have an opinion and be noisy about it, but not to inform yourself. That's a winning combination.

                    Seriously, how closely have you followed their manufacturing situation or the problems with Ice Lake? Did you even know they released Comet Lake for laptops? How do you reconcile that, if not as I've suggested & what seems to be a consensus that Ice Lake and its iteration of their 10 nm process simply couldn't scale up to a high-end laptop, much less desktop? Why do you think it's taken them until just a few months ago to bring Ice Lake SP to the server market? Is it because they just didn't care? Or the much more plausible explanation that it's taken them this long simply to get yields up to a profitable level?

                    The answer is staring you in the face, but you refuse to see it. Most of Intel's problems stem from their delays and issues related to their 10 nm manufacturing node. The more you look at it, the more it explains everything they've done & not done, in the past few years. You can even find plenty of old roadmaps they had communicated to their partners & customers about plans they've since simply been unable to follow through on.

                    Next, consider their upcoming DG-2 gaming GPU that they're outsourcing for TSMC to manufacture. Is that the sort of thing a leading semiconductor manufacturer would do, if they didn't have serious production problems of their own?

                    There's even the counterpoint of Rocket Lake, which is an entire backport of Ice Lake's Sunny Cove they did from 10 nm to 14 nm, exclusively for desktop! Not for laptops, not for servers, but just for desktops! And it'll have been shipping for only about 8 months, by the time Alder Lake launches! If that's not a ringing statement that Intel cares about the desktop, I frankly don't know what is!

                    Originally posted by smitty3268 View Post
                    I'm quite certain neither of us can convince the other, so there's no point in continuing the discussion anyway.
                    That's a weak stance that stinks of insecurity. If you have compelling evidence to support your position, let's see it. I obviously spend lots of time following this stuff, so I'm always game for information I haven't yet come across.

                    Comment

                    Working...
                    X