Announcement

Collapse
No announcement yet.

NVIDIA Announces Grace CPU For ARM-Based AI/HPC Processor

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by schmidtbag View Post
    Ugh enough with this cringy tag line...
    OK...I admit I was wrong. Let me amend.

    The Age of ARM HAS BEEN HERE.

    For PDAs, then Smart Phones and Tablets.
    For IoT, Smart Devices, Watches, Sensors, Nerworking, etc.

    The Age of ARM is ARRIVING for EVERY OTHER platform not mentioned above, like Supercomputers, HPC, AI, Edge, VR, AR, Auto both augmented and fully Self Driving. Also in Chromebooks, ALL APPLE products, increasingly in Microsoft products.

    Because the x86 Desktop PC is an increasingly marginalized platform in light of ARM based Smart Phones, Tablets and soon every SINGLE Apple Personal Compute product including Desktops, by 2030 more than 50% of all Personal Compute products both desktop and laptops Windows, Apple and Google combined will be ARM based.

    Comment


    • #42
      Originally posted by zxy_thf View Post
      Actually this advantage is also not that clear, if we take (potential) vendor locked-in into consideration.
      We may switch Xeon with Epyc and enjoy improved performance/dollar, but when we switched one ARM from another one, there is no guarantee that they share the shame extensions and have similar performance behavior.
      Considered Apple's success from walled garden, I don't believe ARM's vendors won't want to lock you into their own production line.
      thats not possible because vendor lock in only works if you have high marketshare.

      ARM any kind of does not have this marketshare right now.
      Phantom circuit Sequence Reducer Dyslexia

      Comment


      • #43
        Originally posted by Jumbotron View Post
        The Age of ARM is ARRIVING for EVERY OTHER platform not mentioned above, like Supercomputers, HPC, AI, Edge, VR, AR, Auto both augmented and fully Self Driving. Also in Chromebooks, ALL APPLE products, increasingly in Microsoft products.

        Because the x86 Desktop PC is an increasingly marginalized platform in light of ARM based Smart Phones, Tablets and soon every SINGLE Apple Personal Compute product including Desktops, by 2030 more than 50% of all Personal Compute products both desktop and laptops Windows, Apple and Google combined will be ARM based.
        2030 that is quite a way out. You have missed what Risc-V is up to.
        https://www.xda-developers.com/android-risc-v-port/

        Yes the x86 Desktop PC is an increasingly marginalised platform. But we also have to remember we have big companies with a problem with x86 and arm.
        https://www.chinamoneynetwork.com/20...ab-new-markets
        Export/import of technology ban problems.

        It is possible that by 2030 the dominant mobile phone on the market is risc-v. Every market Arm is targeting so is Risc-V.

        Nvidia has acquired arm when there is quite a battle ahead. Arm really has increasing competition from Risc-V because of the USA government bans making non USA companies worried about if they will be able to keep on getting updates to the arm Licenses they have. This is forcing Arm/Nvidia to go after more USA based companies going forwards as they lose china and other country based companies if nothing changes.

        Comment


        • #44
          Originally posted by TemplarGR View Post

          Clock for clock performance is not telling the full story. And not running them higher may not be just about efficiency, but also simple stability. you can get great IPC in a processor but be unable to clock it high because of the design.

          For a clear example to illustrate, Pentium 4 (netburst) had lower IPC than AMD's Athlon (and Pentium 3) but could be clocked significantly higher because of its longer pipeline. Athlon had a shorter pipeline, so it was clocked lower. It wasn't for just "efficiency" reasons (after all, both Intel and AMD -FX9xxxx series- have proven they don't care about efficiency much), but also for stability reasons. At some point, you just get errors and instability if you clock a design for more than it can go.

          ARM is similar in the sense that it is designed for lower clocks and lower die sizes. The designs are not meant to be clock champions, and i am not even sure they could even reach stable clocks that high. And it lacks the SIMDs/FP performance too.
          You really are ignorant. The LPP node offers 3 subnodes: 5ghz full power, 4ghz 2.5x less power and 3.2ghz 2.5x extra less power. Second the Apple soc has tested running Rise of the Tomb Raider via Rosetta 2, it gave 70% vs many core x86 and gtx1650. That was with the gpu at 7w and the cpu at another 7w, with the Soc (including Ram) locked at 16.5w. When the gpu was unleashed at 10w matched the gtx1650. Also the 512bit Fused FP performance per core is enough and the Fujitsu one (100% Arm) utilizes the new instruction set with 2048bit FP per core. The thing goes like this: Arm will fuse the graphics instruction subset with even the heavy cores and bye bye everyone.

          Comment


          • #45
            Originally posted by Qaridarium View Post

            thats not possible because vendor lock in only works if you have high marketshare.

            ARM any kind of does not have this marketshare right now.
            Apple's mac also doesn't have high markershare, and if you count the number of devices this also applies to iPhones (~10% iirc).

            IMO this is very likely to happen if ARM servers take the custom design business model -- just like game consoles, and we all know PS5 and XSX work very differently even if both are designed by AMD and using Zen2+RDNA-like
            Last edited by zxy_thf; 12 April 2021, 11:22 PM.

            Comment


            • #46
              Originally posted by artivision View Post

              You really are ignorant. The LPP node offers 3 subnodes: 5ghz full power, 4ghz 2.5x less power and 3.2ghz 2.5x extra less power. Second the Apple soc has tested running Rise of the Tomb Raider via Rosetta 2, it gave 70% vs many core x86 and gtx1650. That was with the gpu at 7w and the cpu at another 7w, with the Soc (including Ram) locked at 16.5w. When the gpu was unleashed at 10w matched the gtx1650. Also the 512bit Fused FP performance per core is enough and the Fujitsu one (100% Arm) utilizes the new instruction set with 2048bit FP per core. The thing goes like this: Arm will fuse the graphics instruction subset with even the heavy cores and bye bye everyone.
              No, i am not ignorant, i am just not a fanboi whose only knowledge about chip design comes from pop-tech sites.

              1) Ghz are not about the nodes alone, they are as i said about the design principles. If the chip is very complicated it can never achieve 100% load at those clocks. Those ghz you mentioned are best case scenarios.

              2) I want to see a link about that benchmark, to see the conditions of the test. Rise of the Tomb Raider is a very lightweight game that can be maxed on low end cpus/gpus. You need more games to reach a stronger conclusion. Especially in a video game which is mostly gpu bound

              3) The GTX 1650 is a 12nm design, the Apple M1 is a 5nm design. That is a huge difference. You call me ignorant but i am the only one who is talking objective facts here, you are just an ignorant fanboy, and don't you dare call me ignorant again.

              Comment


              • #47
                Originally posted by vegabook View Post

                Jetson AGX Xavier is 699 USD with 8 fairly modern Denver cores, 512 CUDA cores, plenty of tensor cores, 32GB or RAM, nvme SSD capability, and PCIE expansion slot. Still got a few problems but it's getting mighty close. You can most definitely use this a full performance Linux desktop and the only thing you'll really be missing is gaming.

                EDIT: The cores are not Denver (as per TX2), they're its successor "Carmel".

                I have a Jetson AGX Xavier, so it has its uses.

                Nevertheless, for general-purpose applications it cannot compete with an x86 computer of the same price and the same power consumption.

                The NVIDIA Carmel cores have a speed intermediate between Cortex-A73 and Cortex-A75, so they are more than 4 times slower than a modern AMD Zen 3 or Intel Tiger Lake.


                Comment


                • #48
                  The Nvidia Grace CPU isn't really about the x86-64 ISA vs the ARMv8 (or v9) ISA. It's about the business models of the x86 vendors vs. ARM.

                  Problem is that many GPGPU workloads are starved for BW for feeding data to the GPU's, and further, cache coherent memory access would improve programmability of GPU applications. Now, Intel largely controls the PCI SIG which develops the PCI standard, and they're in no hurry to develop it in a direction which would help GPGPU. Hence things like NVLINK, CAPI, and whatnot. And further, AMD and Intel develop their server CPU's for the general purpose server market.

                  So here's where ARM comes in. NVIDIA can "just" go and buy an off-the-shelf Neoverse core from ARM, and develop their own CPU with that core, adding high BW memory interfaces, and high BW and cache coherent NVLINK for connecting to the GPU's.

                  Comment


                  • #49
                    Originally posted by Jumbotron View Post
                    The Age of ARM is here. x86 is legacy.
                    It's slated to come online in 2023, and we don't even know the details.

                    The funny thing is that the core CPU isn't really the impressive part of the announcement. I highly doubt this will even be the fastest ARM server CPU of its day. Its main job is really as a partner to the GPU, and scale out its memory. To that end, you could probably swap out the ARM cores with x86, POWER, etc. without much impact on overall performance.

                    Comment


                    • #50
                      Originally posted by Jumbotron View Post
                      Not only will that Neoverse ARM SoC be tied to Nvidia GPUs through NVLink but the Grace cores will be ARMs Neoverse V1 cores like the Fujitsu Fugaku Supercomputer.
                      Check your facts!
                      Check your facts!
                      Check your facts!

                      By now, you should have learned not to make any factual claim without checking it.

                      ARM didn't even announce V1 cores until Sep. 20, 2020. That's 3 months after Fugaku set its record on the Top 100. ARM always announces new cores well in advance of any CPUs being launched with them, because ARM is in the business of selling that IP. It doesn't sell chips, it sells IP. There's no reason for them to delay announcing their IP, once it's ready for customers to use! In fact, that would make zero sense, since their IP has a very limited timespan of commercially viability!

                      Fujitsu presented A64FX all the way back in 2018, at Hot Chips:


                      Your timeline and facts are so badly out-of-joint, it's like saying that the bombing of Pearl Harbor started WW1!

                      I get that you're enthusiastic, but that's no excuse to be so sloppy with the facts,

                      Comment

                      Working...
                      X