Announcement

Collapse
No announcement yet.

NVIDIA Announces Grace CPU For ARM-Based AI/HPC Processor

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by coder View Post
    I think that's not what they meant. I imagine they had in mind that one OS kernel should be managing hybrid ISA CPU cores, which share a global pool of RAM. This would be an interesting project, but I'm not sure we really have anything like it, today.
    I agree! One of the techniques Apple used to extract performance uplift was to get M1's RAM on the package and tightly linked to all its various cores, not just CPU and GPU. With the upcoming movement by Intel to start integrating RAM on the wafer itself and the continued use of HBM by Nvidia and AMD on GPUs one could see general RAM on the motherboard but linked by CXL or Infintity Architecture as a kind of memory pool. This pool would, as a matter of course, with CXL and Infinity Architecture, be part of a zero copy, cache coherent, heterogeneous compute environment.

    I think that's part of what we will see with Nvidia's Grace SoC. Each Grace core could have an NVLink from SiP RAM to each of Grace's CPU's ancillary cores (DSP, NPU, DPU, etc,) straight to any and all Nvidia's integrated or discreet and external GPUs.

    In this respect, we are approaching a time where HP's "The Machine" concept will be the prevailing design paradigm.

    Comment


    • #62
      Originally posted by oiaohm View Post
      2030 that is quite a way out. You have missed what Risc-V is up to.
      https://www.xda-developers.com/android-risc-v-port/
      It's true. China is the big wild card, with Russia being a smaller one. I'm sure neither likes ARM's US ownership. They're each building MIPS, RISC V, and proprietary ISA CPUs.

      And guess who makes most appliances and personal electronics? China. If China goes big on RISC V, then they can single-handedly turn the tide against ARM.

      Comment


      • #63
        Originally posted by jabl View Post
        Now, Intel largely controls the PCI SIG which develops the PCI standard, and they're in no hurry to develop it in a direction which would help GPGPU.
        Yes, they did. It's called CXL.

        Now that Intel is building datacenter GPUs & AI accelerators, they're highly-motivated to solve those problems.

        Comment


        • #64
          Originally posted by numacross View Post
          Thanks. I could swear I remember something about it sharing the same socket as their Opterons of the same era, but maybe it just shared the same chipset?

          Comment


          • #65
            Originally posted by Jumbotron View Post
            In this respect, we are approaching a time where HP's "The Machine" concept will be the prevailing design paradigm.
            This is the first time, in a while, that I've seen that reference. Does anyone have a link to a clear description of "The Machine"? I'm not looking for marketing BS.

            Comment


            • #66
              Originally posted by TemplarGR View Post

              No, i am not ignorant, i am just not a fanboi whose only knowledge about chip design comes from pop-tech sites.

              1) Ghz are not about the nodes alone, they are as i said about the design principles. If the chip is very complicated it can never achieve 100% load at those clocks. Those ghz you mentioned are best case scenarios.

              2) I want to see a link about that benchmark, to see the conditions of the test. Rise of the Tomb Raider is a very lightweight game that can be maxed on low end cpus/gpus. You need more games to reach a stronger conclusion. Especially in a video game which is mostly gpu bound

              3) The GTX 1650 is a 12nm design, the Apple M1 is a 5nm design. That is a huge difference. You call me ignorant but i am the only one who is talking objective facts here, you are just an ignorant fanboy, and don't you dare call me ignorant again.


              Ice Storm 10 watts pick load.

              Comment


              • #67
                Originally posted by 1250568
                This is the first time, in a while, that I've seen that reference. Does anyone have a link to a clear description of "The Machine"? I'm not looking for marketing BS.
                Perhaps this could serve as a start. More to come if I can find it.

                When Hewlett-Packard launched its moonshot effort to create a new computing architecture centered on non-volatile memory last year, called The Machine,

                Comment


                • #68
                  Originally posted by coder View Post
                  This is the first time, in a while, that I've seen that reference. Does anyone have a link to a clear description of "The Machine"? I'm not looking for marketing BS.
                  Ahhh...here we go. Full rundown of HP's "The Machine" with pix, diagrams, etc. of all major components from the SoC, memory pools, data connections both copper and fiber optic, data and interface planes, rack sleds, pretty much a entire tear down of The Machine. Also, and I had forgotten this The Machine was based off of an undisclosed ARM SoC called the "Workload Processor". Also the interconnects inside and outside The Machine was Gen Z. Yeah...that Gen Z which will be working with the CXL coalition in tying up racks of CPUs, GPU, and external Memory pools in the next year or two.

                  Hewlett Packard Enterprise is not just a manufacturer that takes components from Intel and assembles them into systems. The company also has a heritage of

                  Comment


                  • #69
                    Originally posted by coder View Post
                    There are knowledgeble folks who would sincerely disagree on the latter point, as well as whether ARM is really even RISC. Rather than take a position on that, I just want to point out that the advantages of AArch64 include:
                    • simpler ISA -> simpler, more energy-efficient decoder
                    • fixed-sized instruction word -> wider front-end
                    • larger GP register file -> less spilling
                    • relaxed memory-consistency -> greater instruction-reordering flexibility

                    These are undeniable, though you can certainly debate the impact each has on performance.
                    I meant more that CPUs internally are more RISC in nature, as big intstructions always gets decoded into smaller very simple RISC like instructions and go onto pipeline. X86 by itself or newer ARM architectures are not by itself RISC.

                    Comment


                    • #70
                      Originally posted by coder View Post
                      I think that's not what they meant. I imagine they had in mind that one OS kernel should be managing hybrid ISA CPU cores, which share a global pool of RAM. This would be an interesting project, but I'm not sure we really have anything like it, today.
                      That sounds horrible to code in and extremly challenging to compiler. Honestly CUDA more behaves like that already.

                      Comment

                      Working...
                      X