Announcement

Collapse
No announcement yet.

Qualcomm Sampling 10nm 48-Core Server SoC

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Who cares about X86 if your application is entirely written in HTML, and being run on Linux?

    Comment


    • #12
      x86 is power hungry and has horrific ISA format, which means higher code footprint, hungrier decoder unit and caches and extra complications with instruction translations.
      And even when all this is solved on technical level, you still end up with legal and licensing limitations. Intel'r only alternative is AMD and that's it.

      ARM svcene is wide open to new players and by its nature it doesn't even insist on ARM. Whoever decided to recompile his/her code for ARM, know that there isn't much to stop him from doing it again for something completely different.

      Also, now that applications are using multithreading more and more, single thread performanc eis not that essential any more, whijch means oprating in area, where ARM is much more comfortable - with higher count of more power efficient cores.

      Also, Samsung, Qualcomm and the likes aren't that much behind Intel WRT to pure CPU muscle nor uncore material.

      If nice,speedy 32 or 64-core ARM/MIPS/Power were available on xATX board, I wouldn't lose a nanosecond contemplating Zen.


      Comment


      • #13
        I think the benchmark of Apache Spark points into the direction where this will be used.. Very large batch jobs which can run in parallel, but don´t need the high single thread performance! Running a map-reduce Job on 100th of cores makes sense!
        However Spark is horrible in performance.. as it´s a java application with a lot of overhead.. It´s a quantom leap away from old hadoop map-reduce jobs but it´s still slow!
        The memory footprint is horrible as well.. So to feed 48 Cores you will probably need a very large RAM and a very very fast memory bus.. Otherwise they need to do some segmenting where each core has it´s private memory lanes and the application is aware of which core connects to which memory segment!

        Comment


        • #14
          Originally posted by L_A_G View Post

          I really wouldn't say that any of that is correct... In terms of software there really isn't anything that beats x86 when comes to ecosystem and licensing is really only an issue for companies who want to make their own chips, which is something relatively few companies in the server space actually want to do.
          Best NON-X86 ecosystem, sheesh. Also, apparently Qualcomm wants in...

          Comment


          • #15
            Originally posted by Spacefish View Post
            So to feed 48 Cores you will probably need a very large RAM and a very very fast memory bus.. Otherwise they need to do some segmenting where each core has it´s private memory lanes and the application is aware of which core connects to which memory segment!
            It's a bit hard to be specific about this CPU (Centriq 2400) since Qualcomm haven't really released much detail that I can find - but all designs have some kind of tradeoff (pin count, die size, whatever). Maybe Qualcomm have a specific application category in mind... maybe they're just late to the party... hard to tell without more detail.

            What you're talking about gets complicated rapidly... in your theoretical "ideal" CPU where each core has it's own memory you either need (in this case) 48 memory controllers and 48 memory modules or 48 lots of on-core memory. And then what happens if one core needs more memory than is attached?
            Have a look at the bus topology of a multiprocessor Xeon system or a Knights Landing (Xeon Phi). Things get more complicated from there (crossbar memory etc).

            This is a good read:
            NUMA (Non-Uniform Memory Access): An Overview
            http://queue.acm.org/detail.cfm?id=2513149

            Comment


            • #16
              Originally posted by Brane215 View Post
              x86 <snip>
              has horrific ISA format
              <snip>
              Could you elaborate, please?
              I know this used to be an issue but now ARM also decodes to ucode (gosh, since....cortex a9, or so).

              Comment


              • #17
                Originally posted by L_A_G View Post
                Sorry, but I don't really see the point in a 48 core ARM chip.

                The main point of ARM is good performance at low wattage, but with this many cores it's not going to be low wattage, which puts it squarely in the territory of Intel's Xeon and AMD's upcoming Zen-based Opteron chips. Additionally this number of cores really isn't all that useful for anything except for compute workloads, would would put it in the line of fire of Intel's Xeon Phi accelerators along with Nvidia and AMD's GPGPU products. I'd go as far as call this thing just a flat-out solution in search of a problem.
                Assuming the bus isn't terribly designed, this lets you pay for the dram,nic(s),accelerators ONCE per 48 cores. In a best case scenario all 48 cores will be able to interleave their responses and only be responsible for 1/48 of the power budget. The worst case is only 1 core is active (HOPEFULLY the others are either hotplugged or in a very low C-state) while occasionally servicing requests and paying for all the other hardware that would otherwise be amortized.
                If you want a specific application, qualcomm mentioned hadoop and spark. To me, that suggests rather low ipc (so, relying on stupidly parallel workoads and the new arm neon instructions (http://www.eetimes.com/document.asp?doc_id=1330339)

                Comment


                • #18
                  Also....

                  Last edited by liam; 09 December 2016, 01:06 AM.

                  Comment


                  • #19
                    Michael pls get friendly with the Qualcomm people to get some hardware to benchmark! after all, this is the largest Linux news site - surely you would have some pull!

                    Comment


                    • #20
                      Originally posted by L_A_G View Post
                      Sorry, but I don't really see the point in a 48 core ARM chip.

                      The main point of ARM is good performance at low wattage, but with this many cores it's not going to be low wattage, which puts it squarely in the territory of Intel's Xeon and AMD's upcoming Zen-based Opteron chips. Additionally this number of cores really isn't all that useful for anything except for compute workloads, would would put it in the line of fire of Intel's Xeon Phi accelerators along with Nvidia and AMD's GPGPU products. I'd go as far as call this thing just a flat-out solution in search of a problem.
                      Arm's deal is best price/perf at the phone friendly power. If they can manage best price/perf at server power levels all the better. Many embarassingly parallel workloads at large companies like google or facebook could care less about node performance. They want best performance/(total cost of ownership). That includes things like power, cooling, purchase cost, maintenance cost, error rate, etc.

                      If a rack + 2 30 amp 208V 3phase PDUs + arm ends up delivering more performance per $ then I can see it being very popular. Intel most specializes in maximum performance per core.

                      Comment

                      Working...
                      X