Announcement

Collapse
No announcement yet.

Benchmarking An ARM 96-Core Cavium ThunderX System

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    There's some great coverage of ThunderX2 on The Next Platform.

    Some nice detail on the architecture and projected performance here:
    https://www.nextplatform.com/2017/11...-server-punch/

    And this has some numbers from Cray (they're offering ThunderX2 supers):
    https://www.nextplatform.com/2017/11...essor-shakeup/

    Edit: if you're interested in the world beyond x86, it's worth following Next Platform. They're the only site I know of that goes into detail on developments in ARM, Power etc. They also do nice coverage of hardware beyond that, like Google's homegrown TPU (Tensor accelerator).

    (not here to spam about that site, just like what they do).
    Last edited by aaronage; 28 February 2018, 01:20 PM.

    Comment


    • #12
      Not a cheap server, but would be sweet to have

      Comment


      • #13
        Originally posted by wizard69 View Post
        Some of the results just dont make sense. You would think highly parallel things like the Linux kernel woukd do better even with relatively weak cores. This has me wondering if there are teething or configuration problems.
        I would assume a lack of memory bandwidth, or inefficient way for those 96 threads to access memory. I wonder how the CPU load was.

        Comment


        • #14
          Originally posted by aaronage View Post
          If you want to play with ThunderX, Scaleway offer instances up to 64 vCPU/128GB running on ThunderX.
          Nice, thanks for that. The Scaleway offering is 64 vCPU ARM server, 128GB RAM, 1TB SSD, 280 Euro/month. For contrast their x86_64 12 core server has 120 GB RAM, 1TB SSD, 180 Euro/month.

          That seems pretty cost-competitive, depending upon your workload and what 12-core x86_64 CPU they're using. If you're running, say, something that does C-Ray style calculations or something that matches the performance profile of Java JMH then this thing is probably worth the cost. Otherwise, no.

          If I was getting a high performance home machine I'd go AMD Threadripper or Core i9 over this.

          Comment


          • #15
            Gigabyte released their ThunderX2 boards in November 2017.

            GIGABTYE's R181 series is a 1U platform with dual-socket ThunderX2 compute node with best-in-class throughput, memory configuration and capacity. . GIGABTYE's R281 series is a 2U platform with dual-socket ThunderX2 compute node with best-in-class throughput, memory configuration and capacity.

            Comment


            • #16
              it's only 28nm....

              Comment


              • #17
                Has anyone seen benchmarks of this against A-series Opteron?

                Comment


                • #18
                  Nice thanks for the benchmarks. I had started playing with 96 core Cavium ThunderX the other day too but ran into a road block on CentOS 7 my go to OS as there weren't any official MariaDB 10.x aarch64 YUM repo available yet. The benchmark numbers here leave a lot to be desired though. So while I am waiting on official MariaDB 10.x aarch64 YUM repo, been playing with AMD EPYC 7401P - very interesting beast for the price !

                  Comment


                  • #19
                    Originally posted by chuckula View Post
                    It's not disappointing because of its blatant lack of single-thread performance.

                    It's disappointing because Phoronix teed up benchmarks that should literally fall right into the lap of an OMG 96 COAR system with gobs of memory bandwidth and a pretty large power consumption envelope that's as least as high as a dual socket Xeon or Epyc system. And even there it's clearly inferior to a desktop system you could have bought last year, much less a real server.
                    It's very old ARM semi costume design based on the A57's which ware not efficient & counts as the worst ever ARM OoO design. Pump the things up to the A72 reference & you will get 20% more performance in 33% smaller DTP . Now that brings us down to 67W for a for a similar 48 core part. Then cut that again in half switching it to 16~12 nm FinFET & you get to the 33.5W per SoC or 67W for system containing two of them. Guess what now ARM server becomes 2x more efficient at least in worst case scenario while offering same to better performance in many task workloads as cloud storage and dynamic IO requests. 2x less energy = 2x cheaper ownership cost. Now I did only narrow it down to matching lithograph and wide spread design up to date. Arm has a unique current market advantage that it can push power advantage even future more by combination of OoO and in order core's which simply is not existent on X86 world. Now imagine a 100 core system (48 A72 based ones +2 A55 one's per socket) that in active idle state still full capable handling back & RT tasks uses 1W or less of power. That is an order of magnitude difference.
                    Last edited by Zola; 01 March 2018, 08:57 AM.

                    Comment


                    • #20
                      What's the power consumption for 96-core ThunderX at idle and under load?

                      Comment

                      Working...
                      X