Announcement

Collapse
No announcement yet.

13-Way IBM POWER9 Talos II vs. Intel Xeon vs. AMD Linux Benchmarks On Debian

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by willmore View Post
    For a quick runthrough of the systems, this article does a good job, but you continue to rely on your readers being very knowledgable about the limitations of specific benchmarks. Said another way, you make it very easy for your readers to be mislead unless they posess a great deal of knowledge about the benchmarks you run.
    That's part of the game: if you want to encode using x264 then on ppc it will be slower because the software is not optimized.
    ## VGA ##
    AMD: X1950XTX, HD3870, HD5870
    Intel: GMA45, HD3000 (Core i5 2500K)

    Comment


    • #12
      It would be helpful to better label the single-threaded benchmarks (compress-zstd*, encode-mp3, encode-flac, phpbench, scikit-learn I believe), since on a system with some many cores, it is a bit of a different comparison than more through-put based benchmarks.

      For example, a price/performance ratio for phpbench is a bit strange if you were buying a 144 thread server processor and keeping 143 threads idle... Not quite sure what is going on for compress-zstd since it seems both single-threaded on my x86 systems and one where POWER9 is somewhat faster - at least relative to the other single-thread programs and without some other factor (e.g. I see quite a few memory stalls on scikit-learn).

      --mev

      Comment


      • #13
        I'd like Talos to expand the Power9's NvLinks and provide some real means to use it in the form of NVLink enabled compute-cards (Volta based).
        Alternatively, provide PCIe compute cards with a NvLink enabled data feeding frontend (FPGA feeding NvLink with data w. released IP).
        That would really make a market impact.

        Right now it's just a very expensive platform with small real world benefits for those who do not care about "free and open".

        Comment


        • #14
          Very good test, thank you for taking the time and effort to pull this together. Most people perusing Phoronix have a decent general knowledge on the variances between CPU architectures and the impacts software has on using them.

          If we can just get ThunderX2 and Qualcomm Centriq in the mix with these numbers, I would say we (collectively as supporters of Phoronix) have been very well served. Kudos to Talos for not hiding in a corner.

          Comment


          • #15
            Originally posted by willmore View Post
            Several of the benchmarks where the POWER9 falls behind are where the program in quesiton relies a lot on inline assembly--which isn't available for most non-x86 arch's. So, they fall back to the generic C version. That makes the processor look bad when it's really a benchmarking problem. Comparing hand made assembly vs C compiler generated code isn't very interesting especially when you throw in a change in processor architecture. Notable for this probem is x264.

            Single threaded tests? Really?

            As has been pointed out, the LAME graph is missing--the FLAC chart is duplicated in its place.



            Which you didn't do nor did you report the values necessary to do so.

            Given that you compared the costs of just the CPUs in the performance/$ charts, you're really leaving out any meaningful way to compare these systems. Considering you compared *systems*, but reported results for just *processors*, that whole page can be pretty much ignored. Actually, *must be* ignored lest one be mislead.

            For a quick runthrough of the systems, this article does a good job, but you continue to rely on your readers being very knowledgable about the limitations of specific benchmarks. Said another way, you make it very easy for your readers to be mislead unless they posess a great deal of knowledge about the benchmarks you run.
            Well I for one appreciates the single thread benches, it's also a useful metric. Even if you tend to use highly parallel tasks on systems like this it's always good to know the single core speed if you code latency sensitive applications.

            Comment


            • #16
              Originally posted by milkylainen View Post
              I'd like Talos to expand the Power9's NvLinks and provide some real means to use it in the form of NVLink enabled compute-cards (Volta based).
              Alternatively, provide PCIe compute cards with a NvLink enabled data feeding frontend (FPGA feeding NvLink with data w. released IP).
              That would really make a market impact.
              And where are NVidia dirvers for PPC64? =P

              UPDATE: Oh! They actually provide drivers for Power 8 and 9
              Download the Italiano Linux Power 9 Ubuntu 16.04 for Linux POWER Ubuntu 16.04 systems. Released 2017.12.21
              Last edited by puleglot; 25 June 2018, 05:02 PM.

              Comment


              • #17
                Originally posted by puleglot View Post
                And where are NVidia dirvers for PPC64? =P

                UPDATE: Oh! They actually provide drivers for Power 8 and 9
                http://www.nvidia.com/download/drive...x/128711/en-us
                Yep! We also have CAPI 2.0, which plays well with Mellanox cards and is basically a PCIe version of NVLink for non-NVIDIA cards. There is even FPGA support from Xilinx, though you still need an x86 box to actually run the FPGA synthesis since Xilinx apparently hasn't clued in on the fact that POWER is really good at FPGA synthesis and PAR....

                Bottom line: if you want an NVIDIA compute card and are OK with the closed NVIDIA ecosystem, Talos II already lets you work with some of the NVIDIA interconnect technologies (the ones that run over the PCIe electrical interface). We've got two slots that can be used for that purpose per mainboard, so development work for e.g. one of the Sierra / Summit supercomputers can be done desk-side including testing!

                Comment


                • #18
                  It looks like zstd had some performance improvements in their 1.3.4 version released at end of March. Can you confirm that x86 and POWER9 systems were tested with the same version of zstd? Just curious what factors would cause this particular single-threaded benchmark to be quite a bit better on POWER9 than the others.

                  --mev

                  Comment


                  • #19
                    Originally posted by austin754 View Post
                    It looks like zstd had some performance improvements in their 1.3.4 version released at end of March. Can you confirm that x86 and POWER9 systems were tested with the same version of zstd? Just curious what factors would cause this particular single-threaded benchmark to be quite a bit better on POWER9 than the others.

                    --mev
                    If you click the links through to OpenBenchmarking.org you are able to confirm all of the test details:

                    OpenBenchmarking.org, Phoronix Test Suite, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, open source benchmarking, benchmarking test profiles


                    OpenBenchmarking.org, Phoronix Test Suite, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, open source benchmarking, benchmarking test profiles
                    Michael Larabel
                    https://www.michaellarabel.com/

                    Comment


                    • #20
                      @bridgman:

                      When will we see EPYC+(2) _dual_ systems here, finally?

                      Comment

                      Working...
                      X