Announcement

Collapse
No announcement yet.

Linux vs. BSD CPU Scaling Up To 20 Threads On The Core i9 7900X

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux vs. BSD CPU Scaling Up To 20 Threads On The Core i9 7900X

    Phoronix: Linux vs. BSD CPU Scaling Up To 20 Threads On The Core i9 7900X

    With Intel's recently-launched Core i9 7900X I have carried out some interesting BSD vs. Linux benchmarks when testing out various distributions and comparing each of them at 1, 2, 4, 6, 8, 10, and 20 threads on this $999+ USD processor.

    http://www.phoronix.com/vr.php?view=24978

  • #2
    I'm a lot more curious why (edited) a lot of tests showed almost zero scalability beyond 6 threads.

    Blender should have scaled linearly because rendering uses pretty much independent threads so intercore communication is minimal.

    Is it throttling? OS threads scheduling problems? A lack of RAM/IO bandwidth?
    Last edited by birdie; 07-28-2017, 09:59 AM.

    Comment


    • #3
      Originally posted by birdie View Post
      I'm a lot more curious why most of the tests showed almost zero scalability beyond 6 threads.

      Blender should have scaled linearly because rendering uses pretty much independent threads so intercore communication is minimal.

      Is it throttling? OS threads scheduling problems? A lack of RAM/IO bandwidth?
      some of it is a bit tricky visualization; if the graph Y-axis is about time, and you have linear (perfect) scaling with core count, you get a 1/x graph not a linear graph.

      Comment


      • #4
        Originally posted by arjan_intel View Post

        some of it is a bit tricky visualization; if the graph Y-axis is about time, and you have linear (perfect) scaling with core count, you get a 1/x graph not a linear graph.
        This is where some inverted semi-log graphs or other scaling tricks could come into play to show the scaling more clearly.
        Another idea is to use normalized scaling factors in addition to the raw performance numbers so you could clearly visualize the changes as you add additional cores.

        Comment


        • #5
          Originally posted by arjan_intel View Post

          some of it is a bit tricky visualization; if the graph Y-axis is about time, and you have linear (perfect) scaling with core count, you get a 1/x graph not a linear graph.
          The graphs at Phoronix show raw performance, so they should look more or less linear, instead there's of a plateau right after six cores.

          Comment


          • #6
            Originally posted by birdie View Post

            The graphs at Phoronix show raw performance, so they should look more or less linear, instead there's of a plateau right after six cores.
            The graphs that show actual performance scores that go up (theoretically) to infinity are mostly OK.

            However, a bunch of those benchmarks are time-based benchmarks that go down to zero. The scaling on those graphs is not particularly clear and may give false impressions about what's actually happening with performance. For example, if you start at 100 seconds and go to 50 seconds from one core to two cores, that's 50 units on a graph but only a scaling factor of 2x. If you show another doubling with more cores that goes from 2 seconds to 1 second that's only 1 unit - 50 times smaller in absolute terms and hard to see on the graph - but that 1 second change is another doubling of performance that's certainly important.

            Comment


            • #7
              Yep, I would also be interested on a comparison with Win 10. Thx, Michael!

              Comment


              • #8
                First up with Rodinia and its OpenMP LavaMD profile, the DragonFly results were a bit of a surprise. The single-core performance of DragonFly was much slower than the tested Linux distributions and with a few threads did continue running much slower, but by the time of hitting 20 threads, DragonFly was competing with the three tested Linux distributions.
                That probably means no Turbo Boost support in DragonFly.

                UPD: but no, this doesn't explain 2x performance difference on a single core.
                Last edited by puleglot; 07-28-2017, 11:21 AM.

                Comment


                • #9
                  I am curious to learn why some of these tests were not executed on the BSDs?

                  Comment


                  • #10
                    I can't believe you labeled this article Linux vs. BSD when most of the tests don't even have BSD in them and the ones that do have various different versions. Pretty much pointless.
                    I know for a fact john the ripper easily compiles on BSD, and with some minimal effort you could easily get 99% of the tests working.
                    You also need to configure powerd on BSD for turbo to work.

                    Comment

                    Working...
                    X