Announcement

Collapse
No announcement yet.

A Look At Linux Application Scaling Up To 128 Threads

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adarion
    replied
    I wish there was some screenshot of the boot process, when the KMS kicks in and you can see all the 128 penguins... I wonder if the all fit on a normal screen.

    Leave a comment:


  • Michael
    replied
    Originally posted by jacob View Post
    It would be very interesting to see a benchmark of this baby against a Talos II with 48 cores, is there any chance you could do that, Michael?
    Unfortunately I have no physical access to any POWER9 hardware yet nor any persistent POWER9 remote systems, so would need to request/wait until having any access again...

    Leave a comment:


  • jacob
    replied
    It would be very interesting to see a benchmark of this baby against a Talos II with 48 cores, is there any chance you could do that, Michael?

    Leave a comment:


  • squash
    replied
    Originally posted by bridgman View Post

    LOL - I know this isn't exactly what you mean but the idea of the system not thinking it had enough load to bother clocking up until it saw more than 32 threads was amusing...
    Only 31 active threads? So bored! Better downclock to save some power!

    Leave a comment:


  • bridgman
    replied
    Originally posted by Wielkie G View Post
    The specification table shows that each configuration up to 32 threads works at ~2.7GHz frequency, but 64 and 128 thread configurations work at 3.1GHz.
    LOL - I know this isn't exactly what you mean but the idea of the system not thinking it had enough load to bother clocking up until it saw more than 32 threads was amusing...

    Leave a comment:


  • estan
    replied
    GraphicsMagick seems to scale with log(threads) instead of number of threads, until I miss something.
    I was also wondering about that, the text does not seem to match what the figure says at all.

    Leave a comment:


  • jrch2k8
    replied
    Originally posted by varikonniemi View Post
    what magic does stockfish and vgr do when 32->64 threads more than doubles performance?
    It may depend on several factors, threads(or cores) are only a part of problem, sometimes if your data is big enough it can choke your cache pipelines or even the RAM bandwidth against certain cores on certain numa nodes, L3 victim cache starvation, etc.

    When you see those cases where speedup is more than double usually means you got enough processing to remove the bottleneck on bandwidth or cache due to enough parallelization allowing the hardware to more efficiently handle smaller chunks of data.

    Of course there are other factors related and unrelated to bandwidth or cache but usually those just contribute in a smaller scale.

    There is also the possibility of a runtime algorithm selector in that application, aka sometimes you find a way to make an algorithm really neat and fast to realize later that it hits a ceiling at some point and stop scaling BUT until that point is the fastest implementation you can reach, then after months of breaking your head you realize the other "slow" algorithm you didn't wanted to use because slow before that ceiling turns out to be an scalability Chuck Norris and ends up been a lot faster when it can scale enough. Hence you end up switching algorithms at runtime depending on the size(or any reasonable parameter) of the dataset to use the most effective tool for the job

    Leave a comment:


  • Wielkie G
    replied
    I might have found out the reason for greater than expected scaling between 32 and 64 threads.

    The specification table shows that each configuration up to 32 threads works at ~2.7GHz frequency, but 64 and 128 thread configurations work at 3.1GHz. The less-thread configurations might not have working CPU turbo enabled, hindering their performance.

    Leave a comment:


  • darkbasic
    replied
    Awesome machine!

    Leave a comment:


  • Michael
    replied
    Originally posted by Mr.Radar View Post
    Did you buy this system or is it on loan from Dell for review purposes? Configuring this server on Dell's online store gets it into the 5 figures very easily just from the CPU and memory options.
    Review sample to be used for future Linux server benchmarking and other interesting performance tests.

    Leave a comment:

Working...
X