Announcement

Collapse
No announcement yet.

Benchmarking AMD FX vs. Intel Sandy/Ivy Bridge CPUs Following Spectre, Meltdown, L1TF, Zombieload

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Eero View Post
    When you say they're at 100% utilization, did you check how much of that was IO-wait? In "top", press "1" to see all cores separately, "wa" column then shows IO-wait percentage for each core. In compilation case, IO-wait would be waiting on disk (in other cases it could be waiting for network, GPU, anything else than CPU).
    I'll do that in the future - very interesting.

    Comment


    • #12
      Originally posted by Eero View Post
      Kernel code is C, which is fast to compile so its mostly disk bound. Firefox is C++/Rust which is much slower to compile so its CPU bound.

      (I'm assuming you've asked both builds to parallelize to same number of cores.)
      Can you then explain this?



      It doesn't scale linearly with the CPU power, but I would mostly attribute that to the single-thread parts (like final image linking) and not IO. Especially since the benchmarks were done on an SSD.

      Comment


      • #13
        Originally posted by Eero View Post
        Kernel code is C, which is fast to compile so its mostly disk bound. Firefox is C++/Rust which is much slower to compile so its CPU bound.

        (I'm assuming you've asked both builds to parallelize to same number of cores.)
        That's just rubbish. All compilations with modern compiler stacks are mostly CPU bound.
        While there probably are exceptions to this rule, like TCC on a stupid simple code times a zillion different files, the general rule on CPU bound compilation times do still apply.

        Comment


        • #14
          IME (20+ years working on C and C++ projects), CPU utilization tends to 100% for both types. C++ can compile slower (typically due to templates or some such), but on most systems today, the CPU is still the slowest part of the chain, even in C. And especially when using a fast NVME SSD or from a RAM disk. So the compilation speed difference is mostly due to C++ being a more complex language than C, not due to any hardware slowdowns.

          Incidentally, that's one of the reasons I hate that every time a new, faster processor is released, there's always a bunch of people that pop up and say "why do we need faster CPUs; they are fine for any application in the past 5 years?" Some of those people should try compiling a 1-million line C++ codebase and see if they have the same attitude.

          Comment


          • #15
            Originally posted by Wielkie G View Post
            There is no such thing as "virtual core". Each hardware thread in a single core is equal to the other one.
            There are no "hardware threads" because they share resources of the single physical core making them not equal.

            Linux from 2.6 and Windows from XP have HT-aware schedulers that try to avoid placing 2 threads on both "virtual cores" of the same physical core. They instead prefer using one thread from 2 separate cores in order to minimize the cost of sharing resources. If what you say is true then why bother with modifying scheduling because of HT?

            Comment


            • #16
              Originally posted by debianxfce View Post
              Those server only focused AS kernel developers force you to add the mitigations=off kernel command line parameter when using a SW development/multimedia/gaming computer. My command line was enough long and complex already.
              Code:
              BOOT_IMAGE=/boot/vmlinuz-5.2.0-rc1+ root=UUID=2d7b41ef-8af2-4992-91f9-de0e70302d43 ro quiet amdgpu.ppfeaturemask=0xfffd7fff mitigations=off
              Code:
              linux   /@/boot/vmlinuz-linux51-tkg-bmq root=UUID=that-ain't-your-business rw rootflags=subvol=@  zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=25 zswap.zpool=z3fold amdgpu.dc=1 amdgpu.ppfeaturemask=0xfffd7fff amdgpu.deep_color=1 amdgpu.exp_hw_support=1 zfs.zfs_arc_max=8589934592 pti=off spectre_v2=off l1tf=off nospec_store_bypass_disable mds=off
              I wish I had a micro-command one like you. Mine is so big and long it just gets in the way.

              Comment


              • #17
                Originally posted by atomsymbol
                Maybe the people who say that we do not need faster CPUs are pointing to the fact that a lot of software can be further optimized to yield major performance gains.
                Most of the people that say that IMO are using their systems very lightly (web browsing, some image editing, etc). It just gets my goat that some people can't see use cases beyond their own viewpoint.

                A sidenote: I was using zapcc for a while, and it was much faster than non-ccached gcc/clang, but unfortunately it appears to contain some bugs which can poison its internal caches which makes zapcc unsuitable for work.
                I've also tried zapc++, and when I can work around the issues it is great. On one project of 250,000 LOC, compilation went from ~1 minute with g++/clang++ to 11 seconds! When it works, it really works

                Comment


                • #18
                  Originally posted by schmidtbag View Post
                  Interesting how after the mitigations, the 8370 ended up being overall faster than the 2400S. Though, had we known about these vulnerabilities around 2012, I still don't think AMD would've endured much less scrutiny.
                  The i5 2500k slays the 2400s in single thread performance by 34% and is a much stronger overall processor.

                  Comment


                  • #19
                    Originally posted by Wielkie G View Post
                    ...
                    What actually happens is, probably that disabling HT allows cpu to boost to higher clocks.

                    Comment


                    • #20
                      Originally posted by Eero View Post
                      When you say they're at 100% utilization, did you check how much of that was IO-wait? In "top", press "1" to see all cores separately, "wa" column then shows IO-wait percentage for each core. In compilation case, IO-wait would be waiting on disk (in other cases it could be waiting for network, GPU, anything else than CPU).
                      I'm rebuilding Firefox Nightly now using the updates from the past day. CPU on both cores averaging about 95%. IO-wait per top is typically at the lower end of a 0.0-3.0 range, spiking as high as 44.5 in rare moments.

                      Comment

                      Working...
                      X