Announcement

Collapse
No announcement yet.

Benchmarking AMD FX vs. Intel Sandy/Ivy Bridge CPUs Following Spectre, Meltdown, L1TF, Zombieload

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Eero View Post
    When you say they're at 100% utilization, did you check how much of that was IO-wait? In "top", press "1" to see all cores separately, "wa" column then shows IO-wait percentage for each core. In compilation case, IO-wait would be waiting on disk (in other cases it could be waiting for network, GPU, anything else than CPU).
    I'll do that in the future - very interesting.

    Comment


    • #12
      Originally posted by Eero View Post
      Kernel code is C, which is fast to compile so its mostly disk bound. Firefox is C++/Rust which is much slower to compile so its CPU bound.

      (I'm assuming you've asked both builds to parallelize to same number of cores.)
      Can you then explain this?



      It doesn't scale linearly with the CPU power, but I would mostly attribute that to the single-thread parts (like final image linking) and not IO. Especially since the benchmarks were done on an SSD.

      Comment


      • #13
        Originally posted by Eero View Post
        Kernel code is C, which is fast to compile so its mostly disk bound. Firefox is C++/Rust which is much slower to compile so its CPU bound.

        (I'm assuming you've asked both builds to parallelize to same number of cores.)
        That's just rubbish. All compilations with modern compiler stacks are mostly CPU bound.
        While there probably are exceptions to this rule, like TCC on a stupid simple code times a zillion different files, the general rule on CPU bound compilation times do still apply.

        Comment


        • #14
          IME (20+ years working on C and C++ projects), CPU utilization tends to 100% for both types. C++ can compile slower (typically due to templates or some such), but on most systems today, the CPU is still the slowest part of the chain, even in C. And especially when using a fast NVME SSD or from a RAM disk. So the compilation speed difference is mostly due to C++ being a more complex language than C, not due to any hardware slowdowns.

          Incidentally, that's one of the reasons I hate that every time a new, faster processor is released, there's always a bunch of people that pop up and say "why do we need faster CPUs; they are fine for any application in the past 5 years?" Some of those people should try compiling a 1-million line C++ codebase and see if they have the same attitude.

          Comment


          • #15
            Originally posted by Wielkie G View Post
            There is no such thing as "virtual core". Each hardware thread in a single core is equal to the other one.
            There are no "hardware threads" because they share resources of the single physical core making them not equal.

            Linux from 2.6 and Windows from XP have HT-aware schedulers that try to avoid placing 2 threads on both "virtual cores" of the same physical core. They instead prefer using one thread from 2 separate cores in order to minimize the cost of sharing resources. If what you say is true then why bother with modifying scheduling because of HT?

            Comment


            • #16
              Originally posted by debianxfce View Post
              Those server only focused AS kernel developers force you to add the mitigations=off kernel command line parameter when using a SW development/multimedia/gaming computer. My command line was enough long and complex already.
              Code:
              BOOT_IMAGE=/boot/vmlinuz-5.2.0-rc1+ root=UUID=2d7b41ef-8af2-4992-91f9-de0e70302d43 ro quiet amdgpu.ppfeaturemask=0xfffd7fff mitigations=off
              Code:
              linux   /@/boot/vmlinuz-linux51-tkg-bmq root=UUID=that-ain't-your-business rw [email protected]  zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=25 zswap.zpool=z3fold amdgpu.dc=1 amdgpu.ppfeaturemask=0xfffd7fff amdgpu.deep_color=1 amdgpu.exp_hw_support=1 zfs.zfs_arc_max=8589934592 pti=off spectre_v2=off l1tf=off nospec_store_bypass_disable mds=off
              I wish I had a micro-command one like you. Mine is so big and long it just gets in the way.

              Comment


              • #17
                Originally posted by sa666666 View Post
                IME (20+ years working on C and C++ projects), CPU utilization tends to 100% for both types. C++ can compile slower (typically due to templates or some such), but on most systems today, the CPU is still the slowest part of the chain, even in C. And especially when using a fast NVME SSD or from a RAM disk. So the compilation speed difference is mostly due to C++ being a more complex language than C, not due to any hardware slowdowns.

                Incidentally, that's one of the reasons I hate that every time a new, faster processor is released, there's always a bunch of people that pop up and say "why do we need faster CPUs; they are fine for any application in the past 5 years?" Some of those people should try compiling a 1-million line C++ codebase and see if they have the same attitude.
                In general I agree with your viewpoint, but on the other hand projects like Zapcc are showing that the gcc/g++/clang/clang++ compilers are repeatedly performing a lot of redundant work. Maybe the people who say that we do not need faster CPUs are pointing to the fact that a lot of software can be further optimized to yield major performance gains.

                A sidenote: I was using zapcc for a while, and it was much faster than non-ccached gcc/clang, but unfortunately it appears to contain some bugs which can poison its internal caches which makes zapcc unsuitable for work.

                Another point is that when compiling 1-million line C++ codebase, most of the codelines (90%) in the end won't be executed at all or will be executed only occasionally. gcc/clang aren't normally aware of this and are needlessly investing compilation time into optimization of such codelines. Profile-guided C/C++ compilation is still in its infancy.

                Comment


                • #18
                  Originally posted by atomsymbol View Post
                  Maybe the people who say that we do not need faster CPUs are pointing to the fact that a lot of software can be further optimized to yield major performance gains.
                  Most of the people that say that IMO are using their systems very lightly (web browsing, some image editing, etc). It just gets my goat that some people can't see use cases beyond their own viewpoint.

                  A sidenote: I was using zapcc for a while, and it was much faster than non-ccached gcc/clang, but unfortunately it appears to contain some bugs which can poison its internal caches which makes zapcc unsuitable for work.
                  I've also tried zapc++, and when I can work around the issues it is great. On one project of 250,000 LOC, compilation went from ~1 minute with g++/clang++ to 11 seconds! When it works, it really works

                  Comment


                  • #19
                    Originally posted by schmidtbag View Post
                    Interesting how after the mitigations, the 8370 ended up being overall faster than the 2400S. Though, had we known about these vulnerabilities around 2012, I still don't think AMD would've endured much less scrutiny.
                    The i5 2500k slays the 2400s in single thread performance by 34% and is a much stronger overall processor.

                    Comment


                    • #20
                      Originally posted by Wielkie G View Post
                      ...
                      What actually happens is, probably that disabling HT allows cpu to boost to higher clocks.

                      Comment

                      Working...
                      X