Interesting that for code compilation there doesn't appear to be any reason to go for the more expensive 3950X vs 3900X: the best improvement is for the Linux kernel compilation and it's just 12%. The rest is in the 5-7% area. I have two theories as to why this might be the case.
The first is that the limited memory bandwidth starts to bite. That would also explain why the Linux kernel sees the best scalability (which is generally seems to be the case): lots of smallish C-files that don't take a lot of RAM to compile (unlike, say, large C++ files). But then things don't appear to be scaling well when we go from 3950X to the 24-core Threadripper (which has twice the memory channels). For example, for GCC compilation, going from 12 cores to 16 we see 7% speedup while going from 16 cores to 24 (plus the two extra memory channels), we see 9% speedup (based on these results: https://openbenchmarking.org/result/...AS-COMPILING50). One thing that we are not taking into account in this second comparison is the RAM speed, though.
The second theory is that for compiling real projects, single-core performance also matters a lot because of the serial linking steps. Again, the Linux kernel is the outlier here since there is only one linking step (I believe this benchmark does not build the modules, but even if it did, those are also quite parallelizable). Compare this to GCC which goes through quite a few linking "bottlenecks". And in this regard (single core turbo), the three processors are essentially the same.
Thoughts?
The first is that the limited memory bandwidth starts to bite. That would also explain why the Linux kernel sees the best scalability (which is generally seems to be the case): lots of smallish C-files that don't take a lot of RAM to compile (unlike, say, large C++ files). But then things don't appear to be scaling well when we go from 3950X to the 24-core Threadripper (which has twice the memory channels). For example, for GCC compilation, going from 12 cores to 16 we see 7% speedup while going from 16 cores to 24 (plus the two extra memory channels), we see 9% speedup (based on these results: https://openbenchmarking.org/result/...AS-COMPILING50). One thing that we are not taking into account in this second comparison is the RAM speed, though.
The second theory is that for compiling real projects, single-core performance also matters a lot because of the serial linking steps. Again, the Linux kernel is the outlier here since there is only one linking step (I believe this benchmark does not build the modules, but even if it did, those are also quite parallelizable). Compare this to GCC which goes through quite a few linking "bottlenecks". And in this regard (single core turbo), the three processors are essentially the same.
Thoughts?
Comment