Announcement

Collapse
No announcement yet.

Intel Nehalem vs. Ice Lake Benchmarks - Including Clock + Power + Thermal Metrics

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • coder
    replied
    Originally posted by CochainComplex View Post
    Am I blind or has your post disappeared? if so why? According to citation style and presenting arguments it has been the most professional comment in this thread.
    I updated it with some more details & toned down my rebuke, somewhat. As before, it got caught in the spam filter, due to the links. I knew that would happen, but decided to edit it anyway. Hopefully, it was worthwhile. At least I learned a couple things, in the process.

    Anyway, it's back, in case you'd like to see my handiwork: https://www.phoronix.com/forums/foru...34#post1142134

    IMO, the spam filter really needs some work. It disincentives people from including supporting links, which is really counter-productive.
    Last edited by coder; 11-29-2019, 09:37 PM.

    Leave a comment:


  • CochainComplex
    replied
    Originally posted by coder View Post
    Yeah, although for the purpose of that post, the Real World Tech article was actually a better resource. In fact, I later noticed that he even cited it at the end of his Core 2/Nehalem section.

    There were 8 more parameters that I found for Nehalem, but not Ice Lake. I'm sure part of that is due to Ice Lake's newness, but I think Intel is less forthcoming with details than it used to be. It also doesn't help that Ice Lake is mostly targeted at the mid-performance laptop market, so not very interesting for gamers & therefore attracting less attention.
    Am I blind or has your post disappeared? if so why? According to citation style and presenting arguments it has been the most professional comment in this thread.
    Last edited by CochainComplex; 11-29-2019, 03:39 AM.

    Leave a comment:


  • coder
    replied
    Originally posted by CochainComplex View Post
    coder you are citing Agner Fog thats great. IMHO his work is no known enough.
    Yeah, although for the purpose of that post, the Real World Tech article was actually a better resource. In fact, I later noticed that he even cited it at the end of his Core 2/Nehalem section.

    There were 8 more parameters that I found for Nehalem, but not Ice Lake. I'm sure part of that is due to Ice Lake's newness, but I think Intel is less forthcoming with details than it used to be. It also doesn't help that Ice Lake is mostly targeted at the mid-performance laptop market, so not very interesting for gamers & therefore attracting less attention.

    Leave a comment:


  • CochainComplex
    replied
    coder you are citing Agner Fog thats great. IMHO his work is no known enough.

    Leave a comment:


  • coder
    replied
    Originally posted by kylew77 View Post
    What was most interesting to me is the that the only thing that really changed was clock speed and new instructions.
    WTF? No.

    Let's start by looking at some more numbers.

    Parameter Nehalem Ice Lake Improvement
    L2 TLB (entries) 512 2048 400%
    Load Buffer (entries) 48 72 150%
    Store Buffer (entries) 32 128 400%
    μOp Cache (entries) - 2.25k
    Instruction Decoders 4 5 125%
    Execution Ports 6 10 167%
    Reorder Buffer (entries) 128 352 275%
    L1 DCache (kB) 32 48 150%
    L1 DCache (associativity) 8 12 150%
    L2 Cache (kB) 256 512 200%


    I've yet to find the size of their shadow register files (or actually a number of detailed μArch parameters of Ice Lake), but I'm sure that scaled up, as well.

    Of course, the numbers don't tell the whole story. A lot of sophistication has been added to various aspects of the μArch, including things like μOp fusion, branch prediction, etc.

    As for "new instructions" - that scarcely hints at the nature and extent of the functional differences between these cores. Under that rubric sits (listed by cpuid flag, with major extensions in bold):
    And that's just up to Skylake - the newest generation I have. Ice Lake will also have AVX-512, specifically: F, CD, VL, DQ, BW, IFMA, VBMI, VBMI2, VPOPCNTDQ, BITALG, VNNI, VPCLMULQDQ, GFNI, and VAES.

    Of particular note, AVX/AVX2 widens the SIMD units and registers from 128 bits to 256. AVX-512 obviously doubles this, again.

    Originally posted by kylew77 View Post
    Only 2MB+ cache in 10 years and still 4 cores.
    The comparison is misleading. The Nehalem CPU he tested was a high-end part without integrated graphics, while the Ice Lake CPU is a mid/low-end SoC with an iGPU and more (see below).

    Because 10 nm still can't deliver comparable clock speeds, Intel is keeping the performance segment at 14 nm, for the time being. That's why Ice Lake only goes up to quad-core, but you can already get 6-core mobile chips in the Comet Lake series.

    You're also missing the fact that Ice Lake chips have up to 64 EU GPUs, whereas the previous limit for the mainstream was 24 EUs. In contrast, Nehalem never had an on-die GPU, but the Clarkdale chips had a dual-core CPU with a 12 EU GPU on a separate die. Don't forget that it was also a much more primitive GPU, with no media acceleration (as QuickSync video acceleration was only introduced in SandyBridge) and supporting only D3D 10.1 and OpenGL 2.1.

    Ice Lake also has a dedicated neural processor, which they call the GNA (Gaussian Neural Accelerator). It has a lot else, as well - much more sophisticated clock gating and power management and Thunderbolt integration.

    To summarize, here's a list of all the additional blocks Ice Lake has that you won't find in the Nehalem used for these benchmarks:
    • Gen11 iGPU w/ media encode/decode acceleration
    • GNA neural accelerator
    • Image Processor (4th gen)
    • Thunderbolt 3
    • Integrated PCH with:
      • Wi-Fi 6 (Gig+)
      • Audio DSP
      • 6x USB 3.1
      • PCIe 3.0 x16
      • 3x SATA-3
      • eMMC 5.1


    Originally posted by kylew77 View Post
    Really shows how Moore's law is slowing down. I'm ready for 16 core laptop processors at least considering we have 64 core server parts.
    45 nm vs. 10 nm is nominally a 20x density improvement. Sure, it should be about 32x in 10 years, but you seriously overstate your case.

    As for where the 20x transistor budget went, consider the following:
    • 10 nm wafers are certainly more expensive than the old 45 nm wafers used by Nehalem, so you can't assume constant area.
    • Deeper, wider, more sophisticated cores means you don't get linear scaling of core count.
    • 512-bit AVX registers & arithmetic takes a huge amount of area.
    • Many specialized processing blocks (see above).
    • This is a lower-end chip - Intel's roadmap shows 26-core Ice Lake server chips, arriving early next year.

    That said, if the 10 nm manufacturing process were performing as Intel originally hoped, you'd probably be seeing 8 and 10 core Ice Lake chips for higher-end laptops.

    References:
    1. https://www.anandtech.com/show/14514...and-sunny-cove
    2. https://www.anandtech.com/show/2594 (Nehalem - Everything You Need to Know about Intel's New Architecture)
    3. https://www.anandtech.com/show/2663 (Nehalem: The Unwritten Chapters)
    4. https://www.anandtech.com/show/2671 (Nehalem Part 3: The Cache Debate, LGA-1156 and the 32nm Future)
    5. https://www.anandtech.com/show/2901 (The Clarkdale Review: Intel's Core i5 661, i3 540 & i3 530)
    6. https://en.wikichip.org/wiki/intel/m...e_lake_(client)
    7. https://en.wikichip.org/wiki/intel/m...ehalem_(client)
    8. https://en.wikipedia.org/wiki/Ice_Lake_(microprocessor)
    9. https://en.wikipedia.org/wiki/Nehale...roarchitecture)
    10. https://en.wikipedia.org/wiki/List_o...1st_Generation)
    11. https://www.realworldtech.com/nehalem/
    12. https://www.agner.org/optimize/microarchitecture.pdf
    Last edited by coder; 11-28-2019, 09:41 PM.

    Leave a comment:


  • randomizer
    replied
    As someone who is still running one of these old chips (stock i7 920) it's nice to see a rare article showing how they stack up against current generation CPUs. I only recall one other article in recent years. Thanks Michael.

    Leave a comment:


  • kylew77
    replied
    What was most interesting to me is the that the only thing that really changed was clock speed and new instructions. Only 2MB+ cache in 10 years and still 4 cores. Really shows how Moore's law is slowing down. I'm ready for 16 core laptop processors at least considering we have 64 core server parts.

    Leave a comment:


  • CochainComplex
    replied
    Originally posted by nuetzel View Post
    I have a 'real' Nehalem (Lynnfield). X3470 4c/8t 2,93 GHz/3,6 turbo
    It is not that bad.
    maybe because the performance increase over the last decade was not so substential. Due to the lack of competition. Luckily the game is changing thanks to the zen arch and chiplet design. Otherwise we would still have a 4 core base with an average of 5% performance increase by each "new" platform. If nahelem is performing well it is a bad sign for current gen cpus....

    Leave a comment:


  • nuetzel
    replied
    I have a 'real' Nehalem (Lynnfield). X3470 4c/8t 2,93 GHz/3,6 turbo
    It is not that bad.

    Leave a comment:


  • coder
    replied
    Originally posted by stormcrow View Post
    If you're doing anything other than ML algorithms, apparently so.
    Deep learning really belongs on GPUs and purpose-built chips. I'd bet even the iGPU in that Ice Lake chip out-performs its AVX-512 path.

    Leave a comment:

Working...
X