Announcement

Collapse
No announcement yet.

How A Raspberry Pi 4 Performs Against Intel's Latest Celeron, Pentium CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • schmidtbag
    replied
    Originally posted by vladpetric View Post
    The OS and benchmarks are pretty much the same for all these processors, and they work just fine. That in itself is a huge accomplishment (a collective accomplishment, if you wish). Your analogy is faulty, because they can really perform all the same tasks.
    Huh? That's not even slightly true. Whether you're comparing ARM to x86 or a RPi vs Apple's ARM, there can be significant differences. There are also basic tasks that, regardless of what architecture you have, you'll get similar results. Same can be said of my analogy: for example, an economy car and a sports car will still get you from point A to point B in the same amount of time if you're obeying speed limits.
    In this day and age IPC matters a lot more than you think. Clock speeds are limited upwards by power consumption, and the differences between instruction sets are smaller than you'd think (and yes, that includes the venerable AMD64 instruction set).
    IPC is incredibly important, but not for the target devices that these Broadcom CPUs are meant for. If IPC was so unanimously important, we wouldn't be having this discussion.
    As far as instruction sets are concerned: the usefulness of additional instructions is limited by the compiler and Amdahl's law. The bigger problem with performance is still regular generic code, and not handcoded assembly.
    I completely agree.

    Leave a comment:


  • vladpetric
    replied
    Originally posted by hotaru View Post

    no, that doesn't "even things out". performance is generally much better with 64-bit code. the increase in pointer size doesn't make anywhere near as much difference as doubling the number of registers.
    You're totally wasting your time trying to explain fundamental trade-offs to some of the geniuses here .
    Last edited by vladpetric; 08 August 2020, 01:09 PM.

    Leave a comment:


  • vladpetric
    replied
    Originally posted by schmidtbag View Post
    They're doing poorly because you're judging it for something it wasn't meant to do. As the saying goes, "judge a fish by its ability to climb a tree and it will think it's stupid". ARM isn't built to compete with desktop performance.

    Yes, I basically said that myself but in fewer words.
    IPC can matter a lot. It isn't the only thing that matters.

    I don't know enough about that to make a worthwhile comment on it, which is why I didn't comment on it in the first place. What I do know is it clearly isn't a necessity to make ARM usable on a day to day basis, of course, assuming you're ok with the level of performance (which plenty of people are). I feel like if it was such an obvious thing to add, they would have done so. After all, the NEON instructions for example aren't exactly a simple addition.

    Poorly compared to what? You have to make comparisons, and the context of the article included Intel. ARM is a hell of a lot better compared to most, if not all other RISC architectures (POWER is faster but it is a lot more power hungry too). Apple's CPU might be better, but it's not going to be cheap; this Broadcom chip offers some fantastic performance[-per-watt] for the price. Any CPU can be made better if you just cram more instructions in it, but then it becomes expensive and inefficient for more basic tasks. Like I said multiple times already: you're expecting this CPU to be something it's not. It does what it was built to do very well.

    Me too. Though, kinda the point of these CPUs is they don't have a lot of instructions. My server uses A53 cores (not Broadcom) and this is all it shows for features in cpuinfo:
    half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32
    Sure isn't much to look at, eh? But it's fast enough for what I need it to do.
    The OS and benchmarks are pretty much the same for all these processors, and they work just fine. That in itself is a huge accomplishment (a collective accomplishment, if you wish). Your analogy is faulty, because they can really perform all the same tasks.

    In this day and age IPC matters a lot more than you think. Clock speeds are limited upwards by power consumption, and the differences between instruction sets are smaller than you'd think (and yes, that includes the venerable AMD64 instruction set).

    Problem is - it's not a spec (anyone can quote a spec number), and understanding why it can fluctuate a lot requires understanding of the microarchitecture. And it's easy to get 2 benchmarks that behave very differently on exactly the same processor, with the same config (a database can easily have 10 times lower IPC than a media benchmark).

    As far as instruction sets are concerned: the usefulness of additional instructions is limited by the compiler and Amdahl's law. The bigger problem with performance is still regular generic code, and not handcoded assembly.

    Please, read the following. The first one is a paper by one of AMD's leading researchers; while its main message is something else, a secondary message is that load store speculation is totally worth it:

    https://pdfs.semanticscholar.org/fae...982.1596905037

    And then textbooks:

    https://www.amazon.com/Computer-Arch...dp_ob_title_bk

    https://www.amazon.com/Modern-Proces.../dp/1478607831

    Then we can talk about microarchitecture.
    Last edited by vladpetric; 08 August 2020, 01:12 PM.

    Leave a comment:


  • hotaru
    replied
    Originally posted by Raka555 View Post

    And the fact that 64bit pointers "waste" 1/2 the cache they are loaded in, even things out. So more or less the same performance, dependent on workload.
    no, that doesn't "even things out". performance is generally much better with 64-bit code. the increase in pointer size doesn't make anywhere near as much difference as doubling the number of registers.

    Leave a comment:


  • vladpetric
    replied
    Originally posted by wizard69 View Post

    If Apples processor is as good as I think it can be it will force changes in the industry. If nothing else that is something good that Apple is doing. By the way Apples move here isn't really about ARM, even though I suspect they will have industry leading performance. Rather I see special function units being the big performance driver in their chips, with Neural Engine getting a big boost in the next round of chips. Once the hardware is in place I expect a huge move towards AI/ML techniques in their software.

    So what many will be seeing as great performance in Apples new machines will not always be because of Apples ARM processors.
    The problems with special function units:

    1. They're more-or-less set in stone - once you introduce them, you need (yes, you can extend them, but generally you need to keep your new instructions and maintain compatibility)
    2. It's really difficult to keep them busy. A compiler won't be able to figure out that your code can be mapped on a neural unit, so most often you need to either code stuff by hand (assembly, intrinsics) or rely on a library that makes use of them.
    3. The gains are limited by Amdahl's law.

    All these arguments where made in the original CISC vs RISC papers, FWIW.

    Leave a comment:


  • vladpetric
    replied
    Originally posted by ldesnogu View Post
    A76 and up have changed the direction ARM took. They now seriously target performance... at last! And this explains why companies such as Samsung and Qualcomm decided to switch to ARM designs rather making their own ARM CPU. But yeah Apple is still miles ahead (at least 1/1.5 year in advance).
    Hoping that you're right (honestly!). Any benchmarks though?

    Leave a comment:


  • vladpetric
    replied
    Originally posted by Slartifartblast View Post

    Let's be realistic here, we are paying toy prices. It's built for a price and not for sheer performance. You want the fastest ARM then you are more than welcome to pay Apple prices and good luck breaking out of their garden.
    Yes, you are right about that.

    Leave a comment:


  • vladpetric
    replied
    Originally posted by carewolf View Post

    That is just a marketing lie. They still had the reference designs and components. They have built up their new overall architecture from scratch, but they have all the pre-built components that they bought from ARM. So a new architecture "from scratch", but not a new CPU design from scratch. Or about as new as a new architecture from AMD or Intel.
    ARM Reference designs are a bit like a Cessna, when Apple designed a corporate jet. Yeah, they both fly ... one is far more powerful than the other.

    Leave a comment:


  • Raka555
    replied
    Originally posted by hotaru View Post

    no, it won't. the fact that 32-bit ARM has half as many registers as 64-bit will still hold it back quite a bit in a lot of workloads.
    And the fact that 64bit pointers "waste" 1/2 the cache they are loaded in, even things out. So more or less the same performance, dependent on workload.

    Leave a comment:


  • hotaru
    replied
    Originally posted by Raka555 View Post
    If you compile a 32bit disto to use the Cortex-A53 as minimum, then it will be just as fast or even faster that the 64bit OS.
    no, it won't. the fact that 32-bit ARM has half as many registers as 64-bit will still hold it back quite a bit in a lot of workloads.

    Leave a comment:

Working...
X