Announcement

Collapse
No announcement yet.

How A Raspberry Pi 4 Performs Against Intel's Latest Celeron, Pentium CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by TheOne View Post
    What I would like yo see is an Odroid N2+ comparison, that should be more interesting since it is a more powerful sbc than the RPI4
    OpenBenchmarking.org, Phoronix Test Suite, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, open source benchmarking, benchmarking test profiles

    Did a run on my N2(nearly broken. usb hub dead+ sd card does not work anymore) but the results should be fine.


    Would be interesting to look if the big wins of the intel chips are cause most of the programs have at least sse2, compared to not much optimizations for ARM CPUs with NEON yet.

    For FLAC does it look like there is aarch64/neon support in work but not yet merged https://github.com/xiph/flac/pull/183

    Comment


    • #72
      Originally posted by Raka555 View Post
      If you compile a 32bit disto to use the Cortex-A53 as minimum, then it will be just as fast or even faster that the 64bit OS.
      no, it won't. the fact that 32-bit ARM has half as many registers as 64-bit will still hold it back quite a bit in a lot of workloads.

      Comment


      • #73
        Originally posted by hotaru View Post

        no, it won't. the fact that 32-bit ARM has half as many registers as 64-bit will still hold it back quite a bit in a lot of workloads.
        And the fact that 64bit pointers "waste" 1/2 the cache they are loaded in, even things out. So more or less the same performance, dependent on workload.

        Comment


        • #74
          Originally posted by carewolf View Post

          That is just a marketing lie. They still had the reference designs and components. They have built up their new overall architecture from scratch, but they have all the pre-built components that they bought from ARM. So a new architecture "from scratch", but not a new CPU design from scratch. Or about as new as a new architecture from AMD or Intel.
          ARM Reference designs are a bit like a Cessna, when Apple designed a corporate jet. Yeah, they both fly ... one is far more powerful than the other.

          Comment


          • #75
            Originally posted by Slartifartblast View Post

            Let's be realistic here, we are paying toy prices. It's built for a price and not for sheer performance. You want the fastest ARM then you are more than welcome to pay Apple prices and good luck breaking out of their garden.
            Yes, you are right about that.

            Comment


            • #76
              Originally posted by ldesnogu View Post
              A76 and up have changed the direction ARM took. They now seriously target performance... at last! And this explains why companies such as Samsung and Qualcomm decided to switch to ARM designs rather making their own ARM CPU. But yeah Apple is still miles ahead (at least 1/1.5 year in advance).
              Hoping that you're right (honestly!). Any benchmarks though?

              Comment


              • #77
                Originally posted by wizard69 View Post

                If Apples processor is as good as I think it can be it will force changes in the industry. If nothing else that is something good that Apple is doing. By the way Apples move here isn't really about ARM, even though I suspect they will have industry leading performance. Rather I see special function units being the big performance driver in their chips, with Neural Engine getting a big boost in the next round of chips. Once the hardware is in place I expect a huge move towards AI/ML techniques in their software.

                So what many will be seeing as great performance in Apples new machines will not always be because of Apples ARM processors.
                The problems with special function units:

                1. They're more-or-less set in stone - once you introduce them, you need (yes, you can extend them, but generally you need to keep your new instructions and maintain compatibility)
                2. It's really difficult to keep them busy. A compiler won't be able to figure out that your code can be mapped on a neural unit, so most often you need to either code stuff by hand (assembly, intrinsics) or rely on a library that makes use of them.
                3. The gains are limited by Amdahl's law.

                All these arguments where made in the original CISC vs RISC papers, FWIW.

                Comment


                • #78
                  Originally posted by Raka555 View Post

                  And the fact that 64bit pointers "waste" 1/2 the cache they are loaded in, even things out. So more or less the same performance, dependent on workload.
                  no, that doesn't "even things out". performance is generally much better with 64-bit code. the increase in pointer size doesn't make anywhere near as much difference as doubling the number of registers.

                  Comment


                  • #79
                    Originally posted by schmidtbag View Post
                    They're doing poorly because you're judging it for something it wasn't meant to do. As the saying goes, "judge a fish by its ability to climb a tree and it will think it's stupid". ARM isn't built to compete with desktop performance.

                    Yes, I basically said that myself but in fewer words.
                    IPC can matter a lot. It isn't the only thing that matters.

                    I don't know enough about that to make a worthwhile comment on it, which is why I didn't comment on it in the first place. What I do know is it clearly isn't a necessity to make ARM usable on a day to day basis, of course, assuming you're ok with the level of performance (which plenty of people are). I feel like if it was such an obvious thing to add, they would have done so. After all, the NEON instructions for example aren't exactly a simple addition.

                    Poorly compared to what? You have to make comparisons, and the context of the article included Intel. ARM is a hell of a lot better compared to most, if not all other RISC architectures (POWER is faster but it is a lot more power hungry too). Apple's CPU might be better, but it's not going to be cheap; this Broadcom chip offers some fantastic performance[-per-watt] for the price. Any CPU can be made better if you just cram more instructions in it, but then it becomes expensive and inefficient for more basic tasks. Like I said multiple times already: you're expecting this CPU to be something it's not. It does what it was built to do very well.

                    Me too. Though, kinda the point of these CPUs is they don't have a lot of instructions. My server uses A53 cores (not Broadcom) and this is all it shows for features in cpuinfo:
                    half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt lpae evtstrm aes pmull sha1 sha2 crc32
                    Sure isn't much to look at, eh? But it's fast enough for what I need it to do.
                    The OS and benchmarks are pretty much the same for all these processors, and they work just fine. That in itself is a huge accomplishment (a collective accomplishment, if you wish). Your analogy is faulty, because they can really perform all the same tasks.

                    In this day and age IPC matters a lot more than you think. Clock speeds are limited upwards by power consumption, and the differences between instruction sets are smaller than you'd think (and yes, that includes the venerable AMD64 instruction set).

                    Problem is - it's not a spec (anyone can quote a spec number), and understanding why it can fluctuate a lot requires understanding of the microarchitecture. And it's easy to get 2 benchmarks that behave very differently on exactly the same processor, with the same config (a database can easily have 10 times lower IPC than a media benchmark).

                    As far as instruction sets are concerned: the usefulness of additional instructions is limited by the compiler and Amdahl's law. The bigger problem with performance is still regular generic code, and not handcoded assembly.

                    Please, read the following. The first one is a paper by one of AMD's leading researchers; while its main message is something else, a secondary message is that load store speculation is totally worth it:

                    https://pdfs.semanticscholar.org/fae...982.1596905037

                    And then textbooks:

                    https://www.amazon.com/Computer-Arch...dp_ob_title_bk

                    https://www.amazon.com/Modern-Proces.../dp/1478607831

                    Then we can talk about microarchitecture.
                    Last edited by vladpetric; 08 August 2020, 01:12 PM.

                    Comment


                    • #80
                      Originally posted by hotaru View Post

                      no, that doesn't "even things out". performance is generally much better with 64-bit code. the increase in pointer size doesn't make anywhere near as much difference as doubling the number of registers.
                      You're totally wasting your time trying to explain fundamental trade-offs to some of the geniuses here .
                      Last edited by vladpetric; 08 August 2020, 01:09 PM.

                      Comment

                      Working...
                      X