Announcement

Collapse
No announcement yet.

Apple Confirms Their Future Desktops + Laptops Will Use In-House CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by vladpetric View Post
    You're making a lot of points here, just addressing this one right now: it is pretty rare to saturate the memory bus, even with an RPi4. The reason is that the cache subsystem tends to work really well ... Let's consider a "bad" workload with L1 + L2 cache hit rates of ~95% (which is pretty low FWIW). That means that 1 in 20 loads goes to memory. Further assuming that 33% of all instructions are loads, then 1 in 60 instructions goes to memory. There is a significant cycle penalty with that miss, but 1 in 60 instructions missing won't saturate the memory bus, even on the RPi4.
    This is you being completely wrong A72 is close to a AMD zen core design than a Intel one this is important.

    thunderx2 32 core A72 chip. Yes 8 by 4 core setup. How do you think those 8 groups of cpus sync with each other. By the memory controller interface from the A72 core. So we are not talking a cache miss problem here. The reason why RPI4 performance is really tanked partly the same as why AMD zen core chips don't like low ram speed. Also partly because even that a RPI4 only has 4 A72 cores its still stalling going out to the memory bus asking is any other processor around working on this data and waiting for the time out slower ram speed longer this time out is bigger the very regular stall is. So you are not looking at 1 in 20 cache miss but instead a 1 in 5 sync stall or worse on top of your general cache misses.

    Yes the speed of your memory controller with AMD effecting Zen Data Fabric frequency causing IPC performance drop you see the same with ARM A72 even if it only 4 cores by itself. Yes some of the core to core sync in the 4 core group is also going out to the memory controller bus with the A72. This is one of the changes in the A73. All this sync traffic on the memory bus kind does a number on your raw memory bandwidth.

    A72 in the RPI 4 are basically server chips on embedded hardware configuration being very unhappy about it and showing its displeasure with low performance. Yes still way better performance than the A53 of the RPI 3 but still technically way slower than it should be. The RPI 4 is not a good item to benchmark to get any idea how a ideal setup Cortex A72 should behave. Something like a RPi4 should use a A73 or newer that has the fix that when you have 4 cores by self don't end up running out to the memory controller. But using a A73 or newer equals using more expensive nm production.

    SPEC CPU 2006/2017 are in fact harmed on AMD Zen with low speed ram and A72 in RPI 4 due to low speed ram for exactly the same reason. RPI4 case a lot worse because the amount slower is a lot more.

    Comment


    • Originally posted by oiaohm View Post

      This is you being completely wrong A72 is close to a AMD zen core design than a Intel one this is important.

      thunderx2 32 core A72 chip. Yes 8 by 4 core setup. How do you think those 8 groups of cpus sync with each other. By the memory controller interface from the A72 core. So we are not talking a cache miss problem here. The reason why RPI4 performance is really tanked partly the same as why AMD zen core chips don't like low ram speed. Also partly because even that a RPI4 only has 4 A72 cores its still stalling going out to the memory bus asking is any other processor around working on this data and waiting for the time out slower ram speed longer this time out is bigger the very regular stall is. So you are not looking at 1 in 20 cache miss but instead a 1 in 5 sync stall or worse on top of your general cache misses.

      Yes the speed of your memory controller with AMD effecting Zen Data Fabric frequency causing IPC performance drop you see the same with ARM A72 even if it only 4 cores by itself. Yes some of the core to core sync in the 4 core group is also going out to the memory controller bus with the A72. This is one of the changes in the A73. All this sync traffic on the memory bus kind does a number on your raw memory bandwidth.

      A72 in the RPI 4 are basically server chips on embedded hardware configuration being very unhappy about it and showing its displeasure with low performance. Yes still way better performance than the A53 of the RPI 3 but still technically way slower than it should be. The RPI 4 is not a good item to benchmark to get any idea how a ideal setup Cortex A72 should behave. Something like a RPi4 should use a A73 or newer that has the fix that when you have 4 cores by self don't end up running out to the memory controller. But using a A73 or newer equals using more expensive nm production.

      SPEC CPU 2006/2017 are in fact harmed on AMD Zen with low speed ram and A72 in RPI 4 due to low speed ram for exactly the same reason. RPI4 case a lot worse because the amount slower is a lot more.
      What the heck are you talking about?

      First, the Cortex-A72 (which the rpi4 Broadcom chip is based on) has a unified L2 cache for the four cores.

      https://developer.arm.com/ip-product...x-a/cortex-a72

      Why the heck would you go to memory if you have a shared L2?

      Now even if you didn't have a shared L2, all modern multiprocessing systems implement cache-to-cache transfers through an interconnect, because performance is crap otherwise. Why? Because DRAMs are really slow.

      For Intel it's the QPI, for ThunderX2 it is the Coherent Processor Interconnect. This is how the 8 groups of cores sync with each other. These are not the memory controllers, even though they are also connected to them. But again, you don't need such a thing with RPi4 because you have a shared last level cache.

      Designed for your current needs and future ambitions, Marvell delivers the data infrastructure technology transforming tomorrow’s enterprise, cloud, automotive, and carrier architectures for the better.


      Second, you started to make a claim about memory bandwidth and memory bandwidth saturating. Now you're pivoting into latency. I never said that additional memory latency doesn't hurt. I only said that the lower rpi4 memory bandwidth isn't a big deal.

      I'm sorry, but I don't think you know what you're talking about.

      Comment


      • Originally posted by Michael_S View Post
        I don't own any Apple products, but to be fair to Apple they have the least evil planned obsolescence life cycle for smartphones. All the Android vendors, and Microsoft when they were in the smartphone market, were terrible about supporting devices past two years. This is one area where Apple does right by consumers.
        This type of person is called the useful idiot. Apple is the most malicious company out there. Software support is irrelevant as they offer zero support for the hardware itself. Got a cracked screen? Might pretty much just buy a whole new phone. Battery's dead? Get a new phone.

        Comment


        • Originally posted by curfew View Post
          Battery's dead? Get a new phone.
          What? They have a battery replacement program.

          Comment


          • Originally posted by curfew View Post
            This type of person is called the useful idiot. Apple is the most malicious company out there. Software support is irrelevant as they offer zero support for the hardware itself. Got a cracked screen? Might pretty much just buy a whole new phone. Battery's dead? Get a new phone.
            I had been trying to get by with mid range Android phones and the best case options available for them for my family. LGs, Motorolas, Xioamis, OnePlus (back when OnePlus was cheaper), ZTE. We've gone through 10 phones in 4 years. That's how bad the quality is. Many of those 10 were replaced under warranty, and then failed again soon after the warranty expired.

            We've tried to repair some of the phones ourselves, but in every case we failed. We've managed to replace screens and charging points on tablets many times. But for phones, we break something in the dismantling process or else put it all back together and then a week later the phone splits because the new battery overheated or something.

            Is all that better than Apple? Really?

            I've given up, now my kids all have lightly used Samsung Galaxy S-somethings and my wife and I have Google Pixel phones - not because I want to spoil my kids or even myself, but because I'm desperately hoping these top end devices won't break if you look at them funny.

            Meanwhile, my friends and coworkers with iPhones usually - not always, but usually - have a good reliability experience. There are luxury buyers that get something new every year or two, but most people in my social circle upgrade every four or five years. I've never had that option before now, my Android device was inevitably dead long before that.

            Comment


            • Originally posted by Vistaus View Post

              You're right. Windows ARM doesn't exist, so Boot Camp will go away. Oh wait...
              1. What are the odds Apple will see Windows without users' existing library of applications as valuable enough to offer Boot Camp?
              2. What are the odds Microsoft will alter their secure boot requirements for ARM Windows devices when they want the same kind of vendor lock-in Apple does?
              3. Boot Camp in its current form is possible because of how standardized x86 platforms are. Look at how much trouble it can be to get Linux to run on recent x86 macs with active attempts on the Linux side to support them... and that's with a platform actively trying to support x86 Windows, which is targeted at the standard PC platform.
              4. Not all ARM is made alike. That's why it's so much more work to do things like providing custom ROMs for Android phones. You underestimate how much effort goes into intentionally making x86-based systems implement a common platform.

              Comment


              • Originally posted by vladpetric View Post
                First, the Cortex-A72 (which the rpi4 Broadcom chip is based on) has a unified L2 cache for the four cores.
                Except that is not the problem AMD Zen cores are the same problem. AMD its the infinity fabric this on a Cortex A72 is the AMBA 5 CHI or AMBA 4 ACE with multi clusters but when you only have 1 cluster the AMBA is still there and is not in fact removable the reason why will come clear latter.

                Originally posted by vladpetric View Post
                Why the heck would you go to memory if you have a shared L2?
                CPU core sync actions is no moving memory. Core to Core Sync actions you don't go to L2 you go to the AMBA. Speed of AMBA comes important in the same way infinity fabric in a AMD Zen is.

                Originally posted by vladpetric View Post
                For Intel it's the QPI, for ThunderX2 it is the Coherent Processor Interconnect. This is how the 8 groups of cores sync with each other. These are not the memory controllers, even though they are also connected to them. But again, you don't need such a thing with RPi4 because you have a shared last level cache.
                This is where you are horrible wrong. Point where you are wrong is that you don't need such a thing as a Coherent Processor Interconnect on RPi4 reality the Coherent Processor Interconnect is a mandatory part of the design.
                ARM® AMBA® 5 CHI Memory Controllers work in concert with AMBA 5 CHI interconnects to provide controls to optimise data flows between many processors and the DDR memory.  In this blog I am going to tell you a bit more about the work ARM...

                The diagram here for a A57 is about the clearest I can find. The reality is A72 if you remove the Coherent Processor Interconnect the AMBA completely from a A72 processor you have no means to connect memory controller at all. The speed the A72 AMBA is operating at is set by what is connected. So a raspberry pi with a slow speed memory controller the AMBA is clocked a lot slower and this has knock on effects to performance. Fitting a faster memory controller or another cluster of cpu cores would in fact help a lot as either one would end up with the AMBA at a higher clockspeed and shorter time outs.

                Horrible ARM design here its the Coherent Processor Interconnect is how you connect to your memory controller so a single cluster A72 chip like a Raspberry PI 4 has to still have the Coherent Processor Interconnect. Now the last thing you want is the 4 cores of something like the Raspberry PI deciding to use the Coherent Processor Interconnect when they don't have to.

                How A72 4 core like RPI 4 goes wrong is like Core 1 want to do a sync of some form it calls out to Coherent Process Interconnect(that has to be there) ends up waiting for the time out of it even that there are no other clusters at all to talk to. How long that time out is happens to be set by the speed of the memory controller because there is nothing else connected to the Coherent Processor Interconnect to set it to a higher speed/shorter timeout. A73 design contains a nice little change for single clusters on a A73 core 1 wants to do a sync of some form attempts to call out Coherent Process Interconnect if it a single cluster no time out wait instead straight response there is nothing else continue the operation.

                8 core A72 being 2 clusters of A72 or 4 A72 with 4 A53 behaves performance wise better than than a single cluster 4 core A72. With a really high decent speed memory controller single cluster 4 core A72 will not have a major performance hit problem because the timeout is so short. Heck possible 4 core A72 would behave better with 2 cores per cluster than 4 cores in a single cluster due to how badly this screws with performance.

                By the way the reason why this is so bad is is in fact the A72 is a out of order processor so while its waiting for the AMBA it out its wasting operations checking for the time to be up before continuing the operation stalled by it. Basically CPU level spin lock problem.

                The A72 has a nice 2 little design bug.
                1) Not handling case of single cluster correctly so having a memory controller speed set timeout .
                2) There are times when A72 does a call to the Coherent Process interconnect when there was no need as well.

                Also the A72 does calls out to AMBA for actions that it never need to.

                Originally posted by vladpetric View Post
                I'm sorry, but I don't think you know what you're talking about.
                The complete way along you have not understood this problem. RPI 4 like it or not really should not be used to base like A72 core performance unless you want worse case for a A72 core.

                There are tons of maker boards on the market, but the Raspberry Pi 4 remains a top pick. Compare the RockPro64 vs. Raspberry Pi 4 and find out which maker board is best for your needs.


                This puts 2 A72 core with 4 A53 (rockpro64) head to head with the raspberry pi 4 4 core A72 and the raspberry pi loses. In fact you will find the raspberry pi 4 will still loss when you have 2 of the A53 disabled. This is the fun A72 not liking being a single cluster if you don't have a high speed memory controller connected. So there are benchmarks out there with raspberry pi 4 vs other A72 systems and you see the raspberry pi 4 under performing some cases to a point that 2 cores of A72 processing on the other solution is matching to beating the 4 A72 cores of the raspberry pi in a multi thread workload. So yes getting half the performance you should out the A72 cores that are there does happen with the raspberry pi. Little design choices huge performance hit this is why the raspberry pi 4 is not suitable to work out how fast arm reference designs are.

                Most cases you want to avoid having Coherent Process Interconnect by having everything in 1 cluster. Cortex A72 is the exception to this rule where you don't want a single cluster with slow memory controller ever.

                Comment


                • The thing is, Intel has been relatively stagnant with x86 for around 8-10 years or so, with around 5 of them being on skylake... with very few performance gains. AMD has set the benchmark too low (setting its target to beat Intel which is underperforming) thus it can't save x86 from obsoletion.

                  The result is

                  a) x86 lost phones, lost tablets
                  b) x86 started losing supercomputer contracts - and now most supercomputers at the top don't have x86
                  c) x86 started losing cloud marketshare, with amazons and the likes creating their own arm chips that are more cost efficient and give better perf/$.
                  d) x86 is now starting to lose laptops and desktops (apple switching over)

                  In a few years x86 will have marginalized itself to desktops and ...gaming consoles. Effectively having tiny economies of scale to compete with ARM. And even the desktop use will mostly be due to inertia from Windows running x86 software, otherwise ARM would parade in the desktop space too.

                  Comment


                  • Originally posted by Paradigm Shifter View Post
                    I've often wondered why Apple hasn't fallen foul of the monopoly commissions of various countries more often. The lock-in to their ecosystem is ridiculous. Not that others aren't trying to do the same thing, it's just that Apple has been doing it longer and better.

                    I'm curious, but only for what this might mean for memory, storage and GPU acceleration for ARM in general. I had a brief stint a while ago where I was thinking about switching away from Android and getting an iPad Pro, but the utterly horrific price (when essential accessories are factored in) made me baulk. In the end I gave up.

                    Off topic I know, but anyone know any 12" Android tablets which actually run an up-to-date Android?
                    I think your only real choice is Samsung abandonware (in that you'll get 1 update and then be abandoned.) When you factor that in, the price of an ipad pro isn't bad considering you'll get much longer useful life from it. When Samsung abandoned my Note10 (which I absolutely loved as I used the pen all the time) it pissed me off enough to just bite the bullet and get an ipad pro. The ipad pro I've had since the first gen, still gets all the updates to iOS, still pretty current in terms of capabilities. After 2-3 years with the Note10 I didn't get any android advancements and was basically legacy at that point as I started to run into situations where I couldn't update apps or install new apps. IMO, the ipad will have at least double the useful life if not more and stay current the whole time, and I do have to say, the iOS apps are usually better. For an Android tablet I'd spend ~ $600-650 (Galaxy Tab S6) compared to ~$920 ($800 + $120 pencil) for an equivalent iPadPro. To me, it's a no brainer buying on value.
                    Last edited by sheldonl; 24 June 2020, 10:34 AM.

                    Comment


                    • Originally posted by Michael_S View Post

                      I had been trying to get by with mid range Android phones and the best case options available for them for my family. LGs, Motorolas, Xioamis, OnePlus (back when OnePlus was cheaper), ZTE. We've gone through 10 phones in 4 years. That's how bad the quality is. Many of those 10 were replaced under warranty, and then failed again soon after the warranty expired.

                      We've tried to repair some of the phones ourselves, but in every case we failed. We've managed to replace screens and charging points on tablets many times. But for phones, we break something in the dismantling process or else put it all back together and then a week later the phone splits because the new battery overheated or something.

                      Is all that better than Apple? Really?

                      I've given up, now my kids all have lightly used Samsung Galaxy S-somethings and my wife and I have Google Pixel phones - not because I want to spoil my kids or even myself, but because I'm desperately hoping these top end devices won't break if you look at them funny.

                      Meanwhile, my friends and coworkers with iPhones usually - not always, but usually - have a good reliability experience. There are luxury buyers that get something new every year or two, but most people in my social circle upgrade every four or five years. I've never had that option before now, my Android device was inevitably dead long before that.
                      I have a Samsung Galaxy S3 that still kind of works; the battery is quite flat and the power button became defective but it's flawless otherwise.
                      I also own a Samsung Galaxy S6 which has problems with its sensor buttons but it's flawless otherwise.
                      So I can kind of agree phones tend to break on some place or another.

                      The lesson I learned?

                      a) Buy a phone I can repair myself
                      b) Buy a phone that has some kind of longtime support
                      c) Buy a phone from a company that does not churn out phones like crazy because they quickly become abandonware and also fail quickly - by design

                      So I came to the conclusion a Fairphone 3 (link) is the right phone for me and, boy, how I love this thing!
                      Rugged, minimalistic, easily repairable (link to iFixit), Dual-SIM, MicroSD, removable battery... there is nothing I miss and it has everything I could hope for.
                      Last edited by reba; 24 June 2020, 12:47 PM. Reason: fixed even more typos

                      Comment

                      Working...
                      X