Announcement

Collapse
No announcement yet.

Apple Confirms Their Future Desktops + Laptops Will Use In-House CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by uid0 View Post

    Care to elaborate? Well, you did elaborate in the post (sort of), but it seems like the message is that RAM throughput is not so important, because of caches. I do not follow.

    Caches are meant to reduce the effects of memory latency, not lack of throughput. And if you have multiple cores, doing streaming data processing (for example), slow memory bus can easily have a hard time feeding those wide, data-hungry AVX units (or even ALUs/FPUs) in each core. Networking is another example: if your system's RAM is too slow to move data, no amount of caches will let you reach 100 Gbps.
    A request satisfied in the L1 or L2 caches does not end up on the memory bus at all. Vast majority of requests should be. If they don't end up on the memory bus at all, then throughput is also improved. And yes, caches do that.

    Right, but Neon is only 128 bits, not 512 like AVX512. And yes, you need good bandwidth for high speed DMA. If your benchmark is I/O intensive, that could be a problem.

    BTW, the 100Gbps in the context of Rpi4 made me chuckle ... Actually I don't think you can saturate 100Gbps networking from CPUs if you have to do any reasonable processing of your data. If all you do is lift data out of your RAM (so, the data is already in RAM) and send it on the network, then sure. But if you have to, say, encrypt your data before sending it out, the CPU is going to be the bottleneck as well.
    Last edited by vladpetric; 06-26-2020, 10:39 AM.

    Comment


    • " future laptops AND desktop computers will be using in-house silicon......." doesnt sound much like they have a plan B if it doesnt work out, and they are betting the PC side of their business.

      it sounds very risky and there is a lot at stake.

      They assume a lot about buyers indifference to cpu power levels, & being pushed deeper into a ~closed ecosystem.

      Comment


      • Originally posted by starshipeleven View Post
        ...migrate what is possible to a "mobile-like" laptop and dump the rest in the river.
        But its desktops too??? they are migrating mobile like POWER to the desktop.

        Comment


        • Originally posted by msroadkill612 View Post
          But its desktops too??? they are migrating mobile like POWER to the desktop.
          What desktops? their "all-in-one" PCs (iMac lines) are using laptop CPUs, and Apple's ARM CPU cores have been comparable to Intel mobile CPU cores (aka the ones used in laptops) for a while.
          Yeah, they might be a little bit worse on paper than the higher end laptop CPUs, but the current Apple PCs with higher end laptop Intel CPUs were throttling like mad so the performance was not really optimal.

          The only thing they can't easily replace is something to replace is the Cheese Grater Pro, but they have 4-5 years before they really need to do that.

          Comment


          • I am a similarly fortunate position, but i too feel for those who are not, & $15ph is is to dream of for billions.

            I love seeing the entry price of this life changing resource get lower, & its a reason i greatly admire AMD & scorn the more predatory moat building weasels like Intel & apple.

            Doing business with the devil comes back and bites you in the ass - what we in the west (wrongly afaict) call karma

            Comment


            • Originally posted by Michael_S View Post

              In my experience, Android passed into 'acceptable performance' territory with version 6 or so and 3GB of RAM, and that still holds. More memory is nicer, but the cheap phones we had worked fine before the hardware failed, and some of them 'only' had 3GB of RAM.
              Nowadays, 3GB starts to be on the edge of minimum RAM needed for fluent experience. The RAM isn't used only for system + currently opened applications, but also for disc-caching. With less than 500~1000MB of free memory, there's not enough of space available for disc-caching, and it becomes visible - switching/reopening of applications becomes much slower.

              However, the last one I purchased (at sale) has got 6GB of RAM - XIaomi Mi 9 SE. It should last at least 3 years, if I don't break it,... And, I believe, in ~4 years even 6GB of RAM will be on the edge of minimum RAM needed for fluent experience.

              Comment


              • Originally posted by starshipeleven View Post
                What desktops? their "all-in-one" PCs (iMac lines) are using laptop CPUs, and Apple's ARM CPU cores have been comparable to Intel mobile CPU cores (aka the ones used in laptops) for a while.
                Yeah, they might be a little bit worse on paper than the higher end laptop CPUs, but the current Apple PCs with higher end laptop Intel CPUs were throttling like mad so the performance was not really optimal.

                The only thing they can't easily replace is something to replace is the Cheese Grater Pro, but they have 4-5 years before they really need to do that.
                Oh, well that all right then. Its almost as powerful as previous models.

                I cant see Lisa being keen on that as a strategy

                "Intel giveth, Microsoft taketh away" is the old moores law adage, & it still applies in a general sense - there will always be killer apps that appear when new resources become mainstream. Its irrational for users to upgrade to a step backwards.

                Comment


                • Originally posted by vladpetric View Post

                  http://infocenter.arm.com/help/index...846798627.html

                  This is part of the documentation for A72's cache system. It says:

                  <<
                  The L1 data caches support the MESI protocol. The L2 memory system contains a Snoop Tag array that is a duplicate copy of each of the L1 data cache directories. The Snoop Tag array reduces the amount of snoop traffic between the L2 memory system and the L1 memory system. Any line that resides in the Snoop Tag array in the Modified/Exclusive state belongs to the L1 memory system. Any access that hits against a line in this state must be serviced by the L1 memory system and passed to the L2 memory system. If the line is invalid or in the shared state in the Snoop Tag array, then the L2 cache can supply the data.
                  >> (emphasis added)

                  I totally get it that there's significant design trade-offs when you design a highly configurable micro-architecture. But according to this page, A72 is perfectly capable of doing coherency in the L2 cache. Are you reading this page differently?
                  That the problem you are reading it the way you want to. How is that snoop traffic travelling the answer is the MESI protocol.

                  https://en.wikipedia.org/wiki/MESI_p...grama_MESI.GIF

                  Notice all the stuff in red here that is go out to a BUS. In ARM case that is the AMBA in the AMD case its the infinity fabric system. Intel is different they have a bus that is for cache inside the core clusters so intel case all the connections from L1 to L2 are L1/L2 circuitry. Yes you save some silicon not having a decanted cache bus as intel has but you do pick up some interesting behaviours like memory controller in particular setups being able to mess with your complete system performance by being a cause of a bus speed change.

                  The fact that is MESI protocol not doing coherency using L2 alone the important question when you see a chip design saying it using MESI protocol what bus is that in fact using. What that snippet you found is a optimisations in L2 to reduce request latency not exactly how it works. Yes how it works got 2 words "MESI protocol" leaving out the important detail of what bus is used for MESI.

                  Please note

                  The L1 data caches support the MESI protocol.

                  This is a sentence by itself. You missed the importance of this as well. L1 can operate in MESI protocol to system memory without a L2 present so the complete sentence following talking about L2 is a optional part. MESI protocol does not mandate existence of L2 its just how the cpu core itself will talk cluster parts yes it mandating usage of a bus that can be generic to the system in ARM and AMD it is a generic bus. Intel has a decanted bus that is part of L2/L1.

                  You read that L2 had stuff to assist with doing coherency and skipped over how coherency is in fact being performed yes the answer to how coherency is being performed is the MESI Protocol that you did not understand what it meant.

                  Comment


                  • Originally posted by oiaohm View Post

                    That the problem you are reading it the way you want to. How is that snoop traffic travelling the answer is the MESI protocol.

                    https://en.wikipedia.org/wiki/MESI_p...grama_MESI.GIF

                    Notice all the stuff in red here that is go out to a BUS. In ARM case that is the AMBA in the AMD case its the infinity fabric system. Intel is different they have a bus that is for cache inside the core clusters so intel case all the connections from L1 to L2 are L1/L2 circuitry. Yes you save some silicon not having a decanted cache bus as intel has but you do pick up some interesting behaviours like memory controller in particular setups being able to mess with your complete system performance by being a cause of a bus speed change.

                    The fact that is MESI protocol not doing coherency using L2 alone the important question when you see a chip design saying it using MESI protocol what bus is that in fact using. What that snippet you found is a optimisations in L2 to reduce request latency not exactly how it works. Yes how it works got 2 words "MESI protocol" leaving out the important detail of what bus is used for MESI.

                    Please note

                    The L1 data caches support the MESI protocol.

                    This is a sentence by itself. You missed the importance of this as well. L1 can operate in MESI protocol to system memory without a L2 present so the complete sentence following talking about L2 is a optional part. MESI protocol does not mandate existence of L2 its just how the cpu core itself will talk cluster parts yes it mandating usage of a bus that can be generic to the system in ARM and AMD it is a generic bus. Intel has a decanted bus that is part of L2/L1.

                    You read that L2 had stuff to assist with doing coherency and skipped over how coherency is in fact being performed yes the answer to how coherency is being performed is the MESI Protocol that you did not understand what it meant.
                    The generic diagram from wikipedia for the MESI protocol is just a generic diagram. Of course, MESI protocol and variants operate over a bus/interconnect - that goes without saying. I think that the fundamental disagreement between us is which bus handles that. I'm saying that the L1/L2 interconnect in a single cluster should be able to handle that traffic, and it doesn't need to get out to the AMBA (I think it'd be a really stupid design to do so; maybe that is indeed the case, but I'd like some proof here).

                    So, to clarify, are you telling me that an ARM A72 like the BCM2711 in raspberry pi 4, that is a single cluster of 4 cores with shared L2 cache, always goes to the AMBA when handling a core-to-core transfer?

                    Let's take a most straightforward example. You have a single cache line (unit of cache coherency). In it, there's a mutex and some data.

                    Initially, it resides in core A's L1 cache. Core B (again, same cluster, with its own L1 cache, and sharing the L2 with core A) first acquires the mutex, then does some work on that cache line (and, for simplicity, on no other cacheline), then releases the mutex.

                    Question is what happens here with the cache line and coherency mechanism? I can answer this question with my eyes closed for an Intel/AMD modern design. You accused me earlier of applying Intel thinking to rpi4 ... fine, so be it . Tell me how it works on the BCM2711 (single cluster of 4 cores with shared L2, again).
                    Last edited by vladpetric; 06-28-2020, 01:33 PM.

                    Comment


                    • Originally posted by sheldonl View Post

                      I think your only real choice is Samsung abandonware (in that you'll get 1 update and then be abandoned.) When you factor that in, the price of an ipad pro isn't bad considering you'll get much longer useful life from it. When Samsung abandoned my Note10 (which I absolutely loved as I used the pen all the time) it pissed me off enough to just bite the bullet and get an ipad pro. The ipad pro I've had since the first gen, still gets all the updates to iOS, still pretty current in terms of capabilities. After 2-3 years with the Note10 I didn't get any android advancements and was basically legacy at that point as I started to run into situations where I couldn't update apps or install new apps. IMO, the ipad will have at least double the useful life if not more and stay current the whole time, and I do have to say, the iOS apps are usually better. For an Android tablet I'd spend ~ $600-650 (Galaxy Tab S6) compared to ~$920 ($800 + $120 pencil) for an equivalent iPadPro. To me, it's a no brainer buying on value.
                      Thanks for your input.

                      Comment

                      Working...
                      X