Announcement

Collapse
No announcement yet.

Apple M1 ARM Performance With A 2020 Mac Mini

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by PerformanceExpert View Post

    No, AArch64 is actually 10% more dense. A common myth is that x86 is dense. It isn't when you keep adding prefixes and more SIMD instructions, and x86_64 instructions are now on average more than 4 bytes! If you want a dense variable length encoding, look at Thumb-2.

    In terms of latency the 3-cycle 128/192KB L1 at 3.2GHz is equivalent to 4.7 cycles at 5GHz. Zen does 4, Icelake does 5 cycles with tiny 32KB caches. So the latency is comparable despite much larger caches and that means it's more advanced.
    Damn, SIMD, it seems a great idea at first... On the plus side the AVX instuctions can do a lot even if they are long.

    Do you have a citation for the AArch64 claim? Sometime it takes a while for information to trickle out the the mainstream.

    And the cache is more advance because it's an advance node, not because they found some sort of new technique. There are of course details around paging and address structure, but what I take away is that TSMC 5nm is a fine process indeed.

    Compare to Samsung's 5nm chips. 128k and 3/4 cycle. https://www.anandtech.com/show/15826...roarchitecture

    Even so, 4x the cache area, and the process only gives you 2x density. The signal speed isn't increasing to there probably is one more thing going on with apple u-arch or the don't mind burning power to run at the very edge of possible for the cache under core boost.
    Last edited by WorBlux; 22 November 2020, 01:24 PM.

    Comment


    • Originally posted by geearf View Post

      Yes I understood that point, that's pretty much the one I was making in my original post.

      As for the explanation, I got everything but "ISA". I have a vague understanding of the word, but if you don't mind, I'd appreciate if you could explain what you meant by "original ISA".

      Thanks!
      ISA is Instruction Set Architecture, ie. the definition of what eg. Arm or x86 instructions do and how they are encoded. What I meant is that micro-ops are basically a different encoding of instructions but with the same meaning. So micro-ops are very similar to the original ISA, as in, micro-ops on a CISC are CISC and micro-ops on a RISC are RISC. In reality you wouldn't call micro-ops RISC or CISC since they are not documented or usable from software.

      Comment


      • Originally posted by WorBlux View Post

        Damn, SIMD, it seems a great idea at first... On the plus side the AVX instuctions can do a lot even if they are long.

        Do you have a citation for the AArch64 claim? Sometime it takes a while for information to trickle out the the mainstream.

        And the cache is more advance because it's an advance node, not because they found some sort of new technique. There are of course details around paging and address structure, but what I take away is that TSMC 5nm is a fine process indeed.

        Compare to Samsung's 5nm chips. 128k and 3/4 cycle. https://www.anandtech.com/show/15826...roarchitecture

        Even so, 4x the cache area, and the process only gives you 2x density. The signal speed isn't increasing to there probably is one more thing going on with apple u-arch or the don't mind burning power to run at the very edge of possible for the cache under core boost.
        Unfortunately there are few quality papers on code density since it's not exactly interesting, so you mostly find marketing oriented papers that try to show one specific ISA in the best light (most recently RISC-V). Here is a link from an old discussion on the subject - a while back I counted all of SPECINT using the same GCC and that showed AArch64 being 10.5% smaller overall with -O2 and 11.5% with -O3. Since this is an average over ~20MBytes of code it is as accurate as you can get. You can easily do similar experiments.

        That Samsung design was cancelled and Arm Cortex cores have 4 cycle L1 latency at 3.3GHz. TSMC 7nm to 5nm scaling is 1.8x for logic, but only 1.3x for SRAM. Hence doing huge caches in 5nm while also increasing frequency is impressive.

        Comment


        • Originally posted by deppman View Post
          I have direct proof through Geekbench 5 scores. I'd give you a link, but my last post just got blocked apparently due to the inclusions of links. So PM me if you want to know more.
          As you see, people posts lot's of links and have no problems. Nice try, but still you give no proofs.
          Also a single synthetic benchmark result does not really tell anything.

          Originally posted by deppman View Post
          I used to run server farms with over a hundred nodes so I understand the impact of power and thermal constraints. EDIT: I had originally estimated the SOC could approach 70W peak power, but I was directed to the analysis by Anandtech which shows my estimate is too high. I now estimate that peak combined CPU + GPU power consumption probably would not exceed 39W and Anandtech showed a peak draw of 31W. Thanks to PerformanceExpert for the guidance.
          Anandtech figures are there, I posted the link to the review *after* reading the review and, as supposed by anandtech reviewers, the TDP of the SoC is around 20-24W, and power measurements confirm this. Your estimate was totally wrong, even because there is no need for estimation since power figures are already known.

          Comment


          • Originally posted by scottishduck View Post

            You’re clearly operating on solid ground when you’re picking a single outlier among reviews as the only true one and deciding there’s a vast conspiracy to falsely make apple products looks better than they are.
            Way to be part of the herd. How many of these reviewers (a) were given the device at no cost, and (b) signed an NDA, and (c) submitted to editorial review by Apple? If you think the number is zero, it's time you grow up and learn how the world really works.

            Name one item the reviewer brought up in the review that you can disprove. Go ahead, I'll wait. The review isn't even negative, just honest. I think his workload is far more representative of customers. How many people want to only watch videos on battery for 14 hours straight? Yet that has been *the* battery tests in more than one review.

            There will be more honest reviews in the next few days as unpaid reviewers become more prevalent. Let's see how this pans out.

            Comment


            • Originally posted by blackshard View Post
              As you see, people posts lot's of links and have no problems. Nice try, but still you give no proofs.
              Also a single synthetic benchmark result does not really tell anything.
              The post has now been approved. The link and proof are there. Will you apologize for calling me a liar?

              Of course, you could have found these results yourself in just 2 minutes as I did. Why didn't you?

              Originally posted by blackshard View Post
              Anandtech figures are there, I posted the link to the review *after* reading the review and, as supposed by anandtech reviewers, the TDP of the SoC is around 20-24W, and power measurements confirm this. Your estimate was totally wrong, even because there is no need for estimation since power figures are already known.
              I corrected my likely mistake and gave you credit. You're welcome.

              Also, you are wrong about there being no need for estimates. The power figures that are "already known" are those provided only by Apple through Ars and Anandtech. And like many "facts" being presented, they are likely wrong. Go look at the Forbes review where the author found 25% of the official Apple-claimed battery life (4.5 hours, or 13W *average* power draw) or that Rosetta2 crashes frequently, lags, and runs significantly slower than native x86 apps. Also Steam barely runs, and the reason why reviewers keep showing Shadow of the Tomb Raider is that it's about the only AAA title that works on the M1. Don't believe me? Take 2 minutes and look it up.

              Comment


              • Originally posted by PerformanceExpert View Post

                ISA is Instruction Set Architecture, ie. the definition of what eg. Arm or x86 instructions do and how they are encoded. What I meant is that micro-ops are basically a different encoding of instructions but with the same meaning. So micro-ops are very similar to the original ISA, as in, micro-ops on a CISC are CISC and micro-ops on a RISC are RISC. In reality you wouldn't call micro-ops RISC or CISC since they are not documented or usable from software.
                Got it, thanks again!

                Comment


                • Originally posted by deppman View Post

                  The power figures that are "already known" are those provided only by Apple through Ars and Anandtech.
                  Are you okay?....

                  Comment


                  • From the Fast Company review of the M1 MacBook. Some real world benchmarks.


                    " Of course if you already have a Mac, the performance difference that matters is the one between your old machine and the new one you might buy. In this household, the old Macs include a 2016 MacBook Pro with an Intel Core i7 chip and a 2018 MacBook Air with a Core i5. Both have 16 GB of RAM, double the quantity in the new Air I’ve been trying.

                    Using iMovie to save a 74-second 4K video took more than five minutes on the 2018 MacBook Air, and sent its cooling fan into a tizzy.

                    It took two and a half minutes on the MacBook Pro.

                    And on the new Air—which, with its efficient M1 chip, doesn’t need a fan—it took just 49 seconds. "



                    Whole review down below.

                    https://www.fastcompany.com/90576013...e-silicon-2020
                    Last edited by Jumbotron; 22 November 2020, 09:56 PM.

                    Comment


                    • Originally posted by deppman View Post

                      The M1 WILL draw far greater than "10-20W of power" for this performance as CPU alone use exceeds 20W. Expect GPU heavy loads to run around 40W, and 60W with CPU use. These still are great numbers, just not the "walk on water" numbers Apple would have its fans believe.
                      Nope.



                      Power virus load on CPU and GPU is 32 Watts. The GPU maxes out at 10 Watts.

                      Comment

                      Working...
                      X