Updated Ubuntu 24.10 Install Image Released For Snapdragon X1 Elite Laptops

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ikoz
    Junior Member
    • Oct 2024
    • 9

    #21
    Originally posted by coder View Post
    Also, not totally open. I think every general-purpose commercial RISC-V implementation currently has proprietary firmware blobs, at this point.
    JH7110 is fully supported in mainline kernel and open hardware as well. The only blob (at release date) is for the DDR4 PHY firmware in uboot which is only ran at boot to start the RAM, I don't know if it has been open sourced since then.

    Comment

    • Dukenukemx
      Senior Member
      • Nov 2010
      • 1392

      #22
      Originally posted by coder View Post
      Apple M3 vs. Lunar Lake shows that ARM cores can performance better on the same process node.
      Perform in what way? Almost every reviewer is using Geekbench and Cinebench and that doesn't represent real world performance. When Michael benchmarked the M4 Mac Mini, it showed that it was almost always the slowest with a few exceptions. When it did win, it's because the M4 has more cores. Keep in mind this was done with Ubuntu 24.04 with 6.10 kernel. If he had used CachyOS, the gap would only widen.
      Not in anything I care about.
      Most people do though. As much as I love Linux, I still need to run Windows applications. That is until the day they become Linux applications. Looking at you Adobe and Fusion 360.
      This is phoronix. Why the fuck do you think I give a shit about proprietary software? I don't play games and sure wouldn't buy a thin & light laptop for gaming, even if I did.
      Are you Phoronix? You seem to be talking for yourself and other people, which ever is promoting your argument. Also, why not a thin and light laptop for gaming? You can certainly do it, just not with ARM.
      Originally posted by coder View Post
      Typical PC Master Race copium. No, look at Apple M3 vs. Lunar Lake. Exact same process node. Lunar Lake is the most power-efficient x86 and the M3 still beats it.
      AMD uses TSMC 4nm and it beats Apple's M3. We know that Intel isn't exactly doing well right now, but Intel is hardly the only representative of PCMR. That being said, I'd still rather have a Lunar Lake based laptop over anything Apple. Then again, I'd rather have AMD's Strix Point chips over Lunar Lake.
      This is also a lie. Lunar Lake cores are much bigger than the M3's. Lunar Lake has significantly more total cache, as well.
      Lunar Lake's die size is 100mm² while the base M3 is​ 146 mm². Apple's M3 is significantly larger.
      The spot pricing on AWS Graviton 4 instances suggest that TCO of ARM is still lower than any x86 options.
      You'd think so because it's worse.
      Wishing for a thing and saying it doesn't make it so.

      Apple has been losing sales year over year since the introduction of the M2's and the trend has been continuing with M3's. Who's to say with M4's because this time Apple wasn't going cheap with their products. M4's don't have half the SSD performance like the M2's had. The M4's don't have less cores like the M3's had, plus the M4 Pro even matches the M4 Max in CPU performance. M4's also don't nerf the bandwidth like Apple did with the M3 and M3 Pro. Also, the base ram is now 16GB which brings Apple products to the 2019 era of computers. Qualcomm Snapdragon X laptop sales don't exist. Lets not forget that ARM/SoftBank themselves have filed for bankruptcies and are now after Qualcomm to make them pay because they are probably ready for another bankruptcy. If ARM for desktop doesn't end up like PowerPC then I'd be shocked.

      Lets also not forget that we here at Phoronix use Linux, at least I hope we do. You're not going to run Linux on M3, let alone M4 based products anytime soon. The people behind Asahi are too busy cosplaying as cute cat girls instead of getting Linux working on Apple products. Meanwhile the chads that AMD and Intel hired are busy getting Linux working on their hardware, day1.

      Comment

      • coder
        Senior Member
        • Nov 2014
        • 8920

        #23
        Originally posted by Dukenukemx View Post
        Perform in what way? Almost every reviewer is using Geekbench and Cinebench and that doesn't represent real world performance.
        Cinebench certainly represents real world performance, since it's based on a production renderer. Geekbench indeed has weird MT scaling, so I just look at the ST numbers.

        That's not a good comparison, since it included only one other laptop SoC and that was AMD's top-end Strix Point model that has more cores and used about double the power.

        If you strictly compare either on the basis of single-threaded tests or vs. something like Lunar Lake, that has the same number of cores & threads, Apple comes out well ahead. The other wildcard is the amount of optimization, like hand-coded AVX2 or AVX-512.

        Originally posted by Dukenukemx View Post
        ​​Are you Phoronix? You seem to be talking for yourself and other people,
        Right at the top of the front page, it says:

        Latest Linux Hardware Reviews, Open-Source News & Benchmarks

        Linux and Open-Source. So, a bad idea to assume someone on here gives a shit about proprietary software, because the site caters to the FOSS community. I'm not saying nobody does, but enough of us don't that you can't just assume someone does.

        Originally posted by Dukenukemx View Post
        ​​​We know that Intel isn't exactly doing well right now, but Intel is hardly the only representative of PCMR.
        Intel's Lunar Lake is the most efficient x86, at least in the ballpark of 4P + 4E cores and if you pick a mid-range, like the 256V (which Michael has and mysteriously omitted from that M4 comparison).

        Originally posted by Dukenukemx View Post
        ​​​​That being said, I'd still rather have a Lunar Lake based laptop over anything Apple.
        Same. That's because I don't like Apple as a company. However, I think their hardware is a good example of what's possible.

        Originally posted by Dukenukemx View Post
        ​​​​​Then again, I'd rather have AMD's Strix Point chips over Lunar Lake.
        Depends on what for. Lunar Lake is held back by 8 cores / 8 threads. So, for a software development machine, I'd probably also prefer a HX 370. However, if I'm mainly using it for video calls, productivity apps, web, and remote access, then I'd go with Lunar Lake for sure.

        Originally posted by Dukenukemx View Post
        ​​​​​​Lunar Lake's die size is 100mm² while the base M3 is​ 146 mm². Apple's M3 is significantly larger.
        Where the fuck did you get that number? Even just the compute tile of Lunar Lake is 140 mm^2! The I/O die adds another 80 mm^2, for a total of 219.7 mm^2.

        That's also a misleading comparison, because of things like iGPUs and NPUs dominating the dies (or the M3's at least) and we're just talking about CPU cores here, not GPU. If you compare the CPU cores, the M3's P-core is 2.49 mm^2, while Lunar Lake's is 4.53 mm^2. That puts the x86 core at 81.9% bigger!

        Originally posted by Dukenukemx View Post
        ​​​​​​<random Apple bashing and ARM FUD>
        I don't care about Apple, as a company, or even ARM, for that matter. I'm just interested in Apple as an example of what's technically possible and I'm interested in ARM because it has the maturity to be a practical alternative to x86. Even if there are a few benchmarks Michael can cherry-pick that run better on x86, it doesn't negate the fact that most software and tools have excellent ARM support, following from the decades of work that's gone into the mobile and then server software ecosystems on ARM.
        Last edited by coder; 29 November 2024, 05:35 PM.

        Comment

        • bernstein
          Junior Member
          • Feb 2011
          • 16

          #24
          Originally posted by coder View Post
          Apple M3 vs. Lunar Lake shows that ARM cores can provide performance better on the same process node.
          Originally posted by coder View Post
          Typical PC Master Race copium. No, look at Apple M3 vs. Lunar Lake. Exact same process node. Lunar Lake is the most power-efficient x86 and the M3 still beats it.
          ​​No it shows, that Apple either had a bigger engineering budget or made more out of the budget. Also it shows Qualcomm had a much smaller budget (or did worse engineering) than Apple.
          Originally posted by coder View Post
          ​This is also a lie. Lunar Lake cores are much bigger than the M3's. Lunar Lake has significantly more total cache, as well.​
          ​TL;DR. The efficiency advantage of apple boils down to: a) vertical integration of soft- & hardware b) node advantage
          Originally posted by coder View Post
          ​​The spot pricing on AWS Graviton 4 instances suggest that TCO of ARM is still lower than any x86 options.​
          ​AWS is heavily subsidizing ARM because they want more competition in the server cpu space (than the intel/amd duopoly). Doing your own ARM SoC design is way more expensive than buying the relatively few Graviton's they build at TSMC.
          Originally posted by coder View Post
          ​​> ​​​for the foreseeable x86 will remain the default in server.
          Wishing for a thing and saying it doesn't make it so.​​
          ​Certainly not, but industry shifts like that takes at least a decade. Likely far longer.
          Originally posted by coder View Post
          > x86 is still miles ahead in software support,
          Not in anything I care about.
          Hardware independence is probably the one you care the most about. Currently for every ARM SBC the image have to be built separately. There is no system in place to handle with the different device tree's. That's why it's comparatively cumbersome to support ARM SBCs when comparing to x86. Not too much of a problem for current hardware, huge problem when it's ten years old, because unless it was VERY popular (like the Pi3) no-one will continue building images.

          Comment

          • coder
            Senior Member
            • Nov 2014
            • 8920

            #25
            Originally posted by bernstein View Post
            ​​No it shows, that Apple either had a bigger engineering budget or made more out of the budget. Also it shows Qualcomm had a much smaller budget (or did worse engineering) than Apple.​
            Lunar Lake is Intel, not Qualcomm. Apple has deep pockets, but I seriously doubt their CPU core design team is bigger than Intel's. But, it's clear from this error that you don't even know what you're talking about, so I'll just leave it at that.

            Originally posted by bernstein View Post
            ​​​AWS is heavily subsidizing ARM because they want more competition in the server cpu space
            You can believe this if you want, but I think it doesn't make economic sense for Amazon to take big losses in hope of gaining slightly better pricing, in the future. Most of Amazon's server fleet is now ARM-based Graviton CPUs, which isn't something they'd do without deriving an economic benefit by it.

            Comment

            • Dukenukemx
              Senior Member
              • Nov 2010
              • 1392

              #26
              Originally posted by coder View Post
              Cinebench certainly represents real world performance, since it's based on a production renderer. Geekbench indeed has weird MT scaling, so I just look at the ST numbers.
              They're both synthetic tests, and that's easy to manipulate. Hardly anyone benchmarking Apple products runs real world applications. Strix Point is just better against Apple's M3.
              That's not a good comparison, since it included only one other laptop SoC and that was AMD's top-end Strix Point model that has more cores and used about double the power.
              AMD's Strix Point were sometimes more power efficient. Specifically Kvazaar and there were others where it was close. There were other tests where it used 3x more power too, but again it's on 4nm. Even Intel's Lunar Lake was on an older 3nm process compared to M4. I'm not sure if AMD's new Strix Halo is going to use 3nm, but it will be their new top-end mobile chip.
              If you strictly compare either on the basis of single-threaded tests or vs. something like Lunar Lake, that has the same number of cores & threads, Apple comes out well ahead. The other wildcard is the amount of optimization, like hand-coded AVX2 or AVX-512.
              We know Lunar Lake isn't exactly a performer. Especially when it comes to multi-threaded performance. Who at Intel thought it was a good idea to remove Hyper Threading?
              Right at the top of the front page, it says:

              Latest Linux Hardware Reviews, Open-Source News & Benchmarks





              Linux and Open-Source. So, a bad idea to assume someone on here gives a shit about proprietary software, because the site caters to the FOSS community. I'm not saying nobody does, but enough of us don't that you can't just assume someone does.
              You do know that Apple doesn't cater to the FOSS community either? If you know Apple you know that their dealings with open source is terrible. Apple doesn't exactly open source MacOS, because otherwise we'd have distros based on it. The industry is moving towards open source and here we are with Apple who's behind the times. As for closed software, it's not like the industry is going to start to open source it. Would be great, but if you do any professional work then you need closed source software, like Adobe.
              Intel's Lunar Lake is the most efficient x86, at least in the ballpark of 4P + 4E cores and if you pick a mid-range, like the 256V (which Michael has and mysteriously omitted from that M4 comparison).
              Yes but, Intel is still a mess. Look at their Arrow Lake chips and how bad the performance is and how little power savings there is compared to 14900K. Much like AMD, they are finding hidden performance that might uplift something like the 285K.
              Same. That's because I don't like Apple as a company. However, I think their hardware is a good example of what's possible.
              I think the industry needs to benchmark their hardware better to really see what is possible. I'm not convinced that Apple should be looked up to when it comes to good CPU performance. Cinebench is still a bad benchmark, but things like Davinci Resolve and games are good examples of real use cases.

              Depends on what for. Lunar Lake is held back by 8 cores / 8 threads. So, for a software development machine, I'd probably also prefer a HX 370. However, if I'm mainly using it for video calls, productivity apps, web, and remote access, then I'd go with Lunar Lake for sure.
              I don't like the idea of limiting what I can do with my hardware. The only reason to go with Lunar Lake is better power efficiency, and I don't often find myself that far away from an outlet when using a laptop.
              Where the fuck did you get that number? Even just the compute tile of Lunar Lake is 140 mm^2! The I/O die adds another 80 mm^2, for a total of 219.7 mm^2.
              Yea you're right, it's not 100mm^2. I don't think it's 219.7 either. This is all from the Lunar Lake wiki. As for M3 it might be closer to 150mm^2. I initially went with Google AI search results and they weren't accurate.
              TSMC N3B 140mm2
              TSMC N6 46mm2
              That's also a misleading comparison, because of things like iGPUs and NPUs dominating the dies (or the M3's at least) and we're just talking about CPU cores here, not GPU. If you compare the CPU cores, the M3's P-core is 2.49 mm^2, while Lunar Lake's is 4.53 mm^2. That puts the x86 core at 81.9% bigger!
              Intel's x86 is 82% bigger, but not AMD. AMD is on 4nm and even their Zen5c cores aren't entirely cut down either. AMD is even putting in V-Cache in their X3D chips that increases the size massively, but also doesn't show any performance increases in Cinebench. It does show up for games and certain applications. AMD also supports AVX-512 in all their cores, while Apple's M4's aren't even using SVE let alone SVE2. But hey, at least Apple took the time to include SME which Geekbench happily tested and gave 400 points towards Apple's M4's. Sometimes comparing cores isn't exactly useful when different cores have different use cases. AMD and Intel will pump a lot of cache because people do play games on these CPU's and cache really benefits games. AVX-512 also benefits games... more specifically emulators.
              I don't care about Apple, as a company, or even ARM, for that matter. I'm just interested in Apple as an example of what's technically possible and I'm interested in ARM because it has the maturity to be a practical alternative to x86. Even if there are a few benchmarks Michael can cherry-pick that run better on x86, it doesn't negate the fact that most software and tools have excellent ARM support, following from the decades of work that's gone into the mobile and then server software ecosystems on ARM.
              I don't think Michael is cherry picking anything. I think he just ran his usual tests and that's it. The problem with x86 is Windows as it really sucks for performance. Even though he tested Ubuntu 24.04 with a 6.10 kernel, it's going to perform a lot faster compared to Windows, especially Windows 11. This is why his tests were shocking because not many people expected that difference when reviewers with Windows 11 laptops running Cinebench and Geekbench were getting vastly different results. Keep in mind that distros like CachyOS which cater to V3/V4 plus optimized kernels may just further that gap. Windows has been a sh*tshow for AMD and Intel recently. They both keep finding hidden performance from stupid things like an admin account and a patch that was delayed. The CPU performance is still faster on Linux. Like I said Davinci Resolve was 15% faster on Linux... from 5 years ago. That could be the difference of Apple being faster or AMD/Intel being faster. People really need to do more tests.
              Last edited by Dukenukemx; 30 November 2024, 09:49 AM.

              Comment

              • dfyt
                Phoronix Member
                • Oct 2014
                • 106

                #27
                I was gifted a MacBook Pro M3 max 2 weeks ago. When it comes to performance what people ignore is the horrendous throttling on it. I was running an av1 encode and sits on 118C. Speed after about 20 mins drops to about 1/3. On the Apple threads they always defend but the max beats 9950x. Benchmarks rarely factor throttling of real workloads. I so so so wish I could run Linux. It's like a glorified paper weight and the lack of a numpad, home, delete keys, lack of type A usb. My gosh it sucks.
                Last edited by dfyt; 30 November 2024, 01:12 PM.

                Comment

                • coder
                  Senior Member
                  • Nov 2014
                  • 8920

                  #28
                  Originally posted by Dukenukemx View Post
                  They're both synthetic tests, and that's easy to manipulate.
                  How? There are 3rd party reviewers running these benchmarks, so how is Apple going to manipulate them?

                  Originally posted by Dukenukemx View Post
                  ​AMD's Strix Point were sometimes more power efficient.
                  This is a mismatched comparison, because obviously the thing with more cores & threads is going to be more efficient at scale. If you're comparing two products against each other to understand how things like ISA, manufacturing, and microarchitectural differences affect performance, then you need to use workloads that don't unfairly favor one vs. the other on the basis of thread or core count.

                  This is why Lunar Lake is perfect point of comparison. When comparing anything else vs. M3, if the goal is really to support detailed analysis, then the next best option is to use single-threaded benchmarks.

                  Originally posted by Dukenukemx View Post
                  ​​You do know that Apple doesn't cater to the FOSS community either?
                  Yes. I already said I'm not a fan of them, as a company. I never have and never will buy their products, even just to run Linux on them. I don't want any part of that whole ecosystem.

                  Originally posted by Dukenukemx View Post
                  ​​​Yes but, Intel is still a mess. Look at their Arrow Lake chips and how bad the performance is and how little power savings there is compared to 14900K.
                  This is a more interesting subject, for me. Where initial benchmarks showed the weakest results on Arrow Lake, I think it had a lot to do with P-core vs. E-core scheduling. Because it reuses the problematic I/O tile of Meteor Lake, things like memory latency seem to be an issue for it. For these reasons, I plan to skip Arrow Lake.

                  But, this discussion isn't really about Arrow Lake, anyhow.

                  Originally posted by Dukenukemx View Post
                  ​​​​Intel's x86 is 82% bigger, but not AMD.
                  But Zen 5's single-threaded performance generally lags both the M3's and Arrow Lake's.





                  With their P-cores, AMD is more concerned about perf/area and perf/W than Intel. Because Intel has E-cores, they lean harder into just making their P-cores fast, yet they still struggle against Apple.

                  Originally posted by Dukenukemx View Post
                  ​​​​​AMD is even putting in V-Cache in their X3D chips that increases the size massively, but also doesn't show any performance increases in Cinebench. It does show up for games and certain applications.
                  Yeah? That just underscores that you need a diversity of benchmarks to fully characterize a CPU. It doesn't tell us that Cinebench is worthless. If one benchmark could tell us everything about a CPU, then Michael wouldn't need thousands of them in PTS.

                  Originally posted by Dukenukemx View Post
                  ​​​​​​AMD also supports AVX-512 in all their cores, while Apple's M4's aren't even using SVE let alone SVE2.
                  That just makes the M4's vector/FP performance that much more impressive!

                  Originally posted by Dukenukemx View Post
                  ​​​​​​​I don't think Michael is cherry picking anything.
                  Of-fucking-course he is! He knows which tests tend to favor which types of CPUs and he knows how to play to his audience and/or sponsors. He bought this Mac on his own dime, which means he's under no obligation or pressure to show it in a good light. He provides virtually no transparency into his benchmark selection and it varies quite a lot, from one article to the next!

                  Originally posted by Dukenukemx View Post
                  ​​​​​​​​I think he just ran his usual tests and that's it.
                  This just shows you haven't been paying attention.

                  Comment

                  • Dukenukemx
                    Senior Member
                    • Nov 2010
                    • 1392

                    #29
                    Originally posted by coder View Post
                    How? There are 3rd party reviewers running these benchmarks, so how is Apple going to manipulate them?
                    It's not the reviewers so much as companies catering their hardware and software to maximize them. Some reviewers do try to screw results like one guy tested some games on the M4 Max vs a laptop with an RTX 4090m. Except the laptop has an Ultra 9 185H which is not a CPU you'd want to couple with an RTX 4090m. Synthetic tests have historically had manufacturers cater as much as possible to make themselves look good. It's much easier to spot this with Qualcomm's Snapdragon X chips where they do perform well in 3D Mark, but are horrible at 3D outside of it. This is why the PCMR frowns upon synthetic tests because they are easy to if not cheat then optimize specifically for those tests.
                    This is a mismatched comparison, because obviously the thing with more cores & threads is going to be more efficient at scale. If you're comparing two products against each other to understand how things like ISA, manufacturing, and microarchitectural differences affect performance, then you need to use workloads that don't unfairly favor one vs. the other on the basis of thread or core count.
                    This is why doing lots of tests will give you a better idea of the performance, compared to Geekbench and Cinebench. Lets be honest here, but the tests done by Michael are more realistic than anything you could get from Geekbench and Cinebench.
                    This is why Lunar Lake is perfect point of comparison. When comparing anything else vs. M3, if the goal is really to support detailed analysis, then the next best option is to use single-threaded benchmarks.
                    The best single threaded benchmarks are games as games heavily depend on good single-threaded performance. How many games you see perform better on M4's, if at all? We'll get back to that later.
                    This is a more interesting subject, for me. Where initial benchmarks showed the weakest results on Arrow Lake, I think it had a lot to do with P-core vs. E-core scheduling. Because it reuses the problematic I/O tile of Meteor Lake, things like memory latency seem to be an issue for it. For these reasons, I plan to skip Arrow Lake.

                    But, this discussion isn't really about Arrow Lake, anyhow.
                    You generally don't see these problems on Linux. This has been the case for both AMD's Zen5 and now Intel's Arrow Lake. Arrow Lake is particularly bad because Intel didn't save that much power, but at least AMD was able to get a good 30% power savings from Zen5 compared to Zen4.

                    But Zen 5's single-threaded performance generally lags both the M3's and Arrow Lake's.
                    Because Cinebench says so? There's a reason why I mentioned that AMD's V-Cache has no benefits in Cinebench. Here we see that the 9800X3D is slower than then 9700X in Cinebench. It makes no sense since the level 3 cache is meant to boost single threaded performance? Cinebench is a math heavy application that can run code in any order, which means branch prediction has not much use here. AMD's 3D V-Cache is meant to boost performance with code that has in order traversal. You know, if else, while loop and etc. You put any M4 against AMD's 9800X3D in a game and there's no chance it'll come close in performance. Games love single threaded performance and Apple's M4's have it in spades, so why is Apple terrible when it comes to gaming? Include something like AVX-512 which no game uses as far as I know, but RPCS3 does, as do some other emulators and again the M4 can't match the performance. The reason for this is the lack of SVE2. You can pick and choose your battles and will certainly win, but that's the problem with Cinebench. You run a verity of real world tests and then you can determine the benefits of the hardware.



                    With their P-cores, AMD is more concerned about perf/area and perf/W than Intel. Because Intel has E-cores, they lean harder into just making their P-cores fast, yet they still struggle against Apple.
                    I don't think Intel even knows what they want. For the most part Intel is copying Apple and hoping to beat them at their own game. AMD hasn't gone that route and even the ARM CPU manufacturers are starting to lean away from efficiency cores. I think AMD has the right idea.
                    Yeah? That just underscores that you need a diversity of benchmarks to fully characterize a CPU. It doesn't tell us that Cinebench is worthless. If one benchmark could tell us everything about a CPU, then Michael wouldn't need thousands of them in PTS.
                    Cinebench is worthless because it doesn't take advantage of modern CPU designs like AMD's Zen5. Since tech reviewers are lazy, especially Apple reviewers, they tend to just run it and analyze it for 30 minutes while declaring a winner.
                    That just makes the M4's vector/FP performance that much more impressive!
                    Except when AVX-512 is used then Apple's vector performance looks terrible. The problem with AVX-512 is that the applications that can benefit from it, haven't been updated to do so. Look at FFMPEG where they gained a 94x after implementing AVX-512. Or just read the blog from the RPCS3 developer where he shows the benefits. This is why distros like CachyOS are using a V4 repository to boost performance, because V4 tries to make use of AVX-512. Like I said, AVX-512 is just hardly used and this includes games. I don't think there's a single game that uses it.

                    From left to right: SSE2, SSE4.1, AVX2/FMA, and Icelake tier AVX-512.

                    Of-fucking-course he is! He knows which tests tend to favor which types of CPUs and he knows how to play to his audience and/or sponsors. He bought this Mac on his own dime, which means he's under no obligation or pressure to show it in a good light. He provides virtually no transparency into his benchmark selection and it varies quite a lot, from one article to the next!
                    Variety is how you do benchmarks. Would you have preferred that he was sponsored? I personally avoid benchmarks that involve sponsorships. What would you recommend then? Please don't say Cinebench.
                    This just shows you haven't been paying attention.
                    Do me a favor and tell me what I should be paying attention too? Michael even showed power consumption which heavily favored Apple.
                    Last edited by Dukenukemx; 01 December 2024, 02:36 AM.

                    Comment

                    • coder
                      Senior Member
                      • Nov 2014
                      • 8920

                      #30
                      Originally posted by Dukenukemx View Post
                      It's not the reviewers so much as companies catering their hardware and software to maximize them.
                      I promise you that Apple doesn't give a shit about Cinebench and definitely isn't optimizing for it.

                      Originally posted by Dukenukemx View Post
                      ​It's much easier to spot this with Qualcomm's Snapdragon X chips where they do perform well in 3D Mark, but are horrible at 3D outside of it. This is why the PCMR frowns upon synthetic tests because they are easy to if not cheat then optimize specifically for those tests.
                      But 3DMark uses its own rendering engine, right? Cinebench is not a purpose-built benchmark, but rather a wrapper around the production renderer in Maxon's Cinema4D. I'll bet if 3D Mark simply used Unreal Engine or maybe Unity3D, it would correlate better with actual game performance.

                      Originally posted by Dukenukemx View Post
                      ​​This is why doing lots of tests will give you a better idea of the performance, compared to Geekbench and Cinebench. Lets be honest here, but the tests done by Michael are more realistic than anything you could get from Geekbench and Cinebench.
                      It's not that I don't want more tests, but Michael simply isn't running the single-threaded benchmarks that would tell us how the individual cores compare, so we have to live with what we've got. In his M4 mini review, we got only one single-threaded benchmark where it did indeed spank the x86 crew, but someone pointed out that it was a compression benchmark and we can't rule out the possibility that the M4 simply won by virtue of fitting more of the tables in its L2 cache.

                      However, I've found some SPECint2017 rate-1 benchmarks that include a nice diversity of CPUs:
                      Probably the first thing you're going to point out is how the Zen 5 desktop CPUs beat M3 Pro. That's a laptop CPU, however. If you compare it to the HX 370, Zen 5 ain't looking so good.

                      He also went to the trouble of computing perf/MHz, which is a rough estimate of IPC. I think this is highly enlightening:




                      Originally posted by Dukenukemx View Post
                      ​​​The best single threaded benchmarks are games as games heavily depend on good single-threaded performance.
                      No, just because games correlate well with single-thread performance doesn't make them single-threaded benchmarks. All games are multi-threaded, which is the first complication they pose, since that makes them susceptible to sub-optimal scheduling and frequency-scaling issues. The next issue is that they're doing lots of I/O and synchronization with the GPU. None of these are factors you want to deal with, when you're trying to tease out subtle differences between CPU microarchitectures.

                      Originally posted by Dukenukemx View Post
                      ​​​Because Cinebench says so? There's a reason why I mentioned that AMD's V-Cache has no benefits in Cinebench. Here we see that the 9800X3D is slower than then 9700X in Cinebench. It makes no sense since the level 3 cache is meant to boost single threaded performance?
                      Your understanding is too simplistic. Michael ran plenty of benchmarks on X3D CPUs and it has a wide diversity of impacts. Some of them love the extra L3 cache and others are unaffected by it. The principle factor is how much of the working set can fit in the L3 cache with/without the extra cache die. If the added cache die doesn't make a meaningful difference (either because most of the working set fit in the smaller L3 capacity or because the working set is so huge that the extra L3 cache hardly makes a dent), then the benchmark is simply going to prefer the CPU with a higher clock speed.

                      Originally posted by Dukenukemx View Post
                      ​​​even the ARM CPU manufacturers are starting to lean away from efficiency cores.
                      You mean because they're moving away from including A5xx cores? What happened is basically an inflationary situation. ARM added the X tier of cores, which sit above the A7xx tier. SoC vendors became so hungry for performance that they eagerly embraced them and they became the new P-cores, while the A7xx tier became the new E-core. A5xx is becoming less relevant, because ARM is heavily optimizing them for area and energy efficiency, but that makes them so slow that they become a scheduling hazard for general-purpose workloads. Basically the A5xx cores have inherited the role formerly played by the A3xx tier.

                      Originally posted by Dukenukemx View Post
                      ​​​Cinebench is worthless because it doesn't take advantage of modern CPU designs like AMD's Zen5. Since tech reviewers are lazy, especially Apple reviewers, they tend to just run it and analyze it for 30 minutes while declaring a winner.
                      It's not worthless, because it measures an actual application, which is Cinema4D. It correlates well with the performance of other renderers. Finally, most other programs people use don't employ AVX-512, either.

                      Furthermore, Maxon releases new versions of Cinema4D every couple years. Perhaps the next release will make better use of AVX-512 & AVX10.

                      Originally posted by Dukenukemx View Post
                      That turned out to be either fraudulent or at least a gross misunderstanding due to comparing against a baseline with compiler optimizations completely disabled. The code they were tweeting about wasn't even ffmpeg, which is why I say it could've just been a misunderstanding.

                      More to the point, if you just look at the speedup gained by the other hand-coded versions, you can see that most of the benefits are gained simply by going to SSSE3. The actual improvements between AVX2 and AVX-512 were: -3.2%, 15.2%, 96.6%, and 40.2%. Except for the first one, which was actually a regression, those aren't small improvements. However, these were micro-benchmarks that measured basically a single loop. The overall benefit to AV1 decoding performance would be much smaller.

                      I'm not saying it's not a good thing. SVE shares many of its key features, like per-lane predication. And yes, neither Snapdragon X nor Apple have SVE, yet. It's a good bet this will change, so if we're talking about ISA differences, I don't put AVX-512 in the "win" column for x86. Plenty of CPUs, like Amazon's Graviton 4, Nvidia's Grace, Google's Axion, not to mention the last couple generations of phone SoCs have cores supporting SVE2.

                      Originally posted by Dukenukemx View Post
                      This is why distros like CachyOS are using a V4 repository to boost performance, because V4 tries to make use of AVX-512. Like I said, AVX-512 is just hardly used and this includes games. I don't think there's a single game that uses it.
                      To get the most benefit from it, code either needs to be written with asm/intrinsics, or it needs to at least be written in a way that's easy for the compiler to vectorize. As most code doesn't meet that standard, its overall benefits aren't large. In specific niches, it's a big win, but I don't have a CPU with AVX-512 and I'm not running out to buy one anytime soon.

                      Originally posted by Dukenukemx View Post
                      Variety is how you do benchmarks. Would you have preferred that he was sponsored? I personally avoid benchmarks that involve sponsorships. What would you recommend then? Please don't say Cinebench.
                      A couple or 3 years ago, Michael mentioned that it would take over a month to run all of PTS on a high-end CPU. So, he has very many benchmarks to choose from. With so many possible benchmarks to run, he does weird things like include a dozen OpenVINO test cases, which really favor AVX-512. Back in the Zen 4 era, I recomputed the Geomean in one of his review by excluding them, and found that they were significantly skewing it.

                      Now, we can only guess at his methodology for picking which benchmarks to run and how many cases of each, because there's rarely any transparency, there. As for sponsorship, most of the hardware used for his benchmarks is donated by the manufacturers. While he sometimes buys laptops, mini-PCs, or GPUs, he cannot possibly afford to buy big server CPUs or systems on his own dime.

                      He's also not transparent about who donates to the site, or how much. So, we can't even say there are no financial ties either directly to the companies or their employees.

                      With all that said, I still appreciate Phoronix. I just mention it because you have to be circumspect and thoughtful about what it's showing.

                      Originally posted by Dukenukemx View Post
                      Do me a favor and tell me what I should be paying attention too?
                      Which benchmarks he runs in which articles. Depending on the focus of the article, sometimes the selection is obvious and logical. Other times, it seems quite a bit more arbitrary and like it could be tilting the scales.

                      Comment

                      Working...
                      X