Announcement

Collapse
No announcement yet.

SiFive HiFive Unmatched Hands-On, Initial RISC-V Performance Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by brucehoult View Post
    There is NO ONE who claims the FU740 or HiFive Unmatched is a "product". It's a low volume test / demo / prototyping SoC and board.
    Right, just reiterating after some "700 USD for THIS?" comments on this thread. I guess the number of SoCs produced is in the hundreds or thousands, but I'd be surprised if it went much beyond that.

    Comment


    • #62
      Originally posted by brucehoult View Post

      There are legitimately different ways to design CPU cores, often colloquially known as "speed demon" vs "brainiac". You can have very little processing (gate delays) in each pipe stage and more pipe stages and a high clock rate, or you can have more processing in each pipe stage, fewer stages, and a lower clock rate.
      There are "speed demons" and there are "brainiacs", but in this particular case I guess the U74 doesn't show signs of being targeted at being a "speed demon". Its pipeline is 8 stages long, which is "not very long". However, depending on how complex the individual pipeline stages are, designs might still yield higher or lower frequency results.

      If I read Figure 9 of https://sifive.cdn.prismic.io/sifive...anual_21G2.pdf correctly, the dual-issue capability of U74 is not completely symmetric, with branches, multiplies, divides and floating-point operations only executing on "Pipeline B" (I might read to much into that figure). One avenue for more performance might be to lessen these issue-restrictions in future designs, just like ARM did when going from A7 to A53.

      Comment


      • #63
        Originally posted by SavageX View Post

        There are "speed demons" and there are "brainiacs", but in this particular case I guess the U74 doesn't show signs of being targeted at being a "speed demon". Its pipeline is 8 stages long, which is "not very long". However, depending on how complex the individual pipeline stages are, designs might still yield higher or lower frequency results.
        SiFive claim 2.3GHz on a 7nm process. That's not even half what Zen 3 achieves on 7nm, and well below what Cortex-A53 gets on an old 28nm process. And given the performance results, neither is it a "brainiac" design. So the speed demon/brainiac thing is irrelevant here.

        If I read Figure 9 of https://sifive.cdn.prismic.io/sifive...anual_21G2.pdf correctly, the dual-issue capability of U74 is not completely symmetric, with branches, multiplies, divides and floating-point operations only executing on "Pipeline B" (I might read to much into that figure). One avenue for more performance might be to lessen these issue-restrictions in future designs, just like ARM did when going from A7 to A53.
        Branches, multiplies, division usually don't appear next to each other. Dual-issue is almost never symmetric, and it doesn't have to be. What matters is that typical code can often dual issue.

        However dual-issue is rarely the bottleneck once you look beyond Dhrystone or CoreMark. Branch prediction, caches, TLB handling, prefetchers etc matter a lot more, particularly on in-order cores.

        Comment


        • #64
          Originally posted by PerformanceExpert View Post
          SiFive claim 2.3GHz on a 7nm process. That's not even half what Zen 3 achieves on 7nm, and well below what Cortex-A53 gets on an old 28nm process.
          Wow, that's very exciting! Please share which 28nm SoC has A53 running at well above 2.3 GHz -- I'd love to buy one!

          Comment


          • #65
            Originally posted by brucehoult View Post

            Wow, that's very exciting! Please share which 28nm SoC has A53 running at well above 2.3 GHz -- I'd love to buy one!
            There were lots of cheap Chinese phones and tablets with octa core Cortex-A53. IIRC the fastest A53 on an old process did 2.6 GHz. That's probably about 3GHz on 7nm.

            Comment


            • #66
              Originally posted by PerformanceExpert View Post
              There were lots of cheap Chinese phones and tablets with octa core Cortex-A53. IIRC the fastest A53 on an old process did 2.6 GHz. That's probably about 3GHz on 7nm.
              Name one, please. I've never heard of an A53 at anything like that clock speed.

              Comment


              • #67
                Originally posted by brucehoult View Post

                Name one, please. I've never heard of an A53 at anything like that clock speed.
                Here are some you can still buy: https://www.deviceranks.com/en/platf...o-p25-mt6757cd (there seem to be 2 bins of that SoC, some are 2.6GHz, others are 2.5GHz)

                Comment


                • #68
                  Originally posted by PerformanceExpert View Post
                  However dual-issue is rarely the bottleneck once you look beyond Dhrystone or CoreMark. Branch prediction, caches, TLB handling, prefetchers etc matter a lot more, particularly on in-order cores.
                  You're very likely completely right.

                  I tried to come up with some C code example where there would be considerable back-to-back load/stores and/or branches, but even then there's usually e.g. address calculation stuff (incrementing pointers and/or offsets and such) that can make good use of the dual-issue capabilities. Integer divisions usually are too rare to make a real difference.

                  So yeah, it should mostly come down to caches, branch-prediction, TLBs and such.

                  Comment


                  • #69
                    Originally posted by rene View Post

                    but I have plenty https://www.youtube.com/watch?v=tFNKXSZGyEo of https://www.youtube.com/watch?v=RXXBFcQ0I-M AMD https://www.youtube.com/watch?v=pC4JCPb4v1Q CPUs! ;-) (and obviously more)

                    It's a bit ironic to start with "not aware, where did you ask", just to follow up with "doubt there is an easy path to funding this sort of work" though :-/ If it helps, I was eying an TR 5990x next, maybe AMD has one to spare?

                    PS: This is exactly why I started the YT channel, to have means to finance independent OpenSource work without begging for money for each change and month.
                    If you are interested in getting paid to work on AMD hardware, consider applying to work at AMD. We are hiring.

                    Comment


                    • #70
                      Originally posted by agd5f View Post

                      If you are interested in getting paid to work on AMD hardware, consider applying to work at AMD. We are hiring.
                      Well, having some contribution bounty program is cheaper than hiring. Also filling out application paperwork likely takes longer than finishing the patch, beside I have no plan to change my life like this. If I'd wanted to work in a big cooperation I would have worked at IBM, Siemens, SAP, Volkswagen, Bosch or Apple already since a decade or two... IMHO there is a huge imbalance between companies profiting from open source and given back to individuals in the community who regularly contribute useful work or change. Instead of hiring a handful of people, IMHO we should also work towards normalizing such a program similar to what Infosec has long established in terms of security bounties. We don't we simply have this for non security code work? But if AMD insists, they can make an offer, my email and LinkedIn and such are open ;-)
                      PS: I also finally tracked down and patched why enabling x2apic usually (or always) disables P-states on AMD Zen* this weekend (tested on x570 and x470 w/ 5950x and 3950x, but it probably simply affects most BIOS' ACPI tables ;-): https://www.youtube.com/watch?v=BTI78pG2V80
                      PPS: Did I mention I have a Patreon? https://patreon.com/renerebe
                      Last edited by rene; 27 September 2021, 06:37 PM.

                      Comment

                      Working...
                      X