Announcement

Collapse
No announcement yet.

StarFive VisionFive 2 Quad-Core RISC-V Performance Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by Michael View Post

    It's the kernel image that is the default for the VisionFive 2 default OS image... If a newer version of the StarFive patched kernel was in better shape, I am sure StarFive would have switched to it as their default kernel by now.
    I have no idea why several people have posted almost identical posts saying that the benchmark results are strongly effected by the kernel revision. Do they not know that most BSPs are based off of older kernels as they haven't been mainlined? The kernel being old doesn't mean it doesn't have support for the chips in question. These aren't *stock* kernels. They're highly patched kernels designed to support the chips in question. They're often a kernel that was new-ish when the development started and the only thing they've seen from newer kernels is security patches--and not always those.

    For many SBCs and the SoCs they're based on, if they have any mainline support, it's often limited and leaves out important bits like display controllers, etc. And, while newer kernels are super shiny and cool, they aren't necessarily faster in any menaingful way. They're often slower. They may be better for some workloads as they may have new features which can allow the system (kernel+user apps) to be faster/better/stronger/more, that doesn't mean they are generally beneficial for computational user space apps--which almost by definition interact very little with the kernel. If the goal is to get as much work done as possible, it's best not to waste any CPU time in the kernel, is it not?

    What's going to determine performance for most of these apps is the compiler version and if the app has any optimizations (assembly or otherwise) for the CPU in question. For x86, ASM optimizations are almost a given whenever there's an instruction that will really help (and the compiler can't be trusted to figure it out on its own). For ARM, it's getting pretty likely they'll also have hand tuned optimizations. But for RISC-V? Nope, not yet. For one, there's so few of the chips in use for most apps to *care*. If ~0% of your user base uses a particular CPU, there's just not going to be much call for people to take the time to optimize for it. Also, given how new RISC-V is, it's hard to target any specific instruction set/extension or implementation for optimizations as it's such a moving target.

    Worse than that, as you can see, most of the cores implemented so far that us little people can get our grubby little hands on aren't super high performance in nature. So, expecting them to be super fast is a silly expectation. All of this will take time to change. it'll also take people who buy these early chips/boards to test software on them and to write those optimizations.

    IIRC, this CPU doesn't support some of the vector extensions (even the preliminary ones), but the GCC mainline people have said that they're not going to support the 0.71 pre-ratified version of the vector extension *period*. So, all the cores out there with the 0.71 extension will have to use out of tree compilers maintained by vendors or other groups. So, don't expect some generic distribution to have a compiler that supports them. They may also be pretty old (as debian is) and use a compiler that doesn't support any of the other extensions (even ones that are common now).

    So, if you find the performance of this board too low to interest you, move on, this board isn't aimed at you anyway. If you find it low and love coding challenges, ponder getting one and helping out. If you maintain an app and want to look into adding some RISC-V support, ponder getting one as well. If you like bleeding edge stuff and/or just want to support the designers/vendors/etc., then ponder getting one and/or donating money or the board to people who are interested in coding for them. There are plenty of groups around who would welcome support.

    If you just want to grumble and complain, then move along and do it somewhere else. You're not being insighful nor helpful.
    Last edited by willmore; 18 August 2023, 03:55 PM.

    Comment


    • #72
      Originally posted by willmore View Post

      I have no idea why I'm rambling but here's my opinion
      You have clearly not read the thread though, there's a post by user ayumu​ page 4 that explains everything you need to know.

      Comment


      • #73
        Originally posted by citral View Post

        You have clearly not read the thread though, there's a post by user ayumu​ page 4 that explains everything you need to know.
        "You're not just wrong, you're stupid." Reread their post and then reread mine (or read it for the first time for as much as you seem to have understood it).
        Last edited by willmore; 20 August 2023, 11:42 AM.

        Comment


        • #74
          Originally posted by coder View Post
          Why do the naysayers always seem to exaggerate? If your argument isn't strong enough to sell without exaggerations, then maybe it's just not that strong?

          The Pi 4 launched just over 4 years ago. The process node it used is just over 10 years old.
          BCM2711 was announced in 2019, however that doesn't make it a modern chip. Cortex-A72 was announced early 2015 and available in phones early 2016, so it is 8 years old. It runs at a low frequency and has a tiny cache. Basically it's old IP on a cheap process to create a low-cost chip.

          Orange Pi 5 shows how much gain you get by using a less ancient process and a microarchitecture that is 5 years old. Comparing with the P550 board that was promised this summer (only 1.5 weeks left!) would be very interesting - as these results show, it has a huge gap to overcome.

          Comment


          • #75
            Originally posted by PerformanceExpert View Post
            BCM2711 was announced in 2019, however that doesn't make it a modern chip. Cortex-A72 was announced early 2015 and available in phones early 2016, so it is 8 years old. It runs at a low frequency and has a tiny cache. Basically it's old IP on a cheap process to create a low-cost chip.
            Yes. I agree with these facts, though brucehoult also has valid points (below) that the A72's design objectives differ markedly from the U74's. Merely comparing the vintage of the designs and process nodes overlooks this important aspect.

            Also, hello! I haven't crossed posts with you, lately. I hope you're well.
            Last edited by coder; 18 August 2023, 10:59 PM.

            Comment


            • #76
              Originally posted by brucehoult View Post

              I don't know why.

              When mine arrived in early February I downloaded the "Image 55" os snapshot from the manufacturer's site, copied it to a µSD card, inserted it in the board, attached a full HD monitor and kb and mouse and ethernet and a Pi 4 power supply and it booted right up into the login screen. I didn't touch any switches, I didn't update the on-board boot flash. It simply worked.

              Sure, if you want to stay on the bleeding edge of driver and kernel development then -- like any SBC -- there is more fiddling. But not to simply make it work out of the box.
              Yeah, I didn't know why either.

              Ultimately, I tracked a lot of it back to it really not liking my KVM. Not sure why (no other system, whether SBC or otherwise has given me issues) but directly connecting to monitor/keyboard and a clean install and it was much better behaved.

              Comment


              • #77
                Originally posted by PerformanceExpert View Post

                BCM2711 was announced in 2019, however that doesn't make it a modern chip. Cortex-A72 was announced early 2015 and available in phones early 2016, so it is 8 years old. It runs at a low frequency and has a tiny cache. Basically it's old IP on a cheap process to create a low-cost chip.
                Why do you act as if the year of release is relevant? It is not.

                Microarchitecture is relevant: 2-wide decode, 2-wide execute, in-order for the U74 in the VisionFive 2; 3-wide decode, 8-wide execute, Out of Order for the A72.

                Process node is relevant: 28nm for both

                The A72 was designed as a fast but power-hungry "performance" core to pair with the A53 "efficiency" core.

                The A53 (announced in 2012) has a very similar "efficient performance" design to the U74 (announced in 2018), and they would be appropriately compared with each other.

                Arm's A55 (announced in 2017), similarly. It's even more similar to the U74, as both incorporate split early/late ALUs which enable dual-issue to occur more frequently than in earlier dual-issue designs.

                Arm's A720 (announced in 2023) is also very similar.

                Yes! Arm has announced new cores comparable to the U74 not only in 2012 and 2017, but also in 2023! (and the A510 in 2021, but it took a detour -- apparently not as successful as hoped -- into 3-wide in-order).

                The year is irrelevant. The 2-wide in-order efficiency core is a very important technology that gives the most energy-efficient performance available.

                The RISC-V world has cores in the market similar to A72 and A76. People making chips have been able to license them for several years. They simply haven't worked through the 3 1/2 to 4 1/2 years that it takes to go from announcement of the core to a low cost SBC -- A72 to Pi4 and U74 to VIsionFive 2 were coincidentally both 4 years and 4 months; February '15 to June '19 and October '18 to February '23.

                Comment


                • #78
                  Originally posted by brucehoult View Post
                  The A72 was designed as a fast but power-hungry "performance" core to pair with the A53 "efficiency" core.
                  In fact, it almost seems as if ARM overshot the mark, on the A72, because they reverted to 2-wide in the A73:

                  Comment


                  • #79
                    Originally posted by brucehoult View Post
                    Why do you act as if the year of release is relevant? It is not.
                    Age is absolutely relevant because there are many newer/faster/better CPU generations available. CPUs use technology available at the time of their design. Process technology has improved quite dramatically since the first FinFET generation, enabling more complex designs, larger caches and lower power. Microarchitecture is heavily influenced by learning from previous generations as well as technology breakthroughs (examples are TAGE branch prediction, prefetching, OoO implementation techniques etc). Memory bandwidth has been increasing rapidly in recent years. And so on.

                    Newer designs benefit from all these improvements. The difference between the cores in Pi 4 and Pi 5 is 3 years or 3 CPU generations - look at the huge speedup that gives!

                    Microarchitecture is relevant: 2-wide decode, 2-wide execute, in-order for the U74 in the VisionFive 2; 3-wide decode, 8-wide execute, Out of Order for the A72.
                    Microarchitecture is relevant - however age matters as well. Nobody should be surprised that a modern RISC microcontroller outperforms say a 486. ​An in-order Cortex-A510/520 will be close to (or even beat) A72 because the latter is old, fairly limited in terms of OoO depth and doesn't have any of the advancements in the last 8 years.

                    Yes! Arm has announced new cores comparable to the U74 not only in 2012 and 2017, but also in 2023! (and the A510 in 2021, but it took a detour -- apparently not as successful as hoped -- into 3-wide in-order).

                    The year is irrelevant. The 2-wide in-order efficiency core is a very important technology that gives the most energy-efficient performance available.
                    Again nope. 2-way in-order isn't something super special. In-order cores are primarily area efficient and fine for the low end where you don't need high performance. If you want best area efficiency and lowest power, 1-way is the way to go. But guess why there are mid cores in modern phones? Efficiency-optimized OoO cores work out better overall.

                    Also there is about 1.7x performance difference between Cortex-A53 and Cortex-A510/520 (note the latter are 3-way). None of them are comparable with U74 given they are faster, support vector instructions and many other extensions that give significant performance gains.

                    The RISC-V world has cores in the market similar to A72 and A76. People making chips have been able to license them for several years. They simply haven't worked through the 3 1/2 to 4 1/2 years that it takes to go from announcement of the core to a low cost SBC -- A72 to Pi4 and U74 to VIsionFive 2 were coincidentally both 4 years and 4 months; February '15 to June '19 and October '18 to February '23.
                    We don't need to wait for cheap boards to finally test these claims. I'm hoping the P550 devboard will still turn up despite Intel cancelling some RISC-V programs. Similarly it would be interesting to see benchmarks on the first official RISC-V vector hardware.

                    Comment


                    • #80
                      Originally posted by coder View Post
                      Why do the naysayers always seem to exaggerate? If your argument isn't strong enough to sell without exaggerations, then maybe it's just not that strong?

                      The Pi 4 launched just over 4 years ago. The process node it used is just over 10 years old.
                      If your complaint is that I'm out by a year or two on a 10-plus year range, I'd say that's the argument that's "just not that strong". The further back in time you go, the fewer bits of precision people tend to have for it.

                      Happily though it's irrelevant anyway, since https://en.wikichip.org/wiki/28_nm_lithography_process will tell you that commercial 28nm production did indeed start in 2011, making this one of the rare occasions that I actually remembered a date correctly.

                      > Its performance is Skylake-caliber, but show us a board that's only slightly more expensive.

                      Not sure I'm understanding this. Do you mean the Alder-N cores are ~= to the "real" Skylake cores (which I agree with), or that this R5 chip is?.

                      https://www.aliexpress.us/item/3256804950133483.html (from Google just now) is an N95 NUC for $127 - that's with 8GB of RAM, a 128GB SSD, and a case with a heatsink and fan. Given that the bare board in this article was $115 for an 8GB SoC with no storage, I'd say that easily puts the Alder-N board itself at barely more expensive, and quite possibly not even at all so.

                      > Also, they come rated as low as 6 W, but the quad cores like to turbo up to about 25 W or more.

                      Again, not sure what the "they" is here - I'm guessing from the "25W" part that you mean the Intel cores, but nobody was talking about power consumption until you brought it into the topic just now. Not that it's not a point worth raising, but given the earlier objections it looks more like moving the goalposts.

                      > I think you're in for a few surprises.

                      Possibly, but unless you're actually going to try to address the market share/TAM and development time/cost points I think I'll stick with the "10+ years to be competitive" expectation.

                      Back when we were building stuff around ARM chips in the early days, we called them "washing machine CPUs" - because that's what they were, and that's where R5 is now. These aren't SPARC/Alpha chips trouncing 486s: they're mostly replacing microcontrollers, and to move up from there they need to go through what's now ARM territory, occupied by dirt cheap cores that have years of familiarity, tooling, and other ecosystem advantages; including massive production runs where the licensing cost - which seems to be the only real benefit R5's proponents have to offer - isn't even a rounding error any more.

                      That's going to be a steep uphill struggle, and "It's gonna be awesome bro, trust me" with nothing behind it is as convincing to me as it would be to you, i.e. not at all.
                      Fabbing 20 cores into a 16W package on a grandfather node would be interesting, if they were on par with the Pi4 rather than 2x or more slower like this. Simming that a single core could even get close a Zen4 core if they were on the same process would be interesting. The previous generation of this SoC had two NN/Tensor cores, and while whether you think those really add any value or not is highly subjective, it was at least unusual. This though, as something with no novel design elements that gets absolutely stomped by a (to use the corrected number) 8-year-old ARM? Not even a little bit, and it doesn't show any trace of "surprising" potential just waiting to happen if only someone would pump a few billion dollars into it either.

                      Doesn't mean interesting things can't be done *with* it, but that's a very different proposition from it being interesting in and of itself.
                      Last edited by arQon; 20 August 2023, 03:58 AM.

                      Comment

                      Working...
                      X