Announcement

Collapse
No announcement yet.

SiFive Unleashes New 7-Series RISC-V Cores With Better Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by coder View Post
    Sure, but foregoing OoO isn't the only way to mitigate against them.

    IMO, the obvious reason it's in-order is that it wasn't feasible for them to do OoO, either due to schedule or market constraints. It just bugs me to see that spun as the reason. Don't know if it was Michael's doing, or if that was their official line. Either way - not good.
    Well, for now it is the only reliable mitigation, the performance is good enough for a wide variety of applications, and to their benefit it takes less time to design and validate. Seems like a winning combo for consumers either way.

    As for the communication of this as a strategic defense against sidechannel attacks; those sorta date back to the confusion around when Spectre was first being widely publicized. There were a bunch of people who asked vendors of in-order RISC-V cores if their cores were vulnerable to certain such attacks, and their answer was no; the press then took that to the public. Then, not long after that, a number of RISC-V vendors rather publicly started collaborating on mitigating these attacks. The combination of the two has created a meme that vulnerable RISC-V cores should not be deployed, and indeed they have not; and it is unlikely, if there are viable mitigations which apply to OoO cores, that those mitigations will not be included with commercially available RISC-V OoO cores as they come out.

    So yes, it's probably an exaggeration, but it isn't untrue, nor is it misleading. And FWIW, these cores are on the upper end of in-order performance, especially at the sorts of frequencies they manage to run at. If you are embedding a core and need this sort of performance profile, and want SMP, and want Linux/FreeBSD/FreeRTOS, and also want not to be susceptible to the majority of these recently-popular sidechannel attacks, then these cores could be a very enticing option for you on the merits.
    Last edited by microcode; 03 November 2018, 05:00 PM.

    Comment


    • #22
      Originally posted by ldesnogu View Post
      This is much slower than many ARM CPU for single thread tasks, I bet it's around Cortex-A53. And I'm not even sure anyone made a chip out of it.
      As per SMT, per thread, yes, if you divide your 1 heavy thread by 4 obviously each one will have small performance than a core.
      But its up to you to use SMT or Not, because mips64 you can choose at boot time, if you want SMT, 2 or 4 threads...

      In performance...No Way a ARM A53 can compete with a mips64...even a mips32 outsmoke a Arm53 easily.

      on Dhrystone_2 tests,


      ARM-A53 has a ~2.3 DMIPS/MHZ ( which is very poor, even lower than cortex A9 )

      To give you an Example( Baikal-T1- mips32 1.2 Ghz ),
      Tech journalist Igor Oskolkov of 3DNews.ru publication openly tested the first publicly available version of the evaluation board or, as…


      Has a ~3.69 DMIPS/MHz

      The same goes for CoreMark Tests!
      With 5.4 CoreMark/MHz/core

      They even smoked intel Cpus in this tests

      Maybe , and its a Maybe, a ARM-A57, could be equivalent to a mips32 1.2Ghz, maybe..

      Comment


      • #23
        If you compare this to Zen or Intel's architectures... its about 20 years behind. An Ultrasparc II was doing 4 wide instruction issue in 1998, and it was even at a disadvantage there due to its wonky register file... contemporary competitive designs could have done even better.

        Up to 6 Integer + 4 FPU/SMID per cycle with an average of 8 retired per clock from a 192 op sized queue.


        Last edited by cb88; 03 November 2018, 06:30 PM.

        Comment


        • #24
          Originally posted by microcode View Post
          Seems like a winning combo for consumers either way.
          Lol. Show me a consumer with all in-order cores who feels like they've "won". You're still spinning, which was my core complaint.

          I still doubt they would've gone OoO, even without this excuse. It's not as harsh a criticism as it sounds. I understand the need to set achievable goals and learn to walk before you try to run.

          Comment


          • #25
            Originally posted by cb88 View Post
            If you compare this to Zen or Intel's architectures... its about 20 years behind. An Ultrasparc II was doing 4 wide instruction issue in 1998, and it was even at a disadvantage there due to its wonky register file... contemporary competitive designs could have done even better.

            Up to 6 Integer + 4 FPU/SMID per cycle with an average of 8 retired per clock from a 192 op sized queue.


            Anyone can do four-wide issue, making it useful is the trick. I'll agree that the designs are not ambitious, but all things considered it's a product which is nearly identical to products ARM considers state-of-the-art in their respective classes (and in less area). Competing directly with Zen is not the ambition of every core designer.

            Comment


            • #26
              Originally posted by microcode View Post

              Anyone can do four-wide issue, making it useful is the trick. I'll agree that the designs are not ambitious, but all things considered it's a product which is nearly identical to products ARM considers state-of-the-art in their respective classes (and in less area). Competing directly with Zen is not the ambition of every core designer.
              I agree, my point was people should't be expecting to hook up a GPU, HDD and some USB ports to this and expect it to look like a workstation when it's an overgrown microcontroller, even if it did it would at best look more like a NAS or other small embedded system, Perhaps on par with some of the RPis, but otherwise not impressive or that useful performance wise.

              The Kendryte K210 is pretty cool though... I wouldn't mind having one of those to play with and that's one of RISC-V's strong points its easy to slap all sorts of extensions and accelerators onto it as it's an open design.
              Last edited by cb88; 04 November 2018, 01:24 AM.

              Comment


              • #27
                Originally posted by tuxd3v View Post

                As per SMT, per thread, yes, if you divide your 1 heavy thread by 4 obviously each one will have small performance than a core.
                But its up to you to use SMT or Not, because mips64 you can choose at boot time, if you want SMT, 2 or 4 threads...

                In performance...No Way a ARM A53 can compete with a mips64...even a mips32 outsmoke a Arm53 easily.

                on Dhrystone_2 tests,


                ARM-A53 has a ~2.3 DMIPS/MHZ ( which is very poor, even lower than cortex A9 )

                To give you an Example( Baikal-T1- mips32 1.2 Ghz ),
                Tech journalist Igor Oskolkov of 3DNews.ru publication openly tested the first publicly available version of the evaluation board or, as…


                Has a ~3.69 DMIPS/MHz

                The same goes for CoreMark Tests!
                With 5.4 CoreMark/MHz/core

                They even smoked intel Cpus in this tests

                Maybe , and its a Maybe, a ARM-A57, could be equivalent to a mips32 1.2Ghz, maybe..
                MIPS64 is just the instruction set name. This doesn't mean a lot as far as speed is concerned. So because some MIPS chips is fast doesn't mean I6500 is fast, and it indeed certainly isn't fast for single thread tasks.

                Comment


                • #28
                  Originally posted by ldesnogu View Post
                  MIPS64 is just the instruction set name. This doesn't mean a lot as far as speed is concerned. So because some MIPS chips is fast doesn't mean I6500 is fast, and it indeed certainly isn't fast for single thread tasks.
                  Yes Mips64 is the Instruction Set,

                  But you gave a Arm53 as more performante than a mips processor, and I proved otherwise.

                  My statement was that a Baikal T1 will destroy in performance a ARM53, and I showed why.
                  Its a lot more performant, by each core, than ARM53.

                  So a thread in Baikal T1 shouild have 150% or around that maybe more, than ARM 53.

                  Each MIPS P5600 multiprocessor core (MIPS32 Release 5)( Warrior P class 32 bits ),
                  It is a lot more performant than a ARM53 Core.

                  This was proved with baikal T1!! around at least 150% performance, of ARM35, also the Baikal-T1, has 1 MB shared L2, by each 2 cores.
                  It would not be a abomination to say that Baikal T1, could achieve 180% of the performance of a ARM53.

                  And the Baikal T1, is the minimum implementation, because you can choose at least 5-6 Cores, from 1.2Ghz to 1.5Ghz.

                  No way, a ARM 53 can compete with it!!
                  The Only thing that ARM can compete with it is in Graphics,( because PowerVr is what we know in linux.. )

                  But for headless systems,
                  Baikal-T1 destroys completely the ARM 53, because of that, I stated that maybe a ARM A57 can compete with it.

                  My data was based on Scientific proof, and not in my opinion only.


                  Now for MIPS P6600 multiprocessor core (MIPS64 Release 6)( Warrior P class 64 bits ),

                  I don't have a example for that,

                  Everything could be speculated about it,
                  But given the fact that Mips32 on Baikal T1 is a good-dam processor( The best core performance on the market...they even smoked intel on the tests.. ),

                  I could speculate, that if Baikal Implemented it with same precision... it Would be a beast, in the same manner has Baikal T1 is for 32 bits.
                  But this now is speculation only..

                  Comment


                  • #29
                    Originally posted by microcode View Post

                    Anyone can do four-wide issue, making it useful is the trick. I'll agree that the designs are not ambitious, but all things considered it's a product which is nearly identical to products ARM considers state-of-the-art in their respective classes (and in less area). Competing directly with Zen is not the ambition of every core designer.
                    Agree,
                    If you look into Elbrus3 in 1986,
                    You will find that it was capable to execute 25 instructions per clock cycle,
                    I don't consider ARM a state-of-the-art, but when mips was down, without financing, ARM grow a lot into what we know today..not by merit, but because mips couldn't have cash for R&D.

                    And the kids nowadays, don't know full story, mips was always faster than ARM, but ARM gained in Perf/watt.
                    Now, with P5600 implementation( Baikal-T1 ), it was proved that mips could also be a competitor in perf/watt, for similar envelops.

                    in the VLIW,
                    sparc continues to be a good processor, Oracle databases run great in Sparc, they have been optimising for it!
                    Oracle is trying to convince lots of its database clients to migrate to sparc,
                    with very attractive proposals...and I see a lot of them migrating, which is good for sparc team also( but here I am talking about Corporate Companies, dealing with cash business, like banks, and so on..a lot of this places before were running ( The Great umbreakable )Informix on Power..other places Oracle on amd64 ).

                    The Same thing goes for Elbrus8SM, or for the next Elbrus16S, if code is optimised for them the are very performant..

                    Comment


                    • #30
                      Originally posted by tuxd3v View Post

                      Yes Mips64 is the Instruction Set,

                      But you gave a Arm53 as more performante than a mips processor, and I proved otherwise.

                      My statement was that a Baikal T1 will destroy in performance a ARM53, and I showed why.
                      Its a lot more performant, by each core, than ARM53.

                      So a thread in Baikal T1 shouild have 150% or around that maybe more, than ARM 53.

                      Each MIPS P5600 multiprocessor core (MIPS32 Release 5)( Warrior P class 32 bits ),
                      It is a lot more performant than a ARM53 Core.

                      This was proved with baikal T1!! around at least 150% performance, of ARM35, also the Baikal-T1, has 1 MB shared L2, by each 2 cores.
                      It would not be a abomination to say that Baikal T1, could achieve 180% of the performance of a ARM53.

                      And the Baikal T1, is the minimum implementation, because you can choose at least 5-6 Cores, from 1.2Ghz to 1.5Ghz.

                      No way, a ARM 53 can compete with it!!
                      The Only thing that ARM can compete with it is in Graphics,( because PowerVr is what we know in linux.. )

                      But for headless systems,
                      Baikal-T1 destroys completely the ARM 53, because of that, I stated that maybe a ARM A57 can compete with it.

                      My data was based on Scientific proof, and not in my opinion only.


                      Now for MIPS P6600 multiprocessor core (MIPS64 Release 6)( Warrior P class 64 bits ),

                      I don't have a example for that,

                      Everything could be speculated about it,
                      But given the fact that Mips32 on Baikal T1 is a good-dam processor( The best core performance on the market...they even smoked intel on the tests.. ),

                      I could speculate, that if Baikal Implemented it with same precision... it Would be a beast, in the same manner has Baikal T1 is for 32 bits.
                      But this now is speculation only..
                      On actual posted Coremark results, that Baikal T1 has Coremark/MHz performance below the old 32-bit Cortex-A15, and far below Intel cores. (Even an ancient laptop Sandy Bridge - i5-2520M - gets 8.507 CM/MHz/core, while the P5600 gets 5.15. TI AM5728, Cortex-A15 @ 1.5GHz, gets 5.265 CM/MHz.) That A15 is now several generations old, and has been replaced by the A57, A72, A73, A75, and finally, the A76. Each of these generations were major leaps. Additionally, Coremark has seemed to have a pattern of favoring lower-clocked cores, I suspect due to memory latency being effectively lower, and in my experience CM isn't taken all that seriously for cores beyond MCUs and low-end apps processors. (That being said, that is a pure anecdote; make of it what you will.) Regardless, none of this applies to the I6500, which is a dual-issue in-order processor, as opposed to the wider, out-of-order P5600, which seems to have been designed to compete with the A15. I6500 may very well have higher ST than A53, but it is likely far, far, behind current high-end ARM A-series cores.

                      Also, SPARC isn't VLIW like you seem to think.

                      Comment

                      Working...
                      X