Announcement

**PerformanceExpert** · 10 December 2021, 01:58 AM

Originally posted by tuxd3v View Post

I don't know if Scr7 is a complete out of order, I have the Idea it is not.

Who told you about operating systems??
This code is running bare-metal on the cores fixed at one frequency, otherwise the benchmarks have no purpose..

I'm just looking at what Syntacore are saying in that link. They explicitly mention 2-way decode/dispatch and out-of-order issue of up to 4 micro-ops and in-order completion. That's exactly like a standard out-of-order design. If it isn't then they are deliberately confusing people.

You cannot run SPEC on bare metal, it only runs on Linux.

**PerformanceExpert** · 10 December 2021, 02:30 AM

Originally posted by brucehoult View Post

When something is designed is irrelevant. A microarchitecture with certain characteristics will have the same performance no matter when it is designed. So A53 from 2012 has similar performance per clock to PowerPC 603 or Pentium MMX from the mid 90s.

It is very relevant. The fact is that newer designs become better and faster over time. What is more impressive, beating a core designed in 2012 or one in 2021? Do you think it's reasonable if AMD or Intel compared their latest CPU with a 10 year old CPU from their competitor, ignoring all their newer, faster designs?!?

It's also important to understand that there isn't a single performance number for a particular core. Early A53 designs started out fairly slow, later ones used better processes, higher frequency, more L1/L2 cache and faster DRAM. It is much easier to beat an old and slow A53 implementation like the Pi 3 then it is to beat a newer one. And it's much harder to beat a modern A55 one, let alone the upcoming A510.

So yes, age matters a lot.

**brucehoult** · 10 December 2021, 03:14 AM

There is a reason why we consider microarchitecture separately from implementation technology. It is precisely so that we can compare an A53 from 2012 fairly against an A53 from 2021 with a different process, MHz, amount of cache etc etc.

There is also a reason why people still make simple and relatively slow core designs and they are still interesting -- and it's not only because they don't know how to make faster ones. It's because not everything is about speed. Small size and sipping energy is important in many applications.

That's *why* ARM still puts effort into A55 and A510 when they already have much much faster cores.

**PerformanceExpert** · 10 December 2021, 05:02 AM

Originally posted by brucehoult View Post

There is a reason why we consider microarchitecture separately from implementation technology. It is precisely so that we can compare an A53 from 2012 fairly against an A53 from 2021 with a different process, MHz, amount of cache etc etc.

There is also a reason why people still make simple and relatively slow core designs and they are still interesting -- and it's not only because they don't know how to make faster ones. It's because not everything is about speed. Small size and sipping energy is important in many applications.

That's *why* ARM still puts effort into A55 and A510 when they already have much much faster cores.

That's exactly my point, the microarchitecture has not changed since 2012, while technology moved on significantly since then. If you designed a small & efficient microarchitecture today, you would not end up with a Cortex-A53 clone. It's not like Cortex-A53 was a bad design at the time, but it is far from state of the art now. That's why Cortex-A55 replaced A53 years ago (shame nobody told Broadcom...), and why A510 will replace A55 in the next few years. Hence matching performance of an A53 is not a great achievement just like matching the original Pentium isn't either.

In general fairly comparing 2 microarchitectures with very different memory systems is next to impossible. Devboards can usually change frequency, but making DDR4 memory behave like DDR3 or matching cache size, ways and replacement policies is impossible. So you're just guessing whether performance differences are due to the memory system or the microarchitecture (or both). Add in OS updates, compiler changes and you'll understand why benchmarking is so hard and very few people are able to do it properly.

**coder** · 10 December 2021, 09:21 AM

Originally posted by Developer12 View Post

However, "not getting sued for using it" isn't a particularly unique property. SPARC, MIPS, POWER, OpenRISC, and many others also fall into this category.

Most of those are fairly recent developments, which we can't say would've happened without RISC-V. Of major ISAs in wide use, I think only SPARC was free before RISC-V started gaining traction.

Originally posted by Developer12 View Post

The only reason I think companies are hopping on the RISC-V bandwagon is because it's "new" and so they've been infected with the hype bug while also salivating at the prospect of not paying ARM fees. For them, openness isn't a requirement.

That's hardly fair, and either dishonest or an incredibly lazy analysis. Aside from zero license fees, RISC-V is a buffet-style ISA where you can pick and choose the parts you want. You can also add your own to the mix. And some of its extensions are indeed technically superior to what's found in older ISAs.

**tuxd3v** · 10 December 2021, 07:27 PM

Originally posted by PerformanceExpert View Post

I'm just looking at what Syntacore are saying in that link. They explicitly mention 2-way decode/dispatch and out-of-order issue of up to 4 micro-ops and in-order completion. That's exactly like a standard out-of-order design. If it isn't then they are deliberately confusing people.

You cannot run SPEC on bare metal, it only runs on Linux.

To be honest, now, I don't now anymore..
Initially I thought it were out of order with in order pipeline( so that it on branch needs to wait for the condition processing...to be able to branch later, stalling the pipeline ), but right now maybe its out of order, with out of order pipeline, meaning ...deep speculation, but its just guesses because that is not written in any place I searched for..

Anyway, even being out of order, with a in order pipeline, they should al least made it 3 issue,
I believe they are not taking advantage of the characteristics the cpu have.. but a lot of information is missing..

They tested with Cortex A53 because its a dual issue, doesn't make sense comparing with cortex A72 which is a 3 issue( out of order, our of order pipeline ).
now , we don't know the Operating system they have.. my boards run operating systems built by me, and on 64 bits cpu they have a 64 bits OS, I don't know..
Debian/Devuan have 64 bits OS for rpi3.. you can't say they run a 32 bits OS on it, because you don't know( me neither ).

Also the other benchmarks, can be run bare-metal, and were run bare-metal almost for sure, because that is the method that you should use , with CPU fixed at a certain frequency.
They should have gone with a 3 issue in order for the SCR5 thought..
For what I see, they just announced SCR6, competes with cortex7m, and it delivers 5 CoreMark/Mhz( so probably there next top of the line will come with a lot of optimizations.. ).

Anyway, they will release their next iteration on Q12022, and will be Quad issue, and for what I understand, they will skip a 3 issue one..
SiFive released a 3 issue and a quad issue( however I don't know if any of those beat the Alibaba 7.1 CoreMark/Mhz one..SiFive at least say its superior to cortex A77.. the quad issue..)

**tuxd3v** · 10 December 2021, 07:37 PM

Originally posted by coder View Post

Most of those are fairly recent developments, which we can't say would've happened without RISC-V. Of major ISAs in wide use, I think only SPARC was free before RISC-V started gaining traction.

There are some countries or agencies that went with SPARC indeed.
leon5 is a sparc V8, now they are developing a NOELV(riscv64, also open-source design, with fast IP closed..), with at the moment ~4.4 coremark/mhz for the dual issue config..

Russia developed a lot for SPARC, they have some SPARC v9 of there own( nowadays they are focused on Elbrus VLIW )..
And probably a lot of other places have or had the same situation..

In my Opinion Risc-V has a big advantage, since its isa is open, a software developed for a processor can run in another processor, and vice-versa, and that means a unified way to run software worldwide, that is only possible of-course because RISC-V gained so much traction that everybody is jumping in..

**coder** · 10 December 2021, 09:26 PM

Originally posted by tuxd3v View Post

Russia developed a lot for SPARC, they have some SPARC v9 of there own( nowadays they are focused on Elbrus VLIW )..

I got the impression ELBRUS was developed for some special-purpose applications, like realtime signal processing. That's the sort of thing you usually do with VLIW, anyway.

Don't they have some recent/current MIPS-based designs?

**tuxd3v** · 10 December 2021, 11:27 PM

Originally posted by coder View Post

I got the impression ELBRUS was developed for some special-purpose applications, like realtime signal processing. That's the sort of thing you usually do with VLIW, anyway.

Elbrus had the first superscalar processor in the 60s, come to scale production in mid 70s..I believe its the first superscalar in the world, at least that I know off..
It received many changes in architecture, isa, etc..
The first versions where used by Defence systems, Radars and such..

But currently, its a VLIW little endian, usually called e2k, I believe its e2k version5 in actuality..

well, its how we in the west see VLIW, for signal processing..

They understood it differently, intel tried to contract the entire team in the 90s( fallout of Soviet Union.. famine, etc ), but they only managed to convince the Architecture team, and not the compiler team... that's why Itanium never achieved the desired performance, but still managed to get ~80% x86 emulated performance..

They believe, its possible to have a compiler that knows enough to parallelize the code, at compile time, in such a way that VLIW will shine..
There are different schools( in the West today everybody, or almost everybody understands that such compiler cannot be created..because we follow the fallout of the Intel Itanium.. ),
But they still have that believe, and it starts to gain traction( they have being improving their compiler a lot.. ), for the benchmarks that I already saw, against Intel Xeon, Elbrus today already beat the Xeon in some parameters without doubts in DataCenter workloads, I am talking about servers with elbrus 8SV( the V version is a improved design of elbrus 8S )@1.5Ghz, I believe 23 instructions per cycle( they also have designs that do 25, and I also read about e2k v6 maybe supporting 45 introductions per cycle, I don't know if that is true or not.. )

Basically the Idea I have, is that it can be used as a desktop, but is financed by Russian state to be in the Datacenter..
There are already some variations of hardware, Desktops and servers of 1or 2,4 SMP nodes supporting 8channels of up to 16TB Ram.

Since its not massively produced, using it as a desktop, will cost you both arms, and maybe both legs.. that's why I think Desktop is not were it will go for real..
But there are people using it paired with AMD Graphics cards, and playing games in Linux very well and at high FPS, or even running windows on them( since they also do x86 binary translation, at around 80% performance..).

They have GNU/Linux ported to elbrus, and they are starting porting freeBSD, I believe..at least read that on freeBSD forums..the number of hardware variations, and the number of Companies working around it.. starts to be interesting, its in the break of mass scale adoption.

Next version, to be produced in volume, is elbrus 16S, which is a quantum leap ahead, working at 2Ghz, with 1.5TFlops.. and this is the current big limitation they have, AMD and Intel are around 2.3TFlops, and it grows at each iteration..they are caching up in Floating point operations per second..

Long story short

,
elbrus will be predestinated to Datacenter, and maybe Military Defense Market in the future, will also have a share of desktops, in everything related with the Russian State..
Home market will be small for Elbrus due to its price..

Originally posted by coder View Post

Don't they have some recent/current MIPS-based designs?

MCST also develops MIPS based designs, and sparc v9, I believe they will focus more and more in elbrus, and drop the other too archs( sparc v9 will be probably the last to quit, since its used currently in Big Defense systems, and such..)

But they also have Baikal Electronics producing Mips32r5, and ARM cpus, for light working places..

I think in the Future their will be Elbrus in the Datancenter, and Russian state, some small Home mixed market.
And RiscV in the Mass home market...its how I see it( with SyntaCore, CloudBear , designing cpus/Socs for the home market..).

**brucehoult** · 11 December 2021, 06:52 PM

Originally posted by tuxd3v View Post

MCST also develops MIPS based designs, and sparc v9, I believe they will focus more and more in elbrus, and drop the other too archs( sparc v9 will be probably the last to quit, since its used currently in Big Defense systems, and such..).

Note that MCST is literally the "Moscow Centre for SPARC Technologies", which shows its origins, if not the current focus.

When I worked at Samsung in Moscow doing compiler and related stuff (2014-2018) a huge number of my colleagues were ex-MCST.

Announcement

RISC-V Summit 2021 - High Performance Processors, Other Interesting Talks

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment