Announcement

**TJSER** · 18 September 2020, 04:15 PM

Over the years good science have been rejected by conferences and publications for not meeting arbitrary standards. It is actual;y good that the outlier and even the flaks get a chance to put their ideas out there.

That's true.

So how long did they have to do this presentation? You can liken this to an Apple event, they only gloss over their processors but any rational evaluation of those A series processors is that they perform well.

The difference is that I can't perform a rational evaluation of the LibreSoC core, because it doesn't exist. I think there's a difference between trying to convince me to buy a product that already exists, and trying to convince me to invest in a product that may exist in the future. The latter sets a higher bar.

RISC-V is hardly useful to the "general community" so by that measure the complete project is a failure.

I'm arguing that proposals to a community-drive project should be quantitatively proven to be useful to the members of that community.

I also argue RISC-V that has proven to be useful to the general community, given the level of industry involvement, the number of RISC-V chips shipped, and the increasing probability that the average consumer already owns a RISC-V based chip on one of their devices.

every single time i did not receive a response.

Perhaps you did not receive a response because you failed to prove that your proposal would be of value to the community. The first step is to convince people that its useful. It also seems like this is a problem only you have claimed to perceive. G

does that sound reasonable?

No. I can agree with your goals, but not your methodology.

if every entrepreneurial team set a goal that they only knew in advance how to achieve, do you think humanity would be where it is today?

The difference is that the path to achieving your goal "Build a SoC" is well defined, you just aren't following it. The fact that the LibreSoc project can't even get performance numbers for a simple core speaks volumes on their competence as a whole. Are you even aware that core RTL design is only a fractional portion of the process of building a SoC?

**TJSER** · 18 September 2020, 04:19 PM

There is no reason to believe that SoC development has to follow any specific model of development.

There is a reason to though. Complex projects tend to follow similar development patterns, because those patterns have been proven to work.
Its like if an architect decided to build the floors of a skyscraper layer-by-layer, instead of planning the whole structure beforehand, because he thought it would be more efficient and "agile" that way. While theoretically such an approach could produce a usable building, there's a reason no buildings are constructed this way.

**OneTimeShot** · 18 September 2020, 04:27 PM

Wow this project is still going? I guess that it's time to to back to their GitHub (https://git.libre-soc.org/) It looks like slightly less of scam than last time, there is code there. Not very much, and they haven't deleted the riscv folders, or added a Power folder, but still there is actual code this time...

Python code. That surprises me, because if I go to www.opencores.com to look at a random hardware project, they are all written in Verilog so they can run on FPGAs.

...but what do I know about hardware?

**lkcl** · 18 September 2020, 04:37 PM

Originally posted by Min1123 View Post

lkcl I've been following this (as a GPU) for a while, as I've been following the EOMA68 for a while. You mentioned once using multiple digital lower speed memory to decrease the issues associated with trace length in modern DDRx RAM implementations. Are you still considering this?

let me think back, that was some time ago. it might have been when i was considering using HyperRAM (JEDEC xSPI). there, because each HyperRAM bus would be 8-bit 300mhz DDR, the theory goes just put down 6, 8, 10 or so of them, and you'd get the same equivalent bandwidth of DDR3 without all the complexities of DDR3 DRAM "training".

however, the pricing on HyperRAM is so high compared to the capacity that, well, unfortunately, it's not a realistic proposition *unless* you also get your own 1GByte HyperRAM ICs fabbed up (!)

Also, I didn't see, I was kind of hoping that the SoC here would be multi-core with one or more cores being dedicated to the rendering as a GPU and the others being the CPU, potentially with an MMU that would allow memory bank swapping so that the CPU could load up the GPU RAM and swap, or vice-versa, to support using GPGPU-style operations with a speedup.

yeah funnily enough this came up a couple of times when talking with people on XDC2020, the idea being to use a big.little architecture - still SMP (not NUMA) - and have tiny L1 caches for the little cores, on the basis that most Shader programs fit in well below 8k.

then on each little core you could additionally just massively increase the back-end SIMD width (or, in the case of the OoO microarchitecture we're doing just add many more 64-bit SIMD ALUs), and the job's done.

the other ideas were to do SIMT - which is basically you share one decoder and one I-Cache but have multiple "ALUs", multiple register files and multiple D-Caches, and you "broadcast" the exact same instruction to multiple execution units. i don't really like this because it can't possibly be used for general-purpose execution, and i think the "big.little" architecture, whilst performance would suck if you accidentally allocated a general-purpose program to a little core which only had an 8k L1 cache, at least it wouldn't break the standard programming model that is expected of CPUs today, and it's a nice architectural compromise.

**lkcl** · 18 September 2020, 04:55 PM

Originally posted by OneTimeShot View Post

Wow this project is still going? I guess that it's time to to back to their GitHub (https://git.libre-soc.org/) It looks like slightly less of scam than last time, there is code there.

it's 80,000 lines of HDL and unit tests! that's enormous!

to give you some idea: minerva, which is an rv32 core, is "only" 4,000 lines of python nmigen code, and it took 2 developers *2 years* full-time to write.

that's a productivity rate of only 2-3 lines of code *per day*!

by contrast, the average software engineer is quoted at around 150 lines of code per day.

developing processors is *hard*

Not very much, and they haven't deleted the riscv folders, or added a Power folder, but still there is actual code this time...

yeah, given that the whole project is OpenPOWER, we didn't feel a need to separate out different ISAs. and it's partly micro-coded, anyway, which is (sort-of) architecturally independent... well, PowerISA has some weirdness, like CRs and the XER so and ca/32 and ov/32 bits... you get the idea.

the actual OpenPOWER decoder is here: https://git.libre-soc.org/?p=soc.git...ecoder;hb=HEAD

Python code.

yeah, i'm a software engineer

we did several months evaluation, it's outlined in the two talks, and much of the OpenPOWER talk is dedicated to explaining it.

That surprises me, because if I go to www.opencores.com to look at a random hardware project, they are all written in Verilog so they can run on FPGAs.

...but what do I know about hardware?

errr i wanted to be able to ping you with the youtube video as officially published by the OpenPOWER Foundation but they've not done the "processing and uploading" yet. i'm guessing the talks will appear here? https://www.youtube.com/channel/UCNV...s0_Sg/featured

i wanted to refer you to the talk i gave because i explained the choice of nmigen in one of the slides, and then went over an example of how advantageous it has been, for the PowerISA decoder.

there, basically, because we are using python to *generate* an Abstract Syntax Tree (rather than like in MyHDL which is "verilog with a python syntax"), you can use 20-year-established CSV file reading modules, which everyone who's ever used python knows how to do in a few lines, then create a class that can take a "column subset" to say "hey actually i know you read all these CSV values, but actually i only want the Unit column, the Function column, and maybe the Register RA column, thank you".

in that way, as i outline in the talk, you can use the *exact same code* to create *subset* HDL decoders. and this turns out to be essential because we're farming out the instruction to *twelve* different Function Units, which, if we decoded everything from one single decoder and fanned it out from a central location it would be a massive TWO HUNDRED wires just for the instruction information - to every single Function Unit!

think about that for a minute.

if it was VHDL or Verilog you'd have to write *TWELVE* separate decoders - all looking nearly identical - because neither Verilog nor VHDL are capable of Object-Orientated conceptualisation.

start to make sense why we picked nmigen?

**GruenSein** · 18 September 2020, 05:00 PM

Wake me up when there is anything remotely usable. If the ISA and ALU count is still up for discussion, we must be years away from any demoable result.

**lkcl** · 18 September 2020, 05:04 PM

Originally posted by TJSER View Post

The difference is that I can't perform a rational evaluation of the LibreSoC core, because it doesn't exist.

https://www.youtube.com/channel/UCNV...s0_Sg/featured

tsjer, i appreciate your efforts and challenging questions, and your willingness to raise them publicly. i just... i'm really sorry, there's too many assumptions and incorrect assertions that you've made which, if i were to begin to correct them all, it would make both of us look foolish and belligerent. as i have a deadline of the end of October (under 6 weeks) to get the GDS-II files done for the 180nm tape-out, can i leave it with you to do some careful and considered research, and perhaps the next time we encounter each other on a phoronix article it could be a productive and engaging conversation that we both would love and enjoy? sorry to have to write this publicly.

**lkcl** · 18 September 2020, 05:09 PM

Originally posted by GruenSein View Post

Wake me up when there is anything remotely usable. If the ISA and ALU count is still up for discussion, we must be years away from any demoable result.

gruensein, we're doing "incremental development", going step by step, and one of the first steps, clearly, has to be to actually get standard scalar PowerISA under our belt, first. that's been done successfully "on par" with microwatt, and i have it running on a Versa ECP5 FPGA, a couple of weeks ago:

https://www.youtube.com/channel/UCNV...s0_Sg/featured

however that's only a 45k LUTs FPGA, which is going to be nowhere near big enough to fit an SMP system into, nor a dual SIMD IEEE754 FP32 pipeline. realistically we'll need at least a 200k LUTs FPGA and even then that's going to be tight.

we'll get there - and there'll be announcements about each milestone that we reach, when we do. there will be plenty more articles i'm sure

**TJSER** · 18 September 2020, 05:22 PM

as i have a deadline of the end of October (under 6 weeks) to get the GDS-II files done for the 180nm tape-out,

This again demonstrates the unlikeliness of project success. The tapeout deadline is within 6 weeks, and you are still writing critical core RTL? If your RTL/infrastructure isn't sufficiently developed at this point to provide quantitative estimates of performance/power/area, then there is practically no way you can tapeout a chip that ends up matching expectations.

https://www.youtube.com/channel/UCNV...s0_Sg/featured

Running a soft-core on a FPGA is trivial, especially when they are reusing most of the Litex infrastructure.

There is such a massive gap between this project's claimed goals (application-class, low-power, VPU/GPU) and what has been accomplished so far (reusing existing projects like Microwatt and Litex), that I do not see this project ever meeting any of its claimed goals.

**OneTimeShot** · 18 September 2020, 05:35 PM

Originally posted by lkcl View Post

it's 80,000 lines of HDL and unit tests! that's enormous!

Yeah - I think I found it in soc.git/src/soc/... I can confirm that there is approximately that quantity of Python that autogenerates stuff, a fair proportion of which is original code. Some of the file names match modules that would appear in a CPU.

I assume that it will do something, but it is hard to tell what. I couldn't find anything related to DisplayPort or HDMI, or a memory management system, or parallel processing, or instruction queues, or pipeline timers, or task schedulers, or cache retrieval, or vertex processing, or texture compression, or fanout, or Z buffer, or depth processing, or anything else I'd expect to see if I went to a GPU project.

Does it connect to a monitor and display a triangle?

Announcement

Libre-SOC Still Persevering To Be A Hybrid CPU/GPU That's 100% Open-Source

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment