Announcement

**phoron** · 21 June 2022, 12:13 PM

It kind of looks like the hardware IP is not free, but do we have at least free software drivers ?
I could only find

DELIVERABLES, SOFTWARE & INTEGRATION*
NEOX™ SDK, System Verilog RTL, Integration Tests, LLVM C/C++ compiler, GCC C/C++ compiler. Custom instructions for Computer Graphics, Compute and AI, and user defined extensions. Evaluation on Xilinx SoC FPGA platform and SW Cycle Accurate Simulator. Supported OS: Linux, RTOS, Wear OS.
* Listed items represent a super-set and are subject to change without further notice.

Which isn't very reassuring freedom-wise.
Or not very reassuring in any sense. Can someone put there a correct current list and drop the asterisk ? With that asterisk there they could have added a Warp drive and a linear NP solver and still claim to be honest marketing droids.

**Developer12** · 21 June 2022, 12:21 PM

So.....they glued a bunch of small RISC-V CPU cores together and called it a GPU? Yeah, intel tried that one too.

**coder** · 21 June 2022, 12:53 PM

Originally posted by uid313 View Post

But is it even possible to design a GPU with competitive performance on the RISC-V architecture?

Good question.

Modern GPUs all combine in-order cores with wide SIMD and heavy SMT. At some superficial level, it seems there's no reason you couldn't. However, a closer look shows a few more distinguishing characteristics:

local SRAM
hardware engines: TMUs, ROPs, sometimes Tessellation, and more recently raytracing
hardware thread scheduling
implementation-specific ISA
relaxed cache semantics

An incomplete list, to be sure, but this gets at some of the key differences between GPU and CPU cores. This implementation seems to have ticked the first three boxes (at least, to the extent it probably matters for their intended applications) and possibly the 3rd (if only at the cluster-granularity). The point about implementation-specific ISA is that it allows GPUs to change their instruction set & encoding to suit the needs of a given implementation, reducing the amount of logic needed at the front end of the execution pipeline. Using a standard ISA should add some overhead, here.

Another example of where GPU cores start to resemble DSPs more than CPUs is in how they handle register hazards. For Xe, Intel switched from a CPU-like scoreboarding model, where the CPU will automatically block a read of a register with a pending write, to one where the compiler must shoulder the burden of ensuring such scheduling constraints are met.

https://www.phoronix.com/scan.php?pa...x-Compiler-Big

I've even heard from someone working at a major GPU vendor that the mere notion of instruction traps is a major imposition that they side-stepped. It requires you to construct the notion of a consistent state, which doesn't naturally exist. If you implement a standard CPU core that should support standard debugging tools, then you can't really get around such requirements.

In summary, I think GPUs using a standard CPU ISA will never take the crown in perf/area or perf/W. However, it's certainly possible to be well within the same order of magnitude. At that point, other factors could drive adoption.

**coder** · 21 June 2022, 12:57 PM

Originally posted by Developer12 View Post

So.....they glued a bunch of small RISC-V CPU cores together and called it a GPU? Yeah, intel tried that one too.

If you only read the headline, I wouldn't blame you for assuming that. From what little I've seen, I think their implementation isn't nearly so naive.

It's worth noting that this is coming from a group with some depth & history in low-powered display processors and tiny GPUs. I think they know well enough what they're doing to have built a viable solution.

BTW, RISC-V scales down better than x86. They should carry a lot less baggage than what Intel had to deal with.

**coder** · 21 June 2022, 01:01 PM

Originally posted by phoron View Post

do we have at least free software drivers ?

Good question. It'd be interesting to port something like LavaPipe to this thing.

Even if the drivers aren't open source, the benefit for customers would be that they can at least use standard programming languages and tools to maintain the GPU firmware (the source for which is certainly at least available under contract). That cuts way down on the learning curve for their customers to have some IP they can modify to suit their particular needs.

**dragon321** · 21 June 2022, 01:58 PM

Originally posted by uid313 View Post

409 GFLOPS, meanwhile an old GeForce 1080 have 8873 GFLOPS and the GeForce RTX 3090 at 35580 GFLOPS.

That's almost twice as PlayStation 3 GPU (230 GFLOPS). Or less than twice as iPhone 7 GPU (260 GFLOPS). It's also comparable to some older Intel integrated GPUs perfectly capable of doing lighter work or even some older and lighter games. Were you expecting gaming class GPU capable of replacing newest Nvidia or AMD cards? Clearly it's not a choice for games or heavy work but it's easily capable for many other tasks.

**tildearrow** · 21 June 2022, 03:26 PM

Originally posted by -MacNuke- View Post

And the Raspberry Pi GPU has 32 GFLOPS. So what is your argument?

He just wants Threadripper level RISC-V processor.

**Developer12** · 21 June 2022, 04:18 PM

Originally posted by coder View Post

If you only read the headline, I wouldn't blame you for assuming that. From what little I've seen, I think their implementation isn't nearly so naive.

It's worth noting that this is coming from a group with some depth & history in low-powered display processors and tiny GPUs. I think they know well enough what they're doing to have built a viable solution.

Gee, maybe you shouldn't assume what I read after only reading the title yourself.

Read their post. It's literally a bunch of lightly-modified RISC-V cores in a mesh network-on-chip.

**Developer12** · 21 June 2022, 04:22 PM

Something that the article failed to pick up: they're waaaaaaay behind LibreSoC. All they have is a design, a simulator, and on their TODO list an FPGA prototype.

Meanwhile LibreSoC has actual test chips on real silicon, going onto dev boards. https://twitter.com/lkcl/status/1539231319166689281

**-MacNuke-** · 21 June 2022, 04:46 PM

Originally posted by coder View Post

Yeah, but it's also a complete joke.

Yet it drives a complete and huge ecosystem. And now we are "crying" about something that can be over 10 times faster?

Announcement

Think Silicon Shows Off First RISC-V 3D GPU

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment