Announcement

**Developer12** · 02 December 2021, 11:54 PM

Originally posted by coder View Post

Synthesizing this soft accelerator for a FPGA does. So, you're stuck with a proprietary toolchain, either way.

But performance on a FPGA is bad enough that it's basically pointless. I could get better throughput from simply running the OpenCL code on a modern CPU!

Clearly you're not very good at digital logic design, or you've been fed total bullshit.

It's trivially easy to make a design on an FPGA that far outstrips a CPU, even a small one. All you need to do is design a pipelined core for MD5 or SHA256 or whatever, and tile it across the FPGA making dozens of parallel compute engines.

OpenCL is larger and more complicated, but not terribly different in principle. Take all the parallel math (eg matrix multiplications) break it apart and tile multipliers and adders across the chip to do as many operations in parallel and in a pipelined manor as possible.

It won't be nearly as fast as a GPU, because the GPU has both more units than even the largest FPGA could fit (extra circuitry is required for reconfigurability) and faster digital logic gates (LUT units are slower). But it will be easily faster than the fastest CPU.

For real world examples, see Microsoft. They decided that, rather than make a single fixed piece of hardware like google's TPU, they'd invest in FPGAs. The theory was that AI algorithms would evolve and they could update their designs accordingly. Meanwhile Google's TPU might become incompatible.

**coder** · 03 December 2021, 12:33 AM

Originally posted by Developer12 View Post

Clearly you're not very good at digital logic design, or you've been fed total bullshit.

It's trivially easy to make a design on an FPGA that far outstrips a CPU, even a small one. All you need to do is design a pipelined core for MD5 or SHA256 or whatever, and tile it across the FPGA making dozens of parallel compute engines.

Um... try rereading my messages. I think we're saying the same thing. I said "performance on a FPGA", meaning the performance of this design, running on a FPGA. Not performance of a FPGA, in general.

If you read the article and my previous messages, I don't see why you'd think I was saying what you seem to think I was saying. Maybe check your internet outrage, before firing off an angry reply.

Originally posted by Developer12 View Post

OpenCL is larger and more complicated, but not terribly different in principle. Take all the parallel math (eg matrix multiplications) break it apart and tile multipliers and adders across the chip to do as many operations in parallel and in a pipelined manor as possible.

That's what I'm saying! Just use Intel's OpenCL toolchain to compile and run your kernel directly for their FPGA!

Originally posted by Developer12 View Post

It won't be nearly as fast as a GPU,

Depends on what the kernel does, but generally true.

Originally posted by Developer12 View Post

But it will be easily faster than the fastest CPU.

Look at the quoted performance numbers, in the article. Building a generic accelerator on top of a FPGA is a waste of perfectly good gates!

**smitty3268** · 03 December 2021, 01:18 AM

Originally posted by coder View Post

Look at the quoted performance numbers, in the article. Building a generic accelerator on top of a FPGA is a waste of perfectly good gates!

Not that I follow this project, but I strongly suspect this is just a proof of concept. It's easy to get it working on actual hardware in an FPGA rather than spending the money for actual dedicated silicon, but surely that's the goal of the project. At least if anyone takes it seriously. I also think this is probably more about the research rather than necessarily coming up with a product that they'll sell to anyone.

**coder** · 03 December 2021, 01:55 AM

Originally posted by smitty3268 View Post

spending the money for actual dedicated silicon, but surely that's the goal of the project.

According to WorBlux (post #9), this is more of a research project. As such, it's not likely they'll have funds to get it fabbed on a process modern enough to offer significant benefit over these FPGAs. That's the bitch about hardware, and basically why I went into software.

**Developer12** · 03 December 2021, 01:49 PM

Originally posted by coder View Post

Um... try rereading my messages. I think we're saying the same thing. I said "performance on a FPGA", meaning the performance of this design, running on a FPGA. Not performance of a FPGA, in general.

If you read the article and my previous messages, I don't see why you'd think I was saying what you seem to think I was saying. Maybe check your internet outrage, before firing off an angry reply.

That's what I'm saying! Just use Intel's OpenCL toolchain to compile and run your kernel directly for their FPGA!

Depends on what the kernel does, but generally true.

Look at the quoted performance numbers, in the article. Building a generic accelerator on top of a FPGA is a waste of perfectly good gates!

Try reading my messages. The problem with using the Intel toolchain to recompile the design is that now you have to perform that task for EVERY application you want to run. I've used intel (altera)'s toolchain and it fucking sucks. It's massive, slow, and is very flaky about it's dependencies. We had a whole other company (CMC) that we hired whose experience was in making sure the software would work in our environment.

Implementing a GPU on the other hand means that they can use a traditional graphics stack with openCL compiler. Why an FPGA? Because it's much cheaper than going to a fab and spending millions of dollars for a limited run of chips. Lots of products I could name (from fingerprint door locks to the HTC vive) contain FPGAs for this very reason.

I don't know why the fuck you would even assert that a cpu could be faster than an FPGA implementation. Even a fixed GPU implementation has many, many more parallel execution engines.

**coder** · 04 December 2021, 06:25 AM

Originally posted by Developer12 View Post

Why an FPGA? Because it's much cheaper than going to a fab and spending millions of dollars for a limited run of chips. Lots of products I could name (from fingerprint door locks to the HTC vive) contain FPGAs for this very reason.

You think I'm arguing against FPGAs, but I'm not. I'm just saying that building a generic compute accelerator on top of them is not a winning proposition, as you can easily see if you compare their quoted specs to those of a modern server or workstation CPU. Honestly, just a fast desktop CPU would leave it in the dust.

Originally posted by Developer12 View Post

I don't know why the fuck you would even assert that a cpu could be faster than an FPGA implementation. Even a fixed GPU implementation has many, many more parallel execution engines.

Again, what you consistently seem to miss is that I'm talking about this design. Go back to the article, look at their quoted specs, and tell me a comparably priced CPU wouldn't run circles around it.

**Developer12** · 05 December 2021, 03:25 PM

Originally posted by coder View Post

You think I'm arguing against FPGAs, but I'm not. I'm just saying that building a generic compute accelerator on top of them is not a winning proposition, as you can easily see if you compare their quoted specs to those of a modern server or workstation CPU. Honestly, just a fast desktop CPU would leave it in the dust.

Again, what you consistently seem to miss is that I'm talking about this design. Go back to the article, look at their quoted specs, and tell me a comparably priced CPU wouldn't run circles around it.

Fuck the design. It's a pain in the ass redesigning, resynthesising, and so on. Not acceptable for a lot of deployments to ship all the tools.

Announcement

Open-Source FPGA-Based RISC-V GPGPU That Supports OpenCL 1.2

Comment

Comment

Comment

Comment

Comment

Comment

Comment