Announcement

**coder** · 11 January 2024, 05:11 AM

Originally posted by S.Pam View Post

I am curious if we will be able to run normal cpu programs on the gpus when compiled through stuff like this. Some apps would definitely benefit from fast memory and dedicated compute units.

GPUs really depend on good SIMD occupancy, for competitive performance. If you're only executing mostly scalar code on them, performance won't be competitive. And such wide SIMD has very limited applicability to conventional programs.

Originally posted by S.Pam View Post

I'd imagine that high compression ratio algorithms could benefit since gpus have massively fast memory, which is needed to parse lots of data. I.ee 7zip with large block size compared to the small blocksize of zlib and bzip2.

CPUs have TB/s aggregate bandwidth to their L3 caches. If you use compression levels where the tables can fit in cache, there should be no reason to resort to a GPU.

Originally posted by S.Pam View Post

Also apparently something like it exists already, albeit a cuda only thing. https://developer.nvidia.com/nvcomp

Don't go comparing a H100 or A100 to a desktop CPU. Big datacenter CPUs should be compared to big datacenter CPUs. Let's see how much aggregate compression/decompression throughput you get with a 96-core Genoa or 128-core Bergamo.

**coder** · 11 January 2024, 05:13 AM

Originally posted by dev_null View Post

and it’s already a time when GPUs should be mounted to a system block case, and have a slot for motherboards

No, because you often have multiple GPUs per CPU.

As linuxgeex said, the CPU is central not because it has the most horsepower, but because you need it to take charge of all the various peripherals and tie everything together.

**dev_null** · 11 January 2024, 09:57 AM

I mostly mean size here. GPU is heavy and big and the rest may be much smaller

**coder** · 11 January 2024, 12:52 PM

Originally posted by dev_null View Post

I mostly mean size here. GPU is heavy and big and the rest may be much smaller

I understand how it can seem that way, but a graphics card mostly has just voltage regulators, the GPU, and some memory chips. Beyond that, there's just the cooling solution.

So, to compare, you should look at how big the motherboard's VRM is, consider how much space its RAM is using, and don't forget to add in its heatsink + fan. Once you do that, the GPU seems a lot less special.

One thing that bugs me about GPUs is when I have to discard old ones, which we sometimes do at my job. I have admired their heatsinks, shroud, and fans that are now e-waste. It's too bad that stuff isn't made to a standard, so it could be swapped over to another GPU.

**S.Pam** · 15 January 2024, 12:07 PM

Originally posted by coder View Post

GPUs really depend on good SIMD occupancy, for competitive performance. If you're only executing mostly scalar code on them, performance won't be competitive. And such wide SIMD has very limited applicability to conventional programs.

CPUs have TB/s aggregate bandwidth to their L3 caches. If you use compression levels where the tables can fit in cache, there should be no reason to resort to a GPU.

Don't go comparing a H100 or A100 to a desktop CPU. Big datacenter CPUs should be compared to big datacenter CPUs. Let's see how much aggregate compression/decompression throughput you get with a 96-core Genoa or 128-core Bergamo.

It's when the dataset does not fit in L3 when this could be interesting.

I did see a long time ago someone that did implement a common zip-like algorithm on GPU and had 2-3x performance gain. Sadly I couldn't find the reference and left with the nvcomp instead, which isn't quite the same.

But you are right about massive parallelism being the key on GPUs.

Even so, a task running inside a gpu could be protected from or run with no or little impact on CPU load.

Announcement

Vcc Announced As The Vulkan Clang Compiler

Comment

Comment

Comment

Comment

Comment