Announcement

**coder** · 20 March 2021, 07:34 AM

Originally posted by Qaridarium

Intel already supports WebGPU... as a standard...

I don't know what that means, because no such standard exists. Right now, there is only a working group.

Here is their page: https://www.w3.org/community/gpu/

They have not released anything.

Please keep the discussion grounded in what's real and exists -- not in your hopes and dreams.

Originally posted by Qaridarium

in reality Khronos failed multible times to make a "standard" like OpenCL2.0...

What are you even talking about?

Originally posted by Qaridarium

if you research any standard made by Khronos then you will discover that they failed multible times.

Whether from Khronos or some other body, just by virtue of being a standard doesn't guarantee success. Nobody would argue otherwise!

However, by not being a standard, that means I'm very unlikely to get alternate implementations from competitors, and that's a requirement for me.

Originally posted by Qaridarium

OpenGL for example is a complete failure compared to DirectX ... most game developers choose directX over openGL..
this means OpenGL as a standard failed to be the standard...

I can't believe I even have to say this, but you can't judge a standard as a failure, just because there's some similar implementation that's more widely-used. To judge something as a failure, you have to consider its objectives.

OpenGL is not only for games. There's a lot of professional software that uses OpenGL, partly because OpenGL is more portable and partly because it has greater accuracy guarantees.

OpenGL is extremely-widely supported, as well. DirectX exists only on Windows and through emulation.

**agd5f** · 22 March 2021, 10:52 AM

Originally posted by coder View Post

Last I checked, SYCL is a Khronos standard and HiP is not. Do you really want to play this game?

Despite what Intel says, OneAPI != SYCL. It's similar to SYCL, but contains a number of Intel specific changes.

**coder** · 22 March 2021, 11:24 PM

Originally posted by Qaridarium

I am a end-user and in my point of view everything what is not exactly this:

Thanks for sharing your perspective and priorities.

Originally posted by Qaridarium

this old age thinking and today clearly wrong.. there are CPUs with the same power than GPUs...
like the supercomputer "Fugaku, which is powered by Fujitsu's 48-core Arm-based A64FX system-on-chip, consists of nearly 7.3 processor cores and reached 415.5 petaflops of performance in the High Performance Linpack benchmark,"

Top 500 Supercomputers: New No. 1 Uses Arm-Based Fujitsu CPUs | CRN

https://www.crn.com/news/components-peripherals/top-500-supercomputers-new-no-1-fugaku-uses-arm-based-fujitsu-cpus

A new high-performance computing cluster using Fujitsu's Arm-based processors jolts to the top of the world's top 500 supercomputers, according to Top500's latest list.

this A64FX based system does not use GPU and has similar power than a system with GPUs.
so your argument CPU support don't count is outdated.

Not really. That's a specialized CPU that no one can buy. It's sort of like the ARM equivalent of the Xeon Phi, and probably also not commercially viable. So, why did they build it? I think the main reason is that Japan wanted to minimize dependence on the IP of others. Also, some people claim they wanted to tackle workloads that are more friendly to CPUs than GPUs.

However, if we ignore the fact that you can't buy them (and what they'd cost, if you could) and just compare their raw compute stats with GPUs, they have about 3.38 TFLOPS per CPU, compared with 9.70 for Nvidia's A100 or 11.5 for AMD's MI100. And that's just basic fp64-compute, without even beginning to talk about Tensor or Matrix ops, where the A100 and MI100 would truly kick it to the curb.

So, even a purpose-built HPC CPU cannot truly compete with GPUs. But Xeon Phi pretty much already proved this point.

Originally posted by Qaridarium

we have different opinion what a "standard" is.

I already explained what I care about and why. I care that I can choose from multiple competing implementations and use them interchangeably. Don't let terminology distract you from that point.

Originally posted by Qaridarium

believe it or not but compute-based SPIR-V and WebGPU will be the norm and the standard in near future.

and OpenCL is a waste of time...

Those are your beliefs and expectations. You might be comfortable basing decisions on them, but I'm not.

Also, I can write and run OpenCL today, because it exists and there are multiple mature (and not so mature) implementations available to me. That's critical, because I'm working on projects where I need these technologies. Even if you're right, that doesn't help me today.

Originally posted by Qaridarium

you can read on wikipedia about WebGPU:
"WebGPU uses its own shading language called WGSL that is trivially translatable to SPIR-V.^[14] This choice is a compromise among three proposals: textual WebMetal by Apple, textual WebHLSL / WSL by Apple Safari, and binary SPIR-V by Mozilla."

means the high level part is the same as apple metal in WebGPU it is called WGSL
the low level part is the same as Vulkan means SPIR-V

you can target the low level part directly.

this means WebGPU is not something "new" it is just a hybrid of Vulkan and Metal

Thanks for explaining your claim.

Something to keep in mind is that APIs like OpenGL or Vulkan are incredibly complex. The actual shading language is just one part of it, and surprisingly not a big point of differentiation between OpenGL and Vulkan.

Whether the API is high-level or low-level depends on how much it abstracts things like scheduling and resource management. This is where OpenGL and Vulkan differ most, and it's a lot of what makes Vulkan more difficult to use.

Originally posted by Qaridarium

sure you can use it just use WGSL(metal based) or target the low level part SPIR-V dirctly.
you can use it right now....

No... I don't think so. Even if I could download all the pieces I'd need and manage to get it all compiled and working:

There'd be lots of bugs. Standards come first, then conformance tests. Without conformance tests, implementations tend to be very buggy.
There would be incompatible changes, forcing developers to update their code.
Performance would be poor. Optimization usually comes last, after the standard is finalized and completely implemented and there are a decent number of tests in place.
The only documentation would be the standards draft, itself. Not being final, it might contain errors, and those documents tend to be written for implementers rather than users. If you ever try reading a standards document of a programming language, it's like the most difficult way to learn it!

See what I mean about details? I'd waste literally all of my time struggling with early software, changing APIs, poor performance, and rough/nonexistent documentation. That would be a complete distraction from the real aims of my project.

In fact, I'm not a very bleeding edge guy, at all. I like using stuff that just works. That way, I can just focus on my goals and how best to use the technology to support them.

Originally posted by Qaridarium

this means OpenGL is only a norm and failed to be a standard.

No, it didn't fail at its goals. It started out as a professional graphics API and still dominates in that market. It wasn't created to for gaming, so judging it by the standard of its use in games is missing the point.

**coder** · 23 March 2021, 09:42 PM

Originally posted by Qaridarium

for example a Radeon VII has 3,4 TFLOPS FP64....

to have the same Radeon VII performance inside of an ARM chip is really amazing.

many people have a radeon VII with 3,4 TFLOPS FP64
compared to this the 3,38TFLOPS FP64 is very good.

As we've established, I'm not interested in theoretical products. I'm also not interested in expensive server CPUs, including the expensive servers you'd need to put them in.

However, I think the Radeon VII is a good point of comparison. Let's set aside that the silicon is really capable of double that and focus just on the compute specs. At $700, it provided that, yet the top-spec EPYC 7xx3-series (which will presumably have a Threadripper counterpart) costs about 11x the price and offers just 2.5 TFLOPS. And you get 16 GB of RAM with the GPU!

Also, its memory bandwidth is only about 205 GB/s, compared with Radeon VII's 1 TB/s. And we know a 64-core Zen3 Threadripper would have only half that, due to having only 4-channel memory.

Next, consider that GPUs have directly addressable memories on-chip that offer far higher bandwidth than that!

So, it's really no contest. GPUs are still on completely different level than CPUs. Sure, CPUs will advance, but so will GPUs.

When I'm going for raw compute, CPUs just aren't interesting to me. I don't consider them a viable alternative to GPUs.

**coder** · 24 March 2021, 09:21 PM

Originally posted by Qaridarium

read my post i allready said that intel/amd cpus are expensive... but compared to this ARM cpus tent to be cheaper.
and thats the point if you talk about this: "I'm also not interested in expensive server CPUs, including the expensive servers you'd need to put them in." it always mean intel/AMD CPUs...

Okay, Ampere Altra is like $4k. About the same price as a Threadripper 3990X.

Originally posted by Qaridarium

threadripper pro also have 8 channel

Yeah, I thought of that after I posted.

Originally posted by Qaridarium

you really suffer from AMD/intel Duopole syndrom... openPOWER/RISC-V and ARM are much cheaper than intel for the same perfromance.

Except we're not talking about x86 as the target -- it's GPUs. And they still have the best performance/$.

Originally posted by Qaridarium

FUJITSU Processor A64FX

https://www.fujitsu.com/global/products/computing/servers/supercomputer/a64fx/

Fujitsu's A64FX Processor is a revolutionary processor designed for massively parallel computing and is used in the Fugaku supercomputer.

HPE Apollo 80 System | HPE Store US

https://buy.hpe.com/us/en/servers/apollo-systems/apollo-80-system/apollo-80-system/hpe-apollo-80-system/p/1012970957

Buy HPE Apollo 80 server system that offer opportunity to build HPC clusters with latest Arm processors. Explore HPE Apollo 80 server system price & quickspecs.

you can buy a HPE Apollo 80 System according to their website
it has 8 A64FX ARM cpus inside..

Awesome! I hadn't noticed that.

Originally posted by Qaridarium

looks like the AMD/intel duopole is soon over.

Yeah, when CPUs get launched using ARM's N2 and V1 cores.

Those Fujitsu CPUs are optimized for special-purpose, vector workloads that often run better on GPUs. But, we don't need to go through that all over again. Their single-thread performance is not going to compare well with Ampere Altra, except when using SVE.

**bridgman** · 25 March 2021, 12:35 PM

Originally posted by Qaridarium

and read this:

<my notification about problem running yesterday's 4.1 release on upstream kernels>

the Radeon VII is not really supported by ROCm...

That's a bit dramatic, isn't it ? "Found an issue" is hardly the same as "not really supported".

**bridgman** · 25 March 2021, 03:51 PM

Originally posted by Qaridarium

yes it is maybe a little "dramatic" but i would feel better if you at amd start to fix the problems with your new hired developers.
to test your releases of ROCm on PRO cards but not on the similar non-pro cards is AMDs fault...
you do not even need developers for this some experienced users without developer skilsl can do the testing as well.

We did test Radeon VII - it works fine.

The problem only presents when you combine ROCm userspace AND Radeon VII AND an upstream kernel (or the 4.0 DKMS package). What we failed to do was document that properly to avoid wasting people's time, and I'm trying to get that fixed now.

**coder** · 25 March 2021, 11:25 PM

Originally posted by Qaridarium

i think they are cheaper but only if you price in the mainboard.
the 3990X mainbaords are expensive like 700€...
and the ARM high performance chips are all firmly soldered.

This is all wrong. Just read the first 2 pages of this review:

The Ampere Altra Review: 2x 80 Cores Arm Server Performance Monster

https://www.anandtech.com/show/16315/the-ampere-altra-review

**coder** · 27 March 2021, 08:03 AM

Originally posted by Qaridarium

ok you are right Altra Ampere has a socket but you know in ARM space this is very rare...
99% of all ARM chips are soldered to the mainboard without a socket.

Server processors have a socket probably for a lot of reasons. Server boards are expensive, they usually support multiple CPUs (which are expensive), and if any CPU or the motherboard breaks, you don't want to throw out the other parts. Also, if only a CPU dies, then it requires far less time to swap it out than to replace the whole motherboard. And datacenters need to keep maintenance costs down.

What ARM server processors can you point to that are soldered?

**coder** · 27 March 2021, 11:11 PM

Originally posted by Qaridarium

i was not pointing at server cpus only...

That's your mistake, then. The context was ARM CPUs that could meaningfully compete with GPUs. So, any other type of ARM CPU would be irrelevant to the discussion. Either you're wrong on the facts or you're wrong to talk about non-server ARM CPUs.

Originally posted by Qaridarium

i can also say for sure that we will see more and more soldered cpus even in the server market

I don't care about your predictions. If you can't cite supporting examples, I don't want to hear them.

I gave counter-arguments why not to solder them. You completely ignored those points.

Originally posted by Qaridarium

if you have more and more pins to solder the cpu is more easy than to build a socket with 10 000 pins...

No trend continues infinitely. At some point, the number of pins will reach a peak. And even soldering the pins doesn't address costs involved with connecting them to stuff and routing all those wires. So, I don't believe we can assume that 10k pin packages will definitely happen.

Announcement

Radeon ROCm Updates Documentation Reinforcing Focus On Headless, Non-GUI Workloads

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment