Announcement

**Hadrian** · 12 March 2018, 06:59 AM

Great news. Thanks to David Airlie and helpers for their good work, and also many thanks to Michael Larabel for reporting all these good news for us Linux users with older AMD graphics cards.

**SteamPunker** · 12 March 2018, 07:52 AM

Great news indeed!

But if the Radeon HD 5000/6000 series GPUs fully support OpenGL 4.3 with the exception of FP64 in hardware, doesn't that mean that in theory, a Vulkan driver could be written for this generation of hardware as well? The Vulkan specifications do not explicitly require hardware FP64 capability, do they? This might be useful for supporting this hardware in the long term. Wasn't support for Compute Shaders and OpenGL ES 3.1 enough to support Vulkan as well?

But regardless of that, impressive work, David Airlie!

**Adarion** · 12 March 2018, 08:28 AM

OMG. Please get this asap to the masses in a Mesa release!

That should be the last item for a lot of chips to be able to highly officially advertise OpenGL 4.x support in the driver.

**schmidtbag** · 12 March 2018, 08:53 AM

I'm still don't understand whether the CPU is doing the work or the GPU. If the CPU is doing it, I imagine the performance would be so bad that you'd be better off doing strictly CPU calculations (assuming you're doing OpenCL; I'm not aware of any OpenGL programs that use FP64), because you'd be wasting a lot of time in communicating over the PCIe bus. If the GPU itself is emulating FP64, I'm not exactly sure how that's achievable, but I could definitely see the benefit in that.

Does Arlie intend to improve upon OpenCL for R600? Though I certainly appreciate his efforts, I don't quite understand what his plans are. I have this old Firepro card (based on the HD 6670) that I use for BOINC, and pretty much the only workunit within my interest that it can handle without failure is SETI. It'd be great if I didn't have to blacklist it from other projects, but, I'm not sure if it can be used with open-source drivers.

**pal666** · 12 March 2018, 09:27 AM

Originally posted by schmidtbag View Post

If the GPU itself is emulating FP64, I'm not exactly sure how that's achievable

just as infinite precision floating point is achievable or floating point on cpus without fpu is achievable

**droste** · 12 March 2018, 09:53 AM

Originally posted by schmidtbag View Post

I'm still don't understand whether the CPU is doing the work or the GPU. If the CPU is doing it, I imagine the performance would be so bad that you'd be better off doing strictly CPU calculations (assuming you're doing OpenCL; I'm not aware of any OpenGL programs that use FP64), because you'd be wasting a lot of time in communicating over the PCIe bus. If the GPU itself is emulating FP64, I'm not exactly sure how that's achievable, but I could definitely see the benefit in that.

Does Arlie intend to improve upon OpenCL for R600? Though I certainly appreciate his efforts, I don't quite understand what his plans are. I have this old Firepro card (based on the HD 6670) that I use for BOINC, and pretty much the only workunit within my interest that it can handle without failure is SETI. It'd be great if I didn't have to blacklist it from other projects, but, I'm not sure if it can be used with open-source drivers.

The work is done on the GPU. It's achieved by splitting one instruction in 64bit to multiple instructions in 32bit and combining their results. That's why it's slower, because you need to execute 3,4,5,.... instructions instead of a single one to get the same result. How many instructions depends on what you want to do but it's at least 3 (2*32bit and the combination of the result). So best case is 3 instructions instead of a single one. Many cases require more than 3.

It's pretty much like it's described in the article. It's only done because it is needed for claiming OpenGL >= 4.0 support. Nobody uses it in games, so performance doesn't matter, but a lot of games want OpenGL >= 4.0. With this enabled you jump from 3.3 straight to 4.4 support for these cards.

**schmidtbag** · 12 March 2018, 10:07 AM

Originally posted by pal666 View Post

just as infinite precision floating point is achievable or floating point on cpus without fpu is achievable

GPUs are structured very differently so I'm not sure it's that simple. For example, using FP16 on a GPU that doesn't have the hardware to process that will not run better than FP32. There are "half-precision" GPUs out there that are significantly faster with FP16 vs FP32. Meanwhile, CPUs are a lot more dynamic, where they will actually run faster by lowering the precision.
I'm not entirely sure how GPUs are programmed at lower levels, but when you develop a program to be multi-threaded on a CPU, each thread operates independently, and inherently does not share memory with other threads (whether they're related or not). You can make them share memory, but it costs performance (I'm assuming this is because each thread has to spend more processing time talking to each other). What I'm getting at is if a GPU's individual cores are designed to be FP32, each "thread" (if that's the right term) cannot allocate any more memory to be FP64.

**schmidtbag** · 12 March 2018, 10:14 AM

Originally posted by droste View Post

The work is done on the GPU. It's achieved by splitting one instruction in 64bit to multiple instructions in 32bit and combining their results. That's why it's slower, because you need to execute 3,4,5,.... instructions instead of a single one to get the same result. How many instructions depends on what you want to do but it's at least 3 (2*32bit and the combination of the result). So best case is 3 instructions instead of a single one. Many cases require more than 3.

I had a feeling that they just simply combined multiple instructions, but I was confused how that'd work since each core runs in parallel. But, if it takes multiple executions, that makes more sense. Anyway thanks for the clarification.

It's pretty much like it's described in the article.

I don't see where in the article it answers my question, which is why I asked it. I'm aware of why FP64 is needed, but the article didn't get into detail about how it was done.

**nanonyme** · 12 March 2018, 10:28 AM

Sounds like the time is coming when people can bump up minimum expected GL version from 3.3 to 4.3.

Announcement

David Airlie Moves Toward Upstreaming Soft FP64 Support In Mesa

David Airlie Moves Toward Upstreaming Soft FP64 Support In Mesa

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment