Announcement

**trek** · 12 March 2018, 11:13 AM

Originally posted by schmidtbag View Post

I had a feeling that they just simply combined multiple instructions, but I was confused how that'd work since each core runs in parallel. But, if it takes multiple executions, that makes more sense. Anyway thanks for the clarification.

those multiple instructions are combined within a shader, so each core will run a single floating-point operation, that is translated in multiple integers operation

they can be parallelized on multiple data, one core for one operation, but not on single data, multiple cores for one operation

**DanL** · 12 March 2018, 11:13 AM

Originally posted by nanonyme View Post

Sounds like the time is coming when people can bump up minimum expected GL version from 3.3 to 4.3.

https://mesamatrix.net/ says everything is done for 4.4 on r600.

EDIT: I'm not sure how what (if anything) is needed to finish KHR_robustness for 4.5

Radeon R600 Gallium3D Driver Nearly At OpenGL 4.5, Remaining Bits Being Finished - Phoronix

https://www.phoronix.com/scan.php?page=news_item&px=R600g-Almost-At-OpenGL-4.5

Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

**kmare** · 12 March 2018, 12:39 PM

Originally posted by DanL View Post

https://mesamatrix.net/ says everything is done for 4.4 on r600.

Some r600 cards actually have fp64 in hardware, some don't. So basically the cards with hardware support, advertised OpenGL 4.4 since quite some time now.

**droste** · 12 March 2018, 03:37 PM

Originally posted by schmidtbag View Post

I don't see where in the article it answers my question, which is why I asked it. I'm aware of why FP64 is needed, but the article didn't get into detail about how it was done.

Yeah, sorry that was not really clear on my part. This part of my answer was for your second thought:
"Though I certainly appreciate his efforts, I don't quite understand what his plans are."

**smitty3268** · 13 March 2018, 01:24 AM

Originally posted by schmidtbag View Post

I had a feeling that they just simply combined multiple instructions, but I was confused how that'd work since each core runs in parallel. But, if it takes multiple executions, that makes more sense. Anyway thanks for the clarification.

I don't see where in the article it answers my question, which is why I asked it. I'm aware of why FP64 is needed, but the article didn't get into detail about how it was done.

Dave posted the patches. Check out this one, as an example: https://lists.freedesktop.org/archiv...ch/188614.html

That's an implementation of the multiplication of two 64 bit numbers. If you scroll all the way to the bottom, there's about 150 lines of GLSL code that implement it via 32 bit uint support - pulling out the exponents, signs, etc. into uint variables and then doing bitwise operations, multiplications, etc. on them. It really is exactly the same thing you'd see on a cpu where you needed to generate 32-bit integer assembly code to emulate a 64 bit operation. This fp64 emulation heavily relies on integer support and manual bitwise operations, so it requires GL 3 as a minimum on the GPU.

I imagine it's extremely slow compared to any kind of native hardware support, but it's still going to be much faster than doing the calculations on the CPU and stalling the entire pipeline.

**pal666** · 13 March 2018, 11:31 PM

Originally posted by schmidtbag View Post

GPUs are structured very differently so I'm not sure it's that simple. For example, using FP16 on a GPU that doesn't have the hardware to process that will not run better than FP32.

same shit with cpus. nothing different here.

Originally posted by schmidtbag View Post

There are "half-precision" GPUs out there that are significantly faster with FP16 vs FP32. Meanwhile, CPUs are a lot more dynamic, where they will actually run faster by lowering the precision.

only in your fantasy world for "non-half-precision" cpus

Originally posted by schmidtbag View Post

I'm not entirely sure how GPUs are programmed at lower levels,

but you still feel the urge to argue

Originally posted by schmidtbag View Post

but when you develop a program to be multi-threaded on a CPU, each thread operates independently, and inherently does not share memory with other threads (whether they're related or not). You can make them share memory, but it costs performance (I'm assuming this is because each thread has to spend more processing time talking to each other).

this has nothing to do with arithmetics

Originally posted by schmidtbag View Post

What I'm getting at is if a GPU's individual cores are designed to be FP32, each "thread" (if that's the right term) cannot allocate any more memory to be FP64.

who is your dealer?

**schmidtbag** · 14 March 2018, 08:51 AM

Originally posted by pal666 View Post

same shit with cpus. nothing different here.

It seems you still don't understand the point:
If the hardware isn't there, you can't just magically make things work the way you want, at least not with preferable results. As explained before, a 32-bit processor doing half-precision on hardware that doesn't support it won't yield a performance improvement. As explained by others in this thread, doing double-precision requires you to sacrifice a minimum of 3 instructions.
Not a hard concept to grasp.

but you still feel the urge to argue

Only you would have the bark and none of the bite to say something like that. As usual, you contribute nothing useful to the conversation. Note how many others who replied to me are getting upvotes (some even by myself), because they're not being petty trolls - they understand the situation and are explaining it, politely. You, meanwhile, don't seem to understand the difference between 32 bit and 64 bit processors.

this has nothing to do with arithmetics

First of all, yes, actually, floating point calculations have pretty much everything to do with arithmetics. Second, what does that have to do with the part you quoted? I was talking about how memory sharing works between 2 cores running in parallel. So yes, that has nothing to do with arithmetics, but then your comment might as well have been "the sky is blue".

Announcement

David Airlie Moves Toward Upstreaming Soft FP64 Support In Mesa

Comment

Comment

Comment

Comment

Comment

Comment

Comment