Originally posted by tildearrow
View Post
Announcement
Collapse
No announcement yet.
LLVM Working On "HIPSPV" So AMD HIP Code Can Turn Into SPIR-V And Run On OpenCL
Collapse
X
-
-
Originally posted by StillStuckOnSI View PostIntel are the only GPU maker actually pushing SPIR-V and OpenCL these days, but they don't have the hardware to back it up.
There's also Ponte Vecchio, though I'm not sure about when it's supposed ship date is (i.e. outside of special HPC deployments already in progress).
Comment
-
Originally posted by coder View PostI'll bet that was AMD's original goal. However, then the Oracle/Google lawsuit resulted a ruling that Oracle could indeed copyright the Java API (overturning decades of precedent), and AMD probably then shifted to building a translator instead of simply replicating the CUDA API.
Google took the risky path, and easily took half of the entire mobile OS usage share.
And even after the lawsuit Google still exists.
I wish AMD just took the risk, because only 5% of people would be happy with translation tools, but 95% would be satisfied with a compatibility layer.
- Likes 1
Comment
-
Originally posted by tildearrow View PostScrew this and the unfair greedy monopoly NVIDIA has.
Google took the risky path, and easily took half of the entire mobile OS usage share.
And even after the lawsuit Google still exists.
Remember that AMD wasn't in good financial shape at the time that HiP was first introduced. They weren't & still aren't Google.
Lastly, what Google did was considered safe, at the time they did it -- not risky! Remember the whole "decades of precedence" part?
Originally posted by tildearrow View PostI wish AMD just took the risk, because only 5% of people would be happy with translation tools, but 95% would be satisfied with a compatibility layer.
I wish AMD just stuck to OpenCL and maybe focused on providing translation tools & optimized runtime libraries for it.
- Likes 1
Comment
-
Originally posted by coder View PostVulkan compute shaders only have precision requirements equivalent to GLSL, which is (in some cases much) lower than OpenCL.
So, even if you had the tooling to run OpenCL SPIR-V compute kernels on any Vulkan device, the results (often?) would be unsatisfactory if not downright unusable.
Also, the arithmetic operations needed for basically all path-tracing operations are accurate enough under Vulkan's guarantees that I doubt there will be issues -- the core operations needed are add, sub, mul, div, fma, sqrt, and rsqrt -- all of those have very good accuracy requirements in the Vulkan spec. The less accurate operations are the trigonometric and exponential/logarithmic operations, which are rarely used in path-tracing (probably only for procedural textures, assuming all transformation matrixes are pre-computed by the cpu) and could use a custom more-accurate implementation if necessary, rather than using Vulkan's built-in operations.Last edited by programmerjake; 19 December 2021, 10:48 PM.
Comment
-
Originally posted by programmerjake View PostMost desktop gpus implement higher precision than required by the Vulkan spec, I'm sure it would be relatively easy to create a high-precision Vulkan extension that provides additional guarantees, if needed.
Originally posted by programmerjake View PostAlso, the arithmetic operations needed for basically all path-tracing operations are accurate enough under Vulkan's guarantees that I doubt there will be issues -- the core operations needed are add, sub, mul, div, fma, sqrt, and rsqrt -- all of those have very good accuracy requirements in the Vulkan spec.
Here's what the specs actually say:- https://www.khronos.org/registry/vul...sion-operation
- https://www.khronos.org/registry/Ope...-error-as-ulps
Some of the more glaring examples are exp()/exp2() and atan()/atan2()/asin()/acos(). In the latter case, Vulkan allows up to 4096 ULP vs. OpenCL allowing only 6. That's about 683x as much error tolerance! The former is data-dependent, but I think worst-case is 173 vs. 3 ULP or about 58x as much.
So, you can perhaps now appreciate that these aren't simply hand-wavy differences, where one could blindly take CUDA/HIP code and run it under Vulkan with full faith in the accuracy of the results. Hardware that's designed specifically for graphics (and maybe also deep learning) probably doesn't have abundant precision, as that would be wasteful of die area and power.
Originally posted by programmerjake View PostThe less accurate operations are the trigonometric and exponential/logarithmic operations, which are rarely used in path-tracing (probably only for procedural textures, assuming all transformation matrixes are pre-computed by the cpu) and could use a custom more-accurate implementation if necessary, rather than using Vulkan's built-in operations.
Comment
-
(The forum doesn't display any quotes inside a quote, so I manually adjusted it)Originally posted by programmerjake View PostMost desktop gpus implement higher precision than required by the Vulkan spec, I'm sure it would be relatively easy to create a high-precision Vulkan extension that provides additional guarantees, if needed.Originally posted by coder View PostThat's a pretty big backtrack. Your original post seemed aimed at running these kernels on virtually all Vulkan-capable GPUs. If you're going to require an extension, how is that really better than simply requiring OpenCL SPIR-V support?
Originally posted by programmerjake View PostAlso, the arithmetic operations needed for basically all path-tracing operations are accurate enough under Vulkan's guarantees that I doubt there will be issues -- the core operations needed are add, sub, mul, div, fma, sqrt, and rsqrt -- all of those have very good accuracy requirements in the Vulkan spec.Originally posted by coder View PostDepends on what it's for. But, we're talking about compute workloads, not graphics, because that's what CUDA & HIP are for.
Originally posted by coder View PostHere's what the specs actually say:- https://www.khronos.org/registry/vul...sion-operation
- https://www.khronos.org/registry/Ope...-error-as-ulps
Some of the more glaring examples are exp()/exp2() and atan()/atan2()/asin()/acos(). In the latter case, Vulkan allows up to 4096 ULP vs. OpenCL allowing only 6. That's about 683x as much error tolerance! The former is data-dependent, but I think worst-case is 173 vs. 3 ULP or about 58x as much.
Originally posted by coder View PostSo, you can perhaps now appreciate that these aren't simply hand-wavy differences, where one could blindly take CUDA/HIP code and run it under Vulkan with full faith in the accuracy of the results.Originally posted by coder View PostHardware that's designed specifically for graphics (and maybe also deep learning) probably doesn't have abundant precision, as that would be wasteful of die area and power.
Originally posted by programmerjake View PostThe less accurate operations are the trigonometric and exponential/logarithmic operations, which are rarely used in path-tracing (probably only for procedural textures, assuming all transformation matrixes are pre-computed by the cpu) and could use a custom more-accurate implementation if necessary, rather than using Vulkan's built-in operations.Originally posted by coder View PostOr, instead of writing a custom GPU math library (probably with abysmal performance compared to vendor-native implementations), maybe just stick to running GPU compute kernels via GPU compute APIs?Last edited by programmerjake; 21 December 2021, 12:56 AM.
Comment
-
Originally posted by programmerjake View PostCycles is much more like a graphics workload, in accuracy requirements, last-bit precision is usually not necessary. Vulkan actually provides identical precision to OpenCL for add, sub, mul, div, and rsqrt -- most of the operations I mentioned. Also, I did reread through the spec when I wrote that list of operations you quoted above...I'm quite familiar with Vulkan's accuracy requirements -- I've been working on building a GPU for the last 3 years as part of the Libre-SOC project and have been the primary source of Vulkan expertise for the project.
Yup, Vulkan has pretty atrocious accuracy requirements for the trig/log/exp functions, which is exactly why I stated that they are less accurate and suggested using custom implementations if/when that causes problems. Those custom implementations, on the GPUs with awful accuracy, are likely very similar to the actual implementation that the vendor uses for OpenCL anyway, so would have similar performance.
Yup, I wasn't advocating for blind translation, but the more reasonable approach of adaptation for Vulkan's quirks, one of which is the accuracy. Well, where it will likely actually matter for Cycles, Vulkan already has very strong requirements, so the mere fact that the GPU can implement Vulkan, and not just OpenGL ES 2 (which has very poor requirements, by contrast), means that the floating-point arithmetic hardware on the GPU has good enough accuracy for the operations Cycles likely needs for path tracing. (This comes from my experience, having written several ray-tracers and path-tracers myself.)
All information that we can make public about plans for other GPUs is public, that’s all I can say about that. Vulkan has limitations in how you can write kernels, in practice you can’t currently use pointers for example. But also, GPU vendors will recommend certain platforms for writing production renderers, provide support around that, and various renderers will use it. Choosing a different platform means you will hit more bugs and limitations, have slower or no access to certain features, ar...
Vulkan compute isn't ready for Cycles. It currently has too many limitations/driver bugs/lack of support. If it was easy, then Otoy would have released Octane with the backend that they finished (but barely worked because of driver bugs on anything outside of Nvidia).
See here.
And here.
Comment
-
Originally posted by Boland View Post
See here.
All information that we can make public about plans for other GPUs is public, that’s all I can say about that. Vulkan has limitations in how you can write kernels, in practice you can’t currently use pointers for example. But also, GPU vendors will recommend certain platforms for writing production renderers, provide support around that, and various renderers will use it. Choosing a different platform means you will hit more bugs and limitations, have slower or no access to certain features, ar...
Vulkan compute isn't ready for Cycles. It currently has too many limitations/driver bugs/lack of support. If it was easy, then Otoy would have released Octane with the backend that they finished (but barely worked because of driver bugs on anything outside of Nvidia).
See here.
And here.
for the part about Vulkan's limitations around pointers, that's addressed by VK_KHR_vulkan_memory_model, which is a requirement of Vulkan 1.2.
Comment
Comment