Vulkan 1.3.260 Released With AMDX_shader_enqueue, KHR_maintenance5

Written by Michael Larabel in Vulkan on 28 July 2023 at 11:24 AM EDT. 2 Comments

Vulkan 1.3.260 is out today with a handful of specification clarifications/fixes as well as two new extensions.

The new extensions with this weekly update to the Vulkan API spec are VK_AMDX_shader_enqueue and VK_KHR_maintenance5.

VK_KHR_maintenance5 is the latest maintenance spec update for Vulkan with a variety of minor features that are combined to form this extension for just small additions here and there. VK_KHR_maintenance5 saw contributions from AMD, Intel, Imagination, Valve, and other organizations.

Among the changes with VK_KHR_maintenance5 are a few new formats being added, shader modules are deprecated, stronger guarantees for propagation of device lost return vales, new flags and properties, and other small tweaks.

VK_AMDX_shader_enqueue is an experimental/provisional specification drafted by AMD engineers to let developers enqueue compute shader workgroups from other compute shaders.

VK_AMDX_shader_enqueue for enqueuing compute workgroups from a shader is being done to help address the needs of modern game engines. This doc explains:

Applications are increasingly using more complex renderers, often incorporating multiple compute passes that classify, sort, or otherwise preprocess input data. These passes may be used to determine how future work is performed on the GPU; but triggering that future GPU work requires either a round trip to the host, or going through buffer memory and using indirect commands. Host round trips necessarily include more system bandwidth and latency as command buffers need to be built and transmitted back to the GPU. Indirect commands work well in many cases, but they have little flexibility when it comes to determining what is actually dispatched; they must be enqueued ahead of time, synchronized with heavy API barriers, and execute with a single pre-recorded pipeline.

Whilst latency can be hidden and indirect commands can work in many cases where additional latency and bandwidth is not acceptable, recent engine developments such as Unreal 5’s Nanite technology explicitly require the flexibility of shader selection and low latency. A desirable solution should be able to have the flexibility required for these systems, while keeping the execution loop firmly on the GPU.

Rather than going the NVIDIA device generated commands (DGC) route or extending indirect commands functionality, AMD is pursuing this shader enqueue approach that is explicit and performant for modern game engine needs.

More details on these new extensions and other changes to find with Vulkan 1.3.260 via GitHub.

2 Comments