AMD Revives Linux Work On DRM CGroup Controller For Limiting GPU Resources
At the start of 2018 there was early work on Cgroups support for DRM drivers. That early work was done by Intel developers on using cgroups to allow restricting the GPU priority. AMD is now looking to build a more extensive DRM cgroup controller support for monitoring and restricting GPU resources.
With this new AMD effort where they are seeking comments from upstream, they intend to use Control Groups "cgroups" for limiting GPU resources in the area of GPU compute/workstation needs.
Kenny Ho of AMD explained yesterday, "With the increased importance of machine learning, data science and other cloud-based applications, GPUs are already in production use in data centers today. Existing GPU resource management is very course grain, however, as sysadmins are only able to distribute workload on a per-GPU basis. An alternative is to use GPU virtualization (with or without SRIOV) but it generally acts on the entire GPU instead of the specific resources in a GPU. With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU resource management (in addition to what may be available via GPU virtualization.)"
AMD is starting with their DRM cgroup accounting push by adding basic accounting and statistics code. Their initial proposal along with some basic AMDGPU DRM patches can be found via this mailing list post.
With this new AMD effort where they are seeking comments from upstream, they intend to use Control Groups "cgroups" for limiting GPU resources in the area of GPU compute/workstation needs.
Kenny Ho of AMD explained yesterday, "With the increased importance of machine learning, data science and other cloud-based applications, GPUs are already in production use in data centers today. Existing GPU resource management is very course grain, however, as sysadmins are only able to distribute workload on a per-GPU basis. An alternative is to use GPU virtualization (with or without SRIOV) but it generally acts on the entire GPU instead of the specific resources in a GPU. With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU resource management (in addition to what may be available via GPU virtualization.)"
AMD is starting with their DRM cgroup accounting push by adding basic accounting and statistics code. Their initial proposal along with some basic AMDGPU DRM patches can be found via this mailing list post.
8 Comments