Intel Posts The "Last Part" To Their AMX Bring-Up For Linux
Written by Michael Larabel in Intel on 22 October 2021 at 09:07 AM EDT. Add A Comment
INTEL --
While for many years we have been accustomed to seeing Intel land their new hardware feature enablement work in the Linux kernel and related components well ahead of products shipping, occasionally there are lapses due to various internal and external timings. The launch of Sapphire Rapids is quickly approaching and one of the major additions is Advanced Matrix Extensions with its Linux support still being in the works.

Going back to June of 2020 Intel has been posting patches around AMX for the Linux kernel, the open-source toolchains, and related components. On the Linux kernel side that heavy-lifting is still ongoing with no released Linux kernel yet having the support in place for AMX.

Various AMX patches for the Linux kernel have been posted a number of times along with related improvements to the kernel code. Sent out on Thursday is the fourth and "last part" to the AMX bring-up kernel side for Linux.

Intel Linux engineer Chang S. Bae explained:
This is the last part of the effort to support AMX. This series follows the KVM part.

With AMX the FPU register state buffer which is part of task_struct::thread::fpu is not going to be extended unconditionally for all tasks on an AMX enabled system as that would waste minimum 8K per task.

AMX provides a mechanism to trap on first use. That trap will be utilized to allocate a larger register state buffer when the task (process) has permissions to use it. The default buffer task_struct will only carry states up to AVX512.

The cost of XFD switching only matters for an AMX-enabled system. With the cleanup of the KVM FPU handling, host-side XFD/AMX is completely independent of guest-side XFD/AMX.

The per-task feature and size information helps to support dynamic features organically compared to the old versions.

Each task has a unique sigframe length with dynamic features. sigaltstack() has a new size checker to support a per-task sigframe size.

This version also fixes the syscall implementation and the XFD state switching on hot paths.

These part 4 patches amount to 23 new patches with around one thousand lines of new code for the signaltstack changes, new system calls for controlling dynamic XSTATE components, XFD state and switching support, and enabling AMX with XFD #NM handling.

We are getting quite close to the Linux 5.16 merge window kicking off in a week or two, so it remains to be seen if all these AMX patches will be ready for this next kernel version... At least the x86 FPU clean-up patches and some of the other work looks like it will land for Linux 5.16, but given the timing it would be somewhat surprising if this final batch is reviewed and deemed ready in time. Thus pushing until Linux ~5.17 before the kernel is fully ready for Advanced Matrix Extensions and that also coming close to where Ubuntu 22.04 LTS will likely cut its kernel for that next enterprise release.
Related News
About The Author
Author picture

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter or contacted via MichaelLarabel.com.

Popular News This Week