Linux 5.12 Adds Instruction Latency Reporting To Perf
An exciting new capability with perf in Linux 5.12 is the ability to collect instruction latency metrics as part of the performance reports, but relies on hardware capabilities for now only found in next-generation Intel Xeon "Sapphire Rapids" processors.
Linux 5.12 adds the ability to support instruction latency metrics as part of perf report collections. The instruction latency metrics paired with the memory latency data can help developers understand expensive instructions and the time being spent in the different CPU stages. It will be fun when this ability is more widespread across processors and interesting if it can end up being used for helping to generate more accurate cost tables for compiler targets among other use-cases.
Instruction latency support for perf is exciting but it doesn't magically work for all current CPUs - this feature is being enabled for Intel Xeon "Sapphire Rapids" that is coming late this year or into 2022.
Earlier this week was the perf events pull for the Linux 5.12 merge window that adds the CPU-PMU support for Intel Xeon Sapphire Rapids. That includes alterations due to the event encoding having changed, a new "Precise Distribution" (PDist) facility, and changes for handling the instruction latency.
The perf tools changes for Linux 5.12 were sent in on Friday and that includes the instruction latency support as part of the perf report sub-command. These updates in addition to enabling instruction latency reporting with Sapphire Rapids also enables other new features like perf stat handling L2 topdown events with those forthcoming Xeons.
Outside of the Sapphire Rapids / instruction latency work is also a new perf daemon command for long running sessions to control the enablement of events without restarting the session. The perf tools also now supports collecting events for BPF programs in pef stat, support for tracing KVM with Intel PT, support for Intel PT PTSB synchronization packet events, and a variety of other additions as outlined in this week's pull requests.
Linux 5.12 adds the ability to support instruction latency metrics as part of perf report collections. The instruction latency metrics paired with the memory latency data can help developers understand expensive instructions and the time being spent in the different CPU stages. It will be fun when this ability is more widespread across processors and interesting if it can end up being used for helping to generate more accurate cost tables for compiler targets among other use-cases.
Instruction latency support for perf is exciting but it doesn't magically work for all current CPUs - this feature is being enabled for Intel Xeon "Sapphire Rapids" that is coming late this year or into 2022.
Earlier this week was the perf events pull for the Linux 5.12 merge window that adds the CPU-PMU support for Intel Xeon Sapphire Rapids. That includes alterations due to the event encoding having changed, a new "Precise Distribution" (PDist) facility, and changes for handling the instruction latency.
The perf tools changes for Linux 5.12 were sent in on Friday and that includes the instruction latency support as part of the perf report sub-command. These updates in addition to enabling instruction latency reporting with Sapphire Rapids also enables other new features like perf stat handling L2 topdown events with those forthcoming Xeons.
Outside of the Sapphire Rapids / instruction latency work is also a new perf daemon command for long running sessions to control the enablement of events without restarting the session. The perf tools also now supports collecting events for BPF programs in pef stat, support for tracing KVM with Intel PT, support for Intel PT PTSB synchronization packet events, and a variety of other additions as outlined in this week's pull requests.
1 Comment