Linux 5.19 Perf Changes Bring Three Notable AMD Features
The platform PMU changes on the Intel side include support for Alder Lake N and Raptor Lake P processors. Other Alder Lake and Raptor Lake models were previously added to the Intel perf code while Alder Lake N and Raptor Lake P were left out until now with being added to the kernel code. Just new IDs are needed with no other differences from the PMU side with these new chips.
The platform PMU changes on the AMD side are much more exciting with Linux 5.19:
AMD Zen 4 IBS extensions have been merged. This is for new Instruction-Based Sampling (IBS) capabilities coming with the Zen 4 processors launching later this year. The Linux patches do make it quite clear the new IBS support is coming with Zen 4 and just not some vague upcoming/future processor family. With the Zen 4 processors there are various Instruction-Based Sampling changes:
DataSrc extension provides additional data source details for tagged load/store operations. Add support for these new bits in perf report/script raw-dump.
IBS L3 miss filtering works by tagging an instruction on IBS counter overflow and generating an NMI if the tagged instruction causes an L3 miss. Samples without an L3 miss are discarded and counter is reset with random value (between 1-15 for fetch pmu and 1-127 for op pmu). This helps in reducing sampling overhead when user is interested only in such samples. One of the use case of such filtered samples is to feed data to page-migration daemon in tiered memory systems.
Add support for L3 miss filtering in IBS driver via new pmu attribute "l3missonly".
AMD PerfMonV2 is also to be supported with Linux 5.19. There are various changes to the kernel needed for the updated AMD Performance Monitoring capabilities coming with new AMD CPUs - Zen 4 isn't mentioned explicitly here but it's assumed to be the case. AMD Performance Monitoring V2 has new "global" registers to allow enabling/disabling multiple performance counters at the same time. With the AMD Performance Monitoring up to this point, the different performance counter controls all had to be set individually while now can be set easily in one go using the global registers where present. AMD Performance Monitoring V2 also allows for systematically detecting the number of core PMCs rather than being statically set on a per-family basis. The patches had been out for review and were buttoned up in time for Linux 5.19.
For existing AMD Zen 3 processors, there is new Branch Sampling "BRS" functionality now supported with Linux 5.19. Going back to last year were AMD BRS patches posted by Google for integrating the new hardware Branch Sampling capabilities into Linux's perf subsystem. The AMD BRS with Zen 3 allows for collecting details on branches taken during code execution. Google engineers have been working on this AMD BRS support with an apparent focus on making use of the data for feeding it into AutoFDO-style compiler optimizations on AMD processors. That is for compilers to leverage the collected hardware data to make more informed/accurate optimization decisions based on that profiling.
After the AMD Branch Sampling code was under review and revised the past number of months, it's good to see AMD BRS is ready and was successfully merged as part of the perf event changes for Linux 5.19. AMD BRS is also expected to be present in upcoming Zen 4 processors too but at least this feature is finally here for Zen 3 customers too.
These AMD changes and the other perf events patches can be found via this pull in Linux 5.19.