AMD Zen 4 IBS Extensions Under Review For Linux
Upcoming AMD Zen 4 processors are bringing improvements to their Instruction-Based Sampling (IBS) capabilities that can be utilized by Linux's wonderful perf utility and subsystem.
At the end of April was the initial patch series with a revised series sent out this morning for new IBS extensions with AMD Zen 4. The patch series is also notable in it being the first Linux kernel patch series explicitly referencing "Zen4" rather than just calling it a future/upcoming architecture. All the other recent Zen 4 patch series have just used generic/vague terminology even though we all know it's for Zen 4 given AMD's Linux upstreaming cadence and history around their Linux support timing.
Zen 4 will improve instruction-based sampling by adding a data source extension as well as a new L3 cache miss filtering capability. These new Zen 4 IBS features are summed up as:
See the kernel mailing list for these Zen 4 IBS patches if you are a heavy Linux perf user and wanting to learn more about these new capabilities.
In general, besides perf instruction-based sampling being useful for profiling for possible optimizations and debugging of issues, the sampling is also useful for a growing number of compiler features for being able to feed the perf hardware sampling results back to the compiler for assisting in generating profile-based optimized binaries. With Intel long having been more at the forefront of hardware performance counters and the functionality exposed under Linux, it's good to see some IBS improvements coming with Zen 4.
At the end of April was the initial patch series with a revised series sent out this morning for new IBS extensions with AMD Zen 4. The patch series is also notable in it being the first Linux kernel patch series explicitly referencing "Zen4" rather than just calling it a future/upcoming architecture. All the other recent Zen 4 patch series have just used generic/vague terminology even though we all know it's for Zen 4 given AMD's Linux upstreaming cadence and history around their Linux support timing.
Zen 4 will improve instruction-based sampling by adding a data source extension as well as a new L3 cache miss filtering capability. These new Zen 4 IBS features are summed up as:
DataSrc extension provides additional data source details for tagged load/store operations. Add support for these new bits in perf report/script raw-dump.
IBS L3 miss filtering works by tagging an instruction on IBS counter overflow and generating an NMI if the tagged instruction causes an L3 miss. Samples without an L3 miss are discarded and counter is reset with random value (between 1-15 for fetch pmu and 1-127 for op pmu). This helps in reducing sampling overhead when user is interested only in such samples. One of the use case of such filtered samples is to feed data to page-migration daemon in tiered memory systems.
Add support for L3 miss filtering in IBS driver via new pmu attribute "l3missonly".
See the kernel mailing list for these Zen 4 IBS patches if you are a heavy Linux perf user and wanting to learn more about these new capabilities.
In general, besides perf instruction-based sampling being useful for profiling for possible optimizations and debugging of issues, the sampling is also useful for a growing number of compiler features for being able to feed the perf hardware sampling results back to the compiler for assisting in generating profile-based optimized binaries. With Intel long having been more at the forefront of hardware performance counters and the functionality exposed under Linux, it's good to see some IBS improvements coming with Zen 4.
1 Comment