Maple Tree v2 Patches For The Linux Kernel - 13~840% Faster For Malloc Threads Test Case
Sent out last year was a "request for comments" on "Maple Tree" as a new data structure for the Linux kernel. The latest version of the Maple Tree patches were sent out today with mixed results but for where gains are being made they can be quite significant.
The Maple Tree is a data structure for storing index ranges that map to a single pointer and work well on modern CPUs (modern CPU caches) in an RCU-safe manner. Post-RFC, earlier this year Oracle sent out their Maple Tree patches with promising results and have now been succeeded by the "v2" patches.
With the Maple Tree v2 patches there is a lot of code refactoring, locking changes, RCU fixes, and a variety of other low-level improvements.
As for the performance impact of the Maple Tree patches in their current form, there are sizable wins in some synthetic/micro-benchmark test cases while the real-world performance seems to be flat:
These Maple Tree patches out of Oracle in their current form amount to 61 patches that end up adding more than 46.9k lines of new code (and 1.6k deletions) but the vast majority of that new code is library code and in particular Maple Tree test code.
The Maple Tree is a data structure for storing index ranges that map to a single pointer and work well on modern CPUs (modern CPU caches) in an RCU-safe manner. Post-RFC, earlier this year Oracle sent out their Maple Tree patches with promising results and have now been succeeded by the "v2" patches.
With the Maple Tree v2 patches there is a lot of code refactoring, locking changes, RCU fixes, and a variety of other low-level improvements.
As for the performance impact of the Maple Tree patches in their current form, there are sizable wins in some synthetic/micro-benchmark test cases while the real-world performance seems to be flat:
While still using the mmap_sem, the performance seems fairly similar on real-world workloads, while there are variations in micro-benchmarks.
Increase in performance in the following micro-benchmarks in Hmean:
- wis malloc1-threads: Increase of 13% to 840%
- wis page_fault1-threads: Increase of 1% to 14%
- wis brk1-threads: Disregard, this test is invalid.
Decrease in performance in the following micro-benchmarks in Hmean:
- wis brk1-processes: Decrease of 45% due to RCU required
Mixed:
- wis pthread_mutex1-threads: +11% to -3%
- wis signal1-threads: +6% to -12%
- wis malloc1-processes: +9% to -18% (-18 at 2 processes, increases after)
- wis page_fault3-threads: +8% to -22%
These Maple Tree patches out of Oracle in their current form amount to 61 patches that end up adding more than 46.9k lines of new code (and 1.6k deletions) but the vast majority of that new code is library code and in particular Maple Tree test code.
8 Comments