...in cases where a thread in one NUMA domain is communicating with a thread in another domain (e.g. buffers being passed down a GStreamer pipeline, with the respective threads being scheduled on different physical CPUs). In the worst case, the downstream malloc cache will get polluted entirely with buffers from the wrong NUMA domain, leading it getting non-local memory, whenever it does allocations.
What's needed is either:
- Tag allocations with their NUMA domain and bypass the per-thread cache if the free'd memory is from a different NUMA domain.
- Explicitly tell the kernel to schedule a subset of inter-communicating threads to run in the same NUMA domain.
Leave a comment: