Announcement

Collapse
No announcement yet.

Intel Prepares Linux Batch TLB Flushing For Page Migration As A Big Performance Win

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Prepares Linux Batch TLB Flushing For Page Migration As A Big Performance Win

    Phoronix: Intel Prepares Linux Batch TLB Flushing For Page Migration As A Big Performance Win

    Intel engineer Huang Ying sent out a set of patches today to implement batch TLB flushing for page migration within the migrate_pages() function and is showing very promising results...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    A couple questions, if anyone knows:
    1. I take it this is something that only happens on NUMA machines?
    2. Under what circumstances would the kernel decide to migrate pages?

    Comment


    • #3
      Originally posted by coder View Post
      A couple questions, if anyone knows:
      1. I take it this is something that only happens on NUMA machines?
      2. Under what circumstances would the kernel decide to migrate pages?
      It is a system call that userland can call:

      https://man7.org/linux/man-pages/man...e_pages.2.html

      It only applies to NUMA machines.

      I am not sure if it will decide to do this on its own without userland telling it to do it. I would need to look, but I have enough rabbit holes to keep me busy for months as it is.

      Comment


      • #4
        Originally posted by coder View Post
        A couple questions, if anyone knows:
        1. I take it this is something that only happens on NUMA machines?
        2. Under what circumstances would the kernel decide to migrate pages?
        Server class NUMA with multiple sockets is the target, but it should be noted that even consumer desktop CPUs are adding NUMA like characteristics in their designs (certain caches shared only within a single CPU complex of a few cores with multiple complexes within a socket), so as CPU designs evolve, some of the cpu and memory set assignments may have some usefulness in specific cases in the future on the single socket desktop.

        The underlying reason to migrate pages is performance. Keeping all the pages (and cache(s)) for a set of processes/threads on a specific node or CPU can improve performance (as access to memory is "local", and "hot"). For users (typically the HPC/hyperscalers) for which every percent of improvement means a lot of resources (money), making sure you run everything in the most efficient way possible can be a win, and assigning cpu and memory affinity may be a way to accomplish that.
        Last edited by CommunityMember; 27 December 2022, 11:41 PM.

        Comment

        Working...
        X