Announcement

Collapse
No announcement yet.

New Linux Optimization Patches Reduced TLB Flushes By Over 50% In Some Cases

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • New Linux Optimization Patches Reduced TLB Flushes By Over 50% In Some Cases

    Phoronix: New Linux Optimization Patches Reduced TLB Flushes By Over 50% In Some Cases

    SK engineer Byungchul Park noticed costly migration overhead especially with TLB shoot-downs hurting performance while he was working with Compute Express Link (CXL) on Linux. That led to some optimization patches to reduce TLB flushes under some select cases that in turn led to a 50% reduction in full flushes and has the possibility of helping performance...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    unfortunately I don't see even a single digit performance chance re-compiling an all modular linux kernel:

    Build [2] at 08/04/2023 from 12:46:36 to 12:50:48 00:04:12

    identical to w/o the patch. Ryzen 7950x, building in tmpfs, 96GB DDR5-6000 https://t2sde.org/packages/linux

    Comment


    • #3
      Originally posted by rene View Post
      unfortunately I don't see even a single digit performance chance re-compiling an all modular linux kernel:

      Build [2] at 08/04/2023 from 12:46:36 to 12:50:48 00:04:12

      identical to w/o the patch. Ryzen 7950x, building in tmpfs, 96GB DDR5-6000 https://t2sde.org/packages/linux
      Does your machine have more than one node? Is it equipped with CXL device?
      migration wouldn't happen if your machine have a single NUMA node.

      Comment


      • #4
        Originally posted by hyeyoo View Post

        Does your machine have more than one node? Is it equipped with CXL device?
        migration wouldn't happen if your machine have a single NUMA node.
        Eh, I meant, node-to-node migration will not happen.
        (migration can still happen during i.e. compaction)

        Anyway to benefit from the series your workload should frequently migrate pages
        (i.e. when promoting pages from/demoting pages to CXL device's memory)
        And I think that would be unlikely to happen during kernel compilation.

        Comment


        • #5
          phoronix, will you be doing a perf article on this? e.g. on 2P latest Xeon & Epyc models, running a variety of scalable benchmarks?

          Comment


          • #6
            I'm trying to think, and wouldn't performance increase be more likely with in-order or anemic processors (atom class)? Obviously multi-processor (more than one physical processor) would benefit the most, but comparing single-cpu processors.
            I really don't know. Benchmarks please!

            Comment

            Working...
            X