Announcement

Collapse
No announcement yet.

Intel Linux Kernel Optimizations Show Huge Benefit For High Core Count Servers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Linux Kernel Optimizations Show Huge Benefit For High Core Count Servers

    Phoronix: Intel Linux Kernel Optimizations Show Huge Benefit For High Core Count Servers

    Earlier this month I wrote about Intel engineers working on more big optimizations to the Linux kernel with a focus on enhancing the kernel's performance at high core counts. The numbers shared then were very promising and since then I've had more time looking at the performance impact of Intel's stellar software optimization work and its impact on real-world workloads. Here is a look at how Intel's pending kernel optimization patches are a huge deal for today's high core count servers.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Interesting results. The dip in performance with 120 threads is a bit concerning though. The best performance is obtained when using 1/4 of the hardware in the machine? That's not good. Do we know if it is a problem with Intel's processor or the kernel?

    Comment


    • #3
      Originally posted by guspitts View Post
      Interesting results. The dip in performance with 120 threads is a bit concerning though. The best performance is obtained when using 1/4 of the hardware in the machine? That's not good. Do we know if it is a problem with Intel's processor or the kernel?
      this is how scaling issues show up... you get lock or cacheline contention... and the contention tanks overall performance -- this is not something for a specific cpu; that is a generic thing that happens everywhere. What can be per cpu is where this "knee" in the graph is... but one you hit mass contention the game is mostly over.
      That is, until the code is changed to avoid the contention. then the knee moves much to the right (to the next bottleneck if any)

      the tanking happens because all the cores contending for the cache line, are taking up resources (cache line bounces, snoops etc etc) away from the task that does do the work while not making forward progress themselves.
      Last edited by arjan_intel; 29 March 2023, 02:16 PM.

      Comment


      • #4
        Would like to see DragonFlyBSD thrown in since it claims to be better than other kernels in high SMT situations. See if it dips as much as Linux does at high core counts. I must say this was shocking as a discovery. Never dreamed 60 cores would preform better than 120 cores or 240 threads!

        Comment


        • #5
          Amdahls law

          Comment


          • #6
            MySQL/Maria/Percona should perform much than those MariaDB results with tuning. Take a look at how Percona tuned it for benchmarking high core count VMs:

            Comparing the performance of the new family of AMD EPYC processors when using MySQL in Google Cloud Virtual Machines.


            ​​​​​​​innodb_thread_concurrency is particularly important.

            Comment


            • #7
              I hope Intel sends you one of those 960 core Sapphire Rapids boxes. Maybe ask for one. They'd dominate AMD on the benchmarks for the next few years ;-)

              Comment


              • #8
                Originally posted by kylew77 View Post
                Would like to see DragonFlyBSD thrown in since it claims to be better than other kernels in high SMT situations. See if it dips as much as Linux does at high core counts. I must say this was shocking as a discovery. Never dreamed 60 cores would preform better than 120 cores or 240 threads!
                Would be interesting to see. DragonFlyBSD only supports up to 128 CPU cores and 256 hardware threads while Linux supports up to 8192 CPUs (/threads?). However, it doesn't mean it will be slower.

                Comment


                • #9
                  Originally posted by Volta View Post

                  Would be interesting to see. DragonFlyBSD only supports up to 128 CPU cores and 256 hardware threads while Linux supports up to 8192 CPUs (/threads?). However, it doesn't mean it will be slower.
                  8192 threads. The NR_CPUS thing comes from before SMT and multi-core. What we call a CPU has drifted while to Linux it's still a single execution unit.

                  Comment


                  • #10
                    Originally posted by Mark Rose View Post
                    I hope Intel sends you one of those 960 core Sapphire Rapids boxes. Maybe ask for one. They'd dominate AMD on the benchmarks for the next few years ;-)
                    I'm pretty sure that's 960 thread, but 480 core. AFAIK, Intel Xeon only scales up to 8 sockets, for the most scalable models. I'm aware of servers with like 32 CPUs, but anything beyond 8 would need to have some extra glue that will surely come at a cost, latency-wise. Benchmarks like these would suffer grievously.

                    I imagine a 480-core Sapphire Rapids box must have a 30 Amp plug.

                    Comment

                    Working...
                    X