Announcement

Collapse
No announcement yet.

Linux MGLRU Results Are Looking Great On Ampere Altra

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux MGLRU Results Are Looking Great On Ampere Altra

    Phoronix: Linux MGLRU Results Are Looking Great On Ampere Altra

    One of the best features to make it into the mainline Linux kernel this year is MGLRU as the Multi-Gen LRU for overhauling the kernel's page reclamation code. The MGLRU code that premiered in Linux 6.1 has been showing off very well in a variety of benchmarks...

    https://www.phoronix.com/news/Linux-MGLRU-Ampere-Altra

  • #2
    Wow! Nothing more to say.

    Comment


    • #3
      Originally posted by Jumbotron View Post
      Wow! Nothing more to say.
      I have something to say - what are numbers for C and Rust?

      Comment


      • #4
        Originally posted by gosh000 View Post
        I have something to say - what are numbers for C and Rust?
        I don't think it should matter, unless there's something I'm missing about Rust's memory-management behavior.

        Even then, you can't lump all C programs into a single category. It just depends on the program.

        Here are some more MGLRU benchmarks, using a dual-EPYC 75F3 (32-Core, each) server:

        Comment


        • #5
          Thank you, coder


          I was curious what numbers are for languages/runtimes without garbage collector.

          Comment


          • #6
            Decades ago I took a parallel programming class. We were using an ncube system with I don't remember how many CPUs. One of the assignments was to write a parallel matrix multiply algorithm and look at how it scaled as we increased the number of CPUs used.

            One of the guys in the class was a Russian wunderkind. His single-CPU performance blew everyone else's full parallel performance out of the water, because he has looked at the CPU architecture and tweaked his algorithm to fit code and data entirely in the cache.

            As a hardware engineer we get excited about new CPUs that offer 15% higher IPC, but there is still so much performance to be gained from software alone. This is a great example.



            Comment


            • #7
              Being resourceful helps. I read (recently) that Amazon Web Services contributed code to FFMPEG project - https://aws.amazon.com/blogs/opensou...on-processors/

              Comment


              • #8
                Originally posted by igxqrrl View Post
                Decades ago I took a parallel programming class. We were using an ncube system with I don't remember how many CPUs. One of the assignments was to write a parallel matrix multiply algorithm and look at how it scaled as we increased the number of CPUs used.

                One of the guys in the class was a Russian wunderkind. His single-CPU performance blew everyone else's full parallel performance out of the water, because he has looked at the CPU architecture and tweaked his algorithm to fit code and data entirely in the cache.
                Perhaps one of the lessons you learned was that column-major accesses are a great way to trigger cache thrashing? This is definitely true of image processing, as image widths have an annoying tendency to be a multiple of some significant power of 2.

                Comment


                • #9
                  Originally posted by gosh000 View Post
                  Being resourceful helps. I read (recently) that Amazon Web Services contributed code to FFMPEG project - https://aws.amazon.com/blogs/opensou...on-processors/
                  I also see many merge requests in x264 and x265 for arm64

                  Comment

                  Working...
                  X