Announcement

Collapse
No announcement yet.

Speeding Up The Linux Kernel With Transparent Hugepage Support

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Speeding Up The Linux Kernel With Transparent Hugepage Support

    Phoronix: Speeding Up The Linux Kernel With Transparent Hugepage Support

    Last month we reported on the 200 line Linux kernel patch that does wonders for improving the desktop responsiveness of the system. There was certainly much interest (over 100,000 views to both of our YouTube videos demonstrating the change) but this patch really didn't speed up the system per se but rather improved the desktop interactivity and reduced latency by creating task-groups per TTY so that the processes had more equal access to the CPU. There is though an entirely different patch-set now beginning to generate interest among early adopters that does improve the kernel performance itself in compute and memory intensive applications and it's the Transparent Hugepage Support patch-set. Here are our initial tests of the latest kernel patches that will hopefully be finding their way into the mainline Linux kernel soon.

    http://www.phoronix.com/vr.php?view=15542

  • #2
    So this Transparent Hugepage Support patch actually makes the kernel perform better in some cases, while the previous patch mostly made applications more responsive when the system was under heavy load, giving better apparent performance?

    Comment


    • #3
      Wheels,

      This patch doesn't exactly make the kernel perform better. It leverages some hardware features to make the applications run a little faster. Its most notable on apps that use a lot of memory and jump around a lot within the memory.

      Comment


      • #4
        Transparant Huge Pages with KVM and with RHEL6

        THP is also useful for KVM virtual machines as well as other baremetal workloads.
        It's available in RHEL6 today, backported into 2.6.32

        There's some info posted on line from the kvm forum
        Slides here http://www.linux-kvm.org/wiki/images...-forum-thp.pdf
        and video here http://vimeo.com/15224470

        Comment


        • #5
          AFAIK, it should be an opt-in function/mode for needed apps to call, otherwise too much waste in memory.

          Comment


          • #6
            If I remember my OS class correctly, the TLB is a cache on the CPU that maps an applications virtual memory address to the actual hardware memory location.

            This patch enables a feature on newer CPUs that allows it to map larger sections of memory at once, which means that the cache is more likely to have the correct entries in it instead of having to look it up.

            So it should mostly help apps that access lots of memory, especially if they have large working sets. It sounds like databases and virtualization are some of the main benefactors.

            Comment


            • #7
              So, this should also speed up raytracing and 3D rendering right? If I'm not mistaken these are typically memory and compute intensive tasks.

              Comment


              • #8
                Originally posted by FunkyRider View Post
                AFAIK, it should be an opt-in function/mode for needed apps to call, otherwise too much waste in memory.
                According to the kernel docs, in most cases memory isn't wasted. If you were to alloc 4MB of ram and only use 1 byte it would waste ram. If you alloc 4K of ram and and use just one byte it would be no worse than it is now.

                Originally posted by smitty3268
                This patch enables a feature on newer CPUs that allows it to map larger sections of memory at once
                Actually the ability to use huge pages has been around since the 386. If I remember correctly, Linus actually used them initially in his very first incarnations of Linux.

                Originally posted by devius
                So, this should also speed up raytracing and 3D rendering right? If I'm not mistaken these are typically memory and compute intensive tasks.
                yes it probably will. In general it should be a win, but there are a few cases it can slow things down according to the docs and mailing list posts.

                I have measured a speed improvement of about 1.4% on compiling xulrunner. I admit this isn't a lot, but every little bit helps on day to day tasks. Firefox and Gnome seem a little more lively but I don't have any benchmarks to back that claim up.

                Comment


                • #9
                  I get a nice, long list of errors when I try to compile 2.6.37-rc5 with this patch.

                  CC mm/vmstat.o
                  mm/vmstat.c:722:2: error: expected identifier or '(' before string constant
                  mm/vmstat.c: In function 'zoneinfo_show_print':
                  mm/vmstat.c:847:36: error: 'vmstat_text' undeclared (first use in this function)
                  mm/vmstat.c:847:36: note: each undeclared identifier is reported only once for each function it appears in
                  mm/vmstat.c: In function 'vmstat_start':
                  mm/vmstat.c:927:14: error: 'vmstat_text' undeclared (first use in this function)
                  mm/vmstat.c:927:14: warning: type defaults to 'int' in type name
                  mm/vmstat.c:927:14: warning: type defaults to 'int' in type name
                  mm/vmstat.c:927:14: error: negative width in bit-field '<anonymous>'
                  mm/vmstat.c: In function 'vmstat_next':
                  mm/vmstat.c:959:14: error: 'vmstat_text' undeclared (first use in this function)
                  mm/vmstat.c:959:14: warning: type defaults to 'int' in type name
                  mm/vmstat.c:959:14: warning: type defaults to 'int' in type name
                  mm/vmstat.c:959:14: error: negative width in bit-field '<anonymous>'
                  mm/vmstat.c: In function 'vmstat_show':
                  mm/vmstat.c:969:28: error: 'vmstat_text' undeclared (first use in this function)
                  make[2]: *** [mm/vmstat.o] Error 1
                  make[1]: *** [mm] Error 2
                  make[1]: *** Waiting for unfinished jobs....

                  Comment


                  • #10
                    I wasn't able to get it to apply to 2.6.37-rc5 either. It does apply to rc4 though.

                    Comment

                    Working...
                    X