Announcement

Collapse
No announcement yet.

Speeding Up The Linux Kernel With Your GPU

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Speeding Up The Linux Kernel With Your GPU

    Phoronix: Speeding Up The Linux Kernel With Your GPU

    Sponsored in part by NVIDIA, at the University of Utah they are exploring speeding up the Linux kernel by using GPU acceleration. Rather than just allowing user-space applications to utilize the immense power offered by modern graphics processors, they are looking to speed up parts of the Linux kernel by running it directly on the GPU...

    http://www.phoronix.com/vr.php?view=OTQxMQ

  • #2
    Too bad there are no plans for a CUDA state tracker. In the last two months, I started with some CUDA programming and I must say that it's much nicer to work with compared to OpenCL. From my point of view, OpenCL is the inferior choice.

    Comment


    • #3
      About time someone started to look into using GPUs as general co-processors/vector units.

      Comment


      • #4
        One bug in either Nvidia's drivers or in the cuda code, and what happens to the kernel?

        Hang? Oops?

        Comment


        • #5
          Originally posted by curaga View Post
          One bug in either Nvidia's drivers or in the cuda code, and what happens to the kernel?

          Hang? Oops?
          Same thing that happens with a bug in the kernel itself.

          Comment


          • #6
            There aren't that many uses for GPGPU processing inside the kernel besides cryptography. The cards use a separate memory range and the time required to setup a task on the GPU is pretty high. Most kernel calls do not operate on large portions of data, they just pass them around between user-space programs and peripherial devices, so the processing power of a GPU cannot benefit the task. In most cases a task will probably even take longer, because copying the data to the GPU, starting the GPGPU task and copying the data back heavily increases the latency.

            This is the exact same reason why it doesn't currently make sense to use GPGPU computing in most standard applications, like Microsoft Office or a Web Browser: The workloads are so small that a standard CPU can deliver the result faster than a GPU round-trip would take. And most CPUs nowadays have multiple cores anyways. Maybe the situation improves once CPU and GPU are combined into a single devices with a common, flat memory layout, but the GPU is still no good for small workloads.

            Probably that's why they picked file system cryptography, but newer CPUs come with AES accelerators, and currently available AES-NI units already peak out at up to two gigabytes per second. That's enough to saturate multiple S-ATA links, and AES-NI comes with no additional memory copies, setup times etc., while completely freeing the CPU for other tasks.

            Comment


            • #7
              A better choice would have been OpenCL, which can run on both AMD and NVIDIA GPUs and is an open industry standard.
              Better choice is the one who has means and can do it in affordable amount of time. Why climb on tip of the tree if you can get low hanging fruit without much effort?

              Comment


              • #8
                Presumably even if the CUDA option ends up being a bit of a bust, the work on parallelising the kernel could have good payoffs in the non-gpu kernel given the ever increasing core counts of systems.

                Comment


                • #9
                  Give me sonething that works like a microkernel and drop me 20.

                  Comment


                  • #10
                    To utilize GPU power for filesystem decryption, the better choice would be to move the FS to userspace (FUSE) instead of GPU-stuff to the kernel.

                    In either case, much care needs to be taken to avoid compromising the key. GPU memory isn't protected much, and leftover memory usually gets assigned to the next task without clearing it first. Do either CUDA or OpenGL make any guarantees there?

                    Comment


                    • #11
                      Ya let's do that. Nothing like a windows display driver model type scheduler with a frikkin massively parallel device putting your system to sleep taking memory lockout and schedule dirt naps.

                      Is it cool if a program wants to use it. Ya. Cause it will speed things up but you can't multiask it very well. If you put this in kernel stuff it's going to be a nightmare.
                      If they try to use on very much stuff it's going to make everything an unbearable type of slow.

                      I just don't understand the whole concept of promising speed ups so you can slow things down and sell more rediculously overpowered hardware.

                      They are in your linux. Infesting it with windows 7 taint.

                      http://forums.nvidia.com/index.php?showtopic=190039

                      Comment


                      • #12
                        Originally posted by NSLW View Post
                        Better choice is the one who has means and can do it in affordable amount of time. Why climb on tip of the tree if you can get low hanging fruit without much effort?
                        Because it is nvidia-only.

                        Comment


                        • #13
                          simd instruction set

                          Originally posted by not.sure View Post
                          About time someone started to look into using GPUs as general co-processors/vector units.
                          but we already have vector units on our CPUs. Does eCryptFS use SSE or AltVec or VIS (the less known sparc simd instruction set) to accelerate encryption?

                          Comment


                          • #14
                            The intent is obvious:
                            -Intel good CPU, bad GPU
                            -AMD arguably better but slower CPU, great GPU
                            -nVidia crappy CPU, good GPU

                            With the fusion stuff comming up, nVidia can only compete with good shaders, whilst keeping the nessecary CPU stuff there. So in order for nVidia to get a piece of the cake, they must do this.

                            Quite frankly, with all my AMD fanboyism aside, I realy like what they're doing. They may:
                            -Improve the Linux kernel (and break some for a short while) in a way that's accepted
                            -Give Linux a serious technological edge
                            -Shader cores are much better for kernels, I think, because a kernel is all about management and what greater way to have this furiously multicored? In fact I like anything that's not time-sliced.
                            -Maybe gives nVidia a good reason to open source or imrpove Gallium... Maybe...

                            Comment


                            • #15
                              Originally posted by sturmflut View Post
                              There aren't that many uses for GPGPU processing inside the kernel besides cryptography. The cards use a separate memory range and the time required to setup a task on the GPU is pretty high. Most kernel calls do not operate on large portions of data, they just pass them around between user-space programs and peripherial devices, so the processing power of a GPU cannot benefit the task. In most cases a task will probably even take longer, because copying the data to the GPU, starting the GPGPU task and copying the data back heavily increases the latency.

                              This is the exact same reason why it doesn't currently make sense to use GPGPU computing in most standard applications, like Microsoft Office or a Web Browser: The workloads are so small that a standard CPU can deliver the result faster than a GPU round-trip would take. And most CPUs nowadays have multiple cores anyways. Maybe the situation improves once CPU and GPU are combined into a single devices with a common, flat memory layout, but the GPU is still no good for small workloads.

                              Probably that's why they picked file system cryptography, but newer CPUs come with AES accelerators, and currently available AES-NI units already peak out at up to two gigabytes per second. That's enough to saturate multiple S-ATA links, and AES-NI comes with no additional memory copies, setup times etc., while completely freeing the CPU for other tasks.
                              Great post - agreed 100%. In order for in-kernel GPGPU to make any sense, it needs to be demonstrated that there are existing kernel tasks (or new kernel tasks yet to come) that (a) really belong in the kernel rather than userspace, and (b) would truly benefit from being accelerated despite the big setup times.

                              Way I see it, most applications of GPGPU fail either (a) or (b). I can't think of an application besides very large-scale crypto that might pass both (a) and (b) legitimately.

                              Maybe Software RAID could somehow be accelerated by the GPU, although you'd need a very large stripe size for it to be worth it. With say RAID-5, you might want to be able to calculate parity bits faster. If you factor in GPU setup latency and the GPU can still do that faster than the CPU, that's great -- go for it. But what about the vast majority of the people who either don't use RAID, or use hardware RAID that offloads those calculations to dedicated hardware anyway?

                              Anyway, I'm out of ideas. I can't think of another practical application of GPGPU in the kernel. I can think of many many useful applications of GPGPU, but they all belong squarely in userspace, implemented in applications.

                              Comment

                              Working...
                              X