Announcement

Collapse
No announcement yet.

Gallium3D Clover Can Now Execute OpenCL Native Kernels

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gallium3D Clover Can Now Execute OpenCL Native Kernels

    Phoronix: Gallium3D Clover Can Now Execute OpenCL Native Kernels

    One of the Google Summer of Code projects pertaining to Mesa / X.Org is to bring-up open-source OpenCL support with the Gallium3D driver architecture. There's long been a branch of Mesa dubbed "Clover" that provides an OpenCL state tracker for the Gallium3D driver architecture, but it hasn't been usable as there's a lot of work to be finished. This GSoC project attempts to change that and there's already been a big milestone achieved...

    http://www.phoronix.com/vr.php?view=OTU3OQ

  • #2
    it s good , i wonder if the drivers and kernel will also support pcs having a ati with a nvidia .
    may be an option to set the order of gpu use can be good . there some having 4 ati gpu and one nv in a pc

    Comment


    • #3
      one question for the dev or anyone else that can answer:

      We hit a milestone so how many milestones do we need in order to get a complete OpenCL implementation??? In other words what needs to be done.

      Comment


      • #4
        It's difficult to respond before things happen, but another big milestone will be when Clover will be able to compile an OpenCL kernel into LLVM bitcode (a program). Then, extracting a kernel, settings its arguments and launching it will be another one.

        I will be able to give more details in a few weeks.

        Comment


        • #5
          How hard it is to merge clever with mesa master?

          Do clever need any changes to other parts of mesa stack? And what OpenCL version do you want to support? (normal, mobile 1, 1.1, 2.0)

          Comment


          • #6
            Originally posted by przemoli View Post
            Do clever need any changes to other parts of mesa stack? And what OpenCL version do you want to support? (normal, mobile 1, 1.1, 2.0)
            just to avoid further misspelling on your part.
            its clOver not clEver.

            Comment


            • #7
              It's OpenCL 1.1. I already read that an OpenCL 2.0 version is planned, but it's not on the Khronos website, so I cannot see what it is about.

              Comment


              • #8
                Great work!

                Clover is a very important piece of the puzzle.

                Shader-based decoding is another.

                GLSL is another.

                Good to see progress happening on all fronts.

                Comment


                • #9
                  Congrats Denis. Keep up the good work.

                  I am eagerly awaiting the day that it can compile a kernel using llvm into bytecode. That and the ability to set all of the kernel parameters are most of what I need for my VP8 OpenCL decoder to work (along with the byte_addressable_store extension).

                  Comment


                  • #10
                    What realy sucks balls about OpenCL is that you need to specifically target all kinds of different cards, even though your code will run on any OpenCL device. The problem is hardcore GPU understanding. For example the bank size and terminology is different between nVidia and ATi. Imagine programming soundcards >.<

                    Does the current IR succesfully work as a GPU design abstraction with Clover, so that Clover converts OpenCL in general code that works just as great on nVidia as ATi? That would be massive win all over the place.

                    Comment


                    • #11
                      Originally posted by V!NCENT View Post
                      What realy sucks balls about OpenCL is that you need to specifically target all kinds of different cards, even though your code will run on any OpenCL device. The problem is hardcore GPU understanding. For example the bank size and terminology is different between nVidia and ATi. Imagine programming soundcards >.<

                      Does the current IR succesfully work as a GPU design abstraction with Clover, so that Clover converts OpenCL in general code that works just as great on nVidia as ATi? That would be massive win all over the place.
                      as i understand it if you write code in openCL then it will work fine on ati, nvidia, multicore cpu etc. but if you want the code to super fast then you need to pay close attention to things like memory layout, shared caches, and other hardware dependant stuff, because memory bandwidth and cache misses can be significant. I think that tweaking would be very hard to automate.

                      Comment


                      • #12
                        Originally posted by ssam View Post
                        as i understand it if you write code in openCL then it will work fine on ati, nvidia, multicore cpu etc. but if you want the code to super fast then you need to pay close attention to things like memory layout, shared caches, and other hardware dependant stuff, because memory bandwidth and cache misses can be significant. I think that tweaking would be very hard to automate.
                        The warp/work group sizes can drastically vary between hardware, and the ideal code can as well (vector programming vs other methods). During program startup, it is possible to compile the OpenCL kernels and run quick performance tests to pick an ideal method, but that assumes that you are willing to write the auto-tuning code and also to write multiple codepaths.

                        But you are right. If you write code that works on one OpenCL device (e.g. Nvidia), it should work on another device (CPU, DSP, AMD card, etc). There are extensions that can come into play, but as long as the device you are trying to execute on supports what you need, it should at least execute and produce results.

                        Performance tuning of OpenCL code is affected by the specific hardware you're running on, but the code should at least execute properly on other devices.

                        Comment


                        • #13
                          Originally posted by Veerappan View Post
                          Performance tuning of OpenCL code is affected by the specific hardware you're running on, but the code should at least execute properly on other devices.
                          am i right in thinking that even between different models from the same manufacturer you might need to do different optimisation?

                          Comment


                          • #14
                            Originally posted by ssam View Post
                            am i right in thinking that even between different models from the same manufacturer you might need to do different optimisation?
                            Yeah. Case in point would be AMD. Their r600/r700 chips were mostly 5-wide vector units, but the Cayman chips have moved to 4-wide vector units. The next architecture is supposedly going to be SIMD-based, which will lead to entirely different optimization strategies (possibly similar to Fermi, but we'll see).

                            Comment


                            • #15
                              It would be wonderful if linux could have a FLOSS OpenCL implementation for CPU's.

                              Hopefully this project could become part of the kernel in the future?

                              Really looking forward to being able to use OpenCL on Linux.

                              Comment

                              Working...
                              X