No announcement yet.

Open LLVM-Based Portable OpenCL Announced

  • Filter
  • Time
  • Show
Clear All
new posts

  • Open LLVM-Based Portable OpenCL Announced

    Phoronix: Open LLVM-Based Portable OpenCL Announced

    There's a new open-source OpenCL project called "Portable OpenCL" that takes advantage of LLVM and this morning marks its first public announcement...

  • #2
    This looks a lot more promising than Clover, which marks the sad state of open-source OpenCL.


    • #3
      Originally posted by brent View Post
      This looks a lot more promising than Clover, which marks the sad state of open-source OpenCL.
      In what way? Phoronix and the mailing list link didn't seem to provide many details about this project, so can you go into detail what this does better than Clover?


      • #4

        I'm the developer of Clover and this new project seems interesting. I had a quick look at the source code and it is very different of mine, but very interesting.

        I especially appreciate the way the author of this new project implemented two LLVM passes to handle the barrier() calls. The problem is that he "unrolls" the work-items and execute them one after the other, without (it seems) any threading. His code may be faster than mine, but less scalable. Mine is more concise (20 lines to handle barrier()), but less elegant regarding how the stack is handled.

        I think our two projects are nearly at the same state, we handle the full OpenCL API, but we lack the built-ins. I will hopefully have more time to work on Clover in the following days, and adding a built-in is as simple as adding lines like this in src/runtime/builtins.def :

        func $type fmin $gentype : x:$type y:$type
            return (x < y ? x : y);
        # Native functions are implemented in C++ and are passed to the OpenCL kernels through src/core/cpu/builtins.cpp.
        native float cos float : x:float
            return std::cos(x);
        native $type cos $vecf : x:$type
            for (unsigned int i=0; i<$vecdim; ++i)
                result[i] = std::cos(x[i]);
        This implements the fmin() and cos() built-ins for any scalar or vector float type. cos() uses the STL functions to calculate a sinus. This special code goes through a python script that duplicates it for every $gentype, and put declarations and code everywhere it is needed.


        • #5
          are either of these at the stage where they can be used for writing multi-threaded code for a multi-cored CPU? i.e. can I do what I can currently do with openMP?

          if so would it be wise to start shifting openMP code to openCL. will it run at the same speed now, and much faster one day iin the future when i can be compiled for a GPU.


          • #6
            Hi all,

            I am one of the pocl developers, so maybe I can clarify a little

            steckdenis: yep we fully unroll the work-items now. The passes basically create a big function which is a work-group, containing all the work-items. The goal is to express all the static parallelism inside a work-group to the code generator (this is good for multi-issue architectures but even scalar schedulers can benefit from it). You can still create one thread per work-group if you want, for example to target multi-core.
            The main drawback of the full unrolling is the code size explosion for large work-group sizes. Change to "loop" instead of "unroll" is in the TODO (one previous "incarnation" of the passes used to do that, but we changed in this version for code clarity).

            ssam: the "native" target, which you would use to run your kernels in the CPU, is not multi-threaded. However a "native threaded" target is in the sort-term TODO and might be available soon.

            I just created a mailing list, [email protected], so you might want to subscribe to be informed of development progress.