Radeon Gallium3D OpenCL Is Coming Close
Phoronix: Radeon Gallium3D OpenCL Is Coming Close
Following the OpenCL Gallium3D state tracker having been merged into Mesa earlier this week, the open-source Radeon OpenCL support is coming close...
any apps that work with it or is it tooo early for that???? (ie darktable i think uses OpenCL)
awesome work anyway
it would be really good to know how we are going to move forward from here as far as what ir language will be used in gallium. right now we have tgsi being used everywhere except radeon compute, but, there were all these plans o replace the entire ir stack with llvm. this would be a start i guess, but is that what we really want? we need to stick with something though, that was kinda the entire point of gallium to try and unify as much of the graphics stack between different drivers as we can.
Every single OpenCL/dedicated block encoder so far have terrible quality, including QuickSync and what have we. Just curious, what are you doing that needs the speed more than the quality?
Originally Posted by MonkeyPaw
I do various encodes from HD home videos to upload, as well as taking recorded TV (in HD) and encoding to x264. Last I heard, Quicksync had the best IQ when compared to other GPU off-loads, and it's ridiculously fast.
Originally Posted by curaga
Ah, so, basicaly, we( we as in you, dont you just love when people say we but really they themselves aren't part of the "we") have no clue how demanding compute will be on the ir and in what way the ir will need to bend to effectivly operate.
can you tell me this then. i know we often hear about the ir languages probobly more so than any other componant of the graphics stack below the actual end user api's, but how inconviniant is it really to switch from one to the other? in a game engine it would be a real job to change out from ogl to dx, even from ogl to ogl es if ddone certain ways, but how much of a bother would it be for you to change amd compute back end frrom the llvm over to tgsi if that was the more unified aproch?
also, whats the chances someone will start slinging gcc ir in there as an option what with their plans to try and make a competing ir more like what llvm has?
The problem is that AFAIK essentially all of the "serious" GPU compute experience has been in proprietary stacks so far, generally using proprietary IRs. The source programming languages and runtime environments are evolving as well, which makes it even harder to leverage existig experience.
Originally Posted by benjamin545
So far I haven't seen much in the way of *changing* IRs... it's more common to just translate from the new IR to whatever was being used before. If you look at the r3xx driver as an example, it was written around Mesa IR and when TGSI was introduced the developers added a TGSI-to-Mesa IR translator at the front of the Gallium3D driver and kept the existing shader compiler code.
Originally Posted by benjamin545
This wasn't a matter of intertia though -- some of the IRs are structured as trees or linked lists which a compiler can work on directly (eg optimization steps) while others like TGSI are "flat" and intended for exchange between components rather than as an internal representation worked on directly by the compiler.
That breaks the problem down into two parts :
1. Should the IR be something suitable for direct use by compiler internals, or should it be something designed primarily for transmittal between driver components ?
The advantage of something "flat" like TGSI or AMDIL is that it is relatively independent of compiler internals. The disadvantage is that all but the simplest compilers will require a more structured IR internally and so translation to and from TGSI will be required at each component boundary. Complicating the matter is that while the extra translations seem like they would slow things down they only slow down the compilation step not the runtime execution. Compilation does not usually happen every time the shader is run - minimum is once at program startup, with recompilation sometimes needed when state info that affects shader code changes or if the driver's cache of compiled shaders fills up.
If the choice is something "flat" then TGSI is probably the most likely choice for the open source stacks. If a flat IR is *not* chosen, then we get to question 2...
2. Assuming a structured IR is used, which one should be used ?
This is where GLSL IR and LLVM IR enter the picture, and where the choice of shader compiler internals becomes a factor.
For graphics, the Intel devs were talking about feeding GLSL IR directly into the HW shader compiler for graphics.
Before you say "that's wierd", remember that the high level compiler in Mesa (the "OpenGL state tracker") generates GLSL IR directly which is then converted into TGSI or Mesa IR for use by HW layer drivers so using GLSL IR bypasses some translation steps. For graphics, "Classic" HW drivers use Mesa IR today while "Gallium3D" HW drivers use TGSI. Bottom line is that when you run a GL program on any option source driver the shader starts as GLSL IR then gets optionally translated to something else.
Clover, on the other hand, starts with Clang which generates LLVM IR directly, so the kernel starts as LLVM IR then gets optionally translated to something else.
Once you get down to the HW driver, the shader compiler is likely to need a structured IR such as GLSL IR or LLVM IR. You can see where this is going...
I doubt that gcc will get plumbed into the existing GL/CL driver stack but it seems pretty likely that gcc *will* end up generating GPU shader code and that runtime stacks will exist to get that code running on hardware. This may already have happened although I haven't seen anyone do it yet.
Originally Posted by benjamin545
Last edited by bridgman; 05-13-2012 at 09:13 PM.
well, then it seems like the obvious answer is if we cant have both a structured ir thats as easily transportable between componants (best of both worlds) then we have to use the right ir at the right time for the right solution.
i guess you have to take a step back and try to realize what the big picture is, what is it we want. regarding gallium3d, and i know thats excluding intel and anchient stuff, but what can you realy do about that, is we want a strong central structure that interconects various piecies that do specific functionalities (heres a opencl state tracker, heres a nvidia generation X driver, heres a windows xp winsys connector). this is what gallium3d was billed as. but it was intended for use initialy and primarily for the linux ecosystem, even if it wasn't locked into that specific role.
so in the linux ecosystem, we have some paid hardcore developers and we have a lot of hobbyists. hobbyists will never ever individualy on their own design a modern graphics driver thats competitive with todays standards, and thats ok. now as we have seen in the linux graphics stack over the past few years, paid hardcore developers have come a long way in creating a very competative graphics stack, but we really want hobbyists to be a part of that too, and while some have, i think a lot of people while willing a possibly able to conribute, still feel overwhelmed with the complexity of it all.
getting more the the point i guess, is that if tgsi is a simpler ir to transport between various componants, if i was a newcomer wanting to develop a componant, it would be easier to deal with tgsi. if it is then nessicary to convert it to something more specific to what i am doing, (whitch is what ive been hearing all along is that its too hard to create one all encompasing ir that is perfect for all state trackers and all hardware drivers), then that is what would hae to be done. at least then you could try to make your internal ir something specific to your hardware, for instance, i sure the nvfx/nv30 driver, with its ununified shader cores, is much diferent than the nv50 or nv0c or whatever.
it would be best if other parts of gallium had that same kind of mentality, for instance, memory management is one where initialy gallium was sold as being able to abstract memory management compleatly into the sinsys portion of the driver, but whate ive read before is that a lot of the memory management has been implemented in the hardware drivers usualy due to some feature missing from gallium or it just being easier for whoever is doing it to do it in the driver (im guessing proboobly a lot of that comes from the initial testing and learning stages).