Originally posted by runeks
View Post
Announcement
Collapse
No announcement yet.
VP8 Over VDPAU In Gallium3D Is Emeric's Target
Collapse
X
-
Originally posted by runeks View PostI guess what I'm interested in is these two steps. Perhaps only step 2.
I'm not sure what you mean by "write a separate test app with a shader implementing it" though. Why would we write a separate (test) application to implement a sub-feature of a state tracker? Or do you mean just writing an application that can be used to test whichever decoding routine we choose to optimize using shaders?
Also, in step 3: are we not writing this shader in TGSI ourselves? If so, why would we use mesa to record "the TGSI code it generates"?
Think about optimizing a function in x264. First you'd write it in C code and make sure that's working. Then you can compile that to assembly with GCC and copy the output into an assembly section in x264. Then you can work on actually trying to optimize the assembly, instead of writing it from scratch.
I know that wouldn't always lead to optimal results, but it does seem like the quickest path to me.
PS - I am not a developer involved with Mesa, video decoding, or anything else discussed here. So I'm just giving my opinion of what will probably be done, I have no inside information.
Comment
-
Originally posted by bridgman View PostI think smitty meant "write a test app with a GLSL shader".
Originally posted by smitty3268 View PostWhat i mean is a toy app which only has a single shader in it that does the idtc. Used to test that shader until it's working. It's easier doing it there because then you can just use standard OpenGL instead of trying to hook up a shader compiler inside the state tracker.
Originally posted by smitty3268 View PostThink about optimizing a function in x264. First you'd write it in C code and make sure that's working. Then you can compile that to assembly with GCC and copy the output into an assembly section in x264. Then you can work on actually trying to optimize the assembly, instead of writing it from scratch.
But again, would we even gain that much trying to optimize the TGSI? Wouldn't the fact that GLSL is able to utilize hundreds of shaders make it optimized enough, or is it just far more complicated than I'm making it out to be? (It often is )
Comment
-
Originally posted by runeks View PostI think I get the point. On a CPU the path would be C->asm->optimized asm while on a GPU the path would be GLSL->TGSI->optimized TGSI.
But again, would we even gain that much trying to optimize the TGSI? Wouldn't the fact that GLSL is able to utilize hundreds of shaders make it optimized enough, or is it just far more complicated than I'm making it out to be? (It often is )
So my guess was that to keep things simple they would only mess with the TGSI in the state tracker, but I don't know if that's really the plan or not.
I do think the developers are quite familiar with TGSI and I don't think they would view working directly with it too burdensome. They are the same people who are writing the driver compilers, after all, which are working directly on the TGSI and previous mesa IR code.
Comment
-
Originally posted by runeks View PostHmm, bear with me here. When you write "shader", are you referring to the shader program? I keep thinking about the actual hardware units on the graphics card when I read "shader". So we'd write an application that implements an iDCT function in GLSL, and then input various values into this function and see that we get the correct results?
Originally posted by runeks View PostI think I get the point. On a CPU the path would be C->asm->optimized asm while on a GPU the path would be GLSL->TGSI->optimized TGSI. But again, would we even gain that much trying to optimize the TGSI? Wouldn't the fact that GLSL is able to utilize hundreds of shaders make it optimized enough, or is it just far more complicated than I'm making it out to be? (It often is )Test signature
Comment
-
Try this link - I'm on dial-up so it'll be an hour or so before I can confirm if it's the right slide deck, but I *think* this deck talks about optimizing for compute-type applications (and video decode is more like compute work than 3D work) :
http://developer.amd.com%2Fgpu_asset...plications.pdfTest signature
Comment
-
@smitty3268 I think we're on the same page here. What I meant wasn't really to enable state trackers to use GLSL directly, but rather just use the TGSI that the GLSL compiler outputs as-is, ie. to just copy-and-paste that TGSI code into a state tracker without optimizations. So we'd just be using the already functioning GLSL compiler to generate the TGSI code that we'd be sticking in the state tracker.
Although I was actually about to ask how much it would take to make GLSL directly supported in state trackers instead of TGSI. But I guess that leads me to John's response (wrt. optimizing)...:
Originally posted by bridgman View PostIn general you are mostly optimizing with respect to memory accesses (memory bandwidth is always a challenge) more than shader hardware. Algorithms like IDCT and filtering tend to have to perform a lot of reads for every write (even more so than with normal textured rendering), and a significant part of optimizing is about reducing the number of reads or making sure the read pattern is cache-friendly.
But I guess it's just a matter of learning TGSI like any other language. I've just only briefly touched on RISC assembly, and that seemed like so much effort for so little. It would probably help if we had some TGSI code already, generated from GLSL to start with though.
Originally posted by bridgman View Post[...]
http://developer.amd.com/gpu_assets/...plications.pdf [fixed the URL]
I will definitely be digging into that at some point! Would there happen to be a recorded talk/presentation over these slides somewhere?
Comment
-
Originally posted by runeks View PostAlthough I was actually about to ask how much it would take to make GLSL directly supported in state trackers instead of TGSI. But I guess that leads me to John's response (wrt. optimizing)... <snip> And so, GLSL doesn't really cut it because it abstracts away all the memory management right?
GLSL could probably get you pretty close, if not give you the same performance (although I haven't done enough shader work to be sure). The real issue is that a Gallium3D state tracker uses Gallium3d calls and TGSI shaders by definition, so you probably want to end up with TGSI rather than copying a big heap of code from the OpenGL state tracker (aka Mesa) to convert the shaders from GLSL to TGSI every time you wanted to decode a video.
Originally posted by runeks View PostBut I guess it's just a matter of learning TGSI like any other language. I've just only briefly touched on RISC assembly, and that seemed like so much effort for so little. It would probably help if we had some TGSI code already, generated from GLSL to start with though.
Originally posted by runeks View PostSweet! Looks great! Second page says "GPGPU from real world applications - Decoding H.264 Video" so without knowing much else I'd say it's right on the money. I will definitely be digging into that at some point! Would there happen to be a recorded talk/presentation over these slides somewhere?Test signature
Comment
-
Originally posted by bridgman View PostActually I was mostly responding to your question about the need to optimize.
GLSL could probably get you pretty close, if not give you the same performance (although I haven't done enough shader work to be sure). The real issue is that a Gallium3D state tracker uses Gallium3d calls and TGSI shaders by definition, so you probably want to end up with TGSI rather than copying a big heap of code from the OpenGL state tracker (aka Mesa) to convert the shaders from GLSL to TGSI every time you wanted to decode a video.
But I guess my point is that if the TGSI code produced from the GLSL code, as you say, doesn't even need any optimizations to perform really well, then maybe the ability to hook in the GLSL compiler into other Gallium state trackers would ease the developing of future state trackers? Or is Gallium just designed in such a way that this cannot be done without creating a mess?
The GLSL compiler lives in the mesa state tracker right? So if we were to utilize this feature, the GLSL compiler, in another state tracker, we would be creating a new state tracker that is dependant on another state tracker (mesa). But I guess that's not something that is too distant of a concept to Linux; dependencies.
Of course, it would probably need to be some kind of Just-in-time compiler with code caching, in order to be effective. Which quickly makes it quite of a project in itself.
Originally posted by bridgman View PostThere might be (or, more likely a newer talk), but I would have to be at work (with something faster than 24 Kb/s download) to find it before the technology becomes obsolete
Comment
-
The issue is that OpenGL is a Big Honkin' API and therefore needs a Big Honkin' State Tracker. Mesa is a lot bigger and more complex than the video decoder state tracker would be, and you would probably end up having a lot more code supporting GLSL than supporting video decode.
It's sort of like bringing your house into your car so you can make coffee while you drive -- OK in principle but not so good in practiceTest signature
Comment
Comment