Announcement

Collapse
No announcement yet.

AMD Releases Open-Source R600/700 3D Code

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by bridgman View Post
    Yeah

    It's probably obvious but when I said "high performance expectations" I was talking about the graphics subsystem where the drivers are hugely complex and proprietary drivers still have an edge.
    Still, it sounds like you're equating "open source" == "easy to do" and "proprietary" == "more sophisticated". I should point out that this is an old-fashioned mentality; for example the highest performance directory software in the world is open source (OpenLDAP); it's generally 5-10x faster than any/all of the proprietary directory software packages out there, and it implements the specs correctly where all the proprietary vendors cut corners. Sophistication and performance doesn't require closed source proprietary developers. It doesn't even require the highest paid development teams. I did the profiling and refactoring of OpenLDAP's code simply because I saw it needed to be done, not because anybody paid me to do it... It annoys me that no one has jumped in here yet re: XvMC, and I regret that I don't have the time to do it myself.

    And it still sounds like XvMC is worth investing in, given that Via already extended their implementation to work with H.264 etc; it was obviously the path that gave them the most bang (software compatibility) for their development buck. But if something like VAAPI is suddenly getting adopted, as it now appears to be, then that'd be fine instead.
    Last edited by highlandsun; 01-03-2009, 10:45 PM.

    Comment


    • What Bridgeman said holds true for video drivers.

      Comment


      • Originally posted by highlandsun View Post
        Still, it sounds like you're equating "open source" == "easy to do" and "proprietary" == "more sophisticated". I should point out that this is an old-fashioned mentality; for example the highest performance directory software in the world is open source (OpenLDAP); it's generally 5-10x faster than any/all of the proprietary directory software packages out there, and it implements the specs correctly where all the proprietary vendors cut corners. Sophistication and performance doesn't require closed source proprietary developers. It doesn't even require the highest paid development teams. I did the profiling and refactoring of OpenLDAP's code simply because I saw it needed to be done, not because anybody paid me to do it...
        If you read enough of my posts you'll see that I don't believe in that line of thinking (proprietary automatically = more sophisticated) at all. The problem here is the sheer size of the work relative to the size of the development community.

        Right now getting drivers with features and performance comparable to the proprietary drivers on other OSes takes more development work than the community can do on its own *or* than HW vendors can fund based on the size of the Linux client market. That means the HW vendors will need to share the costs (and the code) across multiple OSes, and so far the business realities of those OTHER OSes dictate that the resulting code remain closed on Linux as well.

        In that context closed source drivers offer a way to tap into more development resources than we could get access to otherwise; nothing more.

        Originally posted by highlandsun View Post
        It annoys me that no one has jumped in here yet re: XvMC, and I regret that I don't have the time to do it myself.
        Yeah, that is my whole argument in a nutshell; the expectations and demands of the Linux market are growing faster than the development community or the market share (and, of course, market share drives the funding which HW vendors and commercial distros can put in).

        One of our hopes is that by providing sample code and by continuing to work on drivers for older GPUs we will make it easier for new developers to get started and allow more people to participate in graphcis driver development than we have today. The experienced volunteer X developers we have today are extremely good and can take on a project like this on their own, but we don't have anywhere near enough of them.

        Originally posted by highlandsun View Post
        And it still sounds like XvMC is worth investing in, given that Via already extended their implementation to work with H.264 etc; it was obviously the path that gave them the most bang (software compatibility) for their development buck. But if something like VAAPI is suddenly getting adopted, as it now appears to be, then that'd be fine instead.
        I don't think the developers extended the detailed XvMC API to support H.264 on Via HW - that would have been a much larger task. AFAIK they just added a "slice level" API and used that to feed into the slice level hardware on specific GPUs, bypassing all the places where the XvMC details didn't match the H.264 details. Since a lot of our GPUs don't have slice level hardware, and at the moment we do not have plans to open up the slice level hardware on the chips which *do* have it, that approach is probably not feasible unless we implement the slice level support in software inside the driver (which, I guess, is an option).

        I have only skimmed the code so far; need to look through it in more detail. Hey, that's what weekends are for, right ?
        Last edited by bridgman; 01-04-2009, 05:49 AM.

        Comment


        • Originally posted by bridgman View Post
          I don't think Via extended the actual XvMC API to support H.264 - that would have been a much larger task. AFAIK they just added a "slice level" API and used that to feed into the slice level hardware on the chip, bypassing all the places where the XvMC details didn't match the H.264 details. Since a lot of our GPUs don't have slice level hardware, and at the moment we do not have plans to open up the slice level hardware on the chips which *do* have it, that approach is probably not feasible unless we implement the slice level support in software inside the driver (which, I guess, is an option). I have only skimmed the code so far; need to look through it in more detail. Hey, that's what weekends are for, right ?
          IF the Via XvMC is "slice level" based, would using a slice level approach in the AMD drivers allow cross driver code usage? Afterall, if one of the problems is the lack of developers, maximising the usage of current code is a good idea...

          Comment


          • It would definitely allow code sharing up in the player. I doubt there would be much code to be shared in the driver, assuming the Via driver just stuffs slice level info into the chip as I suspect.

            I think in the end we're going to have to pin all the APIs up on a wall and game out what happens if we use each one. We (and the open source devs for other HW) really need to choose between two options :

            1. Choose or create an API which matches the functionality we plan to offload to the chip (practically speaking, this means we do *not* offload entropy decode, probably *do* offload IDCT, and definitely offload everything from that point)

            2. Standardize on a slice level API recognizing this means we will need to duplicate some existing CPU code for the things we won't be offloading to the chip.

            So far the second option seems conceptually simpler but more work unless we can borrow the entropy decoding stage from an existing software decoder; that gets tricky because all of the software decoders seem to be GPL or LGPL licensed. You can move code from an MIT-licensed X driver to GPL-licensed software decoder but you can't move from a GPL-licensed SW decoder to an MIT-licensed driver without contacting all the copyright holders and getting their agreement to relicense (which, in practice, rarely seems to happen).

            Putting GPU acceleration into an existing SW decoder (say libavcodec) seems like an obvious option but is tricky because the codec would then need to be come a drm client and follow the DRI protocols; practically speaking you end up having to define an API to get into the X driver stack anyways. This is where Gallium3D could get interesting, assuming that the final implementation automatically brings along the drm/dri integration so the codec could just say "this state, this shader, go".

            BTW I confirmed over the weekend that our older IDCT hardware will *not* be able to support H.264 acceleration; looks like H.264 uses a modified IDCT with different coefficients from the ones hard-wired into our older MPEG2 hardware. That's why I said "probably" for offloading IDCT; I think it will work OK on shaders but I haven't actually seen anyone do it yet.
            Last edited by bridgman; 01-04-2009, 04:16 AM.

            Comment


            • Originally posted by bridgman View Post
              2. Standardize on a slice level API recognizing this means we will need to duplicate some existing CPU code for the things we won't be offloading to the chip.
              Surely the non-offloaded parts will mostly be the same for multiple chips?

              Comment


              • Yes, I think the code would be largely common across any chip using shaders rather than dedicated hardware. I was thinking of "duplicate" in the sense that this code has already been implemented in a number of GPL-licensed software decoders.

                Comment


                • Ooops, sorry, I misunderstood what you meant.

                  I was just thinking that if developers is the big limitation on opensource drivers, anything that gets more use from the code available is a good thing.

                  Comment


                  • Yep. The tricky part is that if the "most code-sharing-y approach" also requires the most work to be done before showing any useful results, the choice becomes harder.

                    The slice level approach also means that the driver devs need to maintain the front end (eg entropy decode) code in each driver whereas if we had a slightly lower level API then that code would be maintained once in the player or whatever decoder sat between the player and the driver API.

                    In other words, there's a practical difference between "sharing a single copy" and "having one set of code which can more or less be used by a bunch of different drivers, each with their own copy, each being maintained independently by different developers and probably drifting in slightly different directions over time". Going with a slice level API is the second case, unfortunately.
                    Last edited by bridgman; 01-04-2009, 06:00 AM.

                    Comment


                    • Originally posted by bridgman View Post
                      So far the second option seems conceptually simpler but more work unless we can borrow the entropy decoding stage from an existing software decoder; that gets tricky because all of the software decoders seem to be GPL or LGPL licensed. You can move code from an MIT-licensed X driver to GPL-licensed software decoder but you can't move from a GPL-licensed SW decoder to an MIT-licensed driver without contacting all the copyright holders and getting their agreement to relicense (which, in practice, rarely seems to happen).
                      GPL is maybe tricky, but is there a problem with LGPL? You could keep the software decoding stages in their own library and thus it wouldn't affect any other bits. I thought that was the general idea of the LGPL = Library GPL, any changes you make to the library itself must be released but nothing else. Throw in a configuration flag to build with (LGPL dep) or without (MIT) video support and the rest is still free game. You're talking to someone that thinks the X stack being under the MIT license is a contributing reason why it's gotten no futher than it has though, so I'm obviously biased.

                      Comment


                      • I'm not sure what the current thinking is re: LGPL in xorg drivers but will ask. I guess the best approach would be to make a subset library from the current decoder which only handled the work we did not offload to the GPU then link the binary in - that would also allow multiple drivers to share the same lib.

                        Interesting idea - thanks !

                        Comment


                        • bridgman> The only discussion in the thread was about whether it was
                          bridgman> worth implementing XvMC, which is currently MPEG2-only

                          Yes

                          bridgman> whether MPEG2 decoding placed enough load on the system to
                          bridgman> justify implementing XvMC

                          Yes

                          bridgman> I think we all agree that support is meeded for the more demanding
                          bridgman> formats, particularly H.264. The question *there* is whether that
                          bridgman> is a higher priority than 3D support, which is what we are working on now.

                          For Rage, Radeon, FireMV-2D: video decoding 1st, then power management, then 3D

                          For FirePro-3D, FireGL-3D: 3D 1st, then power management, then video decoding

                          Have I left out any video chip families?

                          --------------

                          smitty3268> I think the FFMPEG devs probably have a better idea about how to
                          smitty3268> write a codec than AMD does

                          Wow! Given that ffmpeg core dumps constantly, you must have a *really* low
                          opinion of AMD.

                          --------------

                          bridgman> I just spent another half hour going through [ ... ]
                          smitty3268> I wouldn't worry about trying to decode that rambling

                          Obviously bridgman needs rambling decode acceleration. That was
                          a half hour that could have been spent on XvMC.

                          Comment


                          • Originally posted by Dieter View Post
                            Yes
                            Yes
                            Just to be clear, we're dealing with finite resources here so the question is not "would it be nice to have MPEG2 accel ?" (even I can answer that one ) it's "should the community work on MPEG2 accel instead of H.264/VC-1 accel ?", ie which should be worked on first ?

                            Originally posted by Dieter View Post
                            For Rage, Radeon, FireMV-2D: video decoding 1st, then power management, then 3D

                            For FirePro-3D, FireGL-3D: 3D 1st, then power management, then video decoding
                            The GPU programming is pretty much the same for the two families, so sequence of implementation would be the same for both. If you were combining the families, what would the sequence be ?

                            Originally posted by Dieter View Post
                            Obviously bridgman needs rambling decode acceleration. That was a half hour that could have been spent on XvMC.
                            Agreed, but that is another example of a workload which needs specialized hardware and is difficult to parallelize
                            Last edited by bridgman; 01-07-2009, 01:05 PM.

                            Comment


                            • Originally posted by bridgman View Post
                              Just to be clear, we're dealing with finite resources here so the question is not "would it be nice to have MPEG2 accel ?" (even I can answer that one ) it's "should the community work on MPEG2 accel instead of H.264/VC-1 accel ?", ie which should be worked on first ?
                              I vote for H.264/VC-1 acceleration, although I desperately need power management and faster 3D acceleration on my M56 chip (R500-based Mobility Radeon X1600).

                              Why don't you create a forum poll, so everyone could give it's own opinion about what should be created first?

                              Comment


                              • people dont want a poll this late in the game, they want and NEED a real subset AVC decode and related libray ASAP, perhaps as a tempory stop gap measure until it all settles down later if needs be, PLUS development headers and DOCUMENTATION, and sample full working code showing anyone how to use it ASAP/TODAY.

                                some basic benchmark code/charts for each proposed code example might be nice to, so you can decide at a glance which routine or usage suits your requirements for code review and insertion into the likes of FFMPEG etc..

                                its been said that "the API is the least of the problems" and thats true to some degree, but Bridgeman has stated he beleaves theres enough data documentation out there right now.

                                presumably that means theres enough information right now for someone here ? to take parts of the ATI/AMD API(s) and make an equivenent VDPAU ?

                                call it an alpha AVIVO.lib for AVC,vc1,Dirac, and even mpeg2 if its only another entry point in the lib API.

                                remember, the actual ATI hardware can officially decode all but one of these already ,some video Dev here reading must be capable of running up a quick HW assist AVC decode library based on the ATI API in the same vain as VDPAU being used right now in ffmpeg code in a few days and posting it here ?

                                there MUST be some test API code sat on the ATI/AMD devs PCs Bridgeman is in contact with and can get their permissioon to use and contribute an hour or so for outside use, to learn from, and use in a basic alpha state open avivo libary.

                                as Prototyped outlined here
                                http://episteme.arstechnica.com/eve/...m/467009165931

                                "....
                                In September, ATI released their Catalyst 8.9 driver with X Video Bitstream Acceleration (XvBA) libraries that could be enabled using tools that shipped with the driver, and then last month, the 8.10 driver enabled the UVD2 video acceleration by default.

                                The unfortunate thing is that they didn't also ship any development headers with the driver, with the result that the binary libraries were available, but there was no SDK or information available to media player developers to actually utilize the libraries. So XvBA currently remains a white elephant.
                                ..."


                                for PR purposes and to try and make outsiders equate the ATI/AMD library to VDPAU subset library, i think it should be called AVIVO.lib not XvBA , X-Video Bitstream Acceleration (XvBA), were people are confusing it with the old X-Video Motion Compensation (XvMC)


                                remember also that its 4 months since the library(s) have been available, so alpha/beta test code at the very least must exist on the ATI devs machines to show off this new libray use, but still NO docs are available that i know of, to explain how you might use this library or its official API for hardware assist video decoding etc, WHY AS THAT?
                                a poll is wasting peoples times, were are these X-Video Motion Compensation (XvMC) docs so FFMPEG people and the like MIGHT stand a chance to get some parity with the current HW assist VDPAU FFMPEG code diffs....

                                http://en.wikipedia.org/wiki/XvBA
                                "...
                                From Wikipedia, the free encyclopedia
                                (Redirected from XvBA)
                                Jump to: navigation, search
                                X-Video Bitstream Acceleration (XvBA), designed by AMD for its ATI Radeon GPU, is an extension of the X video extension (Xv) for the X Window System on Linux operating-systems[1].

                                XvBA API allows video programs to offload portions of the video decoding process to the GPU video-hardware. Currently, the portions designed to be offloaded by XvBA onto the GPU are motion compensation (mo comp) and inverse discrete cosine transform (iDCT), and VLD (Variable-Length Decoding) for MPEG-2, MPEG-4 AVC (H.264) and VC-1 encoded video.

                                XvBA is the Linux equivalent of the Microsoft's DirectX Video Acceleration (DxVA) API for Windows.[2]

                                ...
                                "

                                seeing as it seems to be the fashion, and the fact ATI/AMD sold them to us as giving you access to some form of hardware assisted video decode/playback etc with a driver update, i have several X1550,HD3650 and looking to get some HD4xxx soon if something HW assisted code comes home sometime seen, or something else to start advocating werever we go....


                                as it happens, the lads HD3650 has a large 1gig memory on it, i wonder if pre-loading/pipeing some video through a FIFO to the cards internal memory might improve any future HW assisted processing!
                                Last edited by popper; 01-07-2009, 06:00 PM.

                                Comment

                                Working...
                                X