Announcement

Collapse
No announcement yet.

There May Still Be Hope For R600g Supporting XvMC, VDPAU

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    [Rant] Where is the progress?

    Sorry for ranting / trolling this thread but,

    When will I be able to replace my catalyst driver for R600 (classic / gallium), in order to have accelerated VDPAU/VAAPI for my video (decoding) needs?

    I really want to replace Catalyst for OS Mesa, bit I can't do it while I don't have proper video support in R600 driver (and no, Xv is very bad at doing its job)...

    Sorry for the ranting, but I think I'm getting sick of the linux DRM / 3D situation. It's a big mess atm, and it's unfortunate we're still many years behind what Win(blows) does atm.

    Cheers

    p.s.: If I knew how to program drivers, OC I'd help it...

    Comment


    • #72
      Originally posted by evolution View Post
      I really want to replace Catalyst for OS Mesa, bit I can't do it while I don't have proper video support in R600 driver (and no, Xv is very bad at doing its job)...
      Xv is actually really good at doing its job. Sounds like the software decoder you are using with Xv may not be doing what you need, though. Are you using a multi-threaded decoder ? If not you should definitely give it a try...

      Originally posted by evolution View Post
      Sorry for the ranting, but I think I'm getting sick of the linux DRM / 3D situation. It's a big mess atm, and it's unfortunate we're still many years behind what Win(blows) does atm.
      DRM is always going to be problematic on Linux because nobody knows how to implement robust DRM on an open source kernel. What you probably will see over time is HW vendors hacking their hardware a bit to work around that problem a bit better...

      Where do you see a big mess on the 3D side ?

      Originally posted by evolution View Post
      p.s.: If I knew how to program drivers, OC I'd help it...
      Nobody "just knows" how to program drivers when they start. If you can build the drivers and edit text files then you know enough to start, and over time you will accumulate expertise and confidence.
      Test signature

      Comment


      • #73
        Xv is actually really good at doing its job. Sounds like the software decoder you are using with Xv may not be doing what you need, though. Are you using a multi-threaded decoder ? If not you should definitely give it a try...
        Well, I've been getting with ffmpeg-mt + mplayer ('mplayer -lavdopts threads=2 -vo xv -ao alsa "movie.mov"') the same performance I get with "normal" ffmpeg + mplayer ('mplayer -lavdopts threads=1 -vo xv -ao alsa "movie.mov"'). The CPU usage for "big buck bunny" (a 1080p sample I use for testing multithread decoding), is about 50% CPU with Xv, in both cases (single or multi-thread modes). Drivers: ATI Avivo (Xv) and Mesa Xv (from the ATI OS driver). Tested in my laptop. (HD2600 + C2D@2GHz)

        I fully agree Xv does a better job than "X11" rendering, but important points such as IDCT or VLD are missing on Xv (I'm not sure, but I heard IDCT and VLD can help a lot on CPU performance, mainly if these processes are done in the GPU).

        I'd like to change to OS ATI, but as I'm thinking in buying a fusion netbook in a near future, I think I'll have some troubles watching videos using OS ATI drivers (I'm not sure if AMD's Zacate processor has enough "horsepower" to play 1080p videos with Xv... ).

        DRM is always going to be problematic on Linux because nobody knows how to implement robust DRM on an open source kernel. What you probably will see over time is HW vendors hacking their hardware a bit to work around that problem a bit better...

        Where do you see a big mess on the 3D side ?
        Well, with Mesa, we're still with OGL 2.1 (OGL3 is still in a early phase), and we don't even have basic support for things such as proper texture compression. (unfortunately patents don't allow us to use it without external linking...)

        Nobody "just knows" how to program drivers when they start. If you can build the drivers and edit text files then you know enough to start, and over time you will accumulate expertise and confidence.
        The best I can do atm is "bug reporting" (I really like to report bugs... ). I don't have any problems describing problems from drivers... Unfortunately, I've a big lack of knowledge in C and Assembly (so, programming for me, is a bit 'out of bounds' atm )... and spare time (I'm doing a masters degree (as you can see in my profile) that takes me a lot of it... )

        Otherwise, I wish the best of luck for OS ATI developers, and I'm hoping we can see in the future improved video support on OS drivers!

        Cheers

        Comment


        • #74
          Originally posted by evolution View Post
          Well, I've been getting with ffmpeg-mt + mplayer ('mplayer -lavdopts threads=2 -vo xv -ao alsa "movie.mov"') the same performance I get with "normal" ffmpeg + mplayer ('mplayer -lavdopts threads=1 -vo xv -ao alsa "movie.mov"'). The CPU usage for "big buck bunny" (a 1080p sample I use for testing multithread decoding), is about 50% CPU with Xv, in both cases (single or multi-thread modes). Drivers: ATI Avivo (Xv) and Mesa Xv (from the ATI OS driver). Tested in my laptop. (HD2600 + C2D@2GHz)
          That sounds about right - I wouldn't expect the multi-threaded decoder to use less CPU (in fact I would expect it to use a tiny bit more for the same workload) - the big deal is that it doesn't hit a wall at 100% and fail to keep up with the video. It hits a wall at 200% on a dual-core, of course, but most high def videos seem to need "a bit more than one core" - couple that with single thread decoders and that's how software decoding got a bad name.

          Originally posted by evolution View Post
          I fully agree Xv does a better job than "X11" rendering, but important points such as IDCT or VLD are missing on Xv (I'm not sure, but I heard IDCT and VLD can help a lot on CPU performance, mainly if these processes are done in the GPU).
          My point was that Xv is not a decode API and was never intended to be one. Its job is to handle the render/present part of the video pipe, ie everything *after* the decoder. The profiles I have seen for software decode still suggest that motion comp is the biggest single CPU hog after the render/present tasks, although it also seems to be most amenable to clever use of CPU SIMD instructions so it may have been optimized to the point where it is no longer the biggest task. Mo-comp still seems like a pretty good fit for running on GPU shaders, and even IDCT seems to fit better than I first expected.

          Originally posted by evolution View Post
          I'd like to change to OS ATI, but as I'm thinking in buying a fusion netbook in a near future, I think I'll have some troubles watching videos using OS ATI drivers (I'm not sure if AMD's Zacate processor has enough "horsepower" to play 1080p videos with Xv... ).
          Yeah, that's one of the things I want to find out as well once we get more Zacates into developer hands (Alex's big wobbly prototype board seems to have given up the ghost recently). You would definitely need a dual-core CPU and a multi-threaded decoder but most of the Zacates out there are dual-core and ffmpeg-mt is tantalizingly close to getting into the ffmpeg mainline.

          Originally posted by evolution View Post
          The best I can do atm is "bug reporting" (I really like to report bugs... ). I don't have any problems describing problems from drivers... Unfortunately, I've a big lack of knowledge in C and Assembly (so, programming for me, is a bit 'out of bounds' atm )... and spare time (I'm doing a masters degree (as you can see in my profile) that takes me a lot of it... )
          Yeah, time is always a challenge. There isn't much assembler in the drivers though, and C is one of the simpler languages out there, so it's really just a question of holding your nose and jumping in. There are lots of people who will help you get started.
          Test signature

          Comment


          • #75
            Yeah, time is always a challenge. There isn't much assembler in the drivers though, and C is one of the simpler languages out there, so it's really just a question of holding your nose and jumping in. There are lots of people who will help you get started.
            i believe that programming is more about "knowing what you want to do" than knowing a programming language. Many people willing to help simply don't know what to do and maybe they will feel weird going into a irc channel or mailing list and say "Hey i am here to help". A getting started guide publicly available might help.

            One thing that i thought was smart was the thing that LibreOffice did with their "Easy tasks" list. They gave a short description of the tasks on their website, the requirements in order to solve the problem and i think they got people into it. Graphics people i think they need to do the same and it might get more people involved.

            Comment


            • #76
              Originally posted by bridgman View Post
              That sounds about right - I wouldn't expect the multi-threaded decoder to use less CPU (in fact I would expect it to use a tiny bit more for the same workload) - the big deal is that it doesn't hit a wall at 100% and fail to keep up with the video. It hits a wall at 200% on a dual-core, of course, but most high def videos seem to need "a bit more than one core" - couple that with single thread decoders and that's how software decoding got a bad name.



              My point was that Xv is not a decode API and was never intended to be one. Its job is to handle the render/present part of the video pipe, ie everything *after* the decoder. The profiles I have seen for software decode still suggest that motion comp is the biggest single CPU hog after the render/present tasks, although it also seems to be most amenable to clever use of CPU SIMD instructions so it may have been optimized to the point where it is no longer the biggest task. Mo-comp still seems like a pretty good fit for running on GPU shaders, and even IDCT seems to fit better than I first expected.



              Yeah, that's one of the things I want to find out as well once we get more Zacates into developer hands (Alex's big wobbly prototype board seems to have given up the ghost recently). You would definitely need a dual-core CPU and a multi-threaded decoder but most of the Zacates out there are dual-core and ffmpeg-mt is tantalizingly close to getting into the ffmpeg mainline.


              Yeah, time is always a challenge. There isn't much assembler in the drivers though, and C is one of the simpler languages out there, so it's really just a question of holding your nose and jumping in. There are lots of people who will help you get started.
              well iv said it before and ill say it again, if you want AMD CPUs to perform Linux CPU video well then get some boards and CPUs into the hands of the x264/ffmpeg/libav assembly guys hand's or at least a remote shell for unreleased kit, it's not hard or rock science and you know this already, get over to the #x264dev IRC channel and tell them you want help and here's some kit to test with and run a current x264/checkasm git pull
              checkasm --bench
              a park_joy_2160p.y4m http://media.xiph.org/video/derf/y4m/2160p/ with a current x264 git pull encode down to 720P might be interesting to use for generic bench too etc.

              now that Zacate and Ontario were officially released as of 01/04/2011 it seems AMD made another boo-boo by including all the usual FP operations (upto SSE3) but not AVX especially now AVX assembly is in x264

              Sat, 15 Jan 2011 18:44:45
              Add AVX functions where 3+ arg commands are useful

              remember that the real life SIMD/AVX integer results are what x264 uses in its code not FP SIMD/AVX operations

              and soon enough ported to ffmpeg/libav if only the developer could get a board to play with.

              in the case of ffmpeg-mt that's primarily Alexander Strange "astrange",
              and in the case of general SIMD you already know them from the directly linked log's above.

              as for the developer of the AVX functions you should already know given that you have been running code profiles apparently, that's Daniel Kang the 17 year old kid that came to x264 IRC under the goggle code in task's a few months ago knowing nothing about x86/64 SIMD/ AVX assembly and yet ended up writing these very same routines your seeing such an improvement in many 8/10bit encodes code profiles etc.

              so he's not really up for paying out pocket money for even a cheap Zacate board and case etc not that it would help new active and prolific developers like him in adding new AVX and related SIMD assembly code to x264 etc in timely manor , never mind up for buying a AVX capable quad or better AMD/Intel board and everything else to make a working and productive developer any time soon... and cant even participate in the current goggle summer of code to get receive the stipend to help pay for any future kit as he's not old enough yet, even if he is the current star pupil of both x264 and ffmpeg/libav developer's.

              you would be well advised to sponsor him and the other students in your own small private (one off ?) AMD summer code-in sponsorship right now totally separate from anything else going on in the Gfx land and get a massive return in your investment in a few months or perhaps even less depending on how well they do for improved video encode/decode for everyone to enjoy and advocate in the future and given some breathing room for your other projects to materialise etc.

              Comment


              • #77
                Originally posted by evolution View Post
                The CPU usage for "big buck bunny" (a 1080p sample I use for testing multithread decoding), is about 50% CPU with Xv, in both cases (single or multi-thread modes).
                That's normal and that's why "big buck bunny" isn't suitable as decoding test: it's so simple to decode that it doesn't even use HALF core
                ## VGA ##
                AMD: X1950XTX, HD3870, HD5870
                Intel: GMA45, HD3000 (Core i5 2500K)

                Comment


                • #78
                  Originally posted by darkbasic View Post
                  That's normal and that's why "big buck bunny" isn't suitable as decoding test: it's so simple to decode that it doesn't even use HALF core
                  I have better examples for you (these ones give me stutter and will push a lot more for the CPU than BBB):

                  http://www.filesonic.com/file/150015...D_MA_fixed.mkv

                  It's a bluray H264 1080p encoded video sample... On Windows (Without DXVA) my CPU goes to almost 100%; on Linux, I've about 80/90% CPU usage...

                  http://www.filesonic.com/file/119607...80p_sample.mkv

                  One Full HD video sample from the "BBC Planet Earth" series. Avg CPU usage (both Windows and Linux): 85+%.

                  On both cases, multithreaded decoders were used.

                  If you don't think I've done enough tests, I can give you some extra proofs...

                  Cheers

                  p.s.: Btw, Big Buck Bunny is still one of the "popular tests" used for testing ffmpeg-mt...

                  Comment

                  Working...
                  X