Announcement

Collapse
No announcement yet.

X.Org SoC: Gallium3D H.264, OpenGL 3.2, GNU/Hurd

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    VP6 is the most widespread so that too.

    Comment


    • #22
      I wasn't aware of that.

      Originally posted by bridgman View Post
      I really believe that the "missing link" so far has been someone grafting libavcodec onto the driver stack so that processing can be incrementally moved from CPU to GPU.

      Also, I forgot to mention that the other benefit of a shader-based implementation is that there are a lot of cards in use today which have a fair amount of shader power but which do not have dedicated decoder HW (ATI 5xx, for example).
      I knew that awhile back attempts were made to use shaders to handle specific aspects of the decode process, but then I don't recall anything else from the dev. I also don't recall what codec was worked upon. A problem, IIRC, was that the shaders were fairly slow in their efforts , but I don't recall what hardware it was being tested upon. Presumably, as you say, a more modern card would be better able to handle the load. A question I always had was one of power efficiency. Offloading for high bitrate material is a necessity, certainly, but for a lower bitrate target I wonder if a cpu wouldn't be more efficient, especially for a simpler codec like Theora.

      Comment


      • #23
        The work over Gallium3D was with MPEG2, using the XvMC API, mostly by Younes Manton on Nouveau :

        http://bitblitter.blogspot.com/

        Cooper then got a good chunk of that code running on the 300g ATI driver before getting dragged off to other projects.

        Somewhere in there a video API was defined and at least partially implemented, not exactly sure who did what there.

        I don't think we have any good power efficiency numbers yet re: whether CPU or GPU shaders do the offloadable work more efficiently. First priority was offloading enough work to the GPU so that the remainder could be handled by a single CPU thread, since the MT version of the CPU codecs wasn't very mature, and without the ability to use multiple CPU cores anything near 100% of a single core meant frame dropping and other yukkies.

        Since then, multithread decoders seem to have become more stable (at least more people seem to be using them), so the pull for GPU decoding has dropped somewhat. I don't know the status of the MT codecs right now, ie whether they are easily accessible to all users or whether they still need a skilled user to build and tweak 'em.

        Comment


        • #24
          Originally posted by bridgman View Post
          The work over Gallium3D was with MPEG2, using the XvMC API, mostly by Younes Manton on Nouveau :

          http://bitblitter.blogspot.com/

          Cooper then got a good chunk of that code running on the 300g ATI driver before getting dragged off to other projects.

          Somewhere in there a video API was defined and at least partially implemented, not exactly sure who did what there.

          I don't think we have any good power efficiency numbers yet re: whether CPU or GPU shaders do the offloadable work more efficiently. First priority was offloading enough work to the GPU so that the remainder could be handled by a single CPU thread, since the MT version of the CPU codecs wasn't very mature, and without the ability to use multiple CPU cores anything near 100% of a single core meant frame dropping and other yukkies.

          Since then, multithread decoders seem to have become more stable (at least more people seem to be using them), so the pull for GPU decoding has dropped somewhat. I don't know the status of the MT codecs right now, ie whether they are easily accessible to all users or whether they still need a skilled user to build and tweak 'em.
          The status of MT is that ffmpeg-mt blows. Better than a single thread, but it pretty much requires a 4-core chip in order to play a typical 10GB/movie file. The coreavc hacks (i.e. wine+coreavc+mplayer) work OK, seems to degrade output quality as the CPU limits are reached rather than dropping frames and going out of sync. Still pretty well maxes out a 2-core CPU and gets the heat up so much that you end up with the vacuum cleaner effect (unless you have a ***HUGE*** heat sync and a big slow fan). Vacuum cleaner effect + movie == very bad.

          Comment


          • #25
            OK, I guess it might be interesting to see if anyone is actively profiling that code to see where the CPU time is going, and how much of that time is going to "shader-friendly" tasks like MC and filtering.

            Comment

            Working...
            X