Announcement

Collapse
No announcement yet.

Daala: A Next-Generation Video Codec From Xiph

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    @droidhacker
    Newer graphics cards can do all the heavy lifting.
    https://www.google.be/#q=video+decod...pletely+on+GPU

    OpenCL on the GPU is something that is somewhere in between.
    GPU's are made for doing graphical work and are also more efficient when used to decode video.
    Not as efficient as an ASIC but much more efficient than using the CPU.

    The implementation of GPU encoders and decoders is advancing.
    There is a big effort to do more with the GPU nowadays.
    Seen the release notes from recent adobe products? Lots of stuff that's moved to the GPU.

    Comment


    • #17
      Originally posted by plonoma View Post
      @droidhacker
      Newer graphics cards can do all the heavy lifting.
      https://www.google.be/#q=video+decod...pletely+on+GPU

      OpenCL on the GPU is something that is somewhere in between.
      GPU's are made for doing graphical work and are also more efficient when used to decode video.
      Not as efficient as an ASIC but much more efficient than using the CPU.

      The implementation of GPU encoders and decoders is advancing.
      There is a big effort to do more with the GPU nowadays.
      Seen the release notes from recent adobe products? Lots of stuff that's moved to the GPU.
      Even more nicely, you can use opengl 4.3 compute shaders to do all the decoding without any of the painful memcopys that plague opencl.

      Comment


      • #18
        Originally posted by droidhacker View Post
        There has been a lot of talk about this idea over the years, but the problem is that it has NEVER been pushed past PARTIAL and/or THEORETICAL. There was some partial GPU assistance on some older video cards, but all in all, video decoding has always been done either in software or on dedicated hardware.

        Now that being said, this new side-transition may be more suitable for general purpose opencl acceleration. Of course, that's at the expense of the massive power consumption typical of all GPUs.
        Look to the various dxva2 levels (notice AT tests quicksync separately, so dxva2 isn't using the intel provided hardware decoding) and madvr (the original madvr release seems like it was mostly like xvideo, but it seems to offer far more now). Not for linux, but apparently tremendously efficient.
        http://www.anandtech.com/show/7007/i...-perspective/5

        With opencl you should be able to do similar things on linux, I'd imagine, but it just hasn't been done b/c there hasn't been sufficient interest from the right people.

        Comment


        • #19
          Originally posted by liam View Post
          Look to the various dxva2 levels (notice AT tests quicksync separately, so dxva2 isn't using the intel provided hardware decoding) and madvr (the original madvr release seems like it was mostly like xvideo, but it seems to offer far more now). Not for linux, but apparently tremendously efficient.
          http://www.anandtech.com/show/7007/i...-perspective/5

          With opencl you should be able to do similar things on linux, I'd imagine, but it just hasn't been done b/c there hasn't been sufficient interest from the right people.
          DXVA is the MS equivalent of VDPAU or VAAPI. It's not shader based decoding, beyond the standard post-processing effects.

          GPU hardware is not h264 decoding friendly, no matter what kind of API like OpenCL you use.

          Comment


          • #20
            Originally posted by smitty3268 View Post
            DXVA is the MS equivalent of VDPAU or VAAPI. It's not shader based decoding, beyond the standard post-processing effects.

            GPU hardware is not h264 decoding friendly, no matter what kind of API like OpenCL you use.
            Something doesn't make sense. According to the link, they were using dxva2 (with two different rendering options) with haswell. Three test variations were made, and two with dxva2, and one using quicksync. Since qs is how you accelerate video on intel, what was dxva2 using when it wasn't using quicksync.
            http://msdn.microsoft.com/en-us/libr...=vs.85%29.aspx
            That link says that it says it can use off-host acceleration of certain parts of a codec, implying that it will accelerate what it can. So it has various entry points, similar to vdpau/vaapi, as you say. So, you can use dxva without targetting dedicated decode hardware. Moreover, from what Bridgman has said, and from processing pipelines i've seen, it seems like the only part of the decoding that can't be handled well on the gpu is the entropy coding (which can be high, admittedly). That is what it seems like is being done in the AT article.

            I'd never heard of dxva prior to that article so bear with me if I misunderstand.
            Last edited by liam; 06-26-2013, 02:59 AM.

            Comment


            • #21
              There is work in performing entropy encoding with the GPU too:
              https://www.google.be/#q=entropy+encoding+on+GPU

              Here is even a patent for doing this:
              http://www.google.com/patents/US20120268469

              Seems like entropy encoding still needs to be done on the CPU.
              Does not mean it will stay this way of course.

              Comment


              • #22
                they should secretly ask the pirate scene to start releasing movies in Daala even some exclusively in Daala. AFAIK that's how divX and the Xvid became popular. Correct me if i'm missing something but i dont know of anything else that made mpeg4 popular. Corporations follow what people want, alteast chinese ones do. So if it becomes popular amount pirates maybe chinese movie players would start shipping with Daala support and then big brands' ones would follow, like happened with xvid/divx.

                Comment


                • #23
                  Originally posted by ashkbajw View Post
                  they should secretly ask the pirate scene to start releasing movies in Daala even some exclusively in Daala. AFAIK that's how divX and the Xvid became popular. Correct me if i'm missing something but i dont know of anything else that made mpeg4 popular. Corporations follow what people want, alteast chinese ones do. So if it becomes popular amount pirates maybe chinese movie players would start shipping with Daala support and then big brands' ones would follow, like happened with xvid/divx.
                  First step would be to have Daala decode support in ffmpeg. Which ever codec between x265, VP9, and Daala has opencl encoding support first will be used. Its really slow to encode HD with x265 using only cpu.

                  Comment


                  • #24
                    Originally posted by ashkbajw View Post
                    they should secretly ask the pirate scene to start releasing movies in Daala even some exclusively in Daala. AFAIK that's how divX and the Xvid became popular. Correct me if i'm missing something but i dont know of anything else that made mpeg4 popular. Corporations follow what people want, alteast chinese ones do. So if it becomes popular amount pirates maybe chinese movie players would start shipping with Daala support and then big brands' ones would follow, like happened with xvid/divx.
                    Most pirates are Windows users that don't give two shits about open standards. For them to use Daala, it really needs to be superior to h264 in a quality / filesize sense.

                    Comment


                    • #25
                      Originally posted by plonoma View Post
                      There is work in performing entropy encoding with the GPU too:
                      https://www.google.be/#q=entropy+encoding+on+GPU

                      Here is even a patent for doing this:
                      http://www.google.com/patents/US20120268469

                      Seems like entropy encoding still needs to be done on the CPU.
                      Does not mean it will stay this way of course.
                      Like in "Design and Implementation of Arithmetic Coder for CUDA" from 2010. Very clever approach which is still bottlenecked by the one nonparallel part (95% of gpu time). They get 3-5 times speedup compared to CPUs on hardware which has 20-50 times more raw power.

                      Imho if you want efficient encoding you have to do it in hardware. Maybe this hardware could be designed to be more flexible/configurable (without resorting to FPGA blocks).

                      Comment

                      Working...
                      X