Announcement

Collapse
No announcement yet.

FFmpeg Has Seen Some AVX2 Optimizations For VP9 Decoding

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    As far as I know, only Kaby Lake does proper hardware decoding of VP9 thus far, so most of us are actually decoding in software. As mentioned, AMD has a hybrid mode but it's Windows only.

    Comment


    • #12
      Nvidia does have native VP9 decoding since January 2015 with GTX 960 and native VP9 10 bit decoding with GTX 1050 Ti since November 2016. Works on Windows 10 with DXVA2/D3D11VA since Anniversary Update and on Linux with CUVID (latter one not with low power consumption, unfortunately).

      YouTube VP9 10 bit HDR videos however don't seem to be very CPU hungry, bitrate is even lower than for 8 bit SDR (at least for the few videos I've looked at).

      Comment


      • #13
        While AMD's Excavator shouldn't benefit from AVX2 optimizations as much as CPUs from Intel as the SIMD units are not as wide, it should still be more efficient to do things with fewer instructions by using AVX over SSE. This should lower pressure in the rest of the execution pipeline, e.g. the decoders.

        Comment


        • #14
          Originally posted by sdack View Post
          Speeds beyond 1x do actually matter, because transcoding is also done in hardware these days and at speeds beyond 1x. You also don't want to have your CPU running at 100% usage while playing a video.
          Sure, but transcoding is a totally different use case. I'm a power user and my family does things like photo editing, but only recently we started with video editing, thanks to a new 4k capable easy to use DSLR. There are tons of users, machines, and use cases that don't involve transcoding or video encoding in any way. Most use of videos is decoding for playback at realtime speed.

          For video decoding, a dedicated decoder hardware DSP is always the best choice. Using AVX might speedup by 100% and save battery by 50%, but the decoder chips reduce power consumption by 99 to 99.9%.

          Comment


          • #15
            Originally posted by juno View Post
            Also, the table counts "GPU or DSP based implementations – software implementations on non-CPU hardware", which is pretty useless. E.G. even AMD's most recent UVD seen in Vega dones't support VP9. The power-hungry hybrid decoder does only work on Windows.
            Sure, the list may be useless to you, but when you know you have the hardware for it then why would you ever want to settle for less? You know it's just a wasted piece of hardware otherwise, paid for by your good money. If it then decodes on the CPU using mmx, sse, avx or now avx2 is pretty much irrelevant. It's nice but that's about it. It's still bad news for all those people who do have the hardware for VP9 decoding, but still have to let the CPU do it.

            Comment


            • #16
              Kudos for making the world's fastest VP9 decoder even faster!

              (A faster decoder translates to being able to play higher resolutions and framerates, which is probably more interesting to most people than actually speeding up the video.)

              Let's hope this is transferrable to AV1 (if/when ffmpeg decides to make an AV1 decoder) – the reason ffvp9 was so fast (according to that link↑) was that it shared so much optimized code with other ffmpeg codecs.
              Last edited by andreano; 27 August 2017, 01:03 PM.

              Comment


              • #17
                Typo:

                Originally posted by phoronix View Post
                Advanced Vector Extensions 2 instrunctions have been supported since Intel Haswell

                Comment


                • #18
                  Originally posted by sdack View Post
                  Sure, the list may be useless to you, but when you know you have the hardware for it then why would you ever want to settle for less? You know it's just a wasted piece of hardware otherwise, paid for by your good money.
                  Only that I don't. There is no fixed function VP9 hardware on any GCN GPU. If it runs on the CUs, sure I've paid for that, but I still can't use it as it's not supported in VA-API (nor likely in dxva, for that matter). I could maybe look for an OpenCL encoder though... Thats why I said this list ist useless.

                  Originally posted by sdack View Post
                  If it then decodes on the CPU using mmx, sse, avx or now avx2 is pretty much irrelevant. It's nice but that's about it. It's still bad news for all those people who do have the hardware for VP9 decoding, but still have to let the CPU do it.
                  That's another story. Complain to Nvidia if you have a VP9 encoder but can't use it because they don't support VA.

                  Comment


                  • #19
                    Originally posted by juno View Post
                    If it runs on the CUs, sure I've paid for that, but I still can't use it as it's not supported in VA-API (nor likely in dxva, for that matter).
                    I think that would be possible if AMD wanted to. The HEVC 8 bit hybrid decoder on Maxwell GPUs older than GTX 960 was/is usable via DXVA, to applications it likely looks like a normal native decoder.

                    Comment


                    • #20
                      Originally posted by juno View Post
                      Only that I don't. There is no fixed function VP9 hardware on any GCN GPU. If it runs on the CUs, sure I've paid for that, but I still can't use it as it's not supported in VA-API (nor likely in dxva, for that matter). I could maybe look for an OpenCL encoder though... Thats why I said this list ist useless.

                      That's another story. Complain to Nvidia if you have a VP9 encoder but can't use it because they don't support VA.
                      Luckily for me does ffmpeg fully support my Nvidia card, allowing me to do decode and encode in hardware at amazing speeds while the CPU sits idle at <5%. I'm all happy and glad I don't have to rely on some tweaked code that still needs to run on the CPU.

                      Comment

                      Working...
                      X