Announcement

Collapse
No announcement yet.

FFmpeg Lands NVDEC-Accelerated H.264 Decoding

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FFmpeg Lands NVDEC-Accelerated H.264 Decoding

    Phoronix: FFmpeg Lands NVDEC-Accelerated H.264 Decoding

    NVIDIA has been shifting their focus from VDPAU for GPU-accelerated video decoding to instead the NVIDIA Video Codec SDK that offers NVENC for encoding and NVDEC for video decoding. FFmpeg has landed initial NVDEC support...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Cool. I've been using this for decode of 10 bit 4k files in mpv for a year now. It feels much more stable than vdpau ever did.

    Comment


    • #3
      It forces the driver into P2 cuda state, which prevents downclocking below boostclock. Wastes lots of energy with bigger GPUs. DXVA2/D3D11VA on Windows have the opposite effect, they make the GPU clock lower while there are still no performance issues in mpv...
      Nvidia's support for fixing bugs with cuvid was also really bad, it's much more prone to corruption with certain videos compared to Windows APIs.

      Comment


      • #4
        I have yet to understand why we use GPUs for video encoding/decoding via special circuit. We should target GPUs via general software with OCL or OMP support, or something C_amp style.

        Comment


        • #5
          If you mean fixed function units in hardware: They are specialized and thus much faster and efficient than general processing units for their given purpose.

          Comment


          • #6
            Originally posted by aufkrawall View Post
            It forces the driver into P2 cuda state, which prevents downclocking below boostclock. Wastes lots of energy with bigger GPUs.
            Thankfully, using amdgpu via vaapi in mpv is very energy efficient - both CPU and GPU can stay at their lowest clock rates, even while a 4k/10-bit video is being decoded. One single CPU core will be utilized to < 10%, and the GPU doesn't warm up.

            Comment


            • #7
              Originally posted by artivision View Post
              I have yet to understand why we use GPUs for video encoding/decoding via special circuit. We should target GPUs via general software with OCL or OMP support, or something C_amp style.
              Because it's all about windows Shadowplay and the new streaming services, and the "special circuit" is there to achieve better gaming performance while encoding.

              Comment


              • #8
                Well, ffmpeg had support for cuvid for forever, which is the exact same API, just under its old name.

                This is in fact just a second implementation using the ffmpeg native format parsers instead of relying on the ones nvidia supplies, which are often not 100% compatible and lack features, like for example closed caption support.

                Once there is feature parity the old cuvid decoder will be deprecated and slowly phased out, which might still take years though.

                Comment


                • #9
                  Originally posted by artivision View Post
                  I have yet to understand why we use GPUs for video encoding/decoding via special circuit. We should target GPUs via general software with OCL or OMP support, or something C_amp style.
                  Because GPUs are terribly bad at video de/encoding. It's not a task where massive parallelism helps you, but constantly gets in your way. So instead they add native hardware, which is more power efficient and faster. And also leaves the GPU free to do other stuff.

                  Comment


                  • #10
                    Originally posted by artivision View Post
                    I have yet to understand why we use GPUs for video encoding/decoding via special circuit. We should target GPUs via general software with OCL or OMP support, or something C_amp style.
                    Because that's incredibly inefficient. GPU's are able to get the performance they do because they are massively parallel. That makes a lot of sense for graphics, because you can break down processing on a per-pixel basis. For video decoding and encoding, there are large portions of the pipeline that are entirely serial and cannot be parallelized at all. Every calculation relies on the results of the previous one, and running a single-thread on a GPU capable of thousands of calculations at a time is going to be terrible.

                    You could, of course, design a new video codec that is able to be parallelized. That's just not the way the ones currently in use behave, and it would be a lot of work to try and create a new codec that could provide the same compression levels. Maybe the Daala/VP/etc. replacement codec that's being worked on goes in that direction, I'm not sure. The current VP9/H264/H265 type codecs are very locked into the current serial style, though.
                    Last edited by smitty3268; 11 November 2017, 09:08 PM.

                    Comment

                    Working...
                    X