Announcement

Collapse
No announcement yet.

NVIDIA vs. Radeon VDPAU Mesa 17.2 Video Decode Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • NVIDIA vs. Radeon VDPAU Mesa 17.2 Video Decode Performance

    Phoronix: NVIDIA vs. Radeon VDPAU Mesa 17.2 Video Decode Performance

    In yesterday's GeForce GT 1030 Linux review, a $70 USD graphics card that's low-profile and passively-cooled, I featured a number of NVIDIA VDPAU video acceleration benchmarks. But a question came up about Radeon VDPAU performance, so here are some benchmarks on that front, but they are far from ideal.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Thanks for this test, it's very interesting.

    Comment


    • #3
      Shouldn't you test power consumption when playing a video at 24 fps?
      I think the Radeons have in issue with not clocking up when fully loaded with VDPAU tasks. However that doesn't really affect the media box usecase.
      If you run glxgears at the same time you will get more decoding fps from what I've read...

      Comment


      • #4
        Hello there,

        I've done a ton of testing of GPU decoding under Linux on various AMD/Nvidia platforms over the years, and there are a couple of things you may wish to consider when doing any sort of testing for VDPAU/VA-API:

        1. Are you deinterlacing video when rendering 1080i content? A number of low-cost AMD GPUs work great with 720p and don't have issues with 1080i as long as you don't try to deinterlace. However as soon as you enabled VDPAU deinterlacing the performance tanks and the GPU cannot keep up with the frame rate.

        2. If you are deinterlacing 1080i content, what deinterlacing mode are you using? VDPAU has a number of deinterlacing quality modes available, and the higher quality modes can adversely effect the performance (or again even the ability to keep up with 1080i in realtime without dropping frames).

        3. Subpicture blending can significant adversely effect performance on some GPUs. Enabling closed captions, rendering text over the video (e.g. showing the title of the video), or logo overlay can have a significant impact on the ability to keep up with realtime without frame dropping. I've had a number of users who reported that "The video looks great until I enable closed captions, at which point I get video stuttering".

        4. The combination of doing deinterlacing and subpicture blending simultaneously may have some unexpected performance consequences. If the deinterlacer is putting out 2xFRAMERATE, then you're going to have to blend that image 60 times a second over the video, whereas if the deinterlacer only puts out 30FPS then you only need to do it 30 times a second. And in GPUs where cycles are scarce, trying to both deinterlace *and* blend a subpicture can exhaust all the available cycles. Hence in any benchmark it's useful to have a "worst case" scenario where you're doing 1080i H.264 with "High quality" deinterlacing and closed captions or some other SPU blending present.

        The above are just some things to consider if you are looking for ways to improve your benchmarking.

        Cheers,

        Devin Heitmueller

        Comment


        • #5
          Not sure if this is pedantic or not but you have H.264 tests and you have MPEG-4 tests.

          H.264 is part of the MPEG-4 standard... There seems to be a ton of confusion about this from a lot of reviewers that include video encoding benchmarks.

          "MPEG-4" itself is rather inspecific. Are you talking MPEG-4 Part 2? Simple profile? Advanced simple profile?

          I'm not sure if this "problem" is even Michael's because maybe the benchmark itself only uses the phrase MPEG-4. It's just really not good enough as there are several video encoding algorithms in the standard, and many of those have multiple profile levels. It's somewhat meaningless data without knowing which algorithm and profile level, specifically, are being tested.
          Last edited by Holograph; 26 May 2017, 01:09 PM.

          Comment


          • #6
            Well written, Devin. Your arguments matter and Michalel's tests seem a bit misleading...

            Comment


            • #7
              Originally posted by pjezek View Post
              Well written, Devin. Your arguments matter and Michalel's tests seem a bit misleading...
              There is nothing misleading about them, qvdpautest is publicly available and has been for years, all the questions can be answered by looking at it, as well as the Phoronix Test Suite's test profile wrapped around it.
              Michael Larabel
              https://www.michaellarabel.com/

              Comment


              • #8
                It's known that AMD don't ramp up the clocks for UVD - They don't need to for playback.

                R9 285 on old Phenom II x4 965be box.

                Clocks auto vs clocks forced high.

                MPEG DECODING (1920x1080): 107 frames/s
                MPEG DECODING (1280x720): 234 frames/s
                H264 DECODING (1920x1080): 208 frames/s
                H264 DECODING (1280x720): 407 frames/s
                VC1 DECODING (1440x1080): 62 frames/s
                MPEG4 DECODING (1920x1080): 50 frames/s

                MPEG DECODING (1920x1080): 175 frames/s
                MPEG DECODING (1280x720): 377 frames/s
                H264 DECODING (1920x1080): 358 frames/s
                H264 DECODING (1280x720): 694 frames/s
                VC1 DECODING (1440x1080): 145 frames/s
                MPEG4 DECODING (1920x1080): 155 frames/s

                Comment


                • #9
                  Right, the clocks only ramp up enough to sustain the playback to keep power usage low. If you want to test max multi-media performance, you can force the clocks high via sysfs.

                  Comment


                  • #10
                    Originally posted by ernstp View Post
                    Shouldn't you test power consumption when playing a video at 24 fps?
                    I think the Radeons have in issue with not clocking up when fully loaded with VDPAU tasks. However that doesn't really affect the media box usecase.
                    If you run glxgears at the same time you will get more decoding fps from what I've read...
                    That's pretty interesting. All of the Radeon GPUs tested performed roughly the same and the frame rates were plenty sufficient.

                    But what I don't get is why is the power consumption so bad? If the clock rates are limited, shouldn't the power consumption reflect that?

                    Comment

                    Working...
                    X