Announcement

Collapse
No announcement yet.

SVT-AV1 0.5 Released As Intel's Speedy AV1 Video Encoder

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by edwaleni View Post
    objective analysis
    I've had luck with sitkevij/ffmpeg:

    Code:
    docker run --rm -it sitkevij/ffmpeg:3.3-vmaf -i output.mkv -s 1920x1080 -r 50 -i input.yuv -lavfi libvmaf -f null -
    Unfortunately, the bundled ffmpeg was too old to support AV1, so you would have to rebuild it. There was also a very apparent memory leak, which may or may not be fixed by upgrading ffmpeg.
    Last edited by andreano; 22 May 2019, 12:10 PM.

    Comment


    • #22
      Originally posted by Spooktra View Post
      x264 --preset veryslow --tune ssim --crf 16 -o test.x264.crf16.264 orig.i420.y4m

      For the uninitiated, the "very slow" preset is considered the "mastering" quality preset and crf 15 is generally considered visually lossless to the source; for this test they used "very slow" and crf 16 which means it would be nearly impossible for the average person to tell the difference between the source and the encoded version.

      Further, the "very slow" preset, as the name implies is indeed very slow, encoding times are glacial.

      If SVT-AV1 matched x264+veryslow+crf 16 with -q 20 -enc-mode 3 (as the command line says) then it doesn't get much better than that because with settings as aggressive as x264's SVT-AV1 would smoke it.
      I don't see what you are trying to say here. If SVT-AV1 is as slow as "x264 --preset veryslow", while not providing better quality at the same bit rate, then clearly x264 is the better choice for that bit rate. As others already stated: A comparison of only speed without the other dimensions considered is useless.

      I would say that Intel's SVT family of encoders are the future
      Only if their combination of result dimensions becomes more competitive.
      Currently, I see a lot of cases where even H.264 is preferable over HEVC, simply because spending lots of extra encoding CPU time comes with a very slim, if at all, advantage in image quality. Especially when plenty of bit rate is available, HECV isn't preferable to H.264, and AV1 will certainly be much worse for a long time.

      Comment


      • #23
        Originally posted by edwaleni View Post
        Jan Ozer has an article looking at Netflix's VMAF here:
        Regardless of how great the measurement method is that Netflix uses, they will certainly use less bandwidth for encoding than necessary to avoid visible compression artifacts. And that is because of the measure "profit", which rises with each bit/s they shave off.

        Comment


        • #24
          Originally posted by Gusar View Post
          PSNR is beyond terrible as a metric (it favors blur, which is the most horrible thing you can do to human eyes), even SSIM is bad. They tell you *nothing* regarding actual video quality. What the psnr and ssim tunings in x264 do is they *disable* encoder tools that make the video look good! The ssim tuning disables psychovisual rate-distortion decisions, while the psnr tuning additionally disables adaptive quantization, those two being _the_ most important x264 tools that make it such a great encoder. Why does x264 do that? Simple, disabling these tools results in higher psnr/ssim scores. Which is completely backwards. So those two metrics are useless. An encoder comparison that deals with PSNR and SSIM is therefore completely worthless.

          So that commandline in Spooktra's post - "x264 --preset veryslow --tune ssim --crf 16 -o test.x264.crf16.264 orig.i420.y4m" (emphasis mine)... yeah.

          Netflix developed a new metric that takes a different approach, the magic of machine learning. It's called VMAF. But even VMAF isn't really it. There's just no substitute for actually looking at the encoded video. So, the people here asking for an "objective quality benchmark"... it doesn't exist.
          Hysterical! For those that actually care to learn, Dark Shikari, one of the 2 main x264 developers and the creator of mb-tree, when he first introduced mb-tree, here is what he said:


          __________________________________________________
          What improvement does it give?

          I've gotten up to a 70% SSIM increase.

          You're joking, right?

          No.

          Seriously? How the hell can you get a 70% quality increase?

          Magic.
          __________________________________________________ __

          For those that don't know, mb-tree requires AQ in order to work and they are both one of 5 Psy "optimizations" x264 offers.

          Look at this thread, post 23:



          Where this same x264 developer talks about using SSIM internally to gauge encoding quality and he has also admitted that x264 uses PSNR internally to determine quality.

          Comment


          • #25
            Originally posted by dwagner View Post
            I don't see what you are trying to say here. If SVT-AV1 is as slow as "x264 --preset veryslow", while not providing better quality at the same bit rate, then clearly x264 is the better choice for that bit rate. As others already stated: A comparison of only speed without the other dimensions considered is useless.
            As I pointed out, x264's very slow preset is considered "mastering" quality and CRF 15 is considered visually lossless relative to the source, by definition as far as quality is concerned, it's impossible to beat x264 with those settings from a pure quality standpoint.

            What you can only do is match that quality at a smaller file size, which SVT-AV1 does and/or beat it in terms of encoding speed, which as evidenced by Michael's numerous tests, Intel's SVT family of encoders does easily.

            Originally posted by dwagner View Post
            Only if their combination of result dimensions becomes more competitive.
            Currently, I see a lot of cases where even H.264 is preferable over HEVC, simply because spending lots of extra encoding CPU time comes with a very slim, if at all, advantage in image quality. Especially when plenty of bit rate is available, HECV isn't preferable to H.264, and AV1 will certainly be much worse for a long time.
            The x265 people have already embraced Intel's SVT-HEVC and now allow you to use the Intel encoder from with the x265 framework:



            With changeset a41325fc854f, the x265 library can invoke the SVT-HEVC library for encoding through the —svt option. We have mapped presets and command-line options supported by the x265 application into the equivalent options of SVT-HEVC, and have added a few specific options that are available only when the SVT-HEVC library is invoked. This page in […]


            When a major player in the encoder market incorporates support for a competitors product into their framework, the writing is on the wall.

            Comment


            • #26
              Originally posted by sophisticles View Post
              Where this same x264 developer talks about using SSIM internally to gauge encoding quality and he has also admitted that x264 uses PSNR internally to determine quality.
              Yes, *specific* tools inside the encoder use these metrics to make decisions. Other tools make use of "magic" lambda values that were determined by... encoding videos several times with different settings and then _looking_ at the results and deciding which lambda value works best. AQ was tuned in a huge thread at doom9.org where people were testing different settings and then also _looking_ at the resulting encodes and providing sample pictures.

              So PSNR and SSIM have their uses, but they are worthless to determine *overall* video quality. You can't do two encodes, measure their psnr/ssim and determine the winner. Because you'll be pleasantly surprised if you actually go and look at those encodes. This can easily be proven by x264's tune switch - the psnr and ssim tunings will give higher scores in those metrics but it'll be the film tuning (or animation tuning if encoding classic cell-shaded animation) that gives the best video.

              It's exactly at the doom9.org forums that people, including and especially x264 developer Dark Shikari, have been harping for years how psnr and ssim aren't a measure of video quality. So it's really funny that you link to doom9.org of all places as supposed proof that my post was "hysterical". Nothing hysterical about my post, it's exactly from doom9.org and Dark Shikari's blog that I have my knowledge from. For example, this: https://web.archive.org/web/20150119...x/archives/458

              Comment


              • #27
                Originally posted by Gusar View Post
                It's exactly at the doom9.org forums that people, including and especially x264 developer Dark Shikari, have been harping for years how psnr and ssim aren't a measure of video quality. So it's really funny that you link to doom9.org of all places as supposed proof that my post was "hysterical". Nothing hysterical about my post, it's exactly from doom9.org and Dark Shikari's blog that I have my knowledge from. For example, this: https://web.archive.org/web/20150119...x/archives/458
                I know all about the doom9 threads, I was a member there for years before I got banned for arguing with His Darkness. Jason AKA Fiona AKA Dark Shikari has been one of the most disingenuous individuals ever. In an effort to ruthlessly promote his software encoder he spread lie and lie and it became gospel among those that do not have the technical background to see through his BS. This was a guy that didn't even have his Comp Sci degree yet, that passed himself off as a video encoding expert and talked crap every chance he got.

                When he stole AQ from other software encoders, sorry I mean "invented" it, and it was found that in some case it lowered PSNR and SSIM (both mathematically based engineering concepts BTW) he started the FUD campaign of how PSNR and SSIM can't be trusted, that only "your eyes" can be trusted and he started that garbage about how "your eye doesn't want to see a picture similar to the source but rather one that has similar complexity". Of course he never bothered to explain who the fuck he is to tell me what my eye wants to see, what "similar complexity" actually means (it's jargon, mish-mash that doesn't actually mean anything). He also never explained how it is that only "your eyes" can be trusted when everyone;s opinion of what looks better is different and when people's eyes perceive pictures differently and when different quality monitors, room lighting and drivers will produce different quality results.

                Subjective measurements, like Dark Shikari and all his worshipers espouse are inherently flawed because they are just an opinion, determining the quality of an encoder by "what looks better" is like saying "which car is the nicest" or "which cheeseburger tastes the best"?

                These are questions that have no valid answer, you can say which car is quicker 0-60, which has more horsepower, which one has the lower coefficient of drag and so on but you can't say this car is nicer.

                I'm telling you right now, if you encode 2 files and one has a PSNR of 47 dB across all three, YUV and one has a PSNR of 50 dB across YUV, then the latter is of higher quality. The mistake people make which leads them to believe that PSNR is not a good indicator of quality is that they usually only look at the Y channel, for instance the first will have PSNR-Y of 47 dB but PSNR-U and PSNR-V of 45 dB and the second will have PSNR-YUV of 46 dB and obviously the latter will be of higher quality but people will only look at PSNR-Y measurement and then say "see, PSNR is a poor indicator of quality".

                If people knew what they were doing they wouldn't believe such silly things.

                Comment


                • #28
                  sophisticles Have you actually done what I wrote about, use the different tunings of x264? Because I have. Three encodes of the same video, each 2-pass target bitrate (to eliminate video size as a variable), the only difference being --tune={psnr,ssim,film}. The first one will look terrible, the second one passable, the third one will actually look good. That's not "gospel" or whatever you're on about, that's a *fact* that each person can verify for themselves.

                  Or, if you don't want to use x264 because of your personal issue with Dark Shikari, use theora. The last libtheora release is 1.1, it optimizes for PSNR. Then use the git version, which optimizes for SSIM. Do an encode with each version, 2-pass target bitrate, and look at the result. The 1.1 release produces garbage, an unusable blurry mess. The git version is passable at least. And this is again something that each person can verify for themselves, this isn't me trying to preach "gospel" or some such.

                  PS. Dark Shikari never claimed he invented AQ. Unless you can provide a link to prove it, then it's *you* who is spreading FUD. And if you thing using words like "His Darkness" and "worshipers" gives any credibility to your statements, think again.

                  Comment


                  • #29
                    Just use VMAF and pretend that it works.

                    It is the best objective metric, and would be infinitely better than nothing, which is the current state of Phoronix encoder testing.
                    Last edited by andreano; 21 May 2019, 12:47 PM.

                    Comment


                    • #30
                      Originally posted by Gusar View Post
                      sophisticles Have you actually done what I wrote about, use the different tunings of x264? Because I have. Three encodes of the same video, each 2-pass target bitrate (to eliminate video size as a variable), the only difference being --tune={psnr,ssim,film}. The first one will look terrible, the second one passable, the third one will actually look good. That's not "gospel" or whatever you're on about, that's a *fact* that each person can verify for themselves.

                      Or, if you don't want to use x264 because of your personal issue with Dark Shikari, use theora. The last libtheora release is 1.1, it optimizes for PSNR. Then use the git version, which optimizes for SSIM. Do an encode with each version, 2-pass target bitrate, and look at the result. The 1.1 release produces garbage, an unusable blurry mess. The git version is passable at least. And this is again something that each person can verify for themselves, this isn't me trying to preach "gospel" or some such.

                      PS. Dark Shikari never claimed he invented AQ. Unless you can provide a link to prove it, then it's *you* who is spreading FUD. And if you thing using words like "His Darkness" and "worshipers" gives any credibility to your statements, think again.
                      Go search the videohelp forums and you tell me if I have done any encoding tests.

                      As for DS, he most certainly has claimed he invented AQ and he ruthlessly promoted it and tried to market it, especially to the Main Concept folks (who btw include 3 different types of AQ in their encoder).

                      As for x264, I do use it once in a while and when I do I always turn off all psychovisual crap, if I wanted something to mess with my head I would still be with my ex-girlfriend. I nearly always use CRF 15, preset very slow and tune PSNR.

                      And my "issue" with Jason/Fiona, is that he was full of crap, he lied about so many things, GPU acceleration, psychovisual optimizations, the list goes on. He took credit for shit he didn't invent or conceive of, psy optimizations where present in the DivX encoder, AQ was patented back in 1995, RDO was patented back in 1998, trellis was patented back in 1997:

                      A rate control algorithm for an MPEG-2 compliant encoder is described. The rate control algorithm has embodiments useful for constant bit rate and variable bit rate encoding. In particular, the invention relates to adaptive quantization.


                      A new method for real time implementation of rate-distortion optimized coding mode selection is disclosed that can be efficiently applied to H-263-compatible video codecs and other codecs of similar type. A normalized rate-distortion model is used to efficiently compute the rate and the distortion when encoding motion-compensated prediction error signals, instead of performing DCT, quantization and entropy-encoding. A fast algorithm is also disclosed that determines sub-optimal values of coding parameters such as the quantization parameter and the Lagrangian multiplier, λ, for the trellis search. Very good rate control and picture quality are achieved, especially when the disclosed techniques are applied in very low bitrate video coding.


                      A trellis encoder circuit comprises receiving means to receive a stream of digital bits, loading means for loading M successive data bits into a first data register from one of said receiving means and another data register, N successive data registers, each successive data register connected in series with one of said successive data registers and said first data register, means for cycling the digital bits in the last of said N successive data registers into said first data register, first multiplexer means for selecting one of plural sets of digital bits from said last data register, means for trellis encoding said one set of digital bits and providing a trellis encoded set of digital bits, and logic means for cycling the digital bits in said successive registers until all the digital bits have been trellis encoded and for reloading said successive registers from said stream of digital bits wherein N and M are integers greater than 1.


                      He has built up a cult-like following, almost becoming a folk hero by claiming to have invented and created stuff that predated his software by nearly a decade.

                      Comment

                      Working...
                      X