Announcement

Collapse
No announcement yet.

AMD FidelityFX Super Resolution 2.0 Source Code Published

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by birdie View Post
    Just a little under 2MB of source code vs ~238MB for NVIDIA DLSS 2.2.
    Perhaps 95% of DLSS' size is the AI model...

    Comment


    • #32
      Originally posted by tildearrow View Post
      Perhaps 95% of DLSS' size is the AI model...
      Actual DLSS DLL is ~14MB.

      Comment


      • #33
        Originally posted by birdie View Post
        This is not what professional reviewers have concluded:
        It's decent, but it's definitely not better.
        That dude who made the video is a very well known nvidia shill. Historically, he goes out of his way and will say anything to make nvidia look good, trashes games that don't support RTX, etc.

        There is no way he's not on some kind of nvidia payroll or preferential treatment behind the scenes.
        Last edited by RealNC; 23 June 2022, 03:43 AM.

        Comment


        • #34
          Originally posted by brucethemoose View Post

          Depth buffers can be generated, see: https://github.com/baowenbo/DAIN

          This could definitely be ported to VapourSynth. But doing it in real time... well that's another matter.

          Maybe it would still be OK in mpv if you gave it a dummy or really basic depth buffer? Or maybe there is a toggle, I havent looked at the code yet.​​​
          MPV could be nice I guess, FSR1 + nvsharpen actually does a really good job on live streams, FSR does a good job on YT content, and fsrcnnx is preferred for movies for me. I wonder if any of the bits and bobs from FSR2 could be useful. would be neat

          Comment


          • #35
            Originally posted by RealNC View Post

            That dude who made the video is a very well known nvidia shill. Historically, he goes out of his way and will say anything to make nvidia look good, trashes games that don't support RTX, etc.

            There is no way he's not on some kind of nvidia payroll or preferential treatment behind the scenes.
            Yet the criticized points are valid, unless someone proves they are not (which will not happen in this case).
            You can criticize an author as biased by proving him one-sided or wrong. But to turn down an analysis just because you think the author is biased is just mentally poor.

            Comment


            • #36
              Originally posted by jrch2k8 View Post

              again, not saying is perfect but i didn't notice them while normal gameplay or at least was not noticeable enough to affect me enough to notice it on gameplay.

              Sure, if you walk around slowly, carefully checking every frame, stoping motion then moving violently i'm sure issue will appear and that is true as well for DLSS tho to a lesser extent depending on the game BUT most ppl won't do that.

              Also i do understand there are ppl with eyes sensible enough to notice such thing like there are ppl with ear sensible enough to notice distortion on sound quality bitrate/equipment and i'm not saying just have to use FSR2. Just saying that for most user doing normal gameplay is barely noticeable and it is good enough.

              It need improvements? i'm sure it does and will get better over time but for now is good enough to not require nVidia RTX class hardware to have decent upscaling for the average joe gamer, for those that need a bit more they can put the extra cash and go with DLSS and we all happy
              You understand topic wrong. Yes in many cases DLSS quality vs FSR 2.0 quality is hard to distinquish. But if they are hard to diffrentiate, you can drop the setting even lower. And that is issue of FSR, when it is quite competitive at quality vs quality, it isn't at performance vs performance.

              2nd issue is particle effects. In case of particle effects there aren't motion vectors or depth buffer FSR can rely on. DLSS at least has neural network to reconstruct image just based on that.
              comparison.jpg
              When FSR 2.0 is good, it doesn't scale as well as DLSS to low resolutions. This is real issue because people with weak GPUs (but not totally garbage) are interested in upscalling. And when RTX2060/3050 user can use DLSS performance and will be really happy with results most of the time. Can you use 6600XT/5600XT with FSR2.0 performance? Answear is not really.

              Of course you could claim you cannot use DLSS with gtx 1060, but it issue is GTX 1060 was 1080p card, and nowadays it doesn't even drive 1080p well. So you are forced with so weak internal resolution, that FSR 2.0 struggles to guess pixels what means you will prefer to drop settings then use FSR.

              Ironically, FSR2.0 is only godsend for strong Pascal users (like 1080TI). This is only old card that doesn't have major slowdown from using FSR2.0 while being able to drive internal resolution good enough to make it work well.

              Comment


              • #37
                Originally posted by tildearrow View Post

                Perhaps 95% of DLSS' size is the AI model...
                Looks like that:

                Code:
                  18935 cuda_auto_exposure_copy_kernel.sm87.h
                  20522 cuda_clear_buffer_kernel.sm87.h
                  18925 cuda_copy_exposure_kernel.sm87.h
                  41320 cuda_downsample_kernel.sm87.h
                  85307 cuda_dump_kernel.sm87.h
                 334931 cuda_engine_input_kernel_dbg.sm87.h
                 169381 cuda_engine_input_kernel_rel_depthinv_hdr_colvar_mvhi.sm87.h
                 178981 cuda_engine_input_kernel_rel_depthinv_hdr_colvar_mvlo.sm87.h
                 182983 cuda_engine_input_kernel_rel_depthinv_hdr_depdis_mvhi.sm87.h
                 192583 cuda_engine_input_kernel_rel_depthinv_hdr_depdis_mvlo.sm87.h
                 128581 cuda_engine_input_kernel_rel_depthinv_hdr_mvdiff_mvhi.sm87.h
                 138181 cuda_engine_input_kernel_rel_depthinv_hdr_mvdiff_mvlo.sm87.h
                 173383 cuda_engine_input_kernel_rel_depthinv_ldr_colvar_mvhi.sm87.h
                 182983 cuda_engine_input_kernel_rel_depthinv_ldr_colvar_mvlo.sm87.h
                 186982 cuda_engine_input_kernel_rel_depthinv_ldr_depdis_mvhi.sm87.h
                 196582 cuda_engine_input_kernel_rel_depthinv_ldr_depdis_mvlo.sm87.h
                 127783 cuda_engine_input_kernel_rel_depthinv_ldr_mvdiff_mvhi.sm87.h
                 136582 cuda_engine_input_kernel_rel_depthinv_ldr_mvdiff_mvlo.sm87.h
                 169381 cuda_engine_input_kernel_rel_depthreg_hdr_colvar_mvhi.sm87.h
                 178981 cuda_engine_input_kernel_rel_depthreg_hdr_colvar_mvlo.sm87.h
                 182983 cuda_engine_input_kernel_rel_depthreg_hdr_depdis_mvhi.sm87.h
                 192583 cuda_engine_input_kernel_rel_depthreg_hdr_depdis_mvlo.sm87.h
                 128581 cuda_engine_input_kernel_rel_depthreg_hdr_mvdiff_mvhi.sm87.h
                 138181 cuda_engine_input_kernel_rel_depthreg_hdr_mvdiff_mvlo.sm87.h
                 173383 cuda_engine_input_kernel_rel_depthreg_ldr_colvar_mvhi.sm87.h
                 182983 cuda_engine_input_kernel_rel_depthreg_ldr_colvar_mvlo.sm87.h
                 186982 cuda_engine_input_kernel_rel_depthreg_ldr_depdis_mvhi.sm87.h
                 196582 cuda_engine_input_kernel_rel_depthreg_ldr_depdis_mvlo.sm87.h
                 127783 cuda_engine_input_kernel_rel_depthreg_ldr_mvdiff_mvhi.sm87.h
                 136582 cuda_engine_input_kernel_rel_depthreg_ldr_mvdiff_mvlo.sm87.h
                 133735 cuda_engine_output_kernel_dbg.sm87.h
                 125760 cuda_engine_output_kernel_rel_hdr_gauss3x3.sm87.h
                 182559 cuda_engine_output_kernel_rel_hdr_gauss5x5.sm87.h
                 109761 cuda_engine_output_kernel_rel_ldr_gauss3x3.sm87.h
                 165759 cuda_engine_output_kernel_rel_ldr_gauss5x5.sm87.h
                  22922 cuda_luma_convert_kernel.sm87.h
                  28918 cuda_reduce_sum_kernel.sm87.h
                  50922 cuda_upscale_sum_kernel.sm87.h
                 127371 nchw8_conv_3x3_pool_512_032_008_fp16_e5m3_kernel.sm87.h
                 114631 nhwc_bilinear_upsample_conv_1x1_conv_1x1_128_128_032_064_128_fp16_e5m3_kernel.sm87.h
                 108229 nhwc_bilinear_upsample_conv_1x1_conv_1x1_128_128_032_064_128_fp16_fp16_kernel.sm87.h
                  92230 nhwc_bilinear_upsample_conv_1x1_conv_1x1_256_032_032_048_032_e5m3_fp16_kernel.sm87.h
                  82630 nhwc_bilinear_upsample_conv_1x1_conv_1x1_256_032_032_048_032_fp16_fp16_kernel.sm87.h
                 141829 nhwc_bilinear_upsample_conv_1x1_conv_1x1_256_064_064_032_064_e5m3_e5m3_kernel.sm87.h
                 118630 nhwc_bilinear_upsample_conv_1x1_conv_1x1_256_064_064_032_064_fp16_fp16_kernel.sm87.h
                1148620 nhwc_bilinear_upsample_conv_1x1_conv_1x1_aniso_gaussian_dfn_3x3_128_032_032_048_032_e5m3_fp16_kernel.sm87.h
                1138222 nhwc_bilinear_upsample_conv_1x1_conv_1x1_aniso_gaussian_dfn_3x3_128_032_032_048_032_fp16_fp16_kernel.sm87.h
                2022225 nhwc_bilinear_upsample_conv_1x1_conv_1x1_aniso_gaussian_dfn_5x5_128_032_032_048_032_e5m3_fp16_kernel.sm87.h
                2011023 nhwc_bilinear_upsample_conv_1x1_conv_1x1_aniso_gaussian_dfn_5x5_128_032_032_048_032_fp16_fp16_kernel.sm87.h
                 579055 nhwc_bilinear_upsample_conv_1x1_conv_1x1_aniso_gaussian_dfn_fastpath_3x3_128_032_032_048_032_e5m3_fp16_kernel.sm87.h
                 937456 nhwc_bilinear_upsample_conv_1x1_conv_1x1_aniso_gaussian_dfn_fastpath_5x5_128_032_032_048_032_e5m3_fp16_kernel.sm87.h
                  60959 nhwc_conv_1x1_064_032_064_fp16_fp16_kernel.sm87.h
                 102561 nhwc_conv_1x1_128_128_032_fp16_fp16_kernel.sm87.h
                 164170 nhwc_conv_1x1_pool_256_064_032_e5m3_e5m3_kernel.sm87.h
                 122569 nhwc_conv_1x1_pool_256_064_032_fp16_fp16_kernel.sm87.h
                 134569 nhwc_conv_3x3_pool_128_128_032_e5m3_fp16_kernel.sm87.h
                 126571 nhwc_conv_3x3_pool_128_128_032_fp16_fp16_kernel.sm87.h
                  80170 nhwc_conv_3x3_pool_512_032_016_fp16_e5m3_kernel.sm87.h
                  72970 nhwc_conv_3x3_pool_512_032_016_fp16_fp16_kernel.sm87.h
                 114599 nhwc_conv_3x3_pool_conv_1x1_pool_512_064_016_fp16_e5m3_kernel.sm87.h
                 104198 nhwc_conv_3x3_pool_conv_1x1_pool_512_064_016_fp16_fp16_kernel.sm87.h
                15MB for just some CUDA weights.

                Comment


                • #38
                  I wonder how difficult it really is to compute motion good enough vectors without engine support. Video encoders also get their motion vectors by doing analysis on what is equivalent to a fully rendered frame from a game. And they're pretty fast at it even in software. Add a depth buffer into the mix and you can extract motion vectors on the z axis, too. But maybe such an approximation would not be accurate enough?

                  Comment


                  • #39
                    Originally posted by binarybanana View Post
                    I wonder how difficult it really is to compute motion good enough vectors without engine support. Video encoders also get their motion vectors by doing analysis on what is equivalent to a fully rendered frame from a game. And they're pretty fast at it even in software. Add a depth buffer into the mix and you can extract motion vectors on the z axis, too. But maybe such an approximation would not be accurate enough?
                    I know mvtools (and its various forks/mods like svp) and Nvidia Optical Flow are rather rough. Some scripts use the information for other filters, but it tends to be unreliable.

                    Not sure about the state of AV1 and such.

                    There have been some attempts to generate motion vectors with AI models as well, but nothing fast like DLSS, and to be frank the actual results outside of their test cases are not great either.

                    Comment


                    • #40
                      Originally posted by piotrj3 View Post
                      You understand topic wrong. Yes in many cases DLSS quality vs FSR 2.0 quality is hard to distinquish. But if they are hard to diffrentiate, you can drop the setting even lower. And that is issue of FSR, when it is quite competitive at quality vs quality, it isn't at performance vs performance.
                      ...
                      When FSR 2.0 is good, it doesn't scale as well as DLSS to low resolutions. This is real issue because people with weak GPUs (but not totally garbage) are interested in upscalling. And when RTX2060/3050 user can use DLSS performance and will be really happy with results most of the time. Can you use 6600XT/5600XT with FSR2.0 performance? Answear is not really.
                      Uh, no. Youtube's "Hardware Unboxed" channel blows that claim out of the water with their video "FSR 2.0, How Do Old GPUs Perform? 8 GPU Generations Benchmarked". Lots of benchmark data to look at, and definitely shows quite clearly, across a number of GPU's, that your claim is utter unjustified. For example, on a 2060 Super (a Nvidia card), the difference between DLSS and FSR 2 on the highest quality is a whopping 5 FPS (85 to 90) in nVidia's favor. Hardly something to be tooting the performance horn about. On even older cards, like the 1650 Super and the RX570, there are gains to be had. Not outstanding, but they are there.

                      As far as FSR's relative benefits when compared to DLSS; again Hardware Unboxed does a fair job (but so have other channels) of detailed 300% zoom comparisons between FSR 2 and DLSS. Again, while DLSS certainly has *some* edge, I suspect it's only 'fan bois' who'd really care.

                      I'd certainly agree FSR 2 isn't the best thing since sliced bread, but to deny it's benefits is the worst kind of platform elitism. Many, many independent reviewers have shown that FSR 2 has obvious benefits for a large number of gamers, and the HU reviewer (final section of video) says it well:
                      "Despite this, even 5 year old gpu's do still run and benefit from FSR 2.0...."

                      Comment

                      Working...
                      X