Announcement

Collapse
No announcement yet.

Blender 3.4 HIP Performance With Radeon RX 7900 Series + RDNA3 OpenCL Compute Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    I am still not convinced about the validity of the results shown (as I posted in the previous article comparing Blender performance on AMD, Nvidia , Optix, CUDA, HIP etc)

    The new cycle implementation exposes a 'Noise Threshold' variable that speeds up rendering.
    And in fact Optix backend implements the noise threshold as per blender post:

    The OptiX SDK includes an AI denoiser that uses an artificial-intelligence-trained network to remove noise from rendered images resulting in reduced render times. OptiX does this operation at interactive rates by taking advantage of Tensor Cores, specialized hardware designed for performing the tensor / matrix operations which are the core compute function used in Deep Learning. A change is in the works to add this feature to Cycles as another user-configurable option in conjunction with the new OptiX backend.​
    Over the past few months, NVIDIA worked closely with Blender Institute to deliver a frequent user request: adding hardware-accelerated ray tracing to Cycles.


    Michael Is the option for Optix Noise Threshold selected when using OptiX backend? With Cuda backend?

    Some of the files used for benchmark (e.g. bmw) are very old -- i don't think optix was available and it is not clear how blender sets up the relevant render options
    Opening any new 'project' since Blender 3.2 (at least) the 'Noise Threshold' checkbox is selected by default

    Granted that Nvidia tensor cores are specialized hardware and do provide a obvious benefit, in selected scenes (bmw) Blender 3.4 on AMD rx 6800 with HIP (ROCM 5.3.0) I obtain 14.5 seconds render time with 'Noise Threshold' selected -- about 25 seconds without 'Noise Threshold'

    Comment


    • #22
      Originally posted by sobrus View Post

      Because AMD fails to deliver. AMD seems to be only concerned about gaming, the main focus is to get steam games running.
      And if you want to do anything else, well you're out of luck. CUDA is working on every single geforce card since 2006, even $50 one.

      Meanwhile rocm is a unstable piece of junk, that requires literally gigabytes of hdd space to be installed (or rather I should say : compiled from sources), is not officially supported on any consumer card (it's designed for CDNA, not RDNA) and even if you manage to get it running - it's way slower than nvidia.
      It took AMD years to provide working OpenCL driver that is just 2.0 compliant right now and tricky to install (requires part of rocm to be installed as well lol). They don't really care at all.
      So how OpenCL was supposed to be supported if there have been no card you can run it on (nVidia is interested only in CUDA, and intel had no cards until this year, unless you want to run compute on integrated one).

      Now are just months away from Intel ARC debut, which had massive driver issues. And somehow it's already better at accelerating Blender (with hardware RT already in Blender 3.3!) than AMD after all those years? What a joke.
      Your comment is not wrong, neither correct on everything and yet you missed the main point of my message that you quoted.

      Comment


      • #23
        Basically 3080 level of performance in Blender. Good enough, but it needs to stop crashing and must be easier to install.
        ## VGA ##
        AMD: X1950XTX, HD3870, HD5870
        Intel: GMA45, HD3000 (Core i5 2500K)

        Comment


        • #24
          Originally posted by sobrus View Post

          Because AMD fails to deliver. AMD seems to be only concerned about gaming, the main focus is to get steam games running.
          And if you want to do anything else, well you're out of luck. CUDA is working on every single geforce card since 2006, even $50 one.

          Meanwhile rocm is a unstable piece of junk, that requires literally gigabytes of hdd space to be installed (or rather I should say : compiled from sources), is not officially supported on any consumer card (it's designed for CDNA, not RDNA) and even if you manage to get it running - it's way slower than nvidia.
          It took AMD years to provide working OpenCL driver that is just 2.0 compliant right now and tricky to install (requires part of rocm to be installed as well lol). They don't really care at all.
          So how OpenCL was supposed to be supported if there have been no card you can run it on (nVidia is interested only in CUDA, and intel had no cards until this year, unless you want to run compute on integrated one).

          Now are just months away from Intel ARC debut, which had massive driver issues. And somehow it's already better at accelerating Blender (with hardware RT already in Blender 3.3!) than AMD after all those years? What a joke.
          ROCm isn't equivalent to CUDA, it's CUDA + CUDA SDK + cuDNN. Install all of those and you also require several gigabytes of disk space. That's why ROCm is split into smaller packages, so you can just install the ones you need. Which is basically just hip-runtime and rocm-opencl for most end users, plus their relatively minor dependencies. And if there's one thing ROCm really still sucks at, it's documentation. With better documentation, you'd probably know that there was a working OpenCL driver before ROCm, called Orca, which was recently phased out in favour of rocm-opencl. Parts of ROCm, specifically the end user facing components (again: hip-runtime and rocm-opencl) work on pretty much any somewhat modern GPU. ROCm is also available on most distros these days, there's no need to compile anything yourself. It's even slowly entering official distro repos. And on top of all that, ROCm is open source and accepts patches.

          So yeah, ROCm was a complete mess for years early on, but it improved a lot since 5.0, and especially over the last two or three releases. The main thing holding it back is that many libraries and applications haven't been hipified yet, but the most important ones like PyTorch or Tensorflow run just fine on most consumer AMD GPUs released in the last couple of years, with good stability and decent performance.

          Comment


          • #25
            Another review with blender using optix on windows and amd performance is lower





            https://techgage.com/article/amd-rad...reator-review/

            Last edited by pinguinpc; 15 December 2022, 06:49 PM.

            Comment


            • #26
              Originally posted by pinguinpc View Post
              Another review with blender using optix on windows and amd performance is lower





              https://techgage.com/article/amd-rad...reator-review/

              That is an 'orange to apple' comparison

              OptiX uses tensor cores, HIP does not use ray tracing hardware on AMD. HIP RT in blender 3.5 is supposed to 'bridge the gap' from a software enablement point of view.
              To the best of my knowledge ROCm still does not use RT hardware

              Comment


              • #27
                Michael

                I just tested another of the sample blender files you report performance on (Fishy Cat -- which is a Blender 2.74 blend file) from:

                Home of the Blender project - Free and Open 3D Creation Software


                on AMD rx 6800 ROCm 5.3.0 render time is 11.11 seconds (this is the second time rendering, first time is 16.11 seconds)-- you report 43.26 seconds for same hardware, see first plot on:

                Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite


                Steps to reproduce:

                1. download blender zip
                2. open file 'fishy_cat.blend'
                3. select 'GPU Compute' device
                4. hit 'F12'

                The above results is in line with the OptiX results you report on 3080/3080Ti

                Are we sure that the results in your article make sense?

                EDIT: the 'readme' in the blend file explain:

                Just press F12 (or adjust render resolution/samples to both scenes)
                - First the Cat scene will be rendered
                - Then automatically the Grain scene will be rendered
                - Then the final composite pops out​
                The results I report above are after the final 'composite pops out'

                EDIT2:
                1. I am on blender 3.4 (arch linux)
                2. I repeated the 'step to reproduce' process 3 time just to make sure -- obtaining equivalent results (best render time is 11.0 seconds)
                Last edited by Grinness; 15 December 2022, 08:33 PM.

                Comment


                • #28
                  Originally posted by pinguinpc View Post
                  OptiX (crush amd so hard) still be a king of hill
                  And without test geforce 40 series
                  thats plain and simple wrong only CPU renderer and CUDA renderer and amd hip renderer are the mathematical same result.

                  OptiX is fast but not the same result. thats called cheating you reduce the quality by cheating to zero and then it is super fast.

                  you say OptiX and Nvidia is king of hill but king of hill of what ?... its not the same result...

                  come again if it is mathematical the same result.
                  Phantom circuit Sequence Reducer Dyslexia

                  Comment


                  • #29
                    Originally posted by Grinness View Post

                    That is an 'orange to apple' comparison

                    OptiX uses tensor cores, HIP does not use ray tracing hardware on AMD. HIP RT in blender 3.5 is supposed to 'bridge the gap' from a software enablement point of view.
                    To the best of my knowledge ROCm still does not use RT hardware
                    Hopefully blender 3.5 can fix something, however stay very impressed with arc (oneapi) blender performance

                    Comment


                    • #30
                      Originally posted by Grinness View Post
                      Michael


                      Are we sure that the results in your article make sense?

                      2. I repeated the 'step to reproduce' process 3 time just to make sure -- obtaining equivalent results (best render time is 11.0 seconds)
                      I just tried your instructions also running Arch, going to try now on Nobara.
                      Kernel 6.0.12-zen1-1-zen​
                      Blender 3.4
                      Rocm-5.3.3
                      5700XT
                      1st run was 11.2s
                      2nd run was 9.1s
                      3rd run was 8.8s.

                      We have to be doing something wrong or missing something.

                      Nobara
                      6.0.10-201.fc36.x86_64
                      Rocm 5.2.3
                      Blender 3.3.1
                      14.3,10.3, 13.8
                      Last edited by dfyt; 16 December 2022, 05:53 AM.

                      Comment

                      Working...
                      X