Announcement

Collapse
No announcement yet.

Blender 3.3 AMD Radeon HIP vs. NVIDIA CUDA/OptiX Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by qarium View Post
    maybe you can organise a growdfunding for the 6950xt
    Maybe AMD could provide Michael a f*ck*ng card for free? AMD WTF? You're losing way more money every time he publishes a review with top of the line Nvidia GPUs and no AMD counterpart. Computing aside AMD blows Nvidia away in rasterization on Linux and yet you decide to not provide the highest end cards. I'll never understand.
    ## VGA ##
    AMD: X1950XTX, HD3870, HD5870
    Intel: GMA45, HD3000 (Core i5 2500K)

    Comment


    • #22
      I just got Sapphire Nitro+ Pure Radeon RX 6950X for Ryzen 9 5900XT running on Fedora 37 Beta. Unfortunately, the HIP failed to get detected for some odd reasons but ROCM OpenCL got significant improvement, shame OpenCL support got dropped. I may try the benchmark this weekend.
      Last edited by finalzone; 23 September 2022, 07:50 PM. Reason: Fix sentences

      Comment


      • #23
        I don't know if this is a coincidence or a consequence, but looks like the AMD and Intel GPU stack development is — somehow — related with the Blender growth: the moment in which the latter became an Industry standard the two companies started to push on the accelerator... 🤔

        Comment


        • #24
          Originally posted by Danielsan View Post
          I don't know if this is a coincidence or a consequence, but looks like the AMD and Intel GPU stack development is — somehow — related with the Blender growth: the moment in which the latter became an Industry standard the two companies started to push on the accelerator... 🤔
          this shows how opensource software become more and more important there is blender as battleground because it is important in the industry now.

          Krita for example did get intel as sponsor because krita goes the same route as blender...

          GIMP should be the same but is not yet important... but i think in the near future GIMP becomes a complete Adobe Photoshop replacement and because of this becomes important in the future.

          as soon as such a software like blender becomes a battleground its like game over for the closed source ecosystem.
          Phantom circuit Sequence Reducer Dyslexia

          Comment


          • #25
            Originally posted by qarium View Post

            this shows how opensource software become more and more important there is blender as battleground because it is important in the industry now.

            Krita for example did get intel as sponsor because krita goes the same route as blender...

            GIMP should be the same but is not yet important... but i think in the near future GIMP becomes a complete Adobe Photoshop replacement and because of this becomes important in the future.

            as soon as such a software like blender becomes a battleground its like game over for the closed source ecosystem.

            I think if “GIMP” had a name that sounded better than a slur for someone with physical disabilities it may gain some traction.

            Large companies don’t want to be seen supporting “gimped” software

            Comment


            • #26
              Originally posted by Eirikr1848 View Post
              I think if “GIMP” had a name that sounded better than a slur for someone with physical disabilities it may gain some traction.
              Large companies don’t want to be seen supporting “gimped” software
              there are multiple reasons why GIMP did not become important similar to blender.

              there are many other opensource projects who did not go airborne yet ... Inkscape is another example.

              i think we will soon see that in the next 5 years that many of these projects go the same route as blender.
              Phantom circuit Sequence Reducer Dyslexia

              Comment


              • #27
                I'm curious as to why the RX 6500 and RX 6400 were tested? My understanding was that all of the media creation parts of those two gpus are disabled.

                Comment


                • #28
                  Better late never. After getting HIP working with ROCm-GFX8P repository from COPR, the benchmark is finally available​ running Sapphire Nitro+ Pure AMD Radeon RX 6950XT.

                  Code:
                  hipinfo
                  
                  --------------------------------------------------------------------------------
                  device#                           0
                  Name:                             AMD Radeon RX 6950 XT
                  pciBusID:                         10
                  pciDeviceID:                      0
                  pciDomainID:                      0
                  multiProcessorCount:              40
                  maxThreadsPerMultiProcessor:      2048
                  isMultiGpuBoard:                  0
                  clockRate:                        2720 Mhz
                  memoryClockRate:                  1124 Mhz
                  memoryBusWidth:                   256
                  totalGlobalMem:                   15.98 GB
                  totalConstMem:                    2147483647
                  sharedMemPerBlock:                64.00 KB
                  canMapHostMemory:                 1
                  regsPerBlock:                     65536
                  warpSize:                         32
                  l2CacheSize:                      4194304
                  computeMode:                      0
                  maxThreadsPerBlock:               1024
                  maxThreadsDim.x:                  1024
                  maxThreadsDim.y:                  1024
                  maxThreadsDim.z:                  1024
                  maxGridSize.x:                    2147483647
                  maxGridSize.y:                    2147483647
                  maxGridSize.z:                    2147483647
                  major:                            10
                  minor:                            3
                  concurrentKernels:                1
                  cooperativeLaunch:                0
                  cooperativeMultiDeviceLaunch:     0
                  isIntegrated:                     0
                  maxTexture1D:                     16384
                  maxTexture2D.width:               16384
                  maxTexture2D.height:              16384
                  maxTexture3D.width:               16384
                  maxTexture3D.height:              16384
                  maxTexture3D.depth:               8192
                  isLargeBar:                       1
                  asicRevision:                     1
                  maxSharedMemoryPerMultiProcessor: 64.00 KB
                  clockInstructionRate:             1000.00 Mhz
                  arch.hasGlobalInt32Atomics:       1
                  arch.hasGlobalFloatAtomicExch:    1
                  arch.hasSharedInt32Atomics:       1
                  arch.hasSharedFloatAtomicExch:    1
                  arch.hasFloatAtomicAdd:           1
                  arch.hasGlobalInt64Atomics:       1
                  arch.hasSharedInt64Atomics:       1
                  arch.hasDoubles:                  1
                  arch.hasWarpVote:                 1
                  arch.hasWarpBallot:               1
                  arch.hasWarpShuffle:              1
                  arch.hasFunnelShift:              0
                  arch.hasThreadFenceSystem:        1
                  arch.hasSyncThreadsExt:           0
                  arch.hasSurfaceFuncs:             0
                  arch.has3dGrid:                   1
                  arch.hasDynamicParallelism:       0
                  gcnArchName:                      gfx1030
                  peers:                            
                  non-peers:                        device#0
                  
                  memInfo.total:                    15.98 GB
                  
                  ​

                  Comment


                  • #29
                    Code:
                    $ blender -noaudio -b bmw27_gpu.blend -o /tmp/outputfile -f 1
                    Blender 3.3.1 (hash b292cfe5a936 built 2022-11-05 23:21:59)
                    Read prefs: /home/rnz/.config/blender/3.3/config/userpref.blend
                    Read blend: /home/rnz/Downloads/blender/benchmark/bmw27/bmw27_gpu.blend
                    Fra:1 Mem:62.66M (Peak 62.67M) | Time:00:00.23 | Mem:0.00M, Peak:0.00M | Scene, RenderLayer | Synchronizing object | Light
                    ...
                    Fra:1 Mem:239.91M (Peak 259.79M) | Time:00:01.12 | Remaining:01:32.20 | Mem:386.99M, Peak:386.99M | Scene, RenderLayer | Sample 1/1225
                    Fra:1 Mem:239.97M (Peak 259.79M) | Time:00:20.62 | Remaining:00:27.16 | Mem:387.05M, Peak:387.05M | Scene, RenderLayer | Sample 513/1225
                    Fra:1 Mem:251.84M (Peak 271.62M) | Time:00:48.75 | Mem:387.05M, Peak:387.05M | Scene, RenderLayer | Sample 1225/1225
                    Fra:1 Mem:251.84M (Peak 271.62M) | Time:00:48.75 | Mem:387.05M, Peak:387.05M | Scene, RenderLayer | Finished
                    ...
                    Fra:1 Mem:90.70M (Peak 271.62M) | Time:00:48.95 | Compositing | De-initializing execution
                    Saved: '/tmp/outputfile0001.png'
                     Time: 00:49.53 (Saving: 00:00.58)
                    Code:
                    $ rocminfo 
                    ROCk module is loaded
                    =====================    
                    HSA System Attributes    
                    =====================    
                    Runtime Version:         1.1
                    System Timestamp Freq.:  1000.000000MHz
                    Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
                    Machine Model:           LARGE                              
                    System Endianness:       LITTLE                             
                    
                    ==========               
                    HSA Agents               
                    ==========               
                    *******                  
                    Agent 1                  
                    *******                  
                      Name:                    AMD Ryzen 9 5900X 12-Core Processor
                      Uuid:                    CPU-XX                             
                      Marketing Name:          AMD Ryzen 9 5900X 12-Core Processor
                      Vendor Name:             CPU                                
                      Feature:                 None specified                     
                      Profile:                 FULL_PROFILE                       
                      Float Round Mode:        NEAR                               
                      Max Queue Number:        0(0x0)                             
                      Queue Min Size:          0(0x0)                             
                      Queue Max Size:          0(0x0)                             
                      Queue Type:              MULTI                              
                      Node:                    0                                  
                      Device Type:             CPU                                
                      Cache Info:              
                        L1:                      32768(0x8000) KB                   
                      Chip ID:                 0(0x0)                             
                      ASIC Revision:           0(0x0)                             
                      Cacheline Size:          64(0x40)                           
                      Max Clock Freq. (MHz):   3700                               
                      BDFID:                   0                                  
                      Internal Node ID:        0                                  
                      Compute Unit:            24                                 
                      SIMDs per CU:            0                                  
                      Shader Engines:          0                                  
                      Shader Arrs. per Eng.:   0                                  
                      WatchPts on Addr. Ranges:1                                  
                      Features:                None
                      Pool Info:               
                        Pool 1                   
                          Segment:                 GLOBAL; FLAGS: FINE GRAINED        
                          Size:                    32785892(0x1f445e4) KB             
                          Allocatable:             TRUE                               
                          Alloc Granule:           4KB                                
                          Alloc Alignment:         4KB                                
                          Accessible by all:       TRUE                               
                        Pool 2                   
                          Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
                          Size:                    32785892(0x1f445e4) KB             
                          Allocatable:             TRUE                               
                          Alloc Granule:           4KB                                
                          Alloc Alignment:         4KB                                
                          Accessible by all:       TRUE                               
                        Pool 3                   
                          Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
                          Size:                    32785892(0x1f445e4) KB             
                          Allocatable:             TRUE                               
                          Alloc Granule:           4KB                                
                          Alloc Alignment:         4KB                                
                          Accessible by all:       TRUE                               
                      ISA Info:                
                    *******                  
                    Agent 2                  
                    *******                  
                      Name:                    gfx1010                            
                      Uuid:                    GPU-XX                             
                      Marketing Name:          AMD Radeon RX 5700 XT              
                      Vendor Name:             AMD                                
                      Feature:                 KERNEL_DISPATCH                    
                      Profile:                 BASE_PROFILE                       
                      Float Round Mode:        NEAR                               
                      Max Queue Number:        128(0x80)                          
                      Queue Min Size:          64(0x40)                           
                      Queue Max Size:          131072(0x20000)                    
                      Queue Type:              MULTI                              
                      Node:                    1                                  
                      Device Type:             GPU                                
                      Cache Info:              
                        L1:                      16(0x10) KB                        
                        L2:                      4096(0x1000) KB                    
                      Chip ID:                 29471(0x731f)                      
                      ASIC Revision:           2(0x2)                             
                      Cacheline Size:          64(0x40)                           
                      Max Clock Freq. (MHz):   2100                               
                      BDFID:                   3328                               
                      Internal Node ID:        1                                  
                      Compute Unit:            40                                 
                      SIMDs per CU:            2                                  
                      Shader Engines:          4                                  
                      Shader Arrs. per Eng.:   2                                  
                      WatchPts on Addr. Ranges:4                                  
                      Features:                KERNEL_DISPATCH 
                      Fast F16 Operation:      TRUE                               
                      Wavefront Size:          32(0x20)                           
                      Workgroup Max Size:      1024(0x400)                        
                      Workgroup Max Size per Dimension:
                        x                        1024(0x400)                        
                        y                        1024(0x400)                        
                        z                        1024(0x400)                        
                      Max Waves Per CU:        40(0x28)                           
                      Max Work-item Per CU:    1280(0x500)                        
                      Grid Max Size:           4294967295(0xffffffff)             
                      Grid Max Size per Dimension:
                        x                        4294967295(0xffffffff)             
                        y                        4294967295(0xffffffff)             
                        z                        4294967295(0xffffffff)             
                      Max fbarriers/Workgrp:   32                                 
                      Pool Info:               
                        Pool 1                   
                          Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
                          Size:                    8372224(0x7fc000) KB               
                          Allocatable:             TRUE                               
                          Alloc Granule:           4KB                                
                          Alloc Alignment:         4KB                                
                          Accessible by all:       FALSE                              
                        Pool 2                   
                          Segment:                 GROUP                              
                          Size:                    64(0x40) KB                        
                          Allocatable:             FALSE                              
                          Alloc Granule:           0KB                                
                          Alloc Alignment:         0KB                                
                          Accessible by all:       FALSE                              
                      ISA Info:                
                        ISA 1                    
                          Name:                    amdgcn-amd-amdhsa--gfx1010:xnack-  
                          Machine Models:          HSA_MACHINE_MODEL_LARGE            
                          Profiles:                HSA_PROFILE_BASE                   
                          Default Rounding Mode:   NEAR                               
                          Default Rounding Mode:   NEAR                               
                          Fast f16:                TRUE                               
                          Workgroup Max Size:      1024(0x400)                        
                          Workgroup Max Size per Dimension:
                            x                        1024(0x400)                        
                            y                        1024(0x400)                        
                            z                        1024(0x400)                        
                          Grid Max Size:           4294967295(0xffffffff)             
                          Grid Max Size per Dimension:
                            x                        4294967295(0xffffffff)             
                            y                        4294967295(0xffffffff)             
                            z                        4294967295(0xffffffff)             
                          FBarrier Max Size:       32                                 
                    *** Done ***     
                    ​
                    Code:
                    $ yay -Ss rocm | grep -i installed
                    aur/rocm-core 5.3.3-1 (+2 0.17) (Installed)
                    aur/opencl-amd-dev 1:5.4.0-1 (+5 1.15) (Installed)
                    aur/rocm-llvm 5.4.0-1 (+11 0.26) (Installed)
                    ​

                    Comment

                    Working...
                    X