Announcement

Collapse
No announcement yet.

Blender 3.3 AMD Radeon HIP vs. NVIDIA CUDA/OptiX Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 0xff
    replied
    Code:
    $ blender -noaudio -b bmw27_gpu.blend -o /tmp/outputfile -f 1
    Blender 3.3.1 (hash b292cfe5a936 built 2022-11-05 23:21:59)
    Read prefs: /home/rnz/.config/blender/3.3/config/userpref.blend
    Read blend: /home/rnz/Downloads/blender/benchmark/bmw27/bmw27_gpu.blend
    Fra:1 Mem:62.66M (Peak 62.67M) | Time:00:00.23 | Mem:0.00M, Peak:0.00M | Scene, RenderLayer | Synchronizing object | Light
    ...
    Fra:1 Mem:239.91M (Peak 259.79M) | Time:00:01.12 | Remaining:01:32.20 | Mem:386.99M, Peak:386.99M | Scene, RenderLayer | Sample 1/1225
    Fra:1 Mem:239.97M (Peak 259.79M) | Time:00:20.62 | Remaining:00:27.16 | Mem:387.05M, Peak:387.05M | Scene, RenderLayer | Sample 513/1225
    Fra:1 Mem:251.84M (Peak 271.62M) | Time:00:48.75 | Mem:387.05M, Peak:387.05M | Scene, RenderLayer | Sample 1225/1225
    Fra:1 Mem:251.84M (Peak 271.62M) | Time:00:48.75 | Mem:387.05M, Peak:387.05M | Scene, RenderLayer | Finished
    ...
    Fra:1 Mem:90.70M (Peak 271.62M) | Time:00:48.95 | Compositing | De-initializing execution
    Saved: '/tmp/outputfile0001.png'
     Time: 00:49.53 (Saving: 00:00.58)
    Code:
    $ rocminfo 
    ROCk module is loaded
    =====================    
    HSA System Attributes    
    =====================    
    Runtime Version:         1.1
    System Timestamp Freq.:  1000.000000MHz
    Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
    Machine Model:           LARGE                              
    System Endianness:       LITTLE                             
    
    ==========               
    HSA Agents               
    ==========               
    *******                  
    Agent 1                  
    *******                  
      Name:                    AMD Ryzen 9 5900X 12-Core Processor
      Uuid:                    CPU-XX                             
      Marketing Name:          AMD Ryzen 9 5900X 12-Core Processor
      Vendor Name:             CPU                                
      Feature:                 None specified                     
      Profile:                 FULL_PROFILE                       
      Float Round Mode:        NEAR                               
      Max Queue Number:        0(0x0)                             
      Queue Min Size:          0(0x0)                             
      Queue Max Size:          0(0x0)                             
      Queue Type:              MULTI                              
      Node:                    0                                  
      Device Type:             CPU                                
      Cache Info:              
        L1:                      32768(0x8000) KB                   
      Chip ID:                 0(0x0)                             
      ASIC Revision:           0(0x0)                             
      Cacheline Size:          64(0x40)                           
      Max Clock Freq. (MHz):   3700                               
      BDFID:                   0                                  
      Internal Node ID:        0                                  
      Compute Unit:            24                                 
      SIMDs per CU:            0                                  
      Shader Engines:          0                                  
      Shader Arrs. per Eng.:   0                                  
      WatchPts on Addr. Ranges:1                                  
      Features:                None
      Pool Info:               
        Pool 1                   
          Segment:                 GLOBAL; FLAGS: FINE GRAINED        
          Size:                    32785892(0x1f445e4) KB             
          Allocatable:             TRUE                               
          Alloc Granule:           4KB                                
          Alloc Alignment:         4KB                                
          Accessible by all:       TRUE                               
        Pool 2                   
          Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
          Size:                    32785892(0x1f445e4) KB             
          Allocatable:             TRUE                               
          Alloc Granule:           4KB                                
          Alloc Alignment:         4KB                                
          Accessible by all:       TRUE                               
        Pool 3                   
          Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
          Size:                    32785892(0x1f445e4) KB             
          Allocatable:             TRUE                               
          Alloc Granule:           4KB                                
          Alloc Alignment:         4KB                                
          Accessible by all:       TRUE                               
      ISA Info:                
    *******                  
    Agent 2                  
    *******                  
      Name:                    gfx1010                            
      Uuid:                    GPU-XX                             
      Marketing Name:          AMD Radeon RX 5700 XT              
      Vendor Name:             AMD                                
      Feature:                 KERNEL_DISPATCH                    
      Profile:                 BASE_PROFILE                       
      Float Round Mode:        NEAR                               
      Max Queue Number:        128(0x80)                          
      Queue Min Size:          64(0x40)                           
      Queue Max Size:          131072(0x20000)                    
      Queue Type:              MULTI                              
      Node:                    1                                  
      Device Type:             GPU                                
      Cache Info:              
        L1:                      16(0x10) KB                        
        L2:                      4096(0x1000) KB                    
      Chip ID:                 29471(0x731f)                      
      ASIC Revision:           2(0x2)                             
      Cacheline Size:          64(0x40)                           
      Max Clock Freq. (MHz):   2100                               
      BDFID:                   3328                               
      Internal Node ID:        1                                  
      Compute Unit:            40                                 
      SIMDs per CU:            2                                  
      Shader Engines:          4                                  
      Shader Arrs. per Eng.:   2                                  
      WatchPts on Addr. Ranges:4                                  
      Features:                KERNEL_DISPATCH 
      Fast F16 Operation:      TRUE                               
      Wavefront Size:          32(0x20)                           
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Max Waves Per CU:        40(0x28)                           
      Max Work-item Per CU:    1280(0x500)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      Max fbarriers/Workgrp:   32                                 
      Pool Info:               
        Pool 1                   
          Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
          Size:                    8372224(0x7fc000) KB               
          Allocatable:             TRUE                               
          Alloc Granule:           4KB                                
          Alloc Alignment:         4KB                                
          Accessible by all:       FALSE                              
        Pool 2                   
          Segment:                 GROUP                              
          Size:                    64(0x40) KB                        
          Allocatable:             FALSE                              
          Alloc Granule:           0KB                                
          Alloc Alignment:         0KB                                
          Accessible by all:       FALSE                              
      ISA Info:                
        ISA 1                    
          Name:                    amdgcn-amd-amdhsa--gfx1010:xnack-  
          Machine Models:          HSA_MACHINE_MODEL_LARGE            
          Profiles:                HSA_PROFILE_BASE                   
          Default Rounding Mode:   NEAR                               
          Default Rounding Mode:   NEAR                               
          Fast f16:                TRUE                               
          Workgroup Max Size:      1024(0x400)                        
          Workgroup Max Size per Dimension:
            x                        1024(0x400)                        
            y                        1024(0x400)                        
            z                        1024(0x400)                        
          Grid Max Size:           4294967295(0xffffffff)             
          Grid Max Size per Dimension:
            x                        4294967295(0xffffffff)             
            y                        4294967295(0xffffffff)             
            z                        4294967295(0xffffffff)             
          FBarrier Max Size:       32                                 
    *** Done ***     
    ​
    Code:
    $ yay -Ss rocm | grep -i installed
    aur/rocm-core 5.3.3-1 (+2 0.17) (Installed)
    aur/opencl-amd-dev 1:5.4.0-1 (+5 1.15) (Installed)
    aur/rocm-llvm 5.4.0-1 (+11 0.26) (Installed)
    ​

    Leave a comment:


  • finalzone
    replied
    Better late never. After getting HIP working with ROCm-GFX8P repository from COPR, the benchmark is finally available​ running Sapphire Nitro+ Pure AMD Radeon RX 6950XT.

    Code:
    hipinfo
    
    --------------------------------------------------------------------------------
    device#                           0
    Name:                             AMD Radeon RX 6950 XT
    pciBusID:                         10
    pciDeviceID:                      0
    pciDomainID:                      0
    multiProcessorCount:              40
    maxThreadsPerMultiProcessor:      2048
    isMultiGpuBoard:                  0
    clockRate:                        2720 Mhz
    memoryClockRate:                  1124 Mhz
    memoryBusWidth:                   256
    totalGlobalMem:                   15.98 GB
    totalConstMem:                    2147483647
    sharedMemPerBlock:                64.00 KB
    canMapHostMemory:                 1
    regsPerBlock:                     65536
    warpSize:                         32
    l2CacheSize:                      4194304
    computeMode:                      0
    maxThreadsPerBlock:               1024
    maxThreadsDim.x:                  1024
    maxThreadsDim.y:                  1024
    maxThreadsDim.z:                  1024
    maxGridSize.x:                    2147483647
    maxGridSize.y:                    2147483647
    maxGridSize.z:                    2147483647
    major:                            10
    minor:                            3
    concurrentKernels:                1
    cooperativeLaunch:                0
    cooperativeMultiDeviceLaunch:     0
    isIntegrated:                     0
    maxTexture1D:                     16384
    maxTexture2D.width:               16384
    maxTexture2D.height:              16384
    maxTexture3D.width:               16384
    maxTexture3D.height:              16384
    maxTexture3D.depth:               8192
    isLargeBar:                       1
    asicRevision:                     1
    maxSharedMemoryPerMultiProcessor: 64.00 KB
    clockInstructionRate:             1000.00 Mhz
    arch.hasGlobalInt32Atomics:       1
    arch.hasGlobalFloatAtomicExch:    1
    arch.hasSharedInt32Atomics:       1
    arch.hasSharedFloatAtomicExch:    1
    arch.hasFloatAtomicAdd:           1
    arch.hasGlobalInt64Atomics:       1
    arch.hasSharedInt64Atomics:       1
    arch.hasDoubles:                  1
    arch.hasWarpVote:                 1
    arch.hasWarpBallot:               1
    arch.hasWarpShuffle:              1
    arch.hasFunnelShift:              0
    arch.hasThreadFenceSystem:        1
    arch.hasSyncThreadsExt:           0
    arch.hasSurfaceFuncs:             0
    arch.has3dGrid:                   1
    arch.hasDynamicParallelism:       0
    gcnArchName:                      gfx1030
    peers:                            
    non-peers:                        device#0
    
    memInfo.total:                    15.98 GB
    
    ​

    Leave a comment:


  • shanedav4
    replied
    I'm curious as to why the RX 6500 and RX 6400 were tested? My understanding was that all of the media creation parts of those two gpus are disabled.

    Leave a comment:


  • qarium
    replied
    Originally posted by Eirikr1848 View Post
    I think if “GIMP” had a name that sounded better than a slur for someone with physical disabilities it may gain some traction.
    Large companies don’t want to be seen supporting “gimped” software
    there are multiple reasons why GIMP did not become important similar to blender.

    there are many other opensource projects who did not go airborne yet ... Inkscape is another example.

    i think we will soon see that in the next 5 years that many of these projects go the same route as blender.

    Leave a comment:


  • Eirikr1848
    replied
    Originally posted by qarium View Post

    this shows how opensource software become more and more important there is blender as battleground because it is important in the industry now.

    Krita for example did get intel as sponsor because krita goes the same route as blender...

    GIMP should be the same but is not yet important... but i think in the near future GIMP becomes a complete Adobe Photoshop replacement and because of this becomes important in the future.

    as soon as such a software like blender becomes a battleground its like game over for the closed source ecosystem.

    I think if “GIMP” had a name that sounded better than a slur for someone with physical disabilities it may gain some traction.

    Large companies don’t want to be seen supporting “gimped” software

    Leave a comment:


  • qarium
    replied
    Originally posted by Danielsan View Post
    I don't know if this is a coincidence or a consequence, but looks like the AMD and Intel GPU stack development is — somehow — related with the Blender growth: the moment in which the latter became an Industry standard the two companies started to push on the accelerator... 🤔
    this shows how opensource software become more and more important there is blender as battleground because it is important in the industry now.

    Krita for example did get intel as sponsor because krita goes the same route as blender...

    GIMP should be the same but is not yet important... but i think in the near future GIMP becomes a complete Adobe Photoshop replacement and because of this becomes important in the future.

    as soon as such a software like blender becomes a battleground its like game over for the closed source ecosystem.

    Leave a comment:


  • Danielsan
    replied
    I don't know if this is a coincidence or a consequence, but looks like the AMD and Intel GPU stack development is — somehow — related with the Blender growth: the moment in which the latter became an Industry standard the two companies started to push on the accelerator... 🤔

    Leave a comment:


  • finalzone
    replied
    I just got Sapphire Nitro+ Pure Radeon RX 6950X for Ryzen 9 5900XT running on Fedora 37 Beta. Unfortunately, the HIP failed to get detected for some odd reasons but ROCM OpenCL got significant improvement, shame OpenCL support got dropped. I may try the benchmark this weekend.
    Last edited by finalzone; 23 September 2022, 07:50 PM. Reason: Fix sentences

    Leave a comment:


  • darkbasic
    replied
    Originally posted by qarium View Post
    maybe you can organise a growdfunding for the 6950xt
    Maybe AMD could provide Michael a f*ck*ng card for free? AMD WTF? You're losing way more money every time he publishes a review with top of the line Nvidia GPUs and no AMD counterpart. Computing aside AMD blows Nvidia away in rasterization on Linux and yet you decide to not provide the highest end cards. I'll never understand.

    Leave a comment:


  • Mr.Elendig
    replied
    Originally posted by Danny3 View Post
    Wow, AMD looks really bad!
    Specially for us rdna1 users. We were thrown under the buss.

    Leave a comment:

Working...
X