Announcement

Collapse
No announcement yet.

Does AMD unofficially support APUs with amdgpu-pro driver?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does AMD unofficially support APUs with amdgpu-pro driver?

    At last it works. Trying to get OpenCL with AMD APU Ryzen 3 3200G. I had many troubles with AMD ROCm. Uninstalled it and installed headless amdgpu-pro OpenCL drivers: Download amdgpu-pro from AMD website. I chose amdgpu-pro-20.40-1147287-sle-15.2.tar.xz for Vega64. Run amdgpu-pro-install --opencl=legacy,pal --headless You may add: “-y” for non-interactive install “–no-dkms” for non-DKMS install. IDK whether it is needed or not. Reboot after install. Amdgpu-pro has OpenCL image ...


    AMD Ryzen 3 3200G, openSUSE Leap 15.2.

    With amdgpu-pro OpenCL headless driver:

    Code:
    :~> clpeak
    
    Platform: AMD Accelerated Parallel Processing
    Device: gfx902
    Driver version : 3180.7 (PAL,HSAIL) (Linux x64)
    Compute units : 8
    Clock frequency : 1250 MHz
    
    Global memory bandwidth (GBPS)
    float : 42.63
    float2 : 44.12
    float4 : 45.13
    float8 : 46.01
    float16 : 45.59
    
    Single-precision compute (GFLOPS)
    float : 1252.46
    float2 : 1251.55
    float4 : 1249.05
    float8 : 1238.10
    float16 : 1225.67
    
    half-precision compute (GFLOPS)
    half : 1250.89
    half2 : 2421.75
    half4 : 2409.56
    half8 : 2381.20
    half16 : 2237.11
    
    Double-precision compute (GFLOPS)
    double : 79.14
    double2 : 79.10
    double4 : 79.01
    double8 : 78.85
    double16 : 78.41
    
    Integer compute (GIOPS)
    int : 252.97
    int2 : 252.96
    int4 : 252.96
    int8 : 252.96
    int16 : 252.94
    
    Transfer bandwidth (GBPS)
    enqueueWriteBuffer : 42.38
    enqueueReadBuffer : 14.94
    enqueueMapBuffer(for read) : 35320.46
    memcpy from mapped ptr : 14.29
    enqueueUnmap(after write) : 78951.60
    memcpy to mapped ptr : 14.34
    
    Kernel launch latency : 48.96 us
    With ROCm driver (v. 3.3 and 3.10):
    Code:
    :~> clpeak
    
    Platform: AMD Accelerated Parallel Processing
    Device: gfx902+xnack
    Driver version : 3212.0 (HSA1.1,LC) (Linux x64)
    Compute units : 11
    Clock frequency : 1250 MHz
    
    Global memory bandwidth (GBPS)
    float : 12.66
    float2 : 12.75
    float4 : 12.76
    float8 : 12.74
    float16 : 12.20
    
    Single-precision compute (GFLOPS)
    float : 202.43
    float2 : 202.28
    float4 : 201.77
    float8 : 200.75
    float16 : 198.48
    
    half-precision compute (GFLOPS)
    half : 202.33
    half2 : 396.88
    half4 : 394.08
    half8 : 389.46
    half16 : 384.83
    
    Double-precision compute (GFLOPS)
    double : 12.76
    double2 : 12.76
    double4 : 12.71
    double8 : 12.71
    double16 : 12.58
    
    Integer compute (GIOPS)
    int : 40.81
    int2 : 40.81
    int4 : 40.81
    int8 : 40.80
    int16 : 40.79
    
    Transfer bandwidth (GBPS)
    enqueueWriteBuffer : 2.11
    enqueueReadBuffer : 5.34
    enqueueMapBuffer(for read) : 33925.49
    memcpy from mapped ptr : 5.23
    enqueueUnmap(after write) : 70409.30
    memcpy to mapped ptr : 5.41
    
    Kernel launch latency : -500964480.00 us
Working...
X