Announcement

Collapse
No announcement yet.

Radeon ROCm 3.5 Released With New Features But Still No Navi Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • vlad
    replied
    Originally posted by bridgman View Post

    Hawaii (gfx7) had an early version of MEC with a fixed microcode store in the MEC block, while the later gfx8 parts could support larger microcode images by executing directly out of VRAM via an instruction cache. I had not heard about Hawaii support being broken until today but will take a look. It was always a challenge to fit even stripped-down ROCm functionality into the fixed microcode store so it's possible that we just outgrew it.
    Support of gfx7 in ROCm would very valuable for those people with S8150 and similar cards. These GPUs are quite capable and with HIP support they could be very efficient for many.
    E.g. see https://github.com/RadeonOpenCompute...ment-647033796

    Leave a comment:


  • ekondis
    replied
    Originally posted by Djhg2000 View Post
    Well, datacenters would bring a lot more money and developer attention to ROCm. Having a stable ROCm to keep developing an OpenCL layer on top of isn't a bad idea IMHO.
    This isn't a bad idea but still SPIR-V is not supported. This lack of supporting a common intermediate language runtime is forcing the maintainance of different SYCL runtimes for each available backend, i.e. SPIR-V, CUDA/PTX & Rocm. Doesn't seem effective.

    Leave a comment:


  • Djhg2000
    replied
    Well, datacenters would bring a lot more money and developer attention to ROCm. Having a stable ROCm to keep developing an OpenCL layer on top of isn't a bad idea IMHO.

    Besides, it's better to get Vega really stable first and then move on to Navi. The Navi support probably isn't official because it's not that stable yet. For instance, DaVinci Studio has a hard dependency on OpenCL and works fine through ROCm on my Polaris card, but on my Navi card I get graphical corruption and after clicking any UI element I get an application crash.

    Leave a comment:


  • coder
    replied
    Originally posted by Aeder View Post
    Why does this sound to me like they are aiming for the datacenter while gaining 0 traction among devs? Is there some information I'm missing?
    Their current strategy seems to be using their HiP CUDA-workalike API + code translation tools to help people port existing CUDA codebases. Remains to be seen how successful that will be, but it represents a departure from their OpenCL-centric strategy they had until a few years ago.

    This leaves Intel as the lone OpenCL holdout - the last one truly embracing it as a central pillar of their GPU compute strategy.

    Leave a comment:


  • Aeder
    replied
    Originally posted by JustRob View Post
    According to AMD ROCm will support HPC but the focus will be on the Instinct line over supporting every GPU.
    There's a lot that goes into supporting each card and the effort can't be made for every one all at once.
    Why does this sound to me like they are aiming for the datacenter while gaining 0 traction among devs? Is there some information I'm missing?

    Leave a comment:


  • Djhg2000
    replied
    With the AMD packages the library search path isn't updated, so on my system I just made a new file "/etc/OpenCL/vendors/amdocl64_ap.icd" with the following content:
    Code:
    /opt/rocm-3.5.0/opencl/lib/libamdocl64.so
    Now even the Mesa version of clinfo works just fine on my Navi card:
    Code:
    Number of devices 1
    Device Name gfx1012
    Device Vendor Advanced Micro Devices, Inc.
    Device Vendor ID 0x1002
    Device Version OpenCL 2.0
    Driver Version 3137.0 (HSA1.1,LC)
    Device OpenCL C Version OpenCL C 2.0
    Device Type GPU
    Device Board Name (AMD) Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
    Device Topology (AMD) PCI-E, 0b:00.0
    Device Profile FULL_PROFILE
    Device Available Yes
    Compiler Available Yes
    Linker Available Yes
    Max compute units 11
    SIMD per compute unit (AMD) 4
    SIMD width (AMD) 32
    SIMD instruction width (AMD) 1
    Max clock frequency 1885MHz
    Graphics IP (AMD) 10.12
    Device Partition (core)
    Max number of sub-devices 11
    Supported partition types None
    Supported affinity domains (n/a)
    Max work item dimensions 3
    Max work item sizes 1024x1024x1024
    Max work group size 256
    Preferred work group size (AMD) 256
    Max work group size (AMD) 1024
    Preferred work group size multiple 32
    Wavefront width (AMD) 32
    Preferred / native vector sizes
    char 4 / 4
    short 2 / 2
    int 1 / 1
    long 1 / 1
    half 1 / 1 (cl_khr_fp16)
    float 1 / 1
    double 1 / 1 (cl_khr_fp64)
    Half-precision Floating-point support (cl_khr_fp16)
    Denormals No
    Infinity and NANs No
    Round to nearest No
    Round to zero No
    Round to infinity No
    IEEE754-2008 fused multiply-add No
    Support is emulated in software No
    Single-precision Floating-point support (core)
    Denormals Yes
    Infinity and NANs Yes
    Round to nearest Yes
    Round to zero Yes
    Round to infinity Yes
    IEEE754-2008 fused multiply-add Yes
    Support is emulated in software No
    Correctly-rounded divide and sqrt operations Yes
    Double-precision Floating-point support (cl_khr_fp64)
    Denormals Yes
    Infinity and NANs Yes
    Round to nearest Yes
    Round to zero Yes
    Round to infinity Yes
    IEEE754-2008 fused multiply-add Yes
    Support is emulated in software No
    Address bits 64, Little-Endian
    Global memory size 4278190080 (3.984GiB)
    Global free memory (AMD) 4177920 (3.984GiB)
    Global memory channels (AMD) 4
    Global memory banks per channel (AMD) 4
    Global memory bank width (AMD) 256 bytes
    Error Correction support No
    Max memory allocation 3636461568 (3.387GiB)
    Unified memory for Host and Device No
    Shared Virtual Memory (SVM) capabilities (core)
    Coarse-grained buffer sharing Yes
    Fine-grained buffer sharing Yes
    Fine-grained system sharing No
    Atomics No
    Minimum alignment for any data type 128 bytes
    Alignment of base address 1024 bits (128 bytes)
    Preferred alignment for atomics
    SVM 0 bytes
    Global 0 bytes
    Local 0 bytes
    Max size for global variable 3636461568 (3.387GiB)
    Preferred total size of global vars 4278190080 (3.984GiB)
    Global Memory cache type Read/Write
    Global Memory cache size 16384 (16KiB)
    Global Memory cache line size 64 bytes
    Image support Yes
    Max number of samplers per kernel 29504
    Max size for 1D images from buffer 65536 pixels
    Max 1D or 2D image array size 2048 images
    Base address alignment for 2D image buffers 256 bytes
    Pitch alignment for 2D image buffers 256 pixels
    Max 2D image size 16384x16384 pixels
    Max 3D image size 2048x2048x2048 pixels
    Max number of read image args 128
    Max number of write image args 8
    Max number of read/write image args 64
    Max number of pipe args 16
    Max active pipe reservations 16
    Max pipe packet size 3636461568 (3.387GiB)
    Local memory type Local
    Local memory size 65536 (64KiB)
    Local memory syze per CU (AMD) 65536 (64KiB)
    Local memory banks (AMD) 32
    Max number of constant args 8
    Max constant buffer size 3636461568 (3.387GiB)
    Preferred constant buffer size (AMD) 16384 (16KiB)
    Max size of kernel argument 1024
    Queue properties (on host)
    Out-of-order execution No
    Profiling Yes
    Queue properties (on device)
    Out-of-order execution Yes
    Profiling Yes
    Preferred size 262144 (256KiB)
    Max size 8388608 (8MiB)
    Max queues on device 1
    Max events on device 1024
    Prefer user sync for interop Yes
    Number of P2P devices (AMD) 0
    P2P devices (AMD) <printDeviceInfo:147: get number of CL_DEVICE_P2P_DEVICES_AMD : error -30>
    Profiling timer resolution 1ns
    Profiling timer offset since Epoch (AMD) 0ns (Thu Jan 1 01:00:00 1970)
    Execution capabilities
    Run OpenCL kernels Yes
    Run native kernels No
    Thread trace supported (AMD) No
    Number of async queues (AMD) 8
    Max real-time compute queues (AMD) 8
    Max real-time compute units (AMD) 11
    printf() buffer size 4194304 (4MiB)
    Built-in kernels (n/a)
    Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
    
    NULL platform behavior
    clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing
    clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD]
    clCreateContext(NULL, ...) [default] Success [AMD]
    clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
    Platform Name AMD Accelerated Parallel Processing
    Device Name gfx1012
    clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
    clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
    Platform Name AMD Accelerated Parallel Processing
    Device Name gfx1012
    clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
    clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
    clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
    Platform Name AMD Accelerated Parallel Processing
    Device Name gfx1012
    
    ICD loader properties
    ICD loader Name OpenCL ICD Loader
    ICD loader Vendor OCL Icd free software
    ICD loader Version 2.2.12
    ICD loader Profile OpenCL 2.2

    Leave a comment:


  • oleid
    replied
    Originally posted by atomsymbol

    I don't understand how it has been arrived at the conclusion that there is no Navi support in ROCm 3.5 because when I run romcinfo or clinfo I get the following:

    Code:
    $rocminfo
    *******
    Agent 1
    *******
    Name: AMD Ryzen 7 3700X 8-Core Processor
    
    *******
    Agent 2
    *******
    Name: gfx1012
    Marketing Name: Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
    Code:
    $ clinfo
    Platform Name AMD Accelerated Parallel Processing
    Number of devices 1
    Device Name gfx1012
    Device Vendor Advanced Micro Devices, Inc.
    Device Vendor ID 0x1002
    Device Version OpenCL 2.0
    Driver Version 3137.0 (HSA1.1,LC)
    Device OpenCL C Version OpenCL C 2.0
    Device Type GPU
    Device Board Name (AMD) Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
    Are you able to run anything OpenCL-y?

    Leave a comment:


  • Djhg2000
    replied
    Originally posted by tildearrow View Post

    I would like to ask. Do you do any of these with your AMD card?:

    - play AAA games? (if so, did it hang for you?)
    - record the desktop using VA-API? (if so, did it hang for you?)
    - leave it on (without logging out) for more than 2 days? (if so, did it hang for you?)
    If you're asking about Navi cards (from context it seems to be specifically about RX 5600 XT though?) than I'm doing point 1 and 3 pretty much every day, no hangs at all. Maybe I'll get around to testing point 2 for you.

    Only issue I've had with my Ryzen 3900X and RX 5500 XT is a couple of days ago when the DC input to my DC-ATX PSU broke from running too hot. It was quite undramatic though, I unplugged it to do a hard reset of all the small microcontrollers because I got random kernel faults and general odd behavior. I wanted to make sure it booted in a clean state before I moved on to testing RAM and the machine simply wouldn't turn on after unplugging and plugging it back in. I probed around with a multimeter until I looked at the connector itself; the inside which is supposed to be shiny metal was now black and had a rough surface. I thought I might be remembering wrong until I scraped it with a pair of tweezers and found the metal surface. Swapped in a regular ATX PSU for now and it's been stable ever since but now I can't close the lid so I'm looking for a new connector to solder on. I'm thinking some oddball D-sub connector since they're extremely capable connectors while still being dirt cheap (for instance, Molex makes D-sub connectors for a couple of bucks rated for 7,5A per pin).

    So there you have a story about system instability with a Navi card but I can hardly blame the Navi card for that one...

    Leave a comment:


  • bug77
    replied
    Originally posted by tildearrow View Post

    I would like to ask. Do you do any of these with your AMD card?:

    - play AAA games? (if so, did it hang for you?)
    - record the desktop using VA-API? (if so, did it hang for you?)
    - leave it on (without logging out) for more than 2 days? (if so, did it hang for you?)
    I'm not sure why you're asking me, I'm on GTX 1060. And I don't game anymore.
    Any other AMD owners, feel free to chime in though.

    Leave a comment:


  • coder
    replied
    Originally posted by bridgman View Post
    ... with 285/380 (Tonga, early gfx8). It's Tonga that was "on the edge"
    Whoah. Cool. I almost bought a 285, a couple years ago, since it was the last with analog out. Instead, I got a GTX 980 Ti.

    I'm now finally using all LCD monitors, though one has an analog input that I use with my analog KVM. Replacing that is the next project...

    Leave a comment:

Working...
X