Announcement

Collapse
No announcement yet.

Radeon ROCm 3.5 Released With New Features But Still No Navi Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #91
    Originally posted by llukas View Post
    https://medium.com/@samnco/using-the...4-4eee72d56791

    2GB of RAM doesn't allow to run much but it is definitely supported.
    Thanks for the info.

    As stated in the second sentence of that post, Nvidia doesn't list it as supporting CUDA. It's not surprising that it actually does, but I assumed they had artificially blocked it from working.

    Anyhow, since your reply only addressed the GTX 1030, I wish you hadn't also quoted the part about Jetson.

    Comment


    • #92
      Originally posted by bridgman View Post
      ... with 285/380 (Tonga, early gfx8). It's Tonga that was "on the edge"
      Whoah. Cool. I almost bought a 285, a couple years ago, since it was the last with analog out. Instead, I got a GTX 980 Ti.

      I'm now finally using all LCD monitors, though one has an analog input that I use with my analog KVM. Replacing that is the next project...

      Comment


      • #93
        Originally posted by tildearrow View Post

        I would like to ask. Do you do any of these with your AMD card?:

        - play AAA games? (if so, did it hang for you?)
        - record the desktop using VA-API? (if so, did it hang for you?)
        - leave it on (without logging out) for more than 2 days? (if so, did it hang for you?)
        I'm not sure why you're asking me, I'm on GTX 1060. And I don't game anymore.
        Any other AMD owners, feel free to chime in though.

        Comment


        • #94
          Originally posted by tildearrow View Post

          I would like to ask. Do you do any of these with your AMD card?:

          - play AAA games? (if so, did it hang for you?)
          - record the desktop using VA-API? (if so, did it hang for you?)
          - leave it on (without logging out) for more than 2 days? (if so, did it hang for you?)
          If you're asking about Navi cards (from context it seems to be specifically about RX 5600 XT though?) than I'm doing point 1 and 3 pretty much every day, no hangs at all. Maybe I'll get around to testing point 2 for you.

          Only issue I've had with my Ryzen 3900X and RX 5500 XT is a couple of days ago when the DC input to my DC-ATX PSU broke from running too hot. It was quite undramatic though, I unplugged it to do a hard reset of all the small microcontrollers because I got random kernel faults and general odd behavior. I wanted to make sure it booted in a clean state before I moved on to testing RAM and the machine simply wouldn't turn on after unplugging and plugging it back in. I probed around with a multimeter until I looked at the connector itself; the inside which is supposed to be shiny metal was now black and had a rough surface. I thought I might be remembering wrong until I scraped it with a pair of tweezers and found the metal surface. Swapped in a regular ATX PSU for now and it's been stable ever since but now I can't close the lid so I'm looking for a new connector to solder on. I'm thinking some oddball D-sub connector since they're extremely capable connectors while still being dirt cheap (for instance, Molex makes D-sub connectors for a couple of bucks rated for 7,5A per pin).

          So there you have a story about system instability with a Navi card but I can hardly blame the Navi card for that one...

          Comment


          • #95
            Originally posted by atomsymbol

            I don't understand how it has been arrived at the conclusion that there is no Navi support in ROCm 3.5 because when I run romcinfo or clinfo I get the following:

            Code:
            $rocminfo
            *******
            Agent 1
            *******
            Name: AMD Ryzen 7 3700X 8-Core Processor
            
            *******
            Agent 2
            *******
            Name: gfx1012
            Marketing Name: Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
            Code:
            $ clinfo
            Platform Name AMD Accelerated Parallel Processing
            Number of devices 1
            Device Name gfx1012
            Device Vendor Advanced Micro Devices, Inc.
            Device Vendor ID 0x1002
            Device Version OpenCL 2.0
            Driver Version 3137.0 (HSA1.1,LC)
            Device OpenCL C Version OpenCL C 2.0
            Device Type GPU
            Device Board Name (AMD) Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
            Are you able to run anything OpenCL-y?

            Comment


            • #96
              With the AMD packages the library search path isn't updated, so on my system I just made a new file "/etc/OpenCL/vendors/amdocl64_ap.icd" with the following content:
              Code:
              /opt/rocm-3.5.0/opencl/lib/libamdocl64.so
              Now even the Mesa version of clinfo works just fine on my Navi card:
              Code:
              Number of devices 1
              Device Name gfx1012
              Device Vendor Advanced Micro Devices, Inc.
              Device Vendor ID 0x1002
              Device Version OpenCL 2.0
              Driver Version 3137.0 (HSA1.1,LC)
              Device OpenCL C Version OpenCL C 2.0
              Device Type GPU
              Device Board Name (AMD) Navi 14 [Radeon RX 5500/5500M / Pro 5500M]
              Device Topology (AMD) PCI-E, 0b:00.0
              Device Profile FULL_PROFILE
              Device Available Yes
              Compiler Available Yes
              Linker Available Yes
              Max compute units 11
              SIMD per compute unit (AMD) 4
              SIMD width (AMD) 32
              SIMD instruction width (AMD) 1
              Max clock frequency 1885MHz
              Graphics IP (AMD) 10.12
              Device Partition (core)
              Max number of sub-devices 11
              Supported partition types None
              Supported affinity domains (n/a)
              Max work item dimensions 3
              Max work item sizes 1024x1024x1024
              Max work group size 256
              Preferred work group size (AMD) 256
              Max work group size (AMD) 1024
              Preferred work group size multiple 32
              Wavefront width (AMD) 32
              Preferred / native vector sizes
              char 4 / 4
              short 2 / 2
              int 1 / 1
              long 1 / 1
              half 1 / 1 (cl_khr_fp16)
              float 1 / 1
              double 1 / 1 (cl_khr_fp64)
              Half-precision Floating-point support (cl_khr_fp16)
              Denormals No
              Infinity and NANs No
              Round to nearest No
              Round to zero No
              Round to infinity No
              IEEE754-2008 fused multiply-add No
              Support is emulated in software No
              Single-precision Floating-point support (core)
              Denormals Yes
              Infinity and NANs Yes
              Round to nearest Yes
              Round to zero Yes
              Round to infinity Yes
              IEEE754-2008 fused multiply-add Yes
              Support is emulated in software No
              Correctly-rounded divide and sqrt operations Yes
              Double-precision Floating-point support (cl_khr_fp64)
              Denormals Yes
              Infinity and NANs Yes
              Round to nearest Yes
              Round to zero Yes
              Round to infinity Yes
              IEEE754-2008 fused multiply-add Yes
              Support is emulated in software No
              Address bits 64, Little-Endian
              Global memory size 4278190080 (3.984GiB)
              Global free memory (AMD) 4177920 (3.984GiB)
              Global memory channels (AMD) 4
              Global memory banks per channel (AMD) 4
              Global memory bank width (AMD) 256 bytes
              Error Correction support No
              Max memory allocation 3636461568 (3.387GiB)
              Unified memory for Host and Device No
              Shared Virtual Memory (SVM) capabilities (core)
              Coarse-grained buffer sharing Yes
              Fine-grained buffer sharing Yes
              Fine-grained system sharing No
              Atomics No
              Minimum alignment for any data type 128 bytes
              Alignment of base address 1024 bits (128 bytes)
              Preferred alignment for atomics
              SVM 0 bytes
              Global 0 bytes
              Local 0 bytes
              Max size for global variable 3636461568 (3.387GiB)
              Preferred total size of global vars 4278190080 (3.984GiB)
              Global Memory cache type Read/Write
              Global Memory cache size 16384 (16KiB)
              Global Memory cache line size 64 bytes
              Image support Yes
              Max number of samplers per kernel 29504
              Max size for 1D images from buffer 65536 pixels
              Max 1D or 2D image array size 2048 images
              Base address alignment for 2D image buffers 256 bytes
              Pitch alignment for 2D image buffers 256 pixels
              Max 2D image size 16384x16384 pixels
              Max 3D image size 2048x2048x2048 pixels
              Max number of read image args 128
              Max number of write image args 8
              Max number of read/write image args 64
              Max number of pipe args 16
              Max active pipe reservations 16
              Max pipe packet size 3636461568 (3.387GiB)
              Local memory type Local
              Local memory size 65536 (64KiB)
              Local memory syze per CU (AMD) 65536 (64KiB)
              Local memory banks (AMD) 32
              Max number of constant args 8
              Max constant buffer size 3636461568 (3.387GiB)
              Preferred constant buffer size (AMD) 16384 (16KiB)
              Max size of kernel argument 1024
              Queue properties (on host)
              Out-of-order execution No
              Profiling Yes
              Queue properties (on device)
              Out-of-order execution Yes
              Profiling Yes
              Preferred size 262144 (256KiB)
              Max size 8388608 (8MiB)
              Max queues on device 1
              Max events on device 1024
              Prefer user sync for interop Yes
              Number of P2P devices (AMD) 0
              P2P devices (AMD) <printDeviceInfo:147: get number of CL_DEVICE_P2P_DEVICES_AMD : error -30>
              Profiling timer resolution 1ns
              Profiling timer offset since Epoch (AMD) 0ns (Thu Jan 1 01:00:00 1970)
              Execution capabilities
              Run OpenCL kernels Yes
              Run native kernels No
              Thread trace supported (AMD) No
              Number of async queues (AMD) 8
              Max real-time compute queues (AMD) 8
              Max real-time compute units (AMD) 11
              printf() buffer size 4194304 (4MiB)
              Built-in kernels (n/a)
              Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
              
              NULL platform behavior
              clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing
              clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD]
              clCreateContext(NULL, ...) [default] Success [AMD]
              clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
              Platform Name AMD Accelerated Parallel Processing
              Device Name gfx1012
              clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
              clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
              Platform Name AMD Accelerated Parallel Processing
              Device Name gfx1012
              clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
              clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
              clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
              Platform Name AMD Accelerated Parallel Processing
              Device Name gfx1012
              
              ICD loader properties
              ICD loader Name OpenCL ICD Loader
              ICD loader Vendor OCL Icd free software
              ICD loader Version 2.2.12
              ICD loader Profile OpenCL 2.2

              Comment


              • #97
                Originally posted by JustRob View Post
                According to AMD ROCm will support HPC but the focus will be on the Instinct line over supporting every GPU.
                There's a lot that goes into supporting each card and the effort can't be made for every one all at once.
                Why does this sound to me like they are aiming for the datacenter while gaining 0 traction among devs? Is there some information I'm missing?

                Comment


                • #98
                  Originally posted by Aeder View Post
                  Why does this sound to me like they are aiming for the datacenter while gaining 0 traction among devs? Is there some information I'm missing?
                  Their current strategy seems to be using their HiP CUDA-workalike API + code translation tools to help people port existing CUDA codebases. Remains to be seen how successful that will be, but it represents a departure from their OpenCL-centric strategy they had until a few years ago.

                  This leaves Intel as the lone OpenCL holdout - the last one truly embracing it as a central pillar of their GPU compute strategy.

                  Comment


                  • #99
                    Well, datacenters would bring a lot more money and developer attention to ROCm. Having a stable ROCm to keep developing an OpenCL layer on top of isn't a bad idea IMHO.

                    Besides, it's better to get Vega really stable first and then move on to Navi. The Navi support probably isn't official because it's not that stable yet. For instance, DaVinci Studio has a hard dependency on OpenCL and works fine through ROCm on my Polaris card, but on my Navi card I get graphical corruption and after clicking any UI element I get an application crash.

                    Comment


                    • Originally posted by Djhg2000 View Post
                      Well, datacenters would bring a lot more money and developer attention to ROCm. Having a stable ROCm to keep developing an OpenCL layer on top of isn't a bad idea IMHO.
                      This isn't a bad idea but still SPIR-V is not supported. This lack of supporting a common intermediate language runtime is forcing the maintainance of different SYCL runtimes for each available backend, i.e. SPIR-V, CUDA/PTX & Rocm. Doesn't seem effective.

                      Comment

                      Working...
                      X