Announcement

Collapse
No announcement yet.

AMDGPU-PRO 16.60 Released

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by Marc Driftmeyer View Post
    Still completely useless for Debian who has punted on OpenCL.
    Maybe we should open bug for Debian to just not build that clover shit... what do you think?

    Comment


    • #32
      Originally posted by Pontostroy View Post
      16.60 can not build kernel module on opensuse 42.2 (16.50 build without any problem)
      Wondering how such regression is possible when Bridgman said this is just "bug fix" release

      AMD proved it is possible to make things worse than with fglrx... i dunno why people complained before
      Last edited by dungeon; 27 January 2017, 03:50 AM.

      Comment


      • #33
        Originally posted by dungeon View Post

        Wondering how such regression is possible when Bridgman said this is just "bug fix" release

        AMD proved it is possible to make things worse than with fglrx... i dunno why people complained before
        16.60 vulkan and opencl works on stock 4.10-rc5(with gcn 1.0 and gcn 1.1), so no pro module is needed anymore, really good work AMD.

        Comment


        • #34
          Decided to read the script more closely and manually installed the following on Debian Sid:
          1. clinfo-amdgpu-pro_16.60-379184_amd64.deb
          2. libopencl1-amdgpu-pro_16.60-379184_amd64.deb
          3. libdrm2-amdgpu-pro_2.4.70-379184_amd64.deb
          4. libdrm-amdgpu-pro-amdgpu1_2.4.70-379184_amd64.deb
          5. libdrm-amdgpu-pro-radeon1_2.4.70-379184_amd64.deb

          Once you install the pro-radeon1 package the clinfo package #1 outputs as expected, even on Linux 4.9 Debian latest Kernel in Sid.

          Ran clinfo:

          Code:
            Number of platforms                               2
            Platform Name                                   Clover
            Platform Vendor                                 Mesa
            Platform Version                                OpenCL 1.1 Mesa 17.0.0-rc2
            Platform Profile                                FULL_PROFILE
            Platform Extensions                             cl_khr_icd
            Platform Extensions function suffix             MESA
          
            Platform Name                                   AMD Accelerated Parallel Processing
            Platform Vendor                                 Advanced Micro Devices, Inc.
            Platform Version                                OpenCL 2.0 AMD-APP (2264.10)
            Platform Profile                                FULL_PROFILE
            Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
            Platform Extensions function suffix             AMD
          
            Platform Name                                   Clover
          Number of devices                                 1
            Device Name                                     AMD POLARIS10 (DRM 3.8.0 / 4.9.0-1-amd64, LLVM 3.9.1)
            Device Vendor                                   AMD
            Device Vendor ID                                0x1002
            Device Version                                  OpenCL 1.1 Mesa 17.0.0-rc2
            Driver Version                                  17.0.0-rc2
            Device OpenCL C Version                         OpenCL C 1.1
            Device Type                                     GPU
            Device Profile                                  FULL_PROFILE
            Max compute units                               36
            Max clock frequency                             1338MHz
            Max work item dimensions                        3
            Max work item sizes                             256x256x256
            Max work group size                             256
            Preferred work group size multiple              ./generic/lib/workitem/get_global_id.cl:4:30: in function sum void (float addrspace(1)*, float addrspace(1)*, float addrspace(1)*): unsupported call to function get_local_size
          
            Preferred / native vector sizes                
              char                                                16 / 16      
              short                                                8 / 8      
              int                                                  4 / 4      
              long                                                 2 / 2      
              half                                                 0 / 0        (n/a)
              float                                                4 / 4      
              double                                               2 / 2        (cl_khr_fp64)
            Half-precision Floating-point support           (n/a)
            Single-precision Floating-point support         (core)
              Denormals                                     No
              Infinity and NANs                             Yes
              Round to nearest                              Yes
              Round to zero                                 No
              Round to infinity                             No
              IEEE754-2008 fused multiply-add               No
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  No
            Double-precision Floating-point support         (cl_khr_fp64)
              Denormals                                     Yes
              Infinity and NANs                             Yes
              Round to nearest                              Yes
              Round to zero                                 Yes
              Round to infinity                             Yes
              IEEE754-2008 fused multiply-add               Yes
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  No
            Address bits                                    64, Little-Endian
            Global memory size                              16848125952 (15.69GiB)
            Error Correction support                        No
            Max memory allocation                           15163313356 (14.12GiB)
            Unified memory for Host and Device              Yes
            Minimum alignment for any data type             128 bytes
            Alignment of base address                       1024 bits (128 bytes)
            Global Memory cache type                        None
            Image support                                   No
            Local memory type                               Local
            Local memory size                               32768 (32KiB)
            Max constant buffer size                        2147483647 (2GiB)
            Max number of constant args                     16
            Max size of kernel argument                     1024
            Queue properties                                
              Out-of-order execution                        No
              Profiling                                     Yes
            Profiling timer resolution                      0ns
            Execution capabilities                          
              Run OpenCL kernels                            Yes
              Run native kernels                            No
            Device Available                                Yes
            Compiler Available                              Yes
            Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64
          
            Platform Name                                   AMD Accelerated Parallel Processing
          Number of devices                                 2
            Device Name                                     Ellesmere
            Device Vendor                                   Advanced Micro Devices, Inc.
            Device Vendor ID                                0x1002
            Device Version                                  OpenCL 1.2 AMD-APP (2264.10)
            Driver Version                                  2264.10
            Device OpenCL C Version                         OpenCL C 1.2
            Device Type                                     GPU
            Device Profile                                  FULL_PROFILE
            Device Board Name (AMD)                         AMD Radeon (TM) RX 480 Graphics
            Device Topology (AMD)                           PCI-E, 01:00.0
            Max compute units                               36
            SIMD per compute unit (AMD)                     4
            SIMD width (AMD)                                16
            SIMD instruction width (AMD)                    1
            Max clock frequency                             1338MHz
            Graphics IP (AMD)                               8.0
            Device Partition                                (core)
              Max number of sub-devices                     36
              Supported partition types                     none specified
            Max work item dimensions                        3
            Max work item sizes                             256x256x256
            Max work group size                             256
            Preferred work group size multiple              64
            Wavefront width (AMD)                           64
            Preferred / native vector sizes                
              char                                                 4 / 4      
              short                                                2 / 2      
              int                                                  1 / 1      
              long                                                 1 / 1      
              half                                                 1 / 1        (cl_khr_fp16)
              float                                                1 / 1      
              double                                               1 / 1        (cl_khr_fp64)
            Half-precision Floating-point support           (cl_khr_fp16)
              Denormals                                     No
              Infinity and NANs                             No
              Round to nearest                              No
              Round to zero                                 No
              Round to infinity                             No
              IEEE754-2008 fused multiply-add               No
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  No
            Single-precision Floating-point support         (core)
              Denormals                                     No
              Infinity and NANs                             Yes
              Round to nearest                              Yes
              Round to zero                                 Yes
              Round to infinity                             Yes
              IEEE754-2008 fused multiply-add               Yes
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  Yes
            Double-precision Floating-point support         (cl_khr_fp64)
              Denormals                                     Yes
              Infinity and NANs                             Yes
              Round to nearest                              Yes
              Round to zero                                 Yes
              Round to infinity                             Yes
              IEEE754-2008 fused multiply-add               Yes
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  No
            Address bits                                    64, Little-Endian
            Global memory size                              4913172480 (4.576GiB)
            Global free memory (AMD)                        4789696 (4.568GiB)
            Global memory channels (AMD)                    8
            Global memory banks per channel (AMD)           16
            Global memory bank width (AMD)                  256 bytes
            Error Correction support                        No
            Max memory allocation                           4244635648 (3.953GiB)
            Unified memory for Host and Device              No
            Minimum alignment for any data type             128 bytes
            Alignment of base address                       2048 bits (256 bytes)
            Global Memory cache type                        Read/Write
            Global Memory cache size                        16384
            Global Memory cache line                        64 bytes
            Image support                                   Yes
              Max number of samplers per kernel             16
              Max size for 1D images from buffer            134217728 pixels
              Max 1D or 2D image array size                 2048 images
              Base address alignment for 2D image buffers   256 bytes
              Pitch alignment for 2D image buffers          256 bytes
              Max 2D image size                             16384x16384 pixels
              Max 3D image size                             2048x2048x2048 pixels
              Max number of read image args                 128
              Max number of write image args                8
            Local memory type                               Local
            Local memory size                               32768 (32KiB)
            Local memory syze per CU (AMD)                  65536 (64KiB)
            Local memory banks (AMD)                        32
            Max constant buffer size                        4244635648 (3.953GiB)
            Max number of constant args                     8
            Max size of kernel argument                     1024
            Queue properties                                
              Out-of-order execution                        No
              Profiling                                     Yes
            Prefer user sync for interop                    Yes
            Profiling timer resolution                      1ns
            Profiling timer offset since Epoch (AMD)        1485405197998233092ns (Wed Jan 25 20:33:17 2017)
            Execution capabilities                          
              Run OpenCL kernels                            Yes
              Run native kernels                            No
              Thread trace supported (AMD)                  Yes
              SPIR versions                                 1.2
            printf() buffer size                            1048576 (1024KiB)
            Built-in kernels                                
            Device Available                                Yes
            Compiler Available                              Yes
            Linker Available                                Yes
            Device Extensions                               cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event
          
            Device Name                                     AMD FX(tm)-8350 Eight-Core Processor
            Device Vendor                                   AuthenticAMD
            Device Vendor ID                                0x1002
            Device Version                                  OpenCL 1.2 AMD-APP (2264.10)
            Driver Version                                  2264.10 (sse2,avx,fma4)
            Device OpenCL C Version                         OpenCL C 1.2
            Device Type                                     CPU
            Device Profile                                  FULL_PROFILE
            Device Board Name (AMD)                        
            Device Topology (AMD)                           (n/a)
            Max compute units                               8
            Max clock frequency                             1400MHz
            Device Partition                                (core, cl_ext_device_fission)
              Max number of sub-devices                     8
              Supported partition types                     equally, by counts, by affinity domain
              Supported affinity domains                    L3 cache, L2 cache, L1 cache, next partitionable
              Supported partition types (ext)               equally, by counts, by affinity domain
              Supported affinity domains (ext)              L3 cache, L2 cache, L1 cache, next fissionable
            Max work item dimensions                        3
            Max work item sizes                             1024x1024x1024
            Max work group size                             1024
            Preferred work group size multiple              1
            Preferred / native vector sizes                
              char                                                16 / 16      
              short                                                8 / 8      
              int                                                  4 / 4      
              long                                                 2 / 2      
              half                                                 4 / 4        (n/a)
              float                                                8 / 8      
              double                                               4 / 4        (cl_khr_fp64)
            Half-precision Floating-point support           (n/a)
            Single-precision Floating-point support         (core)
              Denormals                                     Yes
              Infinity and NANs                             Yes
              Round to nearest                              Yes
              Round to zero                                 Yes
              Round to infinity                             Yes
              IEEE754-2008 fused multiply-add               Yes
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  Yes
            Double-precision Floating-point support         (cl_khr_fp64)
              Denormals                                     Yes
              Infinity and NANs                             Yes
              Round to nearest                              Yes
              Round to zero                                 Yes
              Round to infinity                             Yes
              IEEE754-2008 fused multiply-add               Yes
              Support is emulated in software               No
              Correctly-rounded divide and sqrt operations  No
            Address bits                                    64, Little-Endian
            Global memory size                              33701552128 (31.39GiB)
            Error Correction support                        No
            Max memory allocation                           8425388032 (7.847GiB)
            Unified memory for Host and Device              Yes
            Minimum alignment for any data type             128 bytes
            Alignment of base address                       1024 bits (128 bytes)
            Global Memory cache type                        Read/Write
            Global Memory cache size                        16384
            Global Memory cache line                        64 bytes
            Image support                                   Yes
              Max number of samplers per kernel             16
              Max size for 1D images from buffer            65536 pixels
              Max 1D or 2D image array size                 2048 images
              Max 2D image size                             8192x8192 pixels
              Max 3D image size                             2048x2048x2048 pixels
              Max number of read image args                 128
              Max number of write image args                64
            Local memory type                               Global
            Local memory size                               32768 (32KiB)
            Max constant buffer size                        65536 (64KiB)
            Max number of constant args                     8
            Max size of kernel argument                     4096 (4KiB)
            Queue properties                                
              Out-of-order execution                        No
              Profiling                                     Yes
            Prefer user sync for interop                    Yes
            Profiling timer resolution                      1ns
            Profiling timer offset since Epoch (AMD)        1485405197998233092ns (Wed Jan 25 20:33:17 2017)
            Execution capabilities                          
              Run OpenCL kernels                            Yes
              Run native kernels                            Yes
              SPIR versions                                 1.2
            printf() buffer size                            65536 (64KiB)
            Built-in kernels                                
            Device Available                                Yes
            Compiler Available                              Yes
            Linker Available                                Yes
            Device Extensions                               cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_spir cl_khr_gl_event
          
          NULL platform behavior
            clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
            clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
            clCreateContext(NULL, ...) [default]            No platform
            clCreateContext(NULL, ...) [other]              Success [MESA]
            clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
            clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
            clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
            clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
            clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
          Restarted Blender 2.78a+nightly git.

          User Preferences->Cycles-> OpenCL-> Ellesmere shows up.

          Test OpenCL and it renders.

          Slow as molasses but now the OpenCL kernel doesn't crash.
          Last edited by Marc Driftmeyer; 09 March 2017, 12:51 AM.

          Comment


          • #35
            Did they include the 32bit OpenGL libs for Suse this time? Or was it again forgotten?

            Comment


            • #36
              Originally posted by mibo View Post
              Did they include the 32bit OpenGL libs for Suse this time? Or was it again forgotten?
              nope, so opengl is useless

              Comment


              • #37
                Originally posted by dungeon View Post

                Wondering how such regression is possible when Bridgman said this is just "bug fix" release

                AMD proved it is possible to make things worse than with fglrx... i dunno why people complained before
                That's easy: somebody submitted "builds kernel module on OpenSuse 42.2" as a bug. And they fixed it

                Comment


                • #38
                  Originally posted by Pontostroy View Post

                  nope, so opengl is useless
                  At least for the people who want to play games - most of them are still 32bit.

                  AMD, please include the 32bit libs for Suse.

                  Comment


                  • #39
                    I don't know if anyone from AMD is reading this, but the AMDGPU-PRO driver still lacks important functionality that used to be part of Catalyst:

                    - tear free desktop: this is extremely useful to watch movies without tearing. worked great on the 380/380X, why can't it be supported on Polaris?
                    - more generally the lack of a commander center is a huge step back
                    - cannot easily set a max target frequency to lower power consumption. In another thread I explained how to patch the driver source to achieve this, it'd be great to have a proper /sys interface to under/overclock and under/overvolt.

                    There are other regressions. I for one very much welcome AMD's open-source efforts, but some visibility for the closed-source driver roadmap would be great for us Linux users.

                    By the way Catalyst worked great on a number of distros including Debian: another huge step back.
                    Last edited by cde1; 27 January 2017, 07:47 AM.

                    Comment


                    • #40
                      Originally posted by debianxfce View Post
                      <snip>
                      Latest mesa:
                      Padoka PPA:
                      https://launchpad.net/~paulo-miguel-d...
                      <snip>
                      Since the original question was about making this work on *buntu, I believe that's all that's needed.

                      Comment

                      Working...
                      X