Announcement

Collapse
No announcement yet.

Support for OpenCL Extension cl_khr_int64_base_atomics in AMD ATI Radeon HD series

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Support for OpenCL Extension cl_khr_int64_base_atomics in AMD ATI Radeon HD series

    I have ATI Radeon HD 5970 graphics card in my 64 bit CentOS 6.2 system having Intel core i7 processor. I have installed ATI 12.2 drivers and AMD APP SDK 2.6 . I want to use cl_khr_int64_base_atomics(OpenCL 1.1) extension for 64 bit integer computation on my GPU device, but the device does not support it. However, the same extension is supported by CPU device. Can I somehow use 64 bit integers with my GPU device(I am in desperate need of this)? Is there any work-around for the same? Will the coming versions of ATI drivers/AMD SDK support this ? Is this extension supported by any other graphics card?

    For details, my output of clinfo command is given below:

    Code:
    Number of platforms:				 1
      Platform Profile:				 FULL_PROFILE
      Platform Version:				 OpenCL 1.1 AMD-APP (898.1)
      Platform Name:				 AMD Accelerated Parallel Processing
      Platform Vendor:				 Advanced Micro Devices, Inc.
      Platform Extensions:				 cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
    
    
      Platform Name:				 AMD Accelerated Parallel Processing
    Number of devices:				 3
      Device Type:					 CL_DEVICE_TYPE_GPU
      Device ID:					 4098
      Board name:					 ATI Radeon HD 5900 Series
      Device Topology:				 PCI[ B#4, D#0, F#0 ]
      Max compute units:				 20
      Max work items dimensions:			 3
        Max work items[0]:				 256
        Max work items[1]:				 256
        Max work items[2]:				 256
      Max work group size:				 256
      Preferred vector width char:			 16
      Preferred vector width short:			 8
      Preferred vector width int:			 4
      Preferred vector width long:			 2
      Preferred vector width float:			 4
      Preferred vector width double:		 2
      Native vector width char:			 16
      Native vector width short:			 8
      Native vector width int:			 4
      Native vector width long:			 2
      Native vector width float:			 4
      Native vector width double:			 2
      Max clock frequency:				 725Mhz
      Address bits:					 32
      Max memory allocation:			 134217728
      Image support:				 Yes
      Max number of images read arguments:		 128
      Max number of images write arguments:		 8
      Max image 2D width:				 8192
      Max image 2D height:				 8192
      Max image 3D width:				 2048
      Max image 3D height:				 2048
      Max image 3D depth:				 2048
      Max samplers within kernel:			 16
      Max size of kernel argument:			 1024
      Alignment (bits) of base address:		 2048
      Minimum alignment (bytes) for any datatype:	 128
      Single precision floating point capability
        Denorms:					 No
        Quiet NaNs:					 Yes
        Round to nearest even:			 Yes
        Round to zero:				 Yes
        Round to +ve and infinity:			 Yes
        IEEE754-2008 fused multiply-add:		 Yes
      Cache type:					 None
      Cache line size:				 0
      Cache size:					 0
      Global memory size:				 536870912
      Constant buffer size:				 65536
      Max number of constant args:			 8
      Local memory type:				 Scratchpad
      Local memory size:				 32768
      Kernel Preferred work group size multiple:	 64
      Error correction support:			 0
      Unified memory for Host and Device:		 0
      Profiling timer resolution:			 1
      Device endianess:				 Little
      Available:					 Yes
      Compiler available:				 Yes
      Execution capabilities:				 
        Execute OpenCL kernels:			 Yes
        Execute native function:			 No
      Queue properties:				 
        Out-of-Order:				 No
        Profiling :					 Yes
      Platform ID:					 0x7f118dd36480
      Name:						 Cypress
      Vendor:					 Advanced Micro Devices, Inc.
      Device OpenCL C version:			 OpenCL C 1.1 
      Driver version:				 CAL 1.4.1703
      Profile:					 FULL_PROFILE
      Version:					 OpenCL 1.1 AMD-APP (898.1)
      Extensions:					 cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_meminfo
    
    
      Device Type:					 CL_DEVICE_TYPE_GPU
      Device ID:					 4098
      Board name:					 ATI Radeon HD 5900 Series
      Device Topology:				 PCI[ B#5, D#0, F#0 ]
      Max compute units:				 20
      Max work items dimensions:			 3
        Max work items[0]:				 256
        Max work items[1]:				 256
        Max work items[2]:				 256
      Max work group size:				 256
      Preferred vector width char:			 16
      Preferred vector width short:			 8
      Preferred vector width int:			 4
      Preferred vector width long:			 2
      Preferred vector width float:			 4
      Preferred vector width double:		 2
      Native vector width char:			 16
      Native vector width short:			 8
      Native vector width int:			 4
      Native vector width long:			 2
      Native vector width float:			 4
      Native vector width double:			 2
      Max clock frequency:				 725Mhz
      Address bits:					 32
      Max memory allocation:			 134217728
      Image support:				 Yes
      Max number of images read arguments:		 128
      Max number of images write arguments:		 8
      Max image 2D width:				 8192
      Max image 2D height:				 8192
      Max image 3D width:				 2048
      Max image 3D height:				 2048
      Max image 3D depth:				 2048
      Max samplers within kernel:			 16
      Max size of kernel argument:			 1024
      Alignment (bits) of base address:		 2048
      Minimum alignment (bytes) for any datatype:	 128
      Single precision floating point capability
        Denorms:					 No
        Quiet NaNs:					 Yes
        Round to nearest even:			 Yes
        Round to zero:				 Yes
        Round to +ve and infinity:			 Yes
        IEEE754-2008 fused multiply-add:		 Yes
      Cache type:					 None
      Cache line size:				 0
      Cache size:					 0
      Global memory size:				 536870912
      Constant buffer size:				 65536
      Max number of constant args:			 8
      Local memory type:				 Scratchpad
      Local memory size:				 32768
      Kernel Preferred work group size multiple:	 64
      Error correction support:			 0
      Unified memory for Host and Device:		 0
      Profiling timer resolution:			 1
      Device endianess:				 Little
      Available:					 Yes
      Compiler available:				 Yes
      Execution capabilities:				 
        Execute OpenCL kernels:			 Yes
        Execute native function:			 No
      Queue properties:				 
        Out-of-Order:				 No
        Profiling :					 Yes
      Platform ID:					 0x7f118dd36480
      Name:						 Cypress
      Vendor:					 Advanced Micro Devices, Inc.
      Device OpenCL C version:			 OpenCL C 1.1 
      Driver version:				 CAL 1.4.1703
      Profile:					 FULL_PROFILE
      Version:					 OpenCL 1.1 AMD-APP (898.1)
      Extensions:					 cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_meminfo
    
    
      Device Type:					 CL_DEVICE_TYPE_CPU
      Device ID:					 4098
      Board name:					 
      Max compute units:				 12
      Max work items dimensions:			 3
        Max work items[0]:				 1024
        Max work items[1]:				 1024
        Max work items[2]:				 1024
      Max work group size:				 1024
      Preferred vector width char:			 16
      Preferred vector width short:			 8
      Preferred vector width int:			 4
      Preferred vector width long:			 2
      Preferred vector width float:			 4
      Preferred vector width double:		 0
      Native vector width char:			 16
      Native vector width short:			 8
      Native vector width int:			 4
      Native vector width long:			 2
      Native vector width float:			 4
      Native vector width double:			 0
      Max clock frequency:				 1596Mhz
      Address bits:					 64
      Max memory allocation:			 3119875072
      Image support:				 Yes
      Max number of images read arguments:		 128
      Max number of images write arguments:		 8
      Max image 2D width:				 8192
      Max image 2D height:				 8192
      Max image 3D width:				 2048
      Max image 3D height:				 2048
      Max image 3D depth:				 2048
      Max samplers within kernel:			 16
      Max size of kernel argument:			 4096
      Alignment (bits) of base address:		 1024
      Minimum alignment (bytes) for any datatype:	 128
      Single precision floating point capability
        Denorms:					 Yes
        Quiet NaNs:					 Yes
        Round to nearest even:			 Yes
        Round to zero:				 Yes
        Round to +ve and infinity:			 Yes
        IEEE754-2008 fused multiply-add:		 Yes
      Cache type:					 Read/Write
      Cache line size:				 64
      Cache size:					 32768
      Global memory size:				 12479500288
      Constant buffer size:				 65536
      Max number of constant args:			 8
      Local memory type:				 Global
      Local memory size:				 32768
      Kernel Preferred work group size multiple:	 1
      Error correction support:			 0
      Unified memory for Host and Device:		 1
      Profiling timer resolution:			 1
      Device endianess:				 Little
      Available:					 Yes
      Compiler available:				 Yes
      Execution capabilities:				 
        Execute OpenCL kernels:			 Yes
        Execute native function:			 Yes
      Queue properties:				 
        Out-of-Order:				 No
        Profiling :					 Yes
      Platform ID:					 0x7f118dd36480
      Name:						 Intel(R) Core(TM) i7 CPU       X 980  @ 3.33GHz
      Vendor:					 GenuineIntel
      Device OpenCL C version:			 OpenCL C 1.1 
      Driver version:				 2.0
      Profile:					 FULL_PROFILE
      Version:					 OpenCL 1.1 AMD-APP (898.1)
      Extensions:					 cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt

  • #2
    You might want to try the OpenCL forum on the AMD developer site :



    My first thought though is that GPU hardware doesn't natively support 64-bit integer operations, but take that with a grain of salt.

    EDIT -- never mind, I see you already did
    Test signature

    Comment

    Working...
    X