CUDA-Python Reaches "GA" With NVIDIA CUDA 11.5 Release, __int128 Preview
NVIDIA has made available CUDA 11.5 today as the latest version of their popular but proprietary compute stack/platform. Notable with CUDA 11.5 is that CUDA-Python has reached general availability status.
NVIDIA CUDA 11.5 was posted today along with updated device drivers for Windows and Linux systems. Some of the CUDA 11.5 highlights include:
-CUDA-Python has reached "GA" as the Python bindings for CUDA. More details along with tests and samples of CUDA-Python can be found via GitHub.
- A preview release of the "__int128" data type. This is a limited preview right now with support for math operations, library support, and more to come later.
- Native support for signed and unsigned normalized 8-bit and 16-bit types.
- Improved interoperability with graphics frameworks.
- CUDA now supports per-process memory access policies around multi-process sharing of GPUs.
- cuBINS larger than 2GB can now be linked.
- The device-side caching behavior is now configurable.
- The CUDA compiler now supports "-arch=all" and "-arch=all-major" options for generating code for multiple architectures at the same time.
Overall the CUDA 11.5 release today is a fairly robust update to the CUDA 11 series. Windows/Linux downloads for CUDA 11.5 available from developer.nvidia.com. More details on the CUDA 11.5 changes via the release notes.
NVIDIA CUDA 11.5 was posted today along with updated device drivers for Windows and Linux systems. Some of the CUDA 11.5 highlights include:
-CUDA-Python has reached "GA" as the Python bindings for CUDA. More details along with tests and samples of CUDA-Python can be found via GitHub.
- A preview release of the "__int128" data type. This is a limited preview right now with support for math operations, library support, and more to come later.
- Native support for signed and unsigned normalized 8-bit and 16-bit types.
- Improved interoperability with graphics frameworks.
- CUDA now supports per-process memory access policies around multi-process sharing of GPUs.
- cuBINS larger than 2GB can now be linked.
- The device-side caching behavior is now configurable.
- The CUDA compiler now supports "-arch=all" and "-arch=all-major" options for generating code for multiple architectures at the same time.
Overall the CUDA 11.5 release today is a fairly robust update to the CUDA 11 series. Windows/Linux downloads for CUDA 11.5 available from developer.nvidia.com. More details on the CUDA 11.5 changes via the release notes.
5 Comments