Radeon ROCm 1.9.1 vs. NVIDIA OpenCL Linux Plus RTX 2080 TensorFlow Benchmarks
Following the GeForce RTX 2080 Linux gaming benchmarks last week with now having that non-Ti variant, I carried out some fresh GPU compute benchmarks of the higher-end NVIDIA GeForce and AMD Radeon graphics cards. Here's a look at the OpenCL performance between the competing vendors plus some fresh CUDA benchmarks as well as NVIDIA GPU Cloud TensorFlow Docker benchmarks.
This article provides a fresh look at the Linux GPU compute performance for NVIDIA and AMD. On the AMD side was the Linux 4.19 kernel paired with the ROCm 1.9.1 binary packages for Ubuntu 18.04 LTS. ROCm continues happily running well on the mainline kernel with the latest releases, compared to previously relying upon the out-of-tree/DKMS kernel modules for compute support on the discrete Radeon GPUS. ROCm 2.0 is still supposed to be released before year's end so there will be some fresh benchmarks coming up with that OpenCL 2.0+ implementation when the time comes. The Radeon CPUs tested were the RX Vega 56 and RX Vega 64 as well as tossing in the R9 Fury for some historical context.
On the NVIDIA side was their newest 415.22 driver release paired with the CUDA 10.0 compute stack from this same Ubuntu 18.04 system with Intel Core i9 9900K processor. The NVIDIA cards tested were the GeForce GTX 1070, GTX 1070 Ti, GTX 1080, GTX 1080 Ti, RTX 2070, RTX 2080, and RTX 2080 Ti. Unfortunately in the case of the GeForce RTX 2070, there are only partial results available as I've run into either a 415.22 driver issue only affecting this EVGA RTX 2070 or it has developed a hardware defect as now when benchmarking this card for the GPU TensorFlow tests it has NaN errors during training and other stability issues in general, including for OpenGL/Vulkan games. That EVGA RTX 2070 issue is currently being explored, but at least all of the other cards tested were running stable.
Via the Phoronix Test Suite first up are the cross-vendor OpenCL tests followed by the NVIDIA CUDA and TensorFlow benchmarks for this fresh comparison.