i too think they are about right from what ive seen. but something is fishy. a TC can do 1024 calcs per clock, and there are 500/640 of them. think about that. they are not being occupied because you should get 5X (100TF vs 20TF) performance for matrix mul, which is the bulk of deep learning processing time.
Announcement
Collapse
No announcement yet.
NVIDIA GeForce RTX 2080 Ti To GTX 980 Ti TensorFlow Benchmarks With ResNet-50, AlexNet, GoogLeNet, Inception, VGG-16
Collapse
X
-
Originally posted by LukePoga View Posti too think they are about right from what ive seen. but something is fishy. a TC can do 1024 calcs per clock, and there are 500/640 of them. think about that. they are not being occupied because you should get 5X (100TF vs 20TF) performance for matrix mul, which is the bulk of deep learning processing time.
"Eight Tensor Cores in an SM perform a total of 512 FP16 multiply and accumulate operations per clock, or 1024 total FP operations per clock."
There are 8 TC in an SM, and there are 68 SMs in a 2080ti. So it's 640 TC in Titan V and 544 in a 2080ti.
- Likes 1
Comment
-
Originally posted by LukeP View Postany updates why 2080ti is running half speed? keen to know if it will be fixed before purchasing
I recommend finding a site that's exclusively focused on deep learning. Or perhaps Nvidia's own developer forums.
Comment
-
i already posted on devtalk. noone seems to know why.
one possibility might be (speculation only) nvidia did a swifty and put in inference cores, not the full tensor cores. in that case it would be false advertising of TFlops, which could result in mass refunds. hopefully not true and will be resolved at some stage.
it really would be a good news article IMHO lol.
Comment
-
-
Originally posted by coder View PostHave others reproduced Michael's results?
I'd be curious whether the same test setup can reproduce any official results from Nvidia.
The reason is that they can only do FP32 accumulate at half speed. Titan V tensors and infact Quadro RTX tensors(!!) do full speed.
So they did gimp the tensor cores for the consumer models of RTX.
https://www.purepc.pl/image/mini_rec...fikacja_16.png
It is documented in their Turing Whitepaper.
Last edited by LukeP; 25 October 2018, 06:27 AM.
- Likes 1
Comment
-
Originally posted by LukeP View PostI was trying very hard to find out why 2080Ti tensor cores were half as fast as Titan V tensor cores.
The reason is that they can only do FP32 accumulate at half speed. Titan V tensors and infact Quadro RTX tensors(!!) do full speed.
Let's hope they unlock it on the Titan card, although I'm fearing they probably won't.
Comment
-
Question about the RTX 2080Ti which may be a wee bit off-topic here:
There have been recent reports about the cards degrading rapidly (see link). I have a machine with dual RTX 2080Ti on order to use for machine learning. Now I'm worried my cards will be RIP before I know it.
phoronix : Have you been hitting your RTX 2080Ti sufficiently hard since you got it to assess whether there is an issue with your sample, Michael?
I know that this would be rather anecdotal evidence with a sample size of 1, just trying to be reassured here.
Comment
Comment