Announcement

**coder** · 15 January 2019, 07:11 PM

Originally posted by juno View Post

Hawaii was considerably *smaller* and more efficient than GK110.

Okay, 6.2 B vs. 7.0 B transistors. Both rated at 250 W. I'll grant you that it was a bit smaller, but the 512-bit memory bus was not a cheap or power-efficient thing to do.

Of course, there's only so much insight one can gain by analyzing products from nearly six years ago!

Originally posted by juno View Post

Neither is the typical GTX 980 Ti lower-spec to a typical Fury.

Well, let's look at the two most consequential aspects.

Memory bandwidth: 336 vs. 512 GB/sec

Single-precision: 5.6 vs. 8.6 TFLOPS

Now, tell me: in what way was the GTX 980 Ti not lower spec? It even used less power!

And yet, gaming performance was pretty much neck-and-neck.

Originally posted by juno View Post

If all your sources are this reliable, I'm sure GCN won't go anywhere.

At this point, you're just trolling. I guess I shouldn't be surprised, given how much of your inner AMD fan you revealed in your post before this one.

**juno** · 16 January 2019, 05:35 AM

Originally posted by coder View Post

Okay, 6.2 B vs. 7.0 B transistors. Both rated at 250 W. I'll grant you that it was a bit smaller, but the 512-bit memory bus was not a cheap or power-efficient thing to do.

Of course, there's only so much insight one can gain by analyzing products from nearly six years ago!

561 vs. 438 mm² is a 28% difference in size. Hawaii's 512 bit interface was smaller than the 384 bit one on Tahiti as it was more efficient, targeting a wider bus and lower clocks while also allowing more memory. The AMD cards maxed out at 32 GiB. This was a design decision and resulted in products Nvidia could not counter, not brute force.

Originally posted by coder View Post

Well, let's look at the two most consequential aspects.

Memory bandwidth: 336 vs. 512 GB/sec

Single-precision: 5.6 vs. 8.6 TFLOPS

Now, tell me: in what way was the GTX 980 Ti not lower spec? It even used less power!

And yet, gaming performance was pretty much neck-and-neck.

Of course you could enumerate arbitrary numbers as you wish to make your point. Or you could look up real world numbers. To say it with your words, "at this point, you're just trolling". You take 980 Ti base clock (1000 MHz) and Fury X (you said "Fury" before now it's a X somehow) boost clock (1050 MHz) and compare their numbers, which is, of course, nonsense. You won't see this numbers in any benchmark you run so the results and your conclusions are useless. A typical 980 Ti boosts to 1200 MHz, a good one (Gigabyte Gaming) to 1300 MHz in games out of the box. A typical Fury will boost to 1000 MHz (actually more likely below that, but I don't have to "fake" numbers to make my point), a good one (e.g. Fury Nitro) to 1050 MHz.

That leaves 6.9 TFLOPS for the Fury and 6.8 TFLOPS for the 980 Ti. Yeah, that's really, really lower spec. It's nothing less than a wonder that marvellous Nvidia can keep up with this oh so higher spec card while the Nvidia card is also only rasterizing 80% more triangles per second. Really. Apart from the polygon throughput, there are other values driven by fixed function that are relevant to gaming performance, such as pixel fill rates, texturing performance, memory size, tessellation performance. You should take a look at those numbers once in a while and not just compare FLOPS.

It's true that Fury's memory bandwidth was in its own league. That's caused by the use of HBM and the need for 4 stacks to reach even 4 GiB at this time and that's really the one and only discipline where the Fury is considerably "higher spec". Nvidia's more efficient fixed function units compensate that disadvantage. The Maxwell increase in bandwidth efficiency is something AMD hasn't catched up even until today and nobody denies that. But taking false numbers just to make a point and say AMD has so much stronger chips and so much lower performance is not discussing, it's just spreading misinformation.

Originally posted by coder View Post

At this point, you're just trolling. I guess I shouldn't be surprised, given how much of your inner AMD fan you revealed in your post before this one.

It's not a surprise that a Linux user prefers a Linux-friendly company over a Linux-hostile one. Even a toxic one to the entire market FWIW. That being said, I'm not a fan. I just dislike AMD less.

Now you can call me a fan because of that. I think it's more likely that you are just mad because your argument is invalid.

**coder** · 16 January 2019, 11:31 PM

Originally posted by juno View Post

561 vs. 438 mm² is a 28% difference in size. Hawaii's 512 bit interface was smaller than the 384 bit one on Tahiti as it was more efficient, targeting a wider bus and lower clocks while also allowing more memory. The AMD cards maxed out at 32 GiB. This was a design decision and resulted in products Nvidia could not counter, not brute force.

The chips are from different fabs, so it makes more sense to compare transistor counts. And that wide memory interface is costly at the board level. It might have better perf/watt, but I doubt net power consumption is actually lower. It's telling that neither AMD nor Nvidia has ever made a 512-bit card, since. And Nvidia had a whole generation between then and their adoption of HBM.

But the bigger point here is how you're latching on to the last time, nearly 6 years ago, when AMD was competitive. It's as if to underscore my observation that Maxwell was really where Nvidia started to pull ahead.

Originally posted by juno View Post

Of course you could enumerate arbitrary numbers as you wish to make your point.

For GPUs, it doesn't get more fundamental than raw compute horsepower and bandwidth. Hardly arbitrary. If you don't understand even that much, then this exchange is just a waste of time.

Originally posted by juno View Post

To say it with your words, "at this point, you're just trolling".

Trolling by quoting relevant and facts? That's an interesting definition of trolling. I'd hate to see what you consider a productive discussion - probably those where everyone simply agrees and no one learns anything.

Originally posted by juno View Post

You take 980 Ti base clock (1000 MHz) and Fury X (you said "Fury" before now it's a X somehow) boost clock (1050 MHz) and compare their numbers, which is, of course, nonsense. You won't see this numbers in any benchmark you run so the results and your conclusions are useless. A typical 980 Ti boosts to 1200 MHz, a good one (Gigabyte Gaming) to 1300 MHz in games out of the box. A typical Fury will boost to 1000 MHz

My intent was to compare base to base. I didn't see a boost clock for Fury X.

But, if you'd rather compare boost vs. boost, you'll find boosting a 980 Ti to 1.2 GHz still leaves a pretty substantial gap.

980 Ti @ 1.2 GHz: 6.7 TFLOPS
980 Ti @ 1.3 GHz: 7.3 TFLOPS
Fury X @ 1.0 GHz: 8.2 TFLOPS

So, under your preferred conditions, the Fury X still has either 22% or 12% more compute power. And that's still not even considering its monstrous 52% memory bandwidth advantage.

Originally posted by juno View Post

That leaves 6.9 TFLOPS for the Fury

That's not what I got. Show your math.

But we could also repeat the same exercise with Polaris and Vega. In each case, if you compare the raw compute, raw bandwidth, and power consumption numbers of comparably-performing AMD and Nvidia products, the AMD GPUs consistently achieve less with more.

The RX 580 should easily trade blows with the GTX 1070. Instead, it has to face the GTX 1060. Vega 64 looks spec'd to go up against the GTX 1080 Ti, but it can only hold its own against the GTX 1080.

Originally posted by juno View Post

Apart from the polygon throughput, there are other values driven by fixed function that are relevant to gaming performance, such as pixel fill rates, texturing performance, memory size, tessellation performance. You should take a look at those numbers once in a while and not just compare FLOPS.

Where are you even going with this? Are you making excuses for AMD building worse-performing hardware? If they have an imbalance in their hardware resources, it's their own fault. I sort of get how one can mis-predict the adoption of certain things like tessellation, but they've had several generations worth of optimizations and still lag behind.

If recent AMD GPUs could consistently deliver on compute workloads in line with their specs, then your argument about fixed-function hardware in Nvidia GPUs would be more credible. But, even on the compute front, we see more of the same, with Nvidia often matching up to higher-spec'd AMD parts.

Originally posted by juno View Post

It's not a surprise that a Linux user prefers a Linux-friendly company over a Linux-hostile one.

It's not doing anyone a favor to pretend that AMD hasn't been falling behind.

GCN was a step forward, when it launched. It has a lot of limitations, though, if you delve into the core architecture. I wonder if it hasn't run out of steam. Perhaps most noteworthy is that they don't seem to be able to scale past 4096 shaders, which is increasingly becoming a problem.

Originally posted by juno View Post

I think it's more likely that you are just mad because your argument is invalid.

There are actual reasons why Nvidia can sell their hardware for so much more money, and consequently why they're raking in the profits. And it's not all down to marketing and conspiracies. If you can't see that, then I'll leave you to your alternative facts.

**juno** · 17 January 2019, 05:36 AM

Originally posted by coder View Post

The chips are from different fabs

Wrong.

Originally posted by coder View Post

so it makes more sense to compare transistor counts.

Go ahead. GK110: 7.08 billion, Hawaii 6.2 billion. Hawaii was still more efficient. But we've already had this, why are you belabouring on that?

It might have better perf/watt, but I doubt net power consumption is actually lower. It's telling that neither AMD nor Nvidia has ever made a 512-bit card, since.

Why should the power consumption be lower? It was delivering more than 20% more bandwidth after al.

The reason we haven't seen any 512 bit bus since is that we're simply not stuck to GDDR5 anymore. We have (had) GDDR6 (5X) and HBM. I don't think you missed that.

Originally posted by coder View Post

But the bigger point here is how you're latching on to the last time, nearly 6 years ago, when AMD was competitive. It's as if to underscore my observation that Maxwell was really where Nvidia started to pull ahead.

That's just rectifying your false "facts". Also, nobody said that Nvidia didn't pull ahead with Maxwell. In fact, that's the exact thing I wrote in my previous post. You should perhaps read more carefully.

For GPUs, it doesn't get more fundamental than raw compute horsepower and bandwidth. Hardly arbitrary.

You are disputing your very last statement in this sentence. Maxwell *decreased* the "raw compute horsepower" (both FLOPS and memory bandwidth) with increasing gaming performance.

I also didn't say the metric was arbitrary. Your choice of values was. Comparing boost vs base is just bullshit.

Originally posted by coder View Post

If you don't understand even that much, then this exchange is just a waste of time.

Dunning Kruger at it's best.

Originally posted by coder View Post

My intent was to compare base to base. I didn't see a boost clock for Fury X.

But, if you'd rather compare boost vs. boost

Me? You wanted to compare boost vs. boost. Because those are the real life numbers when you talk about performance comparisons. If you look at any benchmark result, you see boost vs. boost. So you don't know the Fury has a "boost clock"

A Fury X is still not a Fury, that's a difference of 20%, so please be more precise in your statements. AMD boost/base != Nvidia boost/base. The power management works differently and the nomenclature as well. You could do a little research on that before you attempt to make a fool of yourself again. Maybe there is truth in your words about the willingness to learn something?

Originally posted by coder View Post

I'd hate to see what you consider a productive discussion - probably those where everyone simply agrees and no one learns anything.

So here's that: a modern GPU of none of both manufacturers runs with fixed clocks. They dynamically adjust the clocks based on load and power and temperature constraints. Nvidia specifies a base clock and a typical boost clock. AMD specifies a base and a boost clock. The base clocks are quite meaningless because if the GPU gets too hot, it will still go below that value. But it should be able to stay above that in most situations.

AMD's boost clock (pre-vega) is the maximum the GPU ever reaches. So typically, the actual clock under load is below that. With Vega, they moved from calling the value of DPM 7 (the highest) the boost clock, instead it's now the value of DPM state 6.
Nvidia's "typical boost" is some statistical value. Their power management is very fine grained and very individual. A "better" (the fab quality varies quite a bit) GPU will reach higher clocks and might go way beyond the typical boost. A worse one will reach lower clocks and "only" reach the typical boost or typically go slightly below that. Read some detailed reviews and user reports to get a feeling for that. That's all without manual overclocking, of course.
So to get a meaningful comparison, always compare boost vs. boost or even better, compare the exact values for the individual device and benchmark. Never compare base vs. base as those values are useless.

On top of that comes the fact that we now only talk about reference cards' values. But many people actually buy cards from AIB vendors. They have better coolers and therefore automatically (due to the boost mechanism) higher clocks, even if it's not a factory overclocked version. Then people start comparing e.g. a GIGABYTE SUPERCLOCKED 1070 with a huge cooler against a RX580 and claim that they should be equally fast while comparing the base clocks of the reference cards. And that's the exact reason, why so many people think that AMD has so much more raw power. Those times are long gone with the first GCN iterations.

Originally posted by coder View Post

That's not what I got. Show your math.

Yeah, i've actually had a mistake in there. Should've been 7.2.

Originally posted by coder View Post

The RX 580 should easily trade blows with the GTX 1070. Instead, it has to face the GTX 1060. Vega 64 looks spec'd to go up against the GTX 1080 Ti, but it can only hold its own against the GTX 1080.

So I've already showed you your fallacy and you continue to use the wrong numbers?

RX 580: 6.5 TFLOPS, 256 GiB/s, 5.6 billion tri/s
GTX 1070: 6.9 TFLOPS, 256 GiB/s, 7.2 billion tri/s

GTX 1080 Ti: 11.4 TFLOPS, 484.4 GiB/s, 9.3 billion tri/s
Vega 64: 11.5 TFLOPS, 483.8 GiB/s 5.6 billion tri/s

No, those "counterparts" of AMD should not "easily trade blows". And the compute performance is exactly where it would be expected.

Originally posted by coder View Post

There are actual reasons why Nvidia can sell their hardware for so much more money, and consequently why they're raking in the profits. And it's not all down to marketing and conspiracies. If you can't see that, then I'll leave you to your alternative facts.

At least you're an expert with alternative facts

I don't dispute that Nvidia is doing a good job. But there is also a reason why they lost nearly 50% of their value over the past half year. Their current lineup is just massively overpriced and the times where you could demand every price for your GPU due to mining seem gone.

Just in case you haven't got that yet and you still think I'm a fan: I'm not accusing you of not being an AMD fan (apparently this kind of thinking exists within fanboys). I'm accusing you of taking wrong numbers and making wrong assumptions.

**coder** · 17 January 2019, 09:39 PM

Originally posted by juno View Post

Wrong.

Go ahead. GK110: 7.08 billion, Hawaii 6.2 billion.

Fair enough, I thought it was made by Glo Fo, but I see it was TSMC. Anyway, I already stated the transistor counts.

Originally posted by juno View Post

Hawaii was still more efficient.

Wrong.

https://www.anandtech.com/show/7457/...290x-review/19

Originally posted by juno View Post

But we've already had this, why are you belabouring on that?

Actually, Kepler was brought up by you, and I shouldn't have taken the bait, as it was irrelevant to my original and core point.

However, if you look at what they were actually able to charge for it, relative to what Nvidia got for the GK110, then Hawaii still looks too big.

Originally posted by juno View Post

The reason we haven't seen any 512 bit bus since is that we're simply not stuck to GDDR5 anymore. We have (had) GDDR6 (5X) and HBM. I don't think you missed that.

This is a bizarre point to make. GDDR5X came in 2016 and GDDR6 in 2018. But, by 2016, both companies had already embraced HBM/HBM2 for the high end.

My point was that if 512-bit were such a good idea, you'd have expected Nvidia to use it for Maxwell, if not before.

Originally posted by juno View Post

I also didn't say the metric was arbitrary. Your choice of values was. Comparing boost vs base is just bullshit.

You weren't specific about what you considered arbitrary. Anyway, I think that's behind us.

Originally posted by juno View Post

Me? You wanted to compare boost vs. boost. Because those are the real life numbers when you talk about performance comparisons. If you look at any benchmark result, you see boost vs. boost.

My goal was to put them on equal footing. I accept that it was an imperfect comparison. However, all your nit picking about clocks isn't enough to erase the gap I pointed out between the relative specs vs. relative performance.

Originally posted by juno View Post

RX 580: 6.5 TFLOPS, 256 GiB/s, 5.6 billion tri/s
GTX 1070: 6.9 TFLOPS, 256 GiB/s, 7.2 billion tri/s

GTX 1080 Ti: 11.4 TFLOPS, 484.4 GiB/s, 9.3 billion tri/s
Vega 64: 11.5 TFLOPS, 483.8 GiB/s 5.6 billion tri/s

No, those "counterparts" of AMD should not "easily trade blows".

Sure, that'll do. See, the RX 580 should perform much closer to the GTX 1070, but it struggles against a GTX 1060 6 GB.

Likewise, aside from geometry, Vega 64 was clearly spec'd to go up against GTX 1080 Ti. But, even at settings that clearly aren't geometry-limited, Vega 64 has its hands full just contending with a plain GTX 1080.

Originally posted by juno View Post

I don't dispute that Nvidia is doing a good job. But there is also a reason why they lost nearly 50% of their value over the past half year.

Crytomining. Perhaps also too much speculation on their self-driving car platform.

AMD also had a spike and subsequent correction, but theirs was probably dampened by them having a broader range of products and serving more diverse markets.

Originally posted by juno View Post

Their current lineup is just massively overpriced and the times where you could demand every price for your GPU due to mining seem gone.

I can't argue about Nvidia's current pricing. It's too much for me.

That said, the RT cores and Tensor cores might actually be good value for money, but they've still priced themselves out of many gaming PCs and other traditional applications.

Originally posted by juno View Post

I'm accusing you of taking wrong numbers and making wrong assumptions.

Your corrections are welcome. They don't change the bigger picture, though. AMD has only been able to compete by sacrificing their margins, and that deprives them of the R&D funding needed to make the next generation more competitive. In particular, I'm sure Vega fell well short of where they needed and expected it to be.

I hope they succeed in capitalizing on Nvidia's missteps with Turing, much in the way they're about to eat Intel's lunch due to its 10 nm stumbles.

Announcement

Radeon GCC Compute Back-End Approved For Merging In The Upcoming GCC9 Compiler

Comment

Comment

Comment

Comment

Comment