Announcement

**Hibbelharry** · 19 July 2016, 06:17 PM

I think this is partially really funny, because in terms of power consumption, both, amd and nvidia made big steps forward. Even if it seems nvidia was a little bit better in that terms, they're in my opinion absolutely failing to deliver. Because of their drivers.
Having to mess around with their binary legacy drivers is a huge fail for both companies. I just tried to get my NV GF8200 in my HTPC to work with a current kernel, because i wanted the integrated support for my new, shiny dvb card. I had heavy trouble getting the nvidia kernel module to compile and i hated it, as much as I did hate messing around with fglrx in former times. There was just no difference in experience and there is no real way out.

Meanwhile in oss-amd land on my desktop pc, i'm enjoying working everything.

A quiet satisfied R9 Fury owner.

**liam** · 19 July 2016, 06:46 PM

Originally posted by SaucyJack View Post

The other Nvidia shill, obviously

I really hope that someone was indeed being paid to do so

**nevion** · 19 July 2016, 07:12 PM

It's clear who the winner is with the budget GPUs, and it is very easy to prove. The 480 has a 256 bit memory bus which gives it ~256GB/s memory bandwidth. The 1060 has 192 and @ ~ 190GB/s. For a good gpu, you need fast memory - drivers can't make up for this. Next is compute units - there are all sorts of things that can make this comparison hard to do - but luckily there is an easy number: Teraflops. The 480 has 5.1 Tflops, the 1060 is 3.5 Teraflops. The 1060 card simply doesn't have the resources to compare in terms of hardware. Anything else you see is either: bad/inaccurate benchmarks (not always the fault of the reviewers, many things are only tuned for nvidia hardware), paid/shill bloggers, and with the 1060 an extreme case of marketing after they realized where their budget and mid-tier line would sit to AMD. AMD's hardware is still cheaper and with more ram than the touted $250 1060 3GB option.

Now drivers are another issue, but we're to believe AMD is going to crack this soon, along with their next gen compute engines, vulkan and the like. I did alot of GPU computing with AMD and NVidia - I can tell you that I always was able to get the spec on AMD hardware (provided the compiler felt like it was in a good mood). With NVidia - I never really got to 99% performance of the specs. So I feel the pain with the current situation still being tied to fglrx, but the 480 will beat the crap out of the 1060 any way you swing it at computation.

Have a look at these convinient tables:
https://en.wikipedia.org/wiki/List_o..._RX_400_Series
https://en.wikipedia.org/wiki/List_o...orce_10_Series

**LinuxID10T** · 19 July 2016, 08:09 PM

Originally posted by nevion View Post

It's clear who the winner is with the budget GPUs, and it is very easy to prove. The 480 has a 256 bit memory bus which gives it ~256GB/s memory bandwidth. The 1060 has 192 and @ ~ 190GB/s. For a good gpu, you need fast memory - drivers can't make up for this. Next is compute units - there are all sorts of things that can make this comparison hard to do - but luckily there is an easy number: Teraflops. The 480 has 5.1 Tflops, the 1060 is 3.5 Teraflops. The 1060 card simply doesn't have the resources to compare in terms of hardware. Anything else you see is either: bad/inaccurate benchmarks (not always the fault of the reviewers, many things are only tuned for nvidia hardware), paid/shill bloggers, and with the 1060 an extreme case of marketing after they realized where their budget line would sit to AMD. AMD's hardware is still cheaper and with more ram than the touted $250 1060 3GB option.

Now drivers are another issue, but we're to believe AMD is going to crack this soon, along with their next gen compute engines, vulkan and the like. I did alot of GPU computing with AMD and NVidia - I can tell you that I always was able to get the spec on AMD hardware (provided the compiler felt like it was in a good mood). With NVidia - I never really got to 99% performance of the specs. So I feel the pain with the current situation still being tied to fglrx, but the 480 will beat the crap out of the 1060 any way you swing it at computation.

Have a look at these convinient tables:
https://en.wikipedia.org/wiki/List_o..._RX_400_Series
https://en.wikipedia.org/wiki/List_o...orce_10_Series

It really isn't that simple. A video card is only as good as the driver software. What you are seeing is that AMD's drivers are just not optimized to the same level as Nvidias and it leads to subpar GPU utilization. Yes, the RX480 SHOULD be significantly faster, but without a highly optimized title, getting all the power possible out of it isn't happening. I think the best example of this is the new Doom game on Vulkan. It is very well optimized on both AMD and Nvidia, but you see the AMD cards pull ahead despite BOTH brands making jumps in performance with their newest cards. The truth to the matter is though, in most games the GTX 1060 is faster. That being said, as a Linux user I appreciate having working opensource drivers and that is what made the RX 480 a no brainer for me. Even better, with the new AMDGPU driver architecture, if I want to use the PRO driver even after they stop working on it for the RX 480, I'll still be able to run it on the latest distros.

**chuckula** · 19 July 2016, 08:10 PM

Originally posted by nevion View Post

It's clear who the winner is with the budget GPUs, and it is very easy to prove. BLAH BLAH BLAH[/URL]

It's quite obvious that the Rx 480 is a much larger piece of silicon with a lot more transistors that suck down a lot more electrical power and produce a lot more heat.
I don't need you to regurgitate that information again since I can find it elsewhere.

The real question is: What is AMD actually doing with its transistor budget. The real-world performance shows that a massively larger Polaris part has performance that's at best a little ahead of a much smaller and vastly more efficient GTX-1060.

Back in 2011 you could have played the same games with Bulldozer to "prove" that the trainwreck of Bulldozer was actually going to dominate the industry since it had MOAR of everything. So what.

**bridgman** · 19 July 2016, 08:33 PM

I love hyperbole as much as the next guy, but when did ~15% become "massively larger" (~232 vs ~200 mm^2) ?

Running DOOM on Vulkan the RX 480 is reported by HardOCP to be 25% faster than 1060 at 1440p and 32% faster at 1080p... I suppose you would have no trouble agreeing that is also "massively larger" ?

**nevion** · 19 July 2016, 08:53 PM

chuckula power budget is one thing but few (none? ) have regurgitated those few simple performance numbers that trump any artificial/game benchmark in terms of hardware capability. Even if you know how to look it up, almost no one posts that info or looks it up and so it comes to games - which matter to a point, but don't answer which hardware is better. It's of course important to be aware of whether or not driver software will improve performance greatly over time (usually does) - but for compute this stuff works well from day 1, typically. I keep my eyes open on this stuff as a GPU programmer and enthusiast.

I don't know what real-world performance you could mean aside from gaming - but compute already tells you the theoretical and practical card maximums and I've also put to real world use, though I get this is not the definition most users would have. The rest comes down to driver stack and supporting software. Bulldozer is a completely different case where more parallelism is possible given a smarter compiler... apples and oranges to this situation - which has happened again and again for gpus. Better compute = better card and AMD won in all the categories that matter (memory bandwidth, tflops). Unless you want to muddle with things like power-draw. But it wins in raw perf, raw cost, and perf/$, even so.

LinuxID10T - I thought I laid that out and addressed those topics already. Further hardware and driver software are both the main limiting factors.

**juno** · 19 July 2016, 09:51 PM

Originally posted by atomsymbol

R9 Fury has HBM and that is clocked at a lower frequency than GDDR5. This means that in some cases R9 Fury might perform slower because of increased memory latency.

Note: I failed to find any latency numbers about R9 Fury's HBM. Maybe bridgman can clarify this if it isn't a secret.

Not bridgman but some basics: it is pretty "normal" DRAM. With lower clocks you have lower relative latencies, e.g. 5 clock cycles at 500 MHz, 20 at 2 GHz. Numbers are made up but it's enough to get the point I think. So absolute latencies (in ns) stay roughly the same.
However, Fiji does scale well with memory OC but we are not sure which components exactly are on this clock domain. Maybe caches or ROPs or something does suffer from the lower clock rate.

Originally posted by GT220 View Post

You can't emulate Conservative Rasterization

At this point I stopped reading.

https://developer.nvidia.com/gpugems...chapter42.html

Originally posted by Nvidia

Can be done in SW or HW...

http://developer.download.nvidia.com...d_GDC_2015.pdf

Who is ignorant now? Do you have any idea about what you are talking about?

Sure, it's faster in hardware but absolutely doable in software. It depends on implementation, too, of course. Maybe it is even possible in parallel with other tasks, utilising asynchronous computing, and save compute time. But CV is not at all relevant atm anyway.

**nevion** · 19 July 2016, 10:12 PM

juno good post - I wanted to touch on a few points there too - the HBM interface is 4k wide in the Fury config but latency is already pretty bad in GPUs - this is common knowledge which you probably know off hand that everyone programs them knowing that. If we take the transfer speed of 512GB/s and the bit width of the memory bus, we see a max theoretical memory ops of 134 mhz. Of course that says nothing about latency besides the neighborhood of it's likely min, but is an interesting number to compare to the ~1ghz of the 480 @ it's 256bit memory bus .... but the memory controllers are supposed to do all the right coalescing to fully utilizing this and really get the bandwidth and rates for all their worth which means latency is thrown out the window. But still - I do have to wonder if anyone has seen if the memory latency has had an impact, in terms of compute at least.

Btw - should be noted (and this is one of the reasons compute matters) - much of the silicon modern GPUs is for the general purpose compute paths and shared/reused by the other things they do with the cards (e.g. rendering graphics). It's even thought that they implement some of their graphics stacks using these bits of general purpose hardware now with SW, rather than specialized other fixed function parts that use up silicon. What this means again is all things boil down to compute rates/capability... mostly.

**SaucyJack** · 19 July 2016, 11:54 PM

Originally posted by LinuxID10T View Post

The truth to the matter is though, in most games the GTX 1060 is faster.

Only in DX11/OGL.

Announcement

NVIDIA GeForce GTX 1060 Offers Great Performance On Linux

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment