The Handheld Steam Machine With Linux & AMD SoC Moves Ahead

bridgman replied

14 December 2015, 07:24 PM
Originally posted by L_A_G View Post

I quite frankly don't care that much about the transistor density seeing how we're talking about embedded systems where power consumption is of paramount importance. The main factor in how much power a chip draws comes from the transistors switching between their "on" and "off" states. Lowered power consumption when you move to a higher precision process mainly comes from the fact that when you move to a new process, you reduce the current difference in between the "on" and "off" states of transistors meaning that there's less energy being wasted every time a transistor goes from one state to the other.

Yep, that's one of the ways finer processes let you reduce power consumption (strictly speaking it's lower capacitance and hence less area under the current X time curve) - ability to operate at lower voltages is the other big one - although leakage also tends to go up with finer processes so it's not a total win.

Originally posted by L_A_G View Post

While cramming more and more transistors onto the same chip might do a lot of good for performance, it's doesn't help with efficiency which is why we've been seeing TDP's of AMD's higher end chips reach as high as 220W. While I do believe you that these extra transistors have helped improve performance, I don't think it's done anywhere near as much good for power efficiency.

Have to disagree here -- the high power chips you are talking about are the ones which stayed on the same process but were able to clock higher through a combination of process tweaks and higher operating voltage. Higher voltage and higher clocks are a double hit on power (triple if you count the V-squared term ).

Using extra transistors to implement "wider logic running at a lower clock with the same or better performance" is one very important way of reducing power consumption. Reducing the clock helps somewhat, but being able to reduce the voltage (which running at a lower clock lets you do) helps even more.

Agree that if you only use the wider logic for more performance at same clocks that isn't going to help, but that's not what we are doing.

Last edited by bridgman; 14 December 2015, 07:55 PM.
Leave a comment:
L_A_G replied

14 December 2015, 07:20 PM
Originally posted by juno View Post

On the official product page it still sais 4.0. I don't search deeper again now but all I know is I also checked some datasheet and it also said 4.0. Sorry if that info is wrong or outdated or they only said that because the drivers were not ready or whatever. If this is the case, AMD really should work on that. They should maintain something like Intel with their ARK

Right... Here's the link from where I checked that:
http://www.amd.com/Documents/AMDGSer...oductBrief.pdf

Basically the product brief says it pretty clearly on the first page that G-Series SoC's for embedded use support up to OpenGL 4.2. If you're going to start making stuff up, at least don't do it in a way where it's really easy to check if you're making it up or not.

Originally posted by juno

Just because the shader architecture is "GCN", the GPU doesn't have to support all the hardware features that are available on the desktop. You also don't have the same features available on a FirePRO and a mGPU, even if both are GCN, and I'm not talking about the cut-down fp64 performance.
Of course they have time. But we are talking about that specific piece of hardware that they now advertise and that they now want to "sell".

While there are some minor differences and the usual firmware sabotaging of fp64 performance to sell more workstation cards, they're still fundamentally the same architecture so there really isn't any reason why these things somehow couldn't support newer OpenGL-versions. On Windows for instance even a lot of pre-GCN cards support OpenGL 4.5, clearly showing that it's just AMD not bothering to implement driver support for these chips.

Originally posted by juno

Sure, but that's not what this is about. btw. I don't think that it is cheaper to build the exact same, but shrinked SoC in 14 nm at this time. Later, when the yields come closer to those in the planar 28 nm process it will surely be cheaper.

As I said, it's not clear cut and in the early part of a new node's life is always the part where you have the most problems. We are after all talking about a device coming out next year, not something that'll come out in a couple of years when all the yield problems have been sorted out. Apple's already started shifting their purchasing more and more towards Samsung because of GlobalFoundries having yield problems with the 14 nm node.

Originally posted by juno

Sorry, that's just plain wrong. You totally underestimate the impact of architectural changes on performance and power efficiency.
You must never have heard about Intel's Tick-Tock process? They release a new chip architecture (Tock), then shrink it (Tick), then a Tock on the new node, shrink it again (Tick) etc. The bigger steps in the younger past were always Tocks (Skylake, Haswell, Sandy Bridge(!)) while Ticks brought way less impovements (Ivy Bridge, Broadwell).

Trying to sound smart by acting like I haven't heard of the way Intel's way of developing new chips by alternating between introducing a new microarchitecture and a new node? The last few rounds of that have been pretty disappointing to say the least. While the power draw isn't going anywhere, at least performance has been going up even thou it's always less than 10% per new microarchitecture. This is the reason why Intel CPU's hold their value so well on the second hand market and why I went with a second hand i7 950 while I wait and see if Zen is any good.

Originally posted by juno

Also, the TDP for AMDs FX-CPUs or APUs did only rise when they raised the frequency a lot. And I mean a lot. You have to keep in mind, that FX-CPUs only saw the first two of four bulldozer iterations. And the clocks went higher, even if the TDP remained stable.

examples?
Zambezi (1st gen bulldozer): FX-8150 w/ 3.6-4.2 GHz, 125 W
Vishera (2nd gen bulldozer): FX-8370 w/ 4.0-4.3 GHz, 125 W; FX-9590 w/ 4.7-5.0 GHz, 220 W
Of course, the Vishera were also faster and still more power efficient for the same clock speeds compared to the Zambezis. All of these are in GlobalFoundries 32 nm SOI, btw.
Same is for the APUs:
LLano: 3 GHz, 100 W
Richland: 4.1-4.4 GHz, 100W
[Kaveri: 3.9-4.1 GHz (ofc still faster), 95 W] <- this one is in TSMCs 28 nm instead of GF's 32 nm process.
[Bristol Ridge: ??? yet to come]

Probably shouldn't have spoken about the official TDP rating, but instead actual power draw because those figures can vary pretty wildly with chips with the same official TDP.

Just look at the fx-8370 and 8350 which officially have the same TDP:

404 Not Found

http://techreport.com/r.x/amd-fx8370e/power-peak.png

Originally posted by juno

As we are already way too off-topic, I'm not going to explain to you why Hawaii was and is actually an efficient GPU and why it has the reputation to be a like you call it "mini furnace".

I'd hardly call consuming over 100W more than a Titan Black very "efficient":
http://media.bestofmicro.com/P/6/505...on-Torture.png
Leave a comment:
juno replied

14 December 2015, 06:17 PM
Originally posted by L_A_G View Post

According to their own website (that I just checked) the G-series that are out right now supports up to 4.2 and when it's using literally the same architecture as on desktop and a full Linux based OS, it IS transferable. Not only that, this is going to come out late next year meaning that they're going to have a lot of time to work on newer hardware and better drivers.

On the official product page it still sais 4.0. I don't search deeper again now but all I know is I also checked some datasheet and it also said 4.0. Sorry if that info is wrong or outdated or they only said that because the drivers were not ready or whatever. If this is the case, AMD really should work on that. They should maintain something like Intel with their ARK
Just because the shader architecture is "GCN", the GPU doesn't have to support all the hardware features that are available on the desktop. You also don't have the same features available on a FirePRO and a mGPU, even if both are GCN, and I'm not talking about the cut-down fp64 performance.
Of course they have time. But we are talking about that specific piece of hardware that they now advertise and that they now want to "sell".

Originally posted by L_A_G View Post

A higher precision process also means that the price of producing a wafer also goes up while yields go down. It's not so clear cut that a 14 nm process is going to make chips that much cheaper than a well tested 28 nm process where there's not only going to be less problems, but a lot more time spent on ironing these bugs out.

Sure, but that's not what this is about. btw. I don't think that it is cheaper to build the exact same, but shrinked SoC in 14 nm at this time. Later, when the yields come closer to those in the planar 28 nm process it will surely be cheaper.

Originally posted by L_A_G View Post

If you look at TDP's of AMD's desktop processors from the Bulldozer family you're generally going to see them getting getting higher and higher TDP's as time goes on. The reason why Intel has been able to keep TDP's stable is because they haven't been stuck on the same node and have been able to continuously introducing new nodes. You can also see this from AMD's desktop GPU's that have also been getting hotter and hotter culminating in the mini furnace known simply as the 390X.

Sorry, that's just plain wrong. You totally underestimate the impact of architectural changes on performance and power efficiency.
You must never have heard about Intel's Tick-Tock process? They release a new chip architecture (Tock), then shrink it (Tick), then a Tock on the new node, shrink it again (Tick) etc. The bigger steps in the younger past were always Tocks (Skylake, Haswell, Sandy Bridge(!)) while Ticks brought way less impovements (Ivy Bridge, Broadwell).
Also, the TDP for AMDs FX-CPUs or APUs did only rise when they raised the frequency a lot. And I mean a lot. You have to keep in mind, that FX-CPUs only saw the first two of four bulldozer iterations. And the clocks went higher, even if the TDP remained stable.

examples?
Zambezi (1st gen bulldozer): FX-8150 w/ 3.6-4.2 GHz, 125 W
Vishera (2nd gen bulldozer): FX-8370 w/ 4.0-4.3 GHz, 125 W; FX-9590 w/ 4.7-5.0 GHz, 220 W
Of course, the Vishera were also faster and still more power efficient for the same clock speeds compared to the Zambezis. All of these are in GlobalFoundries 32 nm SOI, btw.
Same is for the APUs:
LLano: 3 GHz, 100 W
Richland: 4.1-4.4 GHz, 100W
[Kaveri: 3.9-4.1 GHz (ofc still faster), 95 W] <- this one is in TSMCs 28 nm instead of GF's 32 nm process.
[Bristol Ridge: ??? yet to come]

As we are already way too off-topic, I'm not going to explain to you why Hawaii was and is actually an efficient GPU and why it has the reputation to be a like you call it "mini furnace".
Leave a comment:
L_A_G replied

14 December 2015, 01:54 PM
Originally posted by juno View Post

Please read before you answer. AMD itself states The G-series SoCs support ogl 4.0, so your argument is invalid. So, even if it works on desktop hardware, that is not transferable to this SoC. Also, if the game just simply checks for Nvidia hardware or ogl version x.y and it doesn't start if there is no match, the claim is wrong, even if it would work with tweaks. Consumers want to buy a handheld console here, not a Linux pc in a funny case to carry around.

According to their own website (that I just checked) the G-series that are out right now supports up to 4.2 and when it's using literally the same architecture as on desktop and a full Linux based OS, it IS transferable. Not only that, this is going to come out late next year meaning that they're going to have a lot of time to work on newer hardware and better drivers.

If you're going to try to make it seem like you're doing your homework better than someone else, don't do a half-assed job at it...

Originally posted by juno

As is the price. And die size = price.

A higher precision process also means that the price of producing a wafer also goes up while yields go down. It's not so clear cut that a 14 nm process is going to make chips that much cheaper than a well tested 28 nm process where there's not only going to be less problems, but a lot more time spent on ironing these bugs out.

Originally posted by juno

That's why you just don't put more transistors inside a chip, you just don't. Well to be exact you do, within one generation and one series of chips from different performance levels. But otherwise not. You see architectural improvements for any new generation that are improving the power efficiency, also if – or even because – there are more transistors being used compared to the previous generation. Just compare the iterations of bulldozer up to Carrizo or AMD Tahiti vs Tonga or Nvidia's GM200 vs GK110, GM204 vs GK104 etc. etc.

If you look at TDP's of AMD's desktop processors from the Bulldozer family you're generally going to see them getting getting higher and higher TDP's as time goes on. The reason why Intel has been able to keep TDP's stable is because they haven't been stuck on the same node and have been able to continuously introducing new nodes. You can also see this from AMD's desktop GPU's that have also been getting hotter and hotter culminating in the mini furnace known simply as the 390X.
Leave a comment:
juno replied

14 December 2015, 01:08 PM
Originally posted by L_A_G View Post

AMD's own proprietary driver has had full support to at least OpenGL 4.4 and if I recall correctly they also now fully support up 4.5 as well. It's only the open source drivers that have all those deficiencies in terms of OpenGL support and with AMD being actively involved in the development of this it's obvious that they're going to use AMD's closed source drivers, not the open source ones.

If you're thinking about how hilariously broken Alien: Isolation was when they tested it here, it was only that broken on the open source drivers. It works just fine on the proprietary ones and you can check it on youtube if you don't believe me.

Also, seeing how this is coming out late next year it's obvious that AMD's current lineup is not what they're going to use in the final production model. Most probably they're going to be looking at the 14nm parts that have yet to be even announced rather than the ones based on the old 32nm process they've been using since 2011.

Please read before you answer. AMD itself states The G-series SoCs support ogl 4.0, so your argument is invalid. So, even if it works on desktop hardware, that is not transferable to this SoC. Also, if the game just simply checks for Nvidia hardware or ogl version x.y and it doesn't start if there is no match, the claim is wrong, even if it would work with tweaks. Consumers want to buy a handheld console here, not a Linux pc in a funny case to carry around.

Originally posted by L_A_G View Post

I quite frankly don't care that much about the transistor density seeing how we're talking about embedded systems where power consumption is of paramount importance.

As is the price. And die size = price.

Originally posted by L_A_G View Post

The main factor in how much power a chip draws comes from the transistors switching between their "on" and "off" states. Lowered power consumption when you move to a higher precision process mainly comes from the fact that when you move to a new process, you reduce the current difference in between the "on" and "off" states of transistors meaning that there's less energy being wasted every time a transistor goes from one state to the other.

While cramming more and more transistors onto the same chip might do a lot of good for performance, it's doesn't help with efficiency which is why we've been seeing TDP's of AMD's higher end chips reach as high as 220W. While I do believe you that these extra transistors have helped improve performance, I don't think it's done anywhere near as much good for power efficiency.

That's why you just don't put more transistors inside a chip, you just don't. Well to be exact you do, within one generation and one series of chips from different performance levels. But otherwise not. You see architectural improvements for any new generation that are improving the power efficiency, also if – or even because – there are more transistors being used compared to the previous generation. Just compare the iterations of bulldozer up to Carrizo or AMD Tahiti vs Tonga or Nvidia's GM200 vs GK110, GM204 vs GK104 etc. etc.

Also, there are no FinFET SoCs announced. Focus will be bigger Zen dies and Arctic Islands GPUs. But maybe bridgman knows something more he wants to tell us about
While it was planned for this year, like the Puma-update from kabini/temash to beema/mullins, AMD hasn't even been announcing Peregrine Falcon yet. Not even speaking of the subsequent update, which will then - maybe - bring 14/16nm to the G-series.

Last edited by juno; 14 December 2015, 01:31 PM.
Leave a comment:
L_A_G replied

14 December 2015, 12:40 PM
I quite frankly don't care that much about the transistor density seeing how we're talking about embedded systems where power consumption is of paramount importance. The main factor in how much power a chip draws comes from the transistors switching between their "on" and "off" states. Lowered power consumption when you move to a higher precision process mainly comes from the fact that when you move to a new process, you reduce the current difference in between the "on" and "off" states of transistors meaning that there's less energy being wasted every time a transistor goes from one state to the other.

While cramming more and more transistors onto the same chip might do a lot of good for performance, it's doesn't help with efficiency which is why we've been seeing TDP's of AMD's higher end chips reach as high as 220W. While I do believe you that these extra transistors have helped improve performance, I don't think it's done anywhere near as much good for power efficiency.
Leave a comment:
bridgman replied

14 December 2015, 12:17 PM
Not to downplay the impact of 14nm, but the difference between the 32nmn and 28nm processes is a lot more than the raw numbers suggest.

The 32nm SOI process was good for speed but wasn't so good for high density; between Richland and Kaveri we almost doubled the number of transistors while keeping die size pretty close (moving to a wider/slower approach to reduce power usage), and that was before moving to the higher density libraries for Carrizo.
Leave a comment:
L_A_G replied

14 December 2015, 11:59 AM
Well, I haven't really paid all that much attention to AMD's APU offerings or AMD's offerings in general after how disappointing the first couple of series of bulldozer chips turned out to be. I don't see that big of a difference between 32 and 28 nm so mentioning it is mostly just being a bit anal. However the jump to 14 nm FinFET (which they're definitely capable of seeing how Apple is already using Globalfoundries as a second source for 14 nm FinFET chips) will be pretty massive.
Leave a comment:
bridgman replied

14 December 2015, 11:36 AM
Couple of things...

1. From the Kickstarter blurb I think they're developing with the GX-415 but plan to ship with something newer

2. re: 32nm, pretty sure all the APUs from Kaveri/Kabini onward have been 28nm
Leave a comment:
L_A_G replied

14 December 2015, 11:24 AM
AMD's own proprietary driver has had full support to at least OpenGL 4.4 and if I recall correctly they also now fully support up 4.5 as well. It's only the open source drivers that have all those deficiencies in terms of OpenGL support and with AMD being actively involved in the development of this it's obvious that they're going to use AMD's closed source drivers, not the open source ones.

If you're thinking about how hilariously broken Alien: Isolation was when they tested it here, it was only that broken on the open source drivers. It works just fine on the proprietary ones and you can check it on youtube if you don't believe me.

Also, seeing how this is coming out late next year it's obvious that AMD's current lineup is not what they're going to use in the final production model. Most probably they're going to be looking at the 14nm parts that have yet to be even announced rather than the ones based on the old 32nm process they've been using since 2011.
Leave a comment:

Announcement

The Handheld Steam Machine With Linux & AMD SoC Moves Ahead

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: