Announcement

**benjamin545** · 20 August 2011, 09:59 AM

i think the worst idea would be to have multiple ir's. its just asking for one ir to be well developed (the graphics one of course) and the other ir to be lagging behind. im not too sure which ir would be best, i like llvm as a compiler and i like the idea of one all powerful solution used all through your system. but the drivers have all been written for tgsi already, and as people have mentioned it may not be capable of describing graphics operations efficiently.

i think it may just be best to extend tgsi, it was designed with the intent of it being easy for the gpu vendors to easily port their drivers to interface with tgsi, but thats obviously a pipe dream. glsl could be nice since its designed to represent the glsl code well, but then im not sure if that represents the actual hardware capabilities well or if that represents other functionalities of the devices like compute or video decode.

there is a lot of stuff this ir will need to be used for, obviously graphics, compute, video decode (which is basically using compute for decode and graphics for display).
also, probably the best way to do remote desktop and accelerated virtual machine desktops is to use a gallium based driver model that passes the ir directly.

my personal favorite option is to make everything compile to a tgsi like ir that takes into consideration it may be passed through a network layer, basically just an easily streamable ir that is capable of describing everything graphics wise. then the driver backends convert that into the native gpu code and anything that cant be done on the gpu gets conveted to llvm to be done by the cpu.

i of course dont know if that last part is fully doable. but its kinda done already in a small sense with the i915g drivers. (since they dont have a vertex shader part there is break out in gallium that allows vertex shaders to be sent to a software driver.

**Drago** · 20 August 2011, 10:50 AM

Didn't LunarGLASS (http://www.lunarglass.org/), made already LLVM IR branch already compatible with GLSL ( new instincts and such). Currently they do GLSL IR -> LLVM IR -> TSGI -> Gallium Drv. They do that, because want to evaluate, whether they can make LLVM IR suitable. And they do it successfully till now. Later TSGI may go away, and there will be one IR for graphics and compute. Much like AMDIL. After that, there can be GLSL compiler based on clang, the way there is OpenCL one ( ask Steckdenis). One have to think about the future, though. The future when there will be SIMD GPUs, like AMD HD8xxx family.

**bridgman** · 20 August 2011, 10:59 AM

Originally posted by Drago View Post

The future when there will be SIMD GPUs, like AMD HD8xxx family.

One minor terminology point -- the AMD GPUs are all SIMD today (SIMD "this way" and VLIW "that way", eg a 16x5 or 16x4 array). In the future, they will still be SIMD, just not SIMD *and* VLIW.

There are certainly lots of IRs to choose from, each with their own benefits and drawbacks. One more wild card is that our Fusion System Architecture (FSA) initiative will be built in part around a virtual ISA (FSAIL) designed to bridge over CPUs and GPUs, so I think we have at least 5 to choose from -- 4 if you lump LunarGLASS and LLVM IR together since LunarGLASS builds on and extends LLVM IR. I'm assuming nobody is arguing that we should go back to Mesa IR yet

It's going to be an interesting few months, and an interesting XDC. I might even learn to like compilers, although that is doubtful.

**Drago** · 20 August 2011, 11:01 AM

Originally posted by bridgman View Post

I haven't had a chance to watch the video yet (I read a lot faster than I can watch video

) but sounds like a combination of API limitations on texture updates and overhead to transfer new texture data to video memory -- so reading between the lines it sounds like he wants to "update small bits of a bunch of textures" on a regular basis and that means lots of little operations because the API patterns don't match what the app wants to do... and lots of little operations means slow in any language.

Will post back when I have time to slog through the video. People talk so slo...o...o...owly.

JC talks about MegaTexture technology. The basics are: the texture resides somewhere in the memory. Game engine, can stream out a new version of the texture (or part of it). Currently you have to make a bunch of GL calls. On consoles you have unified memory, so you just can rewrite the memory location. Now this technology is impressing, and it should have more efficient version for PC. Given that APU is coming with unified memory. Or at least GL extension for discrete GPUs.

**Plombo** · 20 August 2011, 11:03 AM

Originally posted by bridgman View Post

Originally posted by Drago View Post

Didn't LunarGLASS (http://www.lunarglass.org/), made already LLVM IR branch already compatible with GLSL ( new instincts and such). Currently they do GLSL IR -> LLVM IR -> TSGI -> Gallium Drv. They do that, because want to evaluate, whether they can make LLVM IR suitable. And they do it successfully till now. Later TSGI may go away, and there will be one IR for graphics and compute. Much like AMDIL. After that, there can be GLSL compiler based on clang, the way there is OpenCL one ( ask Steckdenis). One have to think about the future, though. The future when there will be SIMD GPUs, like AMD HD8xxx family.

One minor terminology point -- the AMD GPUs are all SIMD today (SIMD "this way" and VLIW "that way", eg a 16x5 or 16x4 array). In the future, they will still be SIMD, just not SIMD *and* VLIW.

A more major point - all GPU shader architectures are SIMD.

**Drago** · 20 August 2011, 11:04 AM

Originally posted by bridgman View Post

One minor terminology point -- the AMD GPUs are all SIMD today (SIMD "this way" and VLIW "that way", eg a 16x5 or 16x4 array). In the future, they will still be SIMD, just not SIMD *and* VLIW.

Any better readings on the topic. I though that they are all VLIW. Recently made jump from VLIW5 to VLIW4.
Bridgman, you probably will want to empty your private message inbox

**Drago** · 20 August 2011, 11:08 AM

Originally posted by Plombo View Post

A more major point - all GPU shader architectures are SIMD.

Maybe I am mistaken with the term SIMD. I read that there will be major redesign with HD8xxx. Let me check.
There you go: http://www.anandtech.com/show/4455/a...ts-for-compute

**bridgman** · 20 August 2011, 11:37 AM

Originally posted by Plombo View Post

A more major point - all GPU shader architectures are SIMD.

I wasn't sufficiently sure to say "all", but certainly "most" are.

Drago, mailbox has room now, thanks. The "full" warning is now at the bottom of the page rather than the top

**bridgman** · 20 August 2011, 12:12 PM

QUOTE=Drago;224103]Maybe I am mistaken with the term SIMD. I read that there will be major redesign with HD8xxx. Let me check.
There you go: http://www.anandtech.com/show/4455/a...ts-for-compute[/QUOTE]

The wording in that article is a bit confusing in places -- they talk about going "from VLIW to non-VLIW SIMD" which can be interpreted in more than one way.

Adding parentheses for clarity, most people would interpret the string as "(VLIW) to (non-VLIW SIMD)", implying that the SIMD-ness is new, although the correct interpretation is actually "(VLIW to non-VLIW) SIMD" ie still SIMD but with each SIMD made up of single-operation blocks rather than VLIW blocks.

If you look at the diagrams you'll see references to "Cayman SIMDs" and "GCN SIMDs". Getting your head around the idea of a SIMD made up of VLIW blocks is hard, although once you've done that going away from it is easy

**bridgman** · 20 August 2011, 01:29 PM

Originally posted by Qaridarium

why not call it VLIW+SIMD to RISC+VLIW or CISC+VLIW.

That would be even more confusing, if such a thing is possible

Seriously, GPU instructions have always been much closer to RISC than CISC, especially ATI/AMD GPUs. They're all single-clock, and there are relatively few instructions -- lots of opcodes but those are mostly subtle variants of the same basic function (eg 6.02*10^23 different compare operations).

One could argue that VLIW RISC and CISC are both "complex" from a sufficiently abstract point of view but I don't think they are generally regarded as interchangeable. The transition really is from VLIW RISC to non-VLIW RISC.

Originally posted by Qaridarium

i think the r900 will be 1 RISC core instead of a VLIW core and 5 SIMD cores added to the RISC core and maybe an firmware layer to CISC to protect the internal chip logic.

The slides presented at AFDS talked about 1 scalar unit plus 4 SIMDs (sometimes called vector ALUs) per compute unit (CU) - see pdf for session 2620 at :

http://developer.amd.com/afds/pages/sessions.aspx

I think it's fair to say that the instruction sets for both scalar and vector units can be considered RISC, just like the instruction set for the VLIW core, but the RISC vs CISC topic is almost as dangerous and open to debate as religion or coding standards.

The whole discussion is made more complicated because we used to talk about "vector operations" (eg the RGBA components of a pixel) being handled in a single vector instruction on 3xx-5xx GPUs or by using 4 of the VLIW slots on a 6xx-Cayman GPU. With GCN and beyond "vector" is being used the other way, referring to the 16 elements of the SIMD as a vector.

That's why the SIMD aspect seems new -- VLIW was visible to the programmer while SIMD was not, so it got talked about the most. Now that VLIW is out of the picture SIMD is the most visible thing, and we have to talk about it because a CU contains SIMDs *and* a scalar engine, so the natural terminology is "vector" for the SIMDs and "scalar" for the... um... scalar engine.

What we call a SIMD used to work on a 16x4 or 16x5 array of data, now it works on a 1D vector of data.

But it's still RISC

Announcement

Gallium3D OpenCL GSoC Near-Final Status Update

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment