Announcement

**Melcar** · 29 May 2008, 10:47 AM

Originally posted by Swoopy View Post

Interesting, I have 4 GB ram with an AMD64X2 CPU where the whole memory remapping thing works completely different, but my system hang is like bug 1 described here - (X crashes/hangs system at startup).
I fear I will have to physically remove two sticks of RAM (back to 2GB) in order to test, but my curiosity is definitely piqued.

If it turns out the driver can't handle the AMD64 CPU's remapping above 4GB of RAM either, then that's REALLY a lot of negative brownie points for AMD/ATI!!

I really think it has more to do with MB BIOS and kernel settings. I have two AMD64 machines with 4GB of RAM and I have no problems with them.

**movieman** · 29 May 2008, 12:58 PM

Originally posted by Melcar View Post

I really think it has more to do with MB BIOS and kernel settings. I have two AMD64 machines with 4GB of RAM and I have no problems with them.

Yes, it's not a driver problem, it's a hardware/BIOS problem. As I mentioned in the other thread, my 4GB Athlon64 X2 system was crashing every time fglrx started unless I limited it to <4GB of RAM... until I upgraded the BIOS to get a fix for a remapping bug and now it works fine.

In that case, I believe the BIOS told the OS it had remapped 512MB of RAM above the 4GB area, but didn't actually bother to remap the memory. A driver can't do much to avoid a crash if the RAM it expects to use doesn't exist.

**movieman** · 29 May 2008, 01:06 PM

Originally posted by Swoopy View Post

I fear I will have to physically remove two sticks of RAM (back to 2GB) in order to test, but my curiosity is definitely piqued.

Just set the RAM limit to 3-3.5GB on the kernel boot line to limit the amount of RAM the kernel will use. That's what I was doing until I got a BIOS fix that gave me the missing 512MB back.

**kingtaurus** · 04 June 2008, 03:27 AM

I think this issue can be reproduced on a number of ASUS motherboards (for example P5K and P5K-E) with one of the following chipsets (P35,X38,X48) (see https://bugs.launchpad.net/ubuntu/+s...24/+bug/224404). Further, this bug doesn't just affect AMD/ATI graphics cards as well (see https://bugs.launchpad.net/linux/+bug/210780) - so it looks like BIOS or kernel bug.

=====

Now the question that I came to ask:

Would you be willing to allow people outside of AMD/ATI test your linux drivers?

Originally posted by bridgman View Post

We have a number of machines with 4GB or more and fglrx runs fine on them. This issue seems to be system-bios specific (and possible kernel-specific as well, not sure yet).

**bridgman** · 04 June 2008, 08:49 AM

Yep, that's what the beta program is all about. We do weekly drops as the changes accumulate between releases and collect feedback from the beta testers. The word "beta" probably isn't exactly correct since we're doing monthly incremental releases, but what the heck.

I don't think we want to make the beta test group any larger right now (dealing with remotely repoted issues takes a lot more time per-issue than dealing with problems we can reproduce in house, and the beta testers do a lot of work as well) but we are trying to monitor the platform & distro mix of the beta group and tweak it periodically to make sure it fairly represents our user base.

**sksk** · 04 June 2008, 11:36 AM

ati drivers and pbuffers

HI ATI/AMD developer,

I'm interested to know whether we will should expect to see the pbuffers support in new ATI drivers ?
The old drivers (earlier than 8.40.4) supported them, while the new ones do not, which leads to the problems with some applications http://bugs.winehq.org/show_bug.cgi?id=11826 (World of Warcraft with wine). So because of that particular problem (white minimap in WoW with wine), I'm forced to use the old 8.40.4 version of the driver.

Can you confirm indeed that the new version of the driver doesn't support pbuffers ? Will it support pbuffers ?

thanks

**Dandel** · 11 June 2008, 09:37 PM

2 more questions, mainly about documentation...

1) will the documents on applying the R500 cards to gpgpu processing be released? (somewhat discussed on this prior thread.)

2) will the details in apply crossfire be released for the r500/r600 cards after the release of the r700 series of chips?

**bridgman** · 11 June 2008, 10:35 PM

I can try to answer these ones now. I think all of the information required for a "roll your own" GPGPU implementation on 5xx has already been released. I haven't worked through the details, but I think you could use the existing drm code and just submit your own shader programs to the chip. I bet MostAwesomeDude could do it now, but he's probably sick of shader programs for a while

I think the only information we have not released for full crossfire is the hardware we use to combine images from the two cards (although you could combine in software today; that's what we do for the lower end cards). It's kind of a no-brainer if you are running AFR -- you just blit to the front buffer from card A then card B then A again, instead of always blitting from the same backbuffer.

I haven't looked at the IP issues in the compositing hardware so don't know if there are any "gotchas" yet but yeah, after 7xx seems about right. We might need to put out a bit more memory management setup info to support SW compositing with fast blits between the cards; will check.

**bridgman** · 11 June 2008, 10:42 PM

Originally posted by sksk View Post

HI ATI/AMD developer,

I'm interested to know whether we will should expect to see the pbuffers support in new ATI drivers ?
The old drivers (earlier than 8.40.4) supported them, while the new ones do not, which leads to the problems with some applications http://bugs.winehq.org/show_bug.cgi?id=11826 (World of Warcraft with wine). So because of that particular problem (white minimap in WoW with wine), I'm forced to use the old 8.40.4 version of the driver.

Can you confirm indeed that the new version of the driver doesn't support pbuffers ? Will it support pbuffers ?

thanks

Our OpenGL architect was nice enough to answer this one. Apparently pbuffers are implemented but we recently discovered that the capability was not being exposed correctly (I don't fully understand the details). This is being fixed now so probably a couple of releases before it shows up. If the changes are small and safe we might be able to fast track them, not sure.

**bridgman** · 12 June 2008, 12:26 AM

Originally posted by bridgman View Post

I can try to answer these ones now. I think all of the information required for a "roll your own" GPGPU implementation on 5xx has already been released. I haven't worked through the details, but I think you could use the existing drm code and just submit your own shader programs to the chip.

Just in case this didn't help (which is likely), I thought it might be worth doing a quick overview of GPGPU. Apologies if this is already obvious to everyone. This is where you get to use an HD38xx chip as a 320-core processor (yes, 320 floating point multiply-add operations per clock, ie 1/2 teraflop) without having to worry too much about all that yukky multi-threading.

GPGPU 101

As soon as the first programmable shaders were added to GPUs, it became possible to write programs and execute them on the GPU. I don't know the exact history, but AFAIK the first GPGPU work actually used DX and OpenGL shader languages rather than any compute-specific tools; GPGPU tools came later as a way to accomplish the same work without the overhead and learning curve of a graphics language. GPGPU (aka Stream Computing) is actually pretty simple.

============================

A modern GPU is almost totally programmable - a typical rendering operation involves :

- set up shader programs and textures

- feed a list of vertices into the GPU, where each vertex typically has a colour and a texture coordinate for each active texture

- for each vertex, run the vertex shader program, modifying all of the passed-in parameters

- take the results and assemble into triangles

- explode each triangle out into pixels (aka fragments), where a pixel includes a screen location, depth information, colour information interpolated between the vertices and texture coordinates interpolated between the vertices for each texture; hand each pixel off to a different processor

- for each pixel run the pixel shader program; inputs come from the texture samplers, depth information and interpolated colour information

- result of the pixel shader program goes through z-buffer processing (don't write the pixel if it's behind something already drawn), alpha blending, and some AA processing

============================

How the heck does this map into general purpose computation, you are asking. Pretty simple. Results go in the framebuffer as arrays. Pixel shaders perform the processing for each array element. Inputs to the calculation come from textures, with the filtering turned off (also called point sampling). Vertex shaders are ignored, and you feed in enough triangles or quads to cover the area where the results go. In essence you are fooling the GPU into performing a bunch of calculations using textures as the input arrays and the "screen" as your results array, but it works really well.

You can do this in OpenGL; it's just faster and easier to learn if you have GPGPU specific APIs (CAL, CUDA etc..) and even better if you have high level tools (Brook, RapidMind etc..). If you want to roll your own GPGPU stuff, you can either do it with OpenGL or just write up a program which leans on libdrm to pass the shader programs to the GPU. You need to punch a few registers (already documented AFAIK) to turn off texture filtering, and away you go.

============================

Let's go through the graphics pipeline again, but replacing the graphics terminology with GPGPU terminology :

- set up shader programs and textures [vertex shaders are pass-thru, textures point to your input data arrays, texture filtering turned off]

- feed a list of vertices into the GPU, where each vertex typically has a colour and a texture coordinate for each texture [draw a quad or a couple of triangles; all you need to do is make sure that the pixel shader runs for each point in the result "pixmap"]

- for each vertex, run the vertex shader program [we don't need no steenkin' vertex shaders]

- take the results and assemble into triangles [GPGPU people don't like triangles, it reminds them of graphics; boo hoo]

- explode each triangle out into pixels (aka fragments), where a pixel includes a screen location, depth information, colour information interpolated between the vertices and texture coordinates interpolated between the vertices for each texture; hand each pixel off to a different processor [these are the inputs to your GPGPU calculation; forget about depth and colour, focus on the textures]

- for each pixel run the pixel shader program; inputs come from the texture samplers, depth information and interpolated colour information [this is where all the real work happens; note that each pixel shader can do multiple floating point operations per clock, one for each pixel component (RGBA), so make those ALUs work]

- result of the pixel shader program goes through z-buffer processing (don't write the pixel if it's behind something already drawn), alpha blending, and some AA processing [forget Z, forget alpha, forget AA, just write the results of the pixel shader into memory]

============================

Finally, rather than viewing the results of your rendering on the monitor, read the result pixmap back (it's an array of floating point numbers) and go cure cancer or get rich or something.

Announcement

"Ask ATI" dev thread

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment