AMD R600g Performance Patches Yield Mixed Results

agd5f replied

02 November 2012, 03:55 PM
Originally posted by bug77 View Post

The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?

You need to read in the texture and apply it to the quad every frame otherwise you aren't rendering anything and fps would be pointless.
Leave a comment:
MaxToTheMax replied

02 November 2012, 03:45 PM
Originally posted by bug77 View Post

The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?

Uh, yes. First you need to read from texture memory and into the GPU itself during rasterization. The GPU is also writing to the front buffer. Then, after the buffers are flipped, the (now) back buffer has to be encoded and sent to your monitor. So at a very minimum you have thousands of writes and even more reads for just that one rectangle.

I think you're getting memory bandwidth confused with DMA bandwidth on your PCI-E bus. Not the same thing.
Leave a comment:
bug77 replied

02 November 2012, 03:40 PM
Originally posted by agd5f View Post

Only if you want maximum performance.

The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?
Leave a comment:
agd5f replied

02 November 2012, 03:25 PM
Originally posted by bug77 View Post

Like, seriously? You need the full bandwidth of a card to display a single texture?!?

Only if you want maximum performance.
Leave a comment:
smitty3268 replied

02 November 2012, 02:44 PM
Originally posted by xception View Post

Well, guessing from what was changed I'd say that the huge difference is when the game is complex enough to run out of memory from the graphics card. Since they changed "VRAM|GTT" to just "VRAM", it seems quite likely that is the issue, there should be some code to detect when the video card get close to running out of VRAM and switch from "VRAM" back to "VRAM|GTT" relocations for those workloads while keeping VRAM only for workloads which require less video memory. Another solution would be to somehow monitor which of those resources are accessed more and which less often and locate the high access count resources in VRAM and the rest in GTT.

That was my guess. The high quality setting is probably using larger textures and running out of memory on the card.
Leave a comment:
smitty3268 replied

02 November 2012, 02:44 PM
Originally posted by bug77 View Post

Like, seriously? You need the full bandwidth of a card to display a single texture?!?

That depends on how many fps you want to render. More bandwidth will always give you higher fps.
Leave a comment:
bug77 replied

02 November 2012, 02:31 PM
Originally posted by agd5f View Post

Make sure you have 2D tiling enabled otherwise you won't be fully utilizing your memory bandwidth; it's been made the default as of mesa 9.0 and xf86-video-ati git master. Note that the EGL paths to not properly handle tiling yet.

Like, seriously? You need the full bandwidth of a card to display a single texture?!?
Leave a comment:
a user replied

02 November 2012, 01:30 PM
Originally posted by xception View Post

Well, guessing from what was changed I'd say that the huge difference is when the game is complex enough to run out of memory from the graphics card. Since they changed "VRAM|GTT" to just "VRAM", it seems quite likely that is the issue, there should be some code to detect when the video card get close to running out of VRAM and switch from "VRAM" back to "VRAM|GTT" relocations for those workloads while keeping VRAM only for workloads which require less video memory. Another solution would be to somehow monitor which of those resources are accessed more and which less often and locate the high access count resources in VRAM and the rest in GTT.

wouldn'T it be better to always keep it VRAM|GTT but distribute between them two more intelligently?
Leave a comment:
xception replied

02 November 2012, 01:23 PM
Well, guessing from what was changed I'd say that the huge difference is when the game is complex enough to run out of memory from the graphics card. Since they changed "VRAM|GTT" to just "VRAM", it seems quite likely that is the issue, there should be some code to detect when the video card get close to running out of VRAM and switch from "VRAM" back to "VRAM|GTT" relocations for those workloads while keeping VRAM only for workloads which require less video memory. Another solution would be to somehow monitor which of those resources are accessed more and which less often and locate the high access count resources in VRAM and the rest in GTT.
Leave a comment:
tmikov replied

02 November 2012, 12:37 PM
Originally posted by agd5f View Post

Make sure you have 2D tiling enabled otherwise you won't be fully utilizing your memory bandwidth; it's been made the default as of mesa 9.0 and xf86-video-ati git master. Note that the EGL paths to not properly handle tiling yet.

Tiling is enabled, though I don't see a difference when I enable 2D tiling vs 1D.

About EGL: I have applied a simple patch which enables tiling of the frame buffer. With that It matches the performance of running under X11. In both cases I get 130 FPS, while the blob is at about 220. (It is not actually double, sorry, but is significantly faster).
Leave a comment:

Announcement

AMD R600g Performance Patches Yield Mixed Results

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: