Announcement

**wien** · 16 July 2009, 08:17 AM

Originally posted by bridgman View Post

(b) get back to 1500fps from 15fps by accelerating the back-to-front buffer copy,

See, here's where I'm fuzzy. Aren't buffer swaps supposed to, you know, swap buffers? As in not copy from the back- to the front-buffer at all but instead just swap them? Or is this just some temporary trick you use for now because getting the hardware to do a proper swap is hard?

**bridgman** · 16 July 2009, 08:45 AM

I believe the idea of "buffer swap" dates back to the days when OpenGL apps usually ran full screen and an overlay was floated in front for menus etc...

If the app is running full screen then AFAIK it is possible to do an actual swap, usually called "page flipping" in X/DRI-speak. If the app is not running full screen, however, then you need to copy from back buffer (hidden) to front buffer (screen) without affecting the other content which is already on the screen.

That said, I don't see page flipping on fullscreen apps used as much as I would expect, not sure why. It may be that the growing use of compositors (which add their own copy step anyways) is displacing the traditional "OpenGL owns the screen" style of operation, or it may just be a lower priority than all the other big changes currently being made in the stack. A high-end GPU can do a full-screen copy in well under a millisecond anyways, and even a low-end GPU only takes a few milliseconds.

**wien** · 16 July 2009, 09:13 AM

Right, of course. Didn't think that all the way through. For windowed apps a copy makes sense. as they effectively share the front-buffer with the window manager (in DRI1 at least). Guess the overhead is simply smaller than I imagined. 'Twas just something about copying that triggered my premature optimization circuits.

**BlackStar** · 16 July 2009, 11:08 AM

Originally posted by bridgman View Post

A high-end GPU can do a full-screen copy in well under a millisecond anyways, and even a low-end GPU only takes a few milliseconds.

More like microseconds, actually. 1080p is ~7.9MB per frame, which translates needs something between 80μs (ultra high-end GPUs with GDDR5 memory) to 4ms (ultra low end Intel IGPs with single-channel DDR2 shared memory).

Fglrx seems to enable page flipping on un-re-directed fullscreen apps but fall back to copying when the window becomes redirected (e.g. a background window pops-up or you leave fullscreen). You get a momentary flicker, but it's a good compromise between performance and visual quality. Older Windows drivers (esp. nvidia ones) used to flicker horribly whenever a notification / menu came in front of an OpenGL/D3D app.

**nanonyme** · 16 July 2009, 11:16 AM

Originally posted by BlackStar View Post

Older Windows drivers (esp. nvidia ones) used to flicker horribly whenever a notification / menu came in front of an OpenGL/D3D app.

Yeah, now it only flickers where the notification is. Sometimes a bit annoying if you're trying to concentrate and the notification jumps in and out in less than a second intervals before calming down.

**BlackStar** · 16 July 2009, 11:25 AM

Originally posted by nanonyme View Post

Yeah, now it only flickers where the notification is. Sometimes a bit annoying if you're trying to concentrate and the notification jumps in and out in less than a second intervals before calming down.

Indeed. I think this is due to the design of the compositor on XP (e.g. it shouldn't happen if you disable menu fade-in and shadow effects). As far as I can tell, Vista and higher don't suffer from this issue anymore - at least not when Aero is enabled.

**agd5f** · 16 July 2009, 11:37 AM

Originally posted by tormod View Post

From what I understand (I have edited a previous post of mine about this) you still need to use libdrm from Alex' repo. But you can _build_ your mesa with libdrm from git master.

You need the kernel modules from my repo, but libdrm can be from my repo or drm git master.

**agd5f** · 16 July 2009, 11:37 AM

Also, if you are running a compositer, please turn it off when testing.

**bridgman** · 16 July 2009, 12:05 PM

Originally posted by BlackStar View Post

More like microseconds, actually. 1080p is ~7.9MB per frame, which translates needs something between 80μs (ultra high-end GPUs with GDDR5 memory) to 4ms (ultra low end Intel IGPs with single-channel DDR2 shared memory).

It's more than 80us, more like 200uS or 0.2mS on a board with 256-bit GDDR5 memory. Remember that the data has to be read and written back (so 2 accesses), and that the GPU is alternating read and write bursts. The bandwidth number you see in reviews is the theoretical maximum for infinitely long bursts.

I was thinking of "high end" as starting somewhere around 50-60GB/s peak bandwidth, say 3850/4850 and up, where the copy time would be 0.4-0.5 mS. I agree that if you go right up to 4870/4890 you can probably cut that in half again.

**LiquidAcid** · 16 July 2009, 12:30 PM

I just retried the test with compositing disabled, but that didn't change anything.

EDIT: Switched libdrm repo to the one from Alex, but now I only get the software renderer recognized by glxinfo and glxgears (which works of course).

Announcement

ATI R600/700 OSS 3D Driver Reaches Gears Milestone

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment