@bridgeman the push for increased GL version support probably should have been expected... considering that many open source developers have a implement first make it fast later mentality... and also that implementing is easier than optimizing. I think it might also be important for getting the driver up to feature parity so that driver development can occur in sync with hardware design.
Announcement
Collapse
No announcement yet.
E-450 graphics performance issues
Collapse
X
-
Yep, it's a sad fact that some 2d operations on the 3d engine suck. Add to that the likely better algorithms in the blobs.
When R600 and GF8 were released, this was publicly known and anyone who needed good 2d just bought the previous gen (and this was on Windows too!). It's my understanding that even the latest gen loses in 2d to the cards with dedicated 2d, such as R500, GF7, or the old cards such as Matrox ones. (yep, tab switching on a Matrox is still faster than on a recent AMD, as is software (vesa) )
Are you using a compositor anyhow? I tried to replicate your enter-pressing test, but all it did was bring X from 1% of one core to 3% of one core. But I'm not using a compositor, nor a bloated terminal such as Gnome terminal (mrxvt if you're curious, with antialiased fonts etc).
Comment
-
Originally posted by curaga View PostYep, it's a sad fact that some 2d operations on the 3d engine suck. Add to that the likely better algorithms in the blobs.
When R600 and GF8 were released, this was publicly known and anyone who needed good 2d just bought the previous gen (and this was on Windows too!). It's my understanding that even the latest gen loses in 2d to the cards with dedicated 2d, such as R500, GF7, or the old cards such as Matrox ones. (yep, tab switching on a Matrox is still faster than on a recent AMD, as is software (vesa) )
Are you using a compositor anyhow? I tried to replicate your enter-pressing test, but all it did was bring X from 1% of one core to 3% of one core. But I'm not using a compositor, nor a bloated terminal such as Gnome terminal (mrxvt if you're curious, with antialiased fonts etc).
Comment
-
I've now done a fair share of profiling and tracing. Fallbacks aren't the main issue that's holding back 2D performance. Everything important is accelerated, generally there's no migration ping-pong, etc. It's just that acceleration is very slow, and defunct power management that is forcing the GPU clock to the lowest and slowest power state doesn't help.
I'm not quite sure why rendering is so low as I'm not very familiar with the R600+ architecture. Part of it seems to be synchronous CS flushing, at least.
Comment
-
Yes, overlapped Copy operations are slow, and sometimes unnecessarily so*. However, it's also slow without many flushes, for instance when doing a lot of small Composite operations, a good example for this case is text rendering in gnome-terminal. I've experimented with increasing the size of the VBO and that seems to help a small bit, but not much. Generally, I use cairo-perf-trace to benchmark.
* When doing a copy inside a single pixmap, the DDX does a two-stage copy with two flushes even if the areas don't overlap. In this case no copy to the temporary is needed and one flush is enough. I've fixed that in my tree and it's noticeable faster in some cases, e.g. scrolling in gedit.
Comment
-
Originally posted by brent View Post* The DDX does a two-stage copy blit with two flushes in a single pixmap even if the areas don't overlap. In this case no copy to the temporary is needed and one flush is enough. I've fixed that in my tree and it's noticeable faster in some cases, e.g. scrolling in gedit.Test signature
Comment
Comment