Originally posted by darkbasic
View Post
Announcement
Collapse
No announcement yet.
r600/r700 libdrm, mesa, and radeon performance patches
Collapse
X
-
-
Originally posted by AzuMao View PostIdeally, yes, and all 3D engines should be written in OpenGL.
Sadly, we don't live in the perfect, ideal world.
Leave a comment:
-
I think, if Oscene_CNN(great work btw!) patches result in improved performance, then upstream code misses something. For hardware, code niceness does not mean anything, if it runs faster, then this code is better for hardware. For human, it is something opposite. I suggest some kind of autopatching be enabled upstream, where Obscene_CNN patches are applied with explanations in form of end-point optimization. This means they are not dissolved, but they are used. In the end its gcc to become better or Obscene_CNN work to be incorporated in drivers. I hope upstream understands that drivers code performance side is at least of equal value as code beauty.
Leave a comment:
-
Updated libdrm and mesa patches with inline assembly memcpy goodness for X86-64
As usual you must build and install libdrm first before installing mesa.
libdrm patch
mesa patch
also note I'm going to start working to get some small portions of my mesa patch merged upstream.
Leave a comment:
-
Originally posted by Obscene_CNN View PostI haven't tested each likely/unlikely by itself if that is what you mean. Most were test as part of each section of code modification. A few were tested by themselves and did show improvement.
just measure their impact on their own. (I played a bit with "unlikely"
in a few places in MIPS kernel code. GCC 4.4.3 didn't lay out code
any differently).
I don't doubt that the other changes (batching writes) improve performance,
these changes do make sense.
Originally posted by Obscene_CNN View PostIf you think polluting the instruction cache with null check handling code and code for handling corner cases will make things better feel free to take them out.
predictor and a license to prefetch code to minimize pipeline stalls...
Leave a comment:
-
Originally posted by mlau View PostHave you actually measured it or are you just talking hypothetically?
If you think polluting the instruction cache with null check handling code and code for handling corner cases will make things better feel free to take them out.
I forgot to mention that while using Torcs as a benchmark my worst case frame rate entering the straight away before the final turn on the Forza track was 2.7 FPS without my patch. With my patch it jumped to 3.3 FPS
Leave a comment:
-
Originally posted by Obscene_CNN View PostThe likely and unlike do give a very very minor speed increase. The improvement is not due to branch prediction. It is due to code location. Branches tagged if unlikely are moved out of the primary instruction flow. This reduces the instruction cache fill cycles per function and prevents the unlikely code branches from displacing useful instructions from the cache unless they are taken.
Leave a comment:
Leave a comment: