Announcement

Collapse
No announcement yet.

r600/r700 libdrm, mesa, and radeon performance patches

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • crazycheese
    replied
    Originally posted by darkbasic View Post
    Code performance should be a compiler matter...
    This does not cancel my point. But reporting an issue to gcc is a possibility too.

    Leave a comment:


  • crazycheese
    replied
    Originally posted by AzuMao View Post
    Ideally, yes, and all 3D engines should be written in OpenGL.


    Sadly, we don't live in the perfect, ideal world.
    Thats not correct. Modern 3D engines have render factory with support for multiple renderers, opengl included. The only reason why only directx renderer is implemented instead is microsoft lobby.

    Leave a comment:


  • AzuMao
    replied
    Originally posted by darkbasic View Post
    Code performance should be a compiler matter...
    Ideally, yes, and all 3D engines should be written in OpenGL.


    Sadly, we don't live in the perfect, ideal world.

    Leave a comment:


  • darkbasic
    replied
    Code performance should be a compiler matter...

    Leave a comment:


  • crazycheese
    replied
    I think, if Oscene_CNN(great work btw!) patches result in improved performance, then upstream code misses something. For hardware, code niceness does not mean anything, if it runs faster, then this code is better for hardware. For human, it is something opposite. I suggest some kind of autopatching be enabled upstream, where Obscene_CNN patches are applied with explanations in form of end-point optimization. This means they are not dissolved, but they are used. In the end its gcc to become better or Obscene_CNN work to be incorporated in drivers. I hope upstream understands that drivers code performance side is at least of equal value as code beauty.

    Leave a comment:


  • Death Knight
    replied
    Thanks for the works

    Leave a comment:


  • Obscene_CNN
    replied
    Updated libdrm and mesa patches with inline assembly memcpy goodness for X86-64

    As usual you must build and install libdrm first before installing mesa.

    libdrm patch



    mesa patch


    also note I'm going to start working to get some small portions of my mesa patch merged upstream.

    Leave a comment:


  • mlau
    replied
    Originally posted by Obscene_CNN View Post
    I haven't tested each likely/unlikely by itself if that is what you mean. Most were test as part of each section of code modification. A few were tested by themselves and did show improvement.
    I meant move only the "likely/unlikely" changes to a separate patch and
    just measure their impact on their own. (I played a bit with "unlikely"
    in a few places in MIPS kernel code. GCC 4.4.3 didn't lay out code
    any differently).

    I don't doubt that the other changes (batching writes) improve performance,
    these changes do make sense.

    Originally posted by Obscene_CNN View Post
    If you think polluting the instruction cache with null check handling code and code for handling corner cases will make things better feel free to take them out.
    The checks have to be made either way, adding "unlikely" won't change that. And, x86 have huge caches these days, with a very smart branch
    predictor and a license to prefetch code to minimize pipeline stalls...

    Leave a comment:


  • Obscene_CNN
    replied
    Originally posted by mlau View Post
    Have you actually measured it or are you just talking hypothetically?
    I haven't tested each likely/unlikely by itself if that is what you mean. Most were test as part of each section of code modification. A few were tested by themselves and did show improvement.

    If you think polluting the instruction cache with null check handling code and code for handling corner cases will make things better feel free to take them out.

    I forgot to mention that while using Torcs as a benchmark my worst case frame rate entering the straight away before the final turn on the Forza track was 2.7 FPS without my patch. With my patch it jumped to 3.3 FPS

    Leave a comment:


  • mlau
    replied
    Originally posted by Obscene_CNN View Post
    The likely and unlike do give a very very minor speed increase. The improvement is not due to branch prediction. It is due to code location. Branches tagged if unlikely are moved out of the primary instruction flow. This reduces the instruction cache fill cycles per function and prevents the unlikely code branches from displacing useful instructions from the cache unless they are taken.
    Have you actually measured it or are you just talking hypothetically?

    Leave a comment:

Working...
X