Announcement

Collapse
No announcement yet.

r600/r700 libdrm, mesa, and radeon performance patches

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by Obscene_CNN View Post
    Actually there is some code in this patch that probably would be accepted. However I need to break it out in to separate patches. Unfortunately these parts won't give the speed boost that the core of my patch is based on.
    Something is better than nothing.

    Comment


    • #52
      Also note worthy is the fact that r300 and r500 chips will get a minor speed boost too

      Comment


      • #53
        Did you test the performance implications of the hunks adding
        "unlikely"/"likely" on their own? I highly doubt
        that they are of any use (except maybe the MIPS R10K): modern x86
        are very smart wrt. branch prediction and they obfuscate the code
        in a non-readable way IMO.

        Comment


        • #54
          The likely and unlike do give a very very minor speed increase. The improvement is not due to branch prediction. It is due to code location. Branches tagged if unlikely are moved out of the primary instruction flow. This reduces the instruction cache fill cycles per function and prevents the unlikely code branches from displacing useful instructions from the cache unless they are taken.

          Comment


          • #55
            Originally posted by Obscene_CNN View Post
            The likely and unlike do give a very very minor speed increase. The improvement is not due to branch prediction. It is due to code location. Branches tagged if unlikely are moved out of the primary instruction flow. This reduces the instruction cache fill cycles per function and prevents the unlikely code branches from displacing useful instructions from the cache unless they are taken.
            Have you actually measured it or are you just talking hypothetically?

            Comment


            • #56
              Originally posted by mlau View Post
              Have you actually measured it or are you just talking hypothetically?
              I haven't tested each likely/unlikely by itself if that is what you mean. Most were test as part of each section of code modification. A few were tested by themselves and did show improvement.

              If you think polluting the instruction cache with null check handling code and code for handling corner cases will make things better feel free to take them out.

              I forgot to mention that while using Torcs as a benchmark my worst case frame rate entering the straight away before the final turn on the Forza track was 2.7 FPS without my patch. With my patch it jumped to 3.3 FPS

              Comment


              • #57
                Originally posted by Obscene_CNN View Post
                I haven't tested each likely/unlikely by itself if that is what you mean. Most were test as part of each section of code modification. A few were tested by themselves and did show improvement.
                I meant move only the "likely/unlikely" changes to a separate patch and
                just measure their impact on their own. (I played a bit with "unlikely"
                in a few places in MIPS kernel code. GCC 4.4.3 didn't lay out code
                any differently).

                I don't doubt that the other changes (batching writes) improve performance,
                these changes do make sense.

                Originally posted by Obscene_CNN View Post
                If you think polluting the instruction cache with null check handling code and code for handling corner cases will make things better feel free to take them out.
                The checks have to be made either way, adding "unlikely" won't change that. And, x86 have huge caches these days, with a very smart branch
                predictor and a license to prefetch code to minimize pipeline stalls...

                Comment


                • #58
                  Updated libdrm and mesa patches with inline assembly memcpy goodness for X86-64

                  As usual you must build and install libdrm first before installing mesa.

                  libdrm patch



                  mesa patch


                  also note I'm going to start working to get some small portions of my mesa patch merged upstream.

                  Comment


                  • #59
                    Thanks for the works

                    Comment


                    • #60
                      I think, if Oscene_CNN(great work btw!) patches result in improved performance, then upstream code misses something. For hardware, code niceness does not mean anything, if it runs faster, then this code is better for hardware. For human, it is something opposite. I suggest some kind of autopatching be enabled upstream, where Obscene_CNN patches are applied with explanations in form of end-point optimization. This means they are not dissolved, but they are used. In the end its gcc to become better or Obscene_CNN work to be incorporated in drivers. I hope upstream understands that drivers code performance side is at least of equal value as code beauty.

                      Comment

                      Working...
                      X