Announcement

Collapse
No announcement yet.

r600/r700 libdrm, mesa, and radeon performance patches

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Airlied has also said that the same idea with a slightly different implementation (a macro accepting a variable # of arguments) could get supported and included.

    A lot of good ideas don't get accepted the first time but evolve into something that does...
    Test signature

    Comment


    • #12
      Originally posted by RealNC View Post
      What's that?
      A bashrcng plugin (like bashrcng-shmfs which allow you to compile ebuilds in ram instead of the hard disk). It permits to apply a patch to an ebuild without having to use a local overlay and modifying the ebuild

      You have just to put the patch into $BASEDIR/x11-drivers/xf86-video-ati/[$PN_*.{patch,diff},$P_*.{patch,diff},$PF_*.{patch, diff}] and the patch(es) will be applied to a specified version of xf86-video-ati or every version (it depends if you use $PN, $P or $PF).

      Very simple, very quick. You can find it in the gechi overlay (you will have to fetch it with layman -f -o http://gechi-overlay.sf.net/layman.xml -a gechi)

      Also, you will have to activate bashrcng and bashrcng-patching with eselect bashrcng[-patching] ...
      ## VGA ##
      AMD: X1950XTX, HD3870, HD5870
      Intel: GMA45, HD3000 (Core i5 2500K)

      Comment


      • #13
        Originally posted by that guy View Post
        Cool thread. I'm quite satisfied with my system's performance at the moment but I'm sure it'll please a few hardcore benchmarkers.


        Actually on Gentoo you should be able to simply add the patches to the corresponding ebuilds, which would make it easier and cleaner to apply.

        Out of curiosity, what kind of improvment could one expect in average desktop use? I assume it wouldn't speed up desktop effects as much as pure 3D stuff, right?
        Some libraries like gtk and glitz are leveraging 3d acceleration functions to accelerate 2d.

        the patches to the radeon driver will speed up 2d regardless to the effect of the 3d drivers. The most noticeable improvement I can see is in gnome terminal while compiling packages like ncurses and gettext. The text scrolls by really fast. I do notice a little less tearing when moving windows around too.

        In general these improvements will benefit those with slower CPUs or those with faster GPUs the most.

        Comment


        • #14
          Originally posted by darkbasic View Post
          A bashrcng plugin (like bashrcng-shmfs which allow you to compile ebuilds in ram instead of the hard disk).
          Hmm, I have tmpfs as /var/tmp/portage for that.


          It permits to apply a patch to an ebuild without having to use a local overlay and modifying the ebuild
          I do that already with Portage, why would someone need an extra tool? :P I simply put patches and override-files in /etc/portage/env and portage applies them without overlays or modified ebuilds.

          Comment


          • #15
            Originally posted by RealNC View Post
            Hmm, I have tmpfs as /var/tmp/portage for that.
            But it doesn't use swap if necessary. Also bashrcng-shmfs give you a quick and easy way to create exceptions for heavy packages like openoffice (packages.nomem).

            I simply put patches and override-files in /etc/portage/env and portage applies them without overlays or modified ebuilds.
            Didn't know about it, I use env directory only for per package cflags.
            ## VGA ##
            AMD: X1950XTX, HD3870, HD5870
            Intel: GMA45, HD3000 (Core i5 2500K)

            Comment


            • #16
              Originally posted by darkbasic View Post
              But it doesn't use swap if necessary.
              tmpfs uses swap. It's not a ramdisk, it's an in-memory FS. It consumes only as much RAM as there are files in it, and swaps out if it fills up.

              Also bashrcng-shmfs give you a quick and easy way to create exceptions for heavy packages like openoffice (packages.nomem).
              This is taken care of by tmpfs swapping out.


              Didn't know about it, I use env directory only for per package cflags.
              Just to get the idea. This is how I apply a cleartype patch to all versions of libXft. I create this file:

              /etc/portage/env/x11-libs/libXft

              with following contents:

              Code:
              post_src_unpack() {                                                                                                              
                      epatch "/etc/portage/env/x11-libs/libXft-2.1.14-lcd-cleartype.diff" || die "failed to apply cleartype patch"             
              }
              And place the patch in the same directory. Now portage will apply the patch. No modified ebuilds, no overlays

              The good thing about this is that you can do pretty much whatever you want; it's like adding code to the ebuild but without even modifying it.

              Comment


              • #17
                Bug fix!!

                special thanks to Edwin Torok for find the bug.

                use this file instead of radeon_March_17_2010.patch



                or

                at paste bin

                Comment


                • #18
                  So what are the chances of this patch getting modified to be *acceptable*?

                  Comment


                  • #19
                    Originally posted by bridgman View Post
                    Airlied has also said that the same idea with a slightly different implementation (a macro accepting a variable # of arguments) could get supported and included.
                    I was going to write a long post how macros accepting a variable amount of arguments are impossible in C for this purpose (both with macros and with inline functions, even with gcc extensions to variadic functions/macros).

                    Then I was going to write a long post how it's actually possible, with a whole bunch of tricks.

                    But those tricks require the compiler to do a good job: it MUST inline a function marked as inline and it MUST unroll a loop. Otherwise, it'll probably end up being slower.
                    Problem is, from a quick glance at -O2 -S, it looks like gcc isn't smart enough. Then again, I'm not fluent in asm and there may be optimizations in later compiler stages.

                    So before this is either adapted or dismissed, there have to be further tests. But as it's bedtime for me, I'll post what I have.


                    Example #1, manually calling the correct function:
                    Code:
                    #include <stdio.h>
                    
                    int array[100];
                    int pos = 0;
                    
                    inline void doX1(int a)
                    {
                    	array[pos] = a;
                    	pos+=1;
                    	printf("X1 done\n");
                    }
                    
                    inline void doX2(int a, int b)
                    {
                    	array[pos] = a;
                    	array[pos+1] = b;
                    	pos += 2;
                    	printf("X2 done\n");
                    }
                    
                    inline void doX3(int a, int b, int c)
                    {
                    	array[pos] = a;
                    	array[pos+1] = b;
                    	array[pos+2] = c;
                    	pos += 3;
                    	printf("X3 done\n");
                    }
                    
                    int main()
                    {
                    	doX1(1);
                    	doX2(2,3);
                    	doX3(4,5,6);
                    
                    	int i;
                    	for (i=0;i<pos;i++)
                    		printf("array[%d] = %d\n", i, array[i]);
                    
                    	return 0;
                    }
                    with macro/inline function:
                    Code:
                    #include <stdio.h>
                    #include <stdarg.h>
                    
                    int array[100];
                    int pos = 0;
                    
                    
                    #define array_size(a) (sizeof(a) / (sizeof(a)[0]))
                    
                    #define doX(...) doX_with_count(array_size( ((int []) { __VA_ARGS__ }) ),  __VA_ARGS__ )
                    
                    inline void doX_with_count(int count, ...)
                    {
                    	va_list ap;
                    	int i;
                    	va_start(ap, count);
                    	for(i=0; i<count; i++)
                    		array[pos+i] = va_arg(ap, int);
                    	pos += count;
                    	printf("doX%d\n", i);
                    	va_end(ap);
                    }
                    
                    int main()
                    {
                    	doX(1);
                    	doX(2,3);
                    	doX(4,5,6);
                    
                    	int i;
                    	for (i=0;i<pos;i++)
                    		printf("array[%d] = %d\n", i, array[i]);
                    
                    	return 0;
                    }

                    Comment


                    • #20
                      I should probably mention that I may have oversimplified airlied's response
                      Test signature

                      Comment

                      Working...
                      X