Announcement

Collapse
No announcement yet.

Strange state of 3D perf for radeon....

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strange state of 3D perf for radeon....

    Hello,

    This morning just to see I launch 2 games on a intel X3100 laptop and a ATI X1700 (256Mo) laptop and the results surprised me a lot :

    Scorched3D : nearly same performance
    Hedgewar : X3100 : ~100FPS, X1700 : ~70FPS

    The same graphical options were applied.

    For info :

    X3100 Laptop :
    core2duo T5670 (1.8Ghz), 3Go RAM
    Gentoo 64bits : KDE 4.3 (KWIN composite active), xorg-server 1.6.3, Kernel 2.6.31-rc6 and libdrm, mesa, xf86-video-intel from git (yesterday) - DRI2 + KMS

    X1700 Laptop :
    core2duo T7200 (2Ghz), 2Go RAM
    Gentoo 64bits : KDE 4.3 (KWIN composite active), Kernel 2.6.31-rc6 and xorg-server, libdrm, mesa, xf86-video-ati from git (yesterday) - DRI1 w/o KMS


    I suppose this is not common to all 3D games but if someone has some explication...
    Last edited by rem5; 18 August 2009, 10:59 AM.

  • #2
    The free Radeon Driver Support only OpenGL 1.4 ( Mesa Master is now OGL1.5 vor radeon r200(?)/r300/r400/r500) )

    Comment


    • #3
      Originally posted by Nille View Post
      The free Radeon Driver Support only OpenGL 1.4 ( Mesa Master is now OGL1.5 vor radeon r200(?)/r300/r400/r500) )
      At the moment of my test, 1.5. I know this can make a big difference but from what I hear the major difference between OGL 1.5 et 2.0 is GLSL.

      so if this game doesn't use GLSL this shouln't be related, right ?

      Even if it's the case it's surprising to see an IGP more powerfull than a discrete card (and not a low end at his time).
      Last edited by rem5; 18 August 2009, 11:28 AM.

      Comment


      • #4
        There are 2 big optimizations that aren't implemented yet in the r300 3D driver: hyperz and texture tiling. Beyond that, it would probably be best to profile the application and see where the slow points are.

        Comment


        • #5
          Originally posted by agd5f View Post
          There are 2 big optimizations that aren't implemented yet in the r300 3D driver: hyperz and texture tiling. Beyond that, it would probably be best to profile the application and see where the slow points are.
          Which tool can I use to profile them ? (oprofile ??) and does they require some particular skills as I'm just a user with some very old programming knowledge ?

          Are these 2 big optimizations in the work ?

          And before someone suggest it I have near from nothing knowledge in C/C++, and nothing in (graphic) driver programming..
          Last edited by rem5; 18 August 2009, 12:09 PM.

          Comment


          • #6
            Originally posted by agd5f View Post
            There are 2 big optimizations that aren't implemented yet in the r300 3D driver: hyperz and texture tiling. Beyond that, it would probably be best to profile the application and see where the slow points are.
            talking about optimizations, i wonder, do any of these options and codebases
            include ANY (current ?) SIMD (sse3/4/altivec/NEON etc) optimizations and Glibc replacement code?


            "...
            I have proven http://freevec.org/content/libfreeve...hmarks_updated that glibc, the #1 libc used on Linux, is totally unoptimized even for common platforms (such as x86 and x86_64), and there are performance gains that could/should materialize if someone took the effort to do it....."


            "...
            Finally, with regard to glibc performance, even if we take into account that some common routines are optimised (like strlen(), memcpy(), memcmp() plus some more), most string functions are NOT optimised. Not only that, glibc only includes reference implementations that perform the operations one-byte-at-a-time! How's that for inefficient? We're not talking about dummy unused joke functions here like memfrob(), but really important string and memory functions that are used pretty much everywhere, like strcmp(), strncmp(), strncpy(), etc.


            In times where power consumption has become so much important, I would think that the first thing to do to save power is optimise the software, and what better place to start than the core parts of an operating system? I can't speak for the kernel -though I'm sure it's very optimised actually- but having looked at the glibc code extensively the past years, I can say that it's grossly unoptimised, so much it hurts, Markos"

            i assume at least Someone Has actually taken the time and profiled all the current code bases you refered to etc, and tryed to add at least some of these massive SIMD speed improvements were they can already... if Not will you, and when ?
            Last edited by popper; 18 August 2009, 09:45 PM.

            Comment


            • #7
              I believe the common code in Mesa makes use of vector instructions, but I doubt the hardware drivers do. Right now 3D performance is primarily limited by how efficiently the GPU is used; it's pretty rare to be CPU limited (which is where vector CPU instructions would help).
              Test signature

              Comment


              • #8
                thanks, i was under the impression Today that infact the GPU's were sat around mostly waiting For the CPU and by extension it's SIMD part, to actually give the GPU something to do.

                not the other way around, were the CPU is waiting on the GPU to return something, as you would do in the likes of the GPU/UVD decoded frame editing, and the current AMD buzz word 'OpenCL' etc.

                eather way new SIMD/Vector code on and in both CPU and GPU camps seems like a very good thing to consider as you write your new code, and extend and refactor the old, for both new speed increases for free and larger power saving as your current base for the long term future as Markos above claims...

                thanks again...
                Last edited by popper; 18 August 2009, 09:40 PM.

                Comment


                • #9
                  Depends on whether you are talking about 3D (as we are here) or video. In the case of video, it's very common for the decoder to be CPU limited unless the entire decode task is dumped onto the GPU, and in those cases vector instructions on the CPU can be a big help -- but AFAIK they are already heavily used in the decoder.

                  The drivers do render acceleration (Xv) but all the work there is done on the GPU so vector CPU instructions don't really make a difference for the drivers.

                  If we get to the point where drivers take on the entire decode task, hand portions off to the GPU shaders and do the rest in the driver, only then would vector CPU instructions make a difference. Personally that seems like re-inventing the wheel to me... I would rather take an existing, well understood (and already vectorized ) decode library and add GPU hooks just for the tasks where GPUs can be effective.

                  GPUs only know how to do vector processing (SIMD on most vendors hardware, SIMD+superscalar on ATI hardware), so there's really nothing to vectorize there.
                  Last edited by bridgman; 18 August 2009, 09:42 PM.
                  Test signature

                  Comment

                  Working...
                  X