No announcement yet.

ATI R600/700 OSS 3D Driver Reaches Gears Milestone

  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by Xheyther View Post
    Thank a lot for your answer you two. If I can continue to ask questions, I wonder why memcpy is that slow and copy from back to front buffer with the gpu is that fast...

    I guess this is due to kernel protection mechanism and difficulties to access the GPU memory from the CPU...
    The GPU is a specialized processor in that regards: it has its own instruction set, different from the CPU one. While the CPU has to copy single words (32, 64 bit words) to copy an entire memory area (like 1280x1024x32 = 41943040 bytes), the GPU has specialized instructions for this, copying (Bridgman: blitting?) bigger blocks for each clock cycle.

    Being another processor from the CPU, the GPU can do this in parallel, allowing the CPU to actually execute your programs, giving a quick answer to your mouse movements in Quake 3.


    • Yep. The CPU also has to go over the PCIE bus to access video memory, while for the GPU the video memory is directly connected.

      If most of the graphics operations were being performed by the CPU then we could make CPU operations go faster by mapping the video memory into the CPU address space differently (with CPU cacheing enabled), which would speed things up considerably -- but even then the CPU would still be a lot slower than the slowest modern GPU we make.

      These days we use the 3D engine for everything, so copying a buffer actually involves :

      - setting up a texture unit to point to the "back" buffer
      - setting up a render target to point to the "front" buffer
      - setting up a "pass-through" vertex shader and a "read a texel, write a pixel" fragment shader
      - disabling depth processing etc..
      - drawing a quad (aka rectangle) or two triangles on the 3D engine
      - flushing everything out to memory
      - setting everything up for the first drawing operation in the back buffer, usually a clear

      There used to be dedicated blitters on the chip, but these days the 3D engine outruns everything so we have been dropping the hardware from newer GPUs.

      By the way, the above sequence is also at the heart of GPGPU processing, except a texture unit points to each array of input data, the render target(s) point to the output arrays, and the fragment shader program does the actual computation for each element of the output array (aka pixel).
      Last edited by bridgman; 08-03-2009, 05:14 AM.


      • Thanks Bridgman, that's enlightening.


        • Originally posted by Kano View Post
          I have a script that can update Nvidia drivers just fine. No big deal to do so and current drivers work for me. I could go back too with the script. What distro do you use?
          It's a little bit OT but, the script won't help if the driver is crap and don't show anything on the screen. Have had this last with the 185 driver from nvidia and some Quadro-cards. The install itself works fine. But that's useless if the driver won't work.


          • Thanks Loris & Bridgman.


            • @PuckPoltergeist

              When you use Kanotix you can boot with


              as extra boot option, then you land in text mode and can run the script with any option like -v3 (for 173.xx) or -vs (for special) drivers if the default does not work. Absolutely no problem to find a working driver even if a driver crashes.


              • Originally posted by Kano View Post
                Absolutely no problem to find a working driver even if a driver crashes.
                Hell, nVidia is great! SCNR


                • So how long will it take to change this 'memcpy'? Is this a simple task or will this take some time?


                  • 1500 fps with RadeonHD 3200

                    I am getting 1500 fps with the latest git radeonhd driver and mesa and 2.6.31 kernel on RadeonHD 3200 which is RV610.

                    Colors and all and no flickering. Yay

                    I tried to run Penumbra but got a shader compilation error )

                    Gotta love gentoo

                    Keep up the good work! I bought a standalone ATI video card and then 780G motherboard specifically because I got tired of binary blobs. And I will support ATI in the future.



                    • Originally posted by Boerkel View Post
                      So how long will it take to change this 'memcpy'? Is this a simple task or will this take some time?
                      It's not trivial if that's what you mean. It's entirely different than what was done in previous chips, then again, it's also similar to what's done in EXA for the same chips. Then again, implementation might be untrivial since it's not obviously a good idea to put the whole state setup for the 3D engine in the kernel. Sounds to me like a good old place of having a developer sit down (for quite some time) and think what's the best approach.