Announcement

Collapse
No announcement yet.

AMD R600g Performance Patches Yield Mixed Results

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by bug77 View Post
    The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?
    Uh, yes. First you need to read from texture memory and into the GPU itself during rasterization. The GPU is also writing to the front buffer. Then, after the buffers are flipped, the (now) back buffer has to be encoded and sent to your monitor. So at a very minimum you have thousands of writes and even more reads for just that one rectangle.

    I think you're getting memory bandwidth confused with DMA bandwidth on your PCI-E bus. Not the same thing.

    Comment


    • #22
      Originally posted by bug77 View Post
      The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?
      You need to read in the texture and apply it to the quad every frame otherwise you aren't rendering anything and fps would be pointless.

      Comment


      • #23
        My test code is really basic but if anyone is interested I can post it. I posted an EGL version to the mesa list. Add it to the PTS perhaps? :-)

        Comment


        • #24
          Originally posted by przemoli View Post
          As for article. Is Marek around to comment?
          I expected worse results after seeing the bug report about Unigine Heaven. Anyway, we don't have many options at the moment (I see only one: reverting the commit). The mechanism that decides where buffers are placed (VRAM or GTT) and which buffers are moved when we start to run out of memory must be overhauled. This is a bigger project and I don't have time for it right now. The kernel DRM interface might need some changes. We also need good tools to detect bottlenecks and a good GPU resource monitor. Right now if you run out of GPU memory, there's no easy way to know and definitely no way to know what is eating the memory. We're mostly blind right now.

          However, we're fighting a battle we can't win. S3TC textures need 4x to 8x less memory and would help a lot with this problem. Any driver with S3TC support has a great advantage over a driver without one.

          We could also cheat by using the BC7 format for plain RGBA8 textures. That would be a win if we implemented the BC7 encoding on the GPU.

          Comment


          • #25
            This problem reminds me of a similar problem with r300g. Hopefully a way to improve performance without regressions will be found.

            And anyway thanks Marek for all this endless work on the radeon drivers .

            Comment


            • #26
              Originally posted by marek View Post
              I expected worse results after seeing the bug report about Unigine Heaven. Anyway, we don't have many options at the moment (I see only one: reverting the commit). The mechanism that decides where buffers are placed (VRAM or GTT) and which buffers are moved when we start to run out of memory must be overhauled. This is a bigger project and I don't have time for it right now. The kernel DRM interface might need some changes. We also need good tools to detect bottlenecks and a good GPU resource monitor. Right now if you run out of GPU memory, there's no easy way to know and definitely no way to know what is eating the memory. We're mostly blind right now.

              However, we're fighting a battle we can't win. S3TC textures need 4x to 8x less memory and would help a lot with this problem. Any driver with S3TC support has a great advantage over a driver without one.

              We could also cheat by using the BC7 format for plain RGBA8 textures. That would be a win if we implemented the BC7 encoding on the GPU.
              So would I be correct in thinking that this performance regression for Heaven/ETQW/etc *might* only affect users who haven't enabled S3TC through the external libtxc_dxtn library? Or, I guess users who are using it with applications that don't support it (or applications that just require gobs of memory capacity).

              Comment


              • #27
                Originally posted by Veerappan View Post
                So would I be correct in thinking that this performance regression for Heaven/ETQW/etc *might* only affect users who haven't enabled S3TC through the external libtxc_dxtn library? Or, I guess users who are using it with applications that don't support it (or applications that just require gobs of memory capacity).
                No, the performance regression affects everybody, but users without S3TC are likely to run out of VRAM more often.

                Comment


                • #28
                  Originally posted by marek View Post
                  No, the performance regression affects everybody, but users without S3TC are likely to run out of VRAM more often.
                  So if I understand this correctly, game-specific hacks may not work, because after a long play session loading many maps, even Reaction may eventually load enough texture data to trigger the bug?

                  Comment


                  • #29
                    Originally posted by MaxToTheMax View Post
                    So if I understand this correctly, game-specific hacks may not work, because after a long play session loading many maps, even Reaction may eventually load enough texture data to trigger the bug?
                    The number of loaded maps doesn't matter. What matters is how much memory must be accessed to render a few consecutive frames. Game-specific hacks cannot work, because it's dependent on the pre-set level of detail/graphics quality/what you set in the options menu.

                    Comment


                    • #30
                      @marek

                      i think current distros already preinstall s2tc - ubuntu 12.10 as well (not sure for 32 bit however). in a trial image kanotix dragonfire has it preinstalled as well (32+64 bit lib).

                      Comment

                      Working...
                      X