Announcement

Collapse
No announcement yet.

AMD R600g Performance Patches Yield Mixed Results

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by agd5f View Post
    Make sure you have 2D tiling enabled otherwise you won't be fully utilizing your memory bandwidth; it's been made the default as of mesa 9.0 and xf86-video-ati git master. Note that the EGL paths to not properly handle tiling yet.
    Like, seriously? You need the full bandwidth of a card to display a single texture?!?

    Comment


    • #17
      Originally posted by bug77 View Post
      Like, seriously? You need the full bandwidth of a card to display a single texture?!?
      That depends on how many fps you want to render. More bandwidth will always give you higher fps.

      Comment


      • #18
        Originally posted by xception View Post
        Well, guessing from what was changed I'd say that the huge difference is when the game is complex enough to run out of memory from the graphics card. Since they changed "VRAM|GTT" to just "VRAM", it seems quite likely that is the issue, there should be some code to detect when the video card get close to running out of VRAM and switch from "VRAM" back to "VRAM|GTT" relocations for those workloads while keeping VRAM only for workloads which require less video memory. Another solution would be to somehow monitor which of those resources are accessed more and which less often and locate the high access count resources in VRAM and the rest in GTT.
        That was my guess. The high quality setting is probably using larger textures and running out of memory on the card.

        Comment


        • #19
          Originally posted by bug77 View Post
          Like, seriously? You need the full bandwidth of a card to display a single texture?!?
          Only if you want maximum performance.

          Comment


          • #20
            Originally posted by agd5f View Post
            Only if you want maximum performance.
            The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?

            Comment


            • #21
              Originally posted by bug77 View Post
              The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?
              Uh, yes. First you need to read from texture memory and into the GPU itself during rasterization. The GPU is also writing to the front buffer. Then, after the buffers are flipped, the (now) back buffer has to be encoded and sent to your monitor. So at a very minimum you have thousands of writes and even more reads for just that one rectangle.

              I think you're getting memory bandwidth confused with DMA bandwidth on your PCI-E bus. Not the same thing.

              Comment


              • #22
                Originally posted by bug77 View Post
                The guy said "single static rectangular texture". Even if rendered at 1000fps, does this need any memory bandwidth past the initial rendering?
                You need to read in the texture and apply it to the quad every frame otherwise you aren't rendering anything and fps would be pointless.

                Comment


                • #23
                  My test code is really basic but if anyone is interested I can post it. I posted an EGL version to the mesa list. Add it to the PTS perhaps? :-)

                  Comment


                  • #24
                    Originally posted by przemoli View Post
                    As for article. Is Marek around to comment?
                    I expected worse results after seeing the bug report about Unigine Heaven. Anyway, we don't have many options at the moment (I see only one: reverting the commit). The mechanism that decides where buffers are placed (VRAM or GTT) and which buffers are moved when we start to run out of memory must be overhauled. This is a bigger project and I don't have time for it right now. The kernel DRM interface might need some changes. We also need good tools to detect bottlenecks and a good GPU resource monitor. Right now if you run out of GPU memory, there's no easy way to know and definitely no way to know what is eating the memory. We're mostly blind right now.

                    However, we're fighting a battle we can't win. S3TC textures need 4x to 8x less memory and would help a lot with this problem. Any driver with S3TC support has a great advantage over a driver without one.

                    We could also cheat by using the BC7 format for plain RGBA8 textures. That would be a win if we implemented the BC7 encoding on the GPU.

                    Comment


                    • #25
                      This problem reminds me of a similar problem with r300g. Hopefully a way to improve performance without regressions will be found.

                      And anyway thanks Marek for all this endless work on the radeon drivers .

                      Comment


                      • #26
                        Originally posted by marek View Post
                        I expected worse results after seeing the bug report about Unigine Heaven. Anyway, we don't have many options at the moment (I see only one: reverting the commit). The mechanism that decides where buffers are placed (VRAM or GTT) and which buffers are moved when we start to run out of memory must be overhauled. This is a bigger project and I don't have time for it right now. The kernel DRM interface might need some changes. We also need good tools to detect bottlenecks and a good GPU resource monitor. Right now if you run out of GPU memory, there's no easy way to know and definitely no way to know what is eating the memory. We're mostly blind right now.

                        However, we're fighting a battle we can't win. S3TC textures need 4x to 8x less memory and would help a lot with this problem. Any driver with S3TC support has a great advantage over a driver without one.

                        We could also cheat by using the BC7 format for plain RGBA8 textures. That would be a win if we implemented the BC7 encoding on the GPU.
                        So would I be correct in thinking that this performance regression for Heaven/ETQW/etc *might* only affect users who haven't enabled S3TC through the external libtxc_dxtn library? Or, I guess users who are using it with applications that don't support it (or applications that just require gobs of memory capacity).

                        Comment


                        • #27
                          Originally posted by Veerappan View Post
                          So would I be correct in thinking that this performance regression for Heaven/ETQW/etc *might* only affect users who haven't enabled S3TC through the external libtxc_dxtn library? Or, I guess users who are using it with applications that don't support it (or applications that just require gobs of memory capacity).
                          No, the performance regression affects everybody, but users without S3TC are likely to run out of VRAM more often.

                          Comment


                          • #28
                            Originally posted by marek View Post
                            No, the performance regression affects everybody, but users without S3TC are likely to run out of VRAM more often.
                            So if I understand this correctly, game-specific hacks may not work, because after a long play session loading many maps, even Reaction may eventually load enough texture data to trigger the bug?

                            Comment


                            • #29
                              Originally posted by MaxToTheMax View Post
                              So if I understand this correctly, game-specific hacks may not work, because after a long play session loading many maps, even Reaction may eventually load enough texture data to trigger the bug?
                              The number of loaded maps doesn't matter. What matters is how much memory must be accessed to render a few consecutive frames. Game-specific hacks cannot work, because it's dependent on the pre-set level of detail/graphics quality/what you set in the options menu.

                              Comment


                              • #30
                                @marek

                                i think current distros already preinstall s2tc - ubuntu 12.10 as well (not sure for 32 bit however). in a trial image kanotix dragonfire has it preinstalled as well (32+64 bit lib).

                                Comment

                                Working...
                                X