Announcement

Collapse
No announcement yet.

RV350, compositing and horrible performance.

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by agd5f View Post
    What is your question? vram has much better bandwidth compared to system memory from the GPU's perspective so in most cases it's preferred to store buffers in vram. If we don't have enough vram to cover all the requirements, we may end up with thrashing. There's probably room for improvement with respect to the heuristics used to decide which pools we allocate from and whether we migrate or not. A ttm de-fragmenter would probably also be helpful. Either of these are good projects that don't require low level GPU specific knowledge and could provide nice performance improvements.
    Yes, that was my question. If there is no memory defragementation present and heuristics deserve improvement, no wonder software with high VRAM requirement or lower hardware on any software will bottleneck.. Thanks for clearing up!

    Comment


    • #12
      Originally posted by agd5f View Post
      How much vram does your card have? You may be running out of vram which causes thrashing in GPU memory manager (i.e., migrating stuff between gart and vram). Modern desktop compositors use a lot of memory.
      A whopping 64 MiB of vram. The display is 1440 x 1050 too!

      Can i test vram pressure with 16 bbp or less output? If so, how would I change that?

      Comment


      • #13
        Originally posted by oliver View Post
        A whopping 64 MiB of vram. The display is 1440 x 1050 too!
        That's ~6MB per screen sized buffer at depth 24 (32 bpp). If you are using a GL compositor, you'll need a back buffer as well so that's 12 MB just for what you see on the screen.

        Originally posted by oliver View Post
        Can i test vram pressure with 16 bbp or less output? If so, how would I change that?
        You can change the depth of your root window by specifying Depth 16 in the screen section in your xorg.conf, but with a compositor, apps are allowed to use whatever depth they want, it will just all end up in 16bpp when it's composited onto the root window. Still it will save some memory at least for the root window and back buffer.

        Comment


        • #14
          Originally posted by agd5f View Post
          That's ~6MB per screen sized buffer at depth 24 (32 bpp). If you are using a GL compositor, you'll need a back buffer as well so that's 12 MB just for what you see on the screen.



          You can change the depth of your root window by specifying Depth 16 in the screen section in your xorg.conf, but with a compositor, apps are allowed to use whatever depth they want, it will just all end up in 16bpp when it's composited onto the root window. Still it will save some memory at least for the root window and back buffer.
          I'll create an xorg.cofn just with the bit-depth in it hten; if thats possible. Anyway, you are saying 12 MiB for the 'plain' empty background. Which still leaves me with about 52 MiB of vram. Is there an easy way to check if it's lack of vram? You think a plain gnome 3 shell uses that much ram?

          Comment


          • #15
            Originally posted by oliver View Post
            I'll create an xorg.cofn just with the bit-depth in it hten; if thats possible. Anyway, you are saying 12 MiB for the 'plain' empty background. Which still leaves me with about 52 MiB of vram. Is there an easy way to check if it's lack of vram? You think a plain gnome 3 shell uses that much ram?
            Each app uses offscreen vram. That's how a compositor works. The offscreen buffers are composited onto the front buffer. That's what allows you to have neat transparency effects and expose effects. For example, if you have 5 full screen single buffered apps running, plus your desktop image, each one uses ~6 MB, so that's ~36 MB in addition to the front and back buffer; ~48MB right there. Take off a little bit for alignment and fragmentation, and the kernel fbdev buffer (another 6 MB), then various apps may also stored ancillary pixmaps in memory and right there you are over 64 MB. So yes, a modern composited desktop can easily uses hundreds of megs of memory for pixmaps and textures, etc.

            Comment


            • #16
              Originally posted by oliver View Post
              A whopping 64 MiB of vram. The display is 1440 x 1050 too!

              Can i test vram pressure with 16 bbp or less output? If so, how would I change that?
              Try changing your display resolution to see if there is any setting that suddenly makes everything run smoothly.

              Comment


              • #17
                Also, as noted previously in this thread, it's probably worth adjusting the cpu governor you are using to see if that helps. See https://bugs.freedesktop.org/show_bug.cgi?id=51787#c6 for an idea of how cpu performance can affect GPU performance.

                Comment


                • #18
                  I had turned on the performance governor yesterday, and that didn't really make a difference.

                  I put it on 1024x768 and wow what a differnce that makes. It does seem actually to be pretty smooth. Not perfect, but pretty good. I can't say I notice a difference with 800x600, but that resolution is really hard to work with. 1280x960 appears to be choppy-ish again. Not hugly bad, but quite noticeable.

                  So yeah, problem seems to be in that direction. But can it be solved? I guess using gnome-fallback is out of the question So any/all compositers out of the question? I suppose E17 can still work nicely on old hardware

                  How much vram is one expected to have? "Any modern video card" doesn't really tell you anything ...

                  Comment


                  • #19
                    We probably need a good VRAM defragmentation tool... perhaps combined with some texture prediction engine.... that would solve your problem. Basically every window in compositor is a texture, and every texture has different block size. If the VRAM is limited, allocated/deallocated textures will make "holes" in VRAM in the process - similar to fragmentation of hard disk, and will hinder larger textures to be fit in video memory. So they will be allocated in system memory instead. Leading to reduced performance.

                    This is exactly like machine whose software fits into RAM compared to machine with low RAM and constantly swapping from hard drive.

                    Comment


                    • #20
                      Well I also noticed that AGP mode was forced to 1. (Yes, AGP remember? ) This bug report mentions it and fixes it: https://bugs.launchpad.net/ubuntu/+s...ux/+bug/544988

                      Before the AGP change, this is what i found in xorg.log
                      [ 11.498] (II) RADEON(0): mem size init: gart size :fdff000 vram size: s:4000000 visible:3a1c000
                      does look like 64 MiB for vram?

                      [ 11.499] (II) RADEON(0): [DRI2] DRI driver: r300
                      [ 11.499] (II) RADEON(0): [DRI2] VDPAU driver: r300
                      why not r300g? or this just a nameing thing

                      [ 11.499] (II) RADEON(0): Front buffer size: 5808K
                      [ 11.499] (II) RADEON(0): VRAM usage limit set to 48297K

                      and later:
                      [ 42.599] (II) RADEON(0): VRAM usage limit set to 50760K


                      After reboot btw, those limits remain. So why a limit lower then available VRAM?

                      Comment

                      Working...
                      X