Announcement

Collapse
No announcement yet.

How to tell if a driver is gallium or just mesa? (Slow renderng with radeon)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Still I need a removal for all the things that depend on mesa... I can do the same with "pacman -Rc <package>" so it is not really a question of a distro, but a big inconvenience because the extra layer came to exist for later mesa versions...

    Now I see there is a "-Rdd" switch too which supposedly lets me remove things even when breaking dependencies. That might help too, but still it is a bit inconvenient just because I did not expect this...

    Comment


    • #62
      To be honest.. I actually find it only clean if I really recursively remove everything that builds on mesa and reinstall everything from around the period of the old mesa version.... this is really a burden, but it seems those things that are compiled with the new mesa in mind basically just don't really work with the old libgl.so in place and stuff like that.

      It is messy in every way to play with this and keep the system intact... :-)

      Comment


      • #63
        Progress is made!!!

        Finally i have downgraded all the graphics packages, while running on the latest kernel with default kernel settings. Also this was a partial system downgrade only (mesa, xorg and some - actually many of those packages that rely on them).

        Now i get an unheard minimum 28 fos in extreme tux racer and 379fps in glxgears. I will look what the perf output is and will also try útban terror.

        To me this means that the source of the problem is in new mesa and/xorg sources. Currently running well with mesa 13 and it still says gallium 0.4 on ATI...

        Comment


        • #64
          I have made perf output logs available here:

          ballmerpeak.web.elte.hu/perf_report_arch_oldmesa.txt
          ballmerpeak.web.elte.hu/perf_arch_oldmesa.data

          Sorry. Mesa version is 11.x and not 13.x in this test. That is still pretty fast and 19.x is slow. I am trying to update. Xorg-video-ati is 1:7.8.99.r23 and xorg server is 1.19.1-1 version from 2016 when things are fast...

          I am trying a binarylike search to corner when things went wrong, but arch32 and arch archives are slow and only sometimes also 404-ing for some files so it is not easy to go to various timestamps. Still the hardest is when there are apu or package structure incompatibilities so it takes time...

          Comment


          • #65
            When doing an strace on glxgears this is what I am getting:
            Code:
              ...
              ioctl(6, DRM_IOCTL_RADEON_GEM_CREATE, 0xbfafd880) = 0 <0.000068>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_CS, 0xafe2404c) = 0 <0.000102>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfafd9c4) = 0 <0.000030>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_GEM_CLOSE, 0xbfafd99c) = 0 <0.000043>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_GEM_CREATE, 0xbfafd880) = 0 <0.000070>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_CS, 0xafe380e0) = 0 <0.000088>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfafd9c4) = 0 <0.000029>
               > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_GEM_CLOSE, 0xbfafd99c) = 0 <0.000047>
               > [vdso]() [0x891]
              ...
            These continously appear in the mesa+xorg version combination that is slow.

            I have no idea if the same is happening in the old 11.x version where speed is still good
            and sadly I could not try a later mesa between 11.x and 19.x so far, but I quess some of
            them works fast still.

            I have tried reverting the following changes manually in my mesa.git version:

            https://github.com/mesa3d/mesa/commi...f3d547c486bf0a
            https://github.com/mesa3d/mesa/commi...153167ab8e13e8
            https://github.com/mesa3d/mesa/commi...c086b30f11befe

            Actually after reverting the latter things got 1FPS slower than with unchanged
            19.x, but the first two have helped a very small amount.

            In my journey to find the slowdown I started to look around "radeon_create_bo"
            and "radeon_winsys_bo_create" so that is how I was trying to revert exactly
            these changes because these seemed to be maybe relevant using git blame lookup.

            If anyone knows where to look further or have any idea about my problem - maybe
            on the level of X or other parts of mesa making a lot of "bo"s?) please tell me :-)

            What is "bo" in this sense btw? Is there any documentation I should read to
            understand these acronyms in the code? It takes considerable amount of time
            to understand that cs is some kind of "command stream" (still unsure) and bo
            is some kind of "buffer object" while I have no idea about "pb_" and a lot
            of other things and not even knowing if a bo is a general buffer for anything
            like vertex buffers, constant buffers, backbuffers, zbuffers, who-knows-what
            or just for one specific thing here.

            Is there a list for advised reading before touching mesa code as a noob like me?

            PS.: What is a "slab buffer"??? It really bugs me to know...
            Last edited by prenex; 29 May 2019, 06:32 AM.

            Comment


            • #66
              Originally posted by prenex View Post
              When doing an strace on glxgears this is what I am getting:
              Code:
              ...
              ioctl(6, DRM_IOCTL_RADEON_GEM_CREATE, 0xbfafd880) = 0 <0.000068>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_CS, 0xafe2404c) = 0 <0.000102>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfafd9c4) = 0 <0.000030>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_GEM_CLOSE, 0xbfafd99c) = 0 <0.000043>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_GEM_CREATE, 0xbfafd880) = 0 <0.000070>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_CS, 0xafe380e0) = 0 <0.000088>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfafd9c4) = 0 <0.000029>
              > [vdso]() [0x891]
              ioctl(6, DRM_IOCTL_GEM_CLOSE, 0xbfafd99c) = 0 <0.000047>
              > [vdso]() [0x891]
              ...
              These continously appear in the mesa+xorg version combination that is slow.

              I have no idea if the same is happening in the old 11.x version where speed is still good
              and sadly I could not try a later mesa between 11.x and 19.x so far, but I quess some of
              them works fast still.

              I have tried reverting the following changes manually in my mesa.git version:

              https://github.com/mesa3d/mesa/commi...f3d547c486bf0a
              https://github.com/mesa3d/mesa/commi...153167ab8e13e8
              https://github.com/mesa3d/mesa/commi...c086b30f11befe

              Actually after reverting the latter things got 1FPS slower than with unchanged
              19.x, but the first two have helped a very small amount.

              In my journey to find the slowdown I started to look around "radeon_create_bo"
              and "radeon_winsys_bo_create" so that is how I was trying to revert exactly
              these changes because these seemed to be maybe relevant using git blame lookup.

              If anyone knows where to look further or have any idea about my problem - maybe
              on the level of X or other parts of mesa making a lot of "bo"s?) please tell me :-)

              What is "bo" in this sense btw? Is there any documentation I should read to
              understand these acronyms in the code? It takes considerable amount of time
              to understand that cs is some kind of "command stream" (still unsure) and bo
              is some kind of "buffer object" while I have no idea about "pb_" and a lot
              of other things and not even knowing if a bo is a general buffer for anything
              like vertex buffers, constant buffers, backbuffers, zbuffers, who-knows-what
              or just for one specific thing here.

              Is there a list for advised reading before touching mesa code as a noob like me?

              PS.: What is a "slab buffer"??? It really bugs me to know...
              bo is Buffer Object. There are a few different types that get implemented in mesa to adhere to GL specs.


              If you are making changes it is a -really- good idea to follow these few guidelines so you don't have many problems getting your changes accepted.


              This link is unrelated to mesa itself, however mesa itself is an -implementation- of OpenGL and this link describes how it handles buffers. Check it out it will help you understand.

              Slab (or Slub) is the linux kernels memory allocator.
              What are the factors which help to decide the choice of memory allocators in Linux Kernel? In the present Linux Kernel we have the option of choosing SLAB,SLUB or SLOB. I have read that SLOB is u...

              Comment


              • #67
                Thanks for the links so far! I will dig them and especially good to know about the coding style link and that I should look for opengl resources on Khronos - I kind of know opengl2-3 and gles2, but would not read that page if not directed to because it seemed to be an other abstraction level.

                In the meantime I continued with trying to get old mesa versions running and I could bisect the issue between a later version and the current one.

                Still fast with mesa 17.2.8 and X.Org X Server 1.19.5

                The problem is somewhere between 17.x and 19.x mesa versions (and corresponding xorg).

                Also I have made an strace when it is good in one older system to see number of CREATE and CLOSE ioctl calls (also the number of CS ioctl calls) are a magnitude smaller than in case of 19.x!

                For example 10-20 seconds of glxgears running leads to 9-10 calls to DRM_IOCTL_RADEON_GEM_CREATE on mesa 17.2.8 while it leads to 708 (!!!) number of same calls in the same time period on mesa 19.x! This is surely a quite big of a difference!

                The similar pattern in 17.x is never creating a new gem object:

                Code:
                  ...
                  ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfcf9f04) = 0 <0.000055>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_BUSY, 0xbfcf9d44) = 0 <0.000022>
                  ioctl(6, DRM_IOCTL_RADEON_CS, 0xb307d03c) = 0 <0.000089>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfcf9f04) = 0 <0.000053>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_BUSY, 0xbfcf9d44) = 0 <0.000023>
                  ioctl(6, DRM_IOCTL_RADEON_CS, 0xb30910d0) = 0 <0.000095>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfcf9f04) = 0 <0.000054>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_BUSY, 0xbfcf9d44) = 0 <0.000023>
                  ioctl(6, DRM_IOCTL_RADEON_CS, 0xb307d03c) = 0 <0.000090>
                  ...
                Sometimes when the *_BUSY ioctl call returns -1, it issues a CREATE, but otherwise not.

                I think GEM is some kind of memory handler for the GPU (just like "ttm" in the perf output) and I think something have messed up with memory handling schemes for Mobility Radeon 200M (r300) at some mesa update between 17.x and 19.x.

                Will try to bisect a closer version as 17.2.8 is from 2017.12.22 in time...

                I am pretty sure that the difference between only 9 GEM_CREATE ioctl calls and around 700+ calls in the new version is the key for the problem - but so far I was suspecting so many things already as a mistake that of course I am not sure. It is a really big difference though.

                Comment


                • #68
                  Originally posted by prenex View Post
                  Thanks for the links so far! I will dig them and especially good to know about the coding style link and that I should look for opengl resources on Khronos - I kind of know opengl2-3 and gles2, but would not read that page if not directed to because it seemed to be an other abstraction level.

                  In the meantime I continued with trying to get old mesa versions running and I could bisect the issue between a later version and the current one.

                  Still fast with mesa 17.2.8 and X.Org X Server 1.19.5

                  The problem is somewhere between 17.x and 19.x mesa versions (and corresponding xorg).

                  Also I have made an strace when it is good in one older system to see number of CREATE and CLOSE ioctl calls (also the number of CS ioctl calls) are a magnitude smaller than in case of 19.x!

                  For example 10-20 seconds of glxgears running leads to 9-10 calls to DRM_IOCTL_RADEON_GEM_CREATE on mesa 17.2.8 while it leads to 708 (!!!) number of same calls in the same time period on mesa 19.x! This is surely a quite big of a difference!

                  The similar pattern in 17.x is never creating a new gem object:

                  Code:
                  ...
                  ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfcf9f04) = 0 <0.000055>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_BUSY, 0xbfcf9d44) = 0 <0.000022>
                  ioctl(6, DRM_IOCTL_RADEON_CS, 0xb307d03c) = 0 <0.000089>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfcf9f04) = 0 <0.000053>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_BUSY, 0xbfcf9d44) = 0 <0.000023>
                  ioctl(6, DRM_IOCTL_RADEON_CS, 0xb30910d0) = 0 <0.000095>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0xbfcf9f04) = 0 <0.000054>
                  ioctl(6, DRM_IOCTL_RADEON_GEM_BUSY, 0xbfcf9d44) = 0 <0.000023>
                  ioctl(6, DRM_IOCTL_RADEON_CS, 0xb307d03c) = 0 <0.000090>
                  ...
                  Sometimes when the *_BUSY ioctl call returns -1, it issues a CREATE, but otherwise not.

                  I think GEM is some kind of memory handler for the GPU (just like "ttm" in the perf output) and I think something have messed up with memory handling schemes for Mobility Radeon 200M (r300) at some mesa update between 17.x and 19.x.

                  Will try to bisect a closer version as 17.2.8 is from 2017.12.22 in time...

                  I am pretty sure that the difference between only 9 GEM_CREATE ioctl calls and around 700+ calls in the new version is the key for the problem - but so far I was suspecting so many things already as a mistake that of course I am not sure. It is a really big difference though.
                  TTM and GEM are used in combination. They are both GPU memory managers, but I think TTM is used for it's infrastructure and GEM is used for it's performance characteristics. I'm really not sure what the details are, but I don't think TTM is ever used alone or that GEM is ever used alone. I think TTM is used for "set up" and GEM is used for everything else. It may be a good idea to get on IRC and ask the devs about it. The Intel OSS devs initially designed GEM and the Mesa OSS devs initially desined TTM and since then they seem to cooporate with each other. Those guys would be able to give you specific guidelines.

                  Comment


                  • #69

                    I have found it fast that when I build mesa myself 18.0.0 is still bad:

                    Fast mesa version: 17.2.8 (2017 december 22)
                    Slow mesa version: 18.0.0 (git: 44c7d1aa2ea)

                    Lately I have found that some 17.4.x is maybe bad too (but I was unsure as I went on with git bisect and not written down every version as being bad/good).

                    Still haven't bisected the problem to a single commit, because I ran into issues where I have to change the code because of LLVM dependencies and other things. Hard issue is also when the build system has changes or sometimes a whole build system was exchanged in this or that revision...(meson vs scons vs autoconf).

                    Talking with a guy who knows mesa better than me over email, it seems that TTM is the low-level memory management thing now and GEM is indeed from intel people but other drivers have a GEM-layer on top of their own stuff, so radeon ttm has a GEM-supporting high-level facade. Looking at the call stack in the early perf outputs this is surely visible actually...

                    So sorry for not being here for a while, but I will try to bisect the issue to a single commit. While bisecting I trying to keep everything else the latest: latest LLVM, Xorg, everything... This way it will also surely turn out if it is really mesa or not (I think 95% it is in mesa)

                    Comment


                    • #70
                      Hi!!!

                      Finally I got to a point in bisecting where I will answer "git bisect good"!!!

                      This is really awsome feeling and ensures the problem lies in mesa codebase indeed (not in the x server or things like that).

                      Also the first "good" point came right when I had to first build using autoconf or scons instead of meson so I might just measure how things look when I make a debug build just to ensure it does not affect things the same way and rule out it is some configuration problem in the build system. I just want to be extra sure in everything after analysing this so long...

                      Also instead of copying with my homebrew script, I am now using this:
                      Code:
                      export LIBGL_DRIVERS_PATH=lib
                      export LD_LIBRARY_PATH=lib
                      export EGL_DRIVERS_PATH=lib/egl
                      After build so it was really easy to compare the "bad" and "new" version performance side-by-side when running from different xterms near each other.

                      Some exact measurements:

                      10 seconds of glxgears on my default 19.0.4 (bad):

                      -> 12466 (any) ioctl calls
                      -> 3111 DRM_IOCTL_RADEON_GEM_CREATE
                      -> 3112 DRM_IOCTL_GEM_CLOSE

                      10 seconds of glxgears on 17.2.0-devel (git-33236a306d) (good):

                      -> 1783 (any) ioctl calls
                      -> 7 DRM_IOCTL_RADEON_GEM_CREATE
                      -> 8 DRM_IOCTL_GEM_CLOSE

                      Current bisect state:

                      HEAD detached at 33236a306d1
                      You are currently bisecting, started from branch '44c7d1aa2ea'

                      (I had to restart bisecting at the point because of a technical mistake, but it seems to be bad still for the right end at '44c..')

                      I have went back in git bisecting on mesa code only. The xorg is still the latest, kernel source tree is still the latest, LLVM is still the latest (I had to manually change the old code to work with latest so I know) so the problem now is surely and officially related to the mesa source tree. Only further bisecting will show the exact commit that made things slow, but now it became 100% sure that the issue is not in xorg or some library that I was always downgrading together with mesa in my earlier tests.

                      It seems to be only a matter of time to find the source of the issue - Happy news so far ;-)

                      Sadly it is still 11 steps from now. The earlier 7 steps was a misconception of mine because I did not properly understood how to pair mesa version numbers to commit revisions so I choose a too small time period. Sorry for that it was just unclear how they map, but it does not count anymore so much.

                      EDIT: I still don't understand it well how to map mesa version numbers to git revisions properly, but I saw there are branches in git for mayor revisions like 17.x, 18.x and so on. Because I knew that 17.2.x was still good I just selected that branches revision as git bisect good and a 18.x one as git bisect bad and this seemed to be a good approximation. Earlier I was trying to rely on dates and commit messages saying things about 17.2.8 but that was a bad direction from me earlier.
                      Last edited by prenex; 01 June 2019, 05:00 PM.

                      Comment

                      Working...
                      X