Announcement

Collapse
No announcement yet.

Marek Cleans-Up & Refactors R600g Driver

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by jrch2k8 View Post
    hi, marek i have an stupid idea[that maybe can be good] since i don't have time to deal with learning gallium enough to implement opengl missing bits i was thinking that i should understand gallium enough to help optimizing the command stream but i was curious if exist a tool that let me, lets say run piglit and capture the CS from fglrx so i can compare it against r600g, maybe i can shoot some low hanging fruits in there.

    another thing i could spent my time with is comparing performance against a reference base[fglrx] but i wish to know if you know any utility that let me capture every gl call and see how much time needed to execute[not sure if can be possible with something like apitrace] dunno for example GL_ARB_Extension 200ms, ofc i don't expect to discover boiling water this way but it can provide a good start point to tackle some low hanging bottlenecks at least

    i do code optimization all day at work and i have lots of patience for it[Qt/C++ tho] but unlike cpu code im not sure if the gpu tools exist, so if the tooling i need exists[love if you can give the names so i can get them] and you can give me some starting points ill try my best in bot read the code
    You can use apitrace, "glretrace -p ..." for profiling. I've used it some time ago to compare mesa to windows catalyst. There was not much difference, just that SwapBuffers took like ten times longer with mesa/r600g if I remember correctly. But then my test code didn't do anything extraordinary, just setting textures and issuing draw calls most of the time.

    Comment


    • #12
      Ah, Marek fixes the world again.
      Kudos for the code cleanups. It makes things smoother.
      It's always good for all kinds of work if somebody else with some distance (or yourself with some time passed) goes over it and adds some polishing.
      Stop TCPA, stupid software patents and corrupt politicians!

      Comment


      • #13
        Originally posted by log0 View Post
        You can use apitrace, "glretrace -p ..." for profiling. I've used it some time ago to compare mesa to windows catalyst. There was not much difference, just that SwapBuffers took like ten times longer with mesa/r600g if I remember correctly. But then my test code didn't do anything extraordinary, just setting textures and issuing draw calls most of the time.
        very useful thanks for your post, i think ill try with a common game like xonotic hoping to be able of reach some bottlenecks in a real world scenario[i suspect of some of the bottlenecks to be in shader and texture downloading but using this tooling you posted i think i can get some solid evidence and start from there again many thanks]

        Comment


        • #14
          Originally posted by jrch2k8 View Post
          hi, marek i have an stupid idea[that maybe can be good] since i don't have time to deal with learning gallium enough to implement opengl missing bits i was thinking that i should understand gallium enough to help optimizing the command stream but i was curious if exist a tool that let me, lets say run piglit and capture the CS from fglrx so i can compare it against r600g, maybe i can shoot some low hanging fruits in there.
          The tools exist, but Jerome has been doing that already and it doesn't seem to have borne much fruit (compared to the invested time). I was doing that too while working on r300g.

          Originally posted by jrch2k8 View Post
          another thing i could spent my time with is comparing performance against a reference base[fglrx] but i wish to know if you know any utility that let me capture every gl call and see how much time needed to execute[not sure if can be possible with something like apitrace] dunno for example GL_ARB_Extension 200ms, ofc i don't expect to discover boiling water this way but it can provide a good start point to tackle some low hanging bottlenecks at least
          Apitrace can capture GL calls, but the hardware executes commands in parallel, so you won't be able to tell how much time each command takes on the GPU. Besides that, Apitrace itself has huge CPU overhead and replaying commands with it is actually a lot slower, making it useless for benchmarking.

          Originally posted by Adarion View Post
          Kudos for the code cleanups. It makes things smoother.
          It's always good for all kinds of work if somebody else with some distance (or yourself with some time passed) goes over it and adds some polishing.
          The cleanups mainly ensure that the code doesn't turn into unmaintainable mess after a couple of years of development (and make supporting new features easier).

          Comment


          • #15
            Originally posted by marek View Post
            The cleanups mainly ensure that the code doesn't turn into unmaintainable mess after a couple of years of development (and make supporting new features easier).

            Hey Marek just curious, is there anything big you WISH you could do with R600g codebase? I've been following the posts about the Nouveau crowd rewriting the DRM bits, Intel rewriting how they do KMS, etc etc etc, and I was just curious: If you had infinite time to do it in, and would get paid to do it, whats the one thing you'd do to the R600g driver? Like are there any serious core design flaws that need to be fixed but that no one wants to/has the time, to fix?

            Edit: You can screw backwards compatibility with existing ABI/API if you have to
            Last edited by Ericg; 11 September 2012, 01:55 PM.
            All opinions are my own not those of my employer if you know who they are.

            Comment


            • #16
              Originally posted by marek View Post
              The tools exist, but Jerome has been doing that already and it doesn't seem to have borne much fruit (compared to the invested time). I was doing that too while working on r300g.


              Apitrace can capture GL calls, but the hardware executes commands in parallel, so you won't be able to tell how much time each command takes on the GPU. Besides that, Apitrace itself has huge CPU overhead and replaying commands with it is actually a lot slower, making it useless for benchmarking.


              The cleanups mainly ensure that the code doesn't turn into unmaintainable mess after a couple of years of development (and make supporting new features easier).
              ok make sense but i guess since i got nights and weekends i can try anyway and maybe read lots of code to see if can find loops that can be unbranched or vectorized/threaded in the struct fest[more a c++ fan but i can get used to it]

              Comment


              • #17
                Originally posted by jrch2k8 View Post
                ok make sense but i guess since i got nights and weekends i can try anyway and maybe read lots of code to see if can find loops that can be unbranched or vectorized/threaded in the struct fest[more a c++ fan but i can get used to it]
                I think this is the profiling work mentioned by Marek, maybe interesting for you too.

                Comment


                • #18
                  Originally posted by Ericg View Post
                  Hey Marek just curious, is there anything big you WISH you could do with R600g codebase? I've been following the posts about the Nouveau crowd rewriting the DRM bits, Intel rewriting how they do KMS, etc etc etc, and I was just curious: If you had infinite time to do it in, and would get paid to do it, whats the one thing you'd do to the R600g driver? Like are there any serious core design flaws that need to be fixed but that no one wants to/has the time, to fix?
                  Nothing big comes to my mind right now. R600g certainly needs a good optimizing compiler, it's the weakest spot of the driver. Most of the design flaws have been either fixed already or are in the process of being fixed.

                  Comment


                  • #19
                    Originally posted by marek View Post
                    or are in the process of being fixed.
                    Anything big coming up that you know about? and thats good to hear there isnt anything major outstanding other than the compiler. Intel is my "go-to" right now just because my laptop is Sandy Bridge, but my Home theater is ATI (Fedora 17) based so I still keep up with R600g development.

                    I know LLVM is being used for some parts of radeon but can it be used for that compiler too?
                    All opinions are my own not those of my employer if you know who they are.

                    Comment


                    • #20
                      J?rome said, more than one year ago, that the kernel interface is quite bad and is (or will be) a bottleneck. But it's really hard to heavily modify this.

                      Comment

                      Working...
                      X