Announcement

Collapse
No announcement yet.

Docs for the hardware/perf counters?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Docs for the hardware/perf counters?

    GDebugger news quotes:
    A new ATI (AMD) performance metrics integration was added, supporting the new ATI driver performance metrics infrastructure. This integration enables users to view ATI performance metrics such as hardware utilization, vertex wait for pixel, pixel wait for vertex, overdraw and more.
    I assume at least some of these are hardware counters. Are these documented, and if so, in which doc?

  • #2
    Aha, found a link confirming they are hw counters: http://developer.amd.com/Membership/...eloper.amd.com

    Hardware Counters

    You get information, in the form of bar charts and line graphs, on every D3D call executed per frame. In addition, you're able to track any of the following hardware counters:
    % Hardware Utilization
    % Vertex Wait for Pixel
    % Pixel Wait for Vertex
    Pre-clip Primitives
    Post-clip Primitives
    % Blended Pixels
    ALU to Texture Instruction Ratio
    % Pixels Passed Z-test
    Overdraw

    Comment


    • #3
      Oh, there's even a linux library for getting them: http://developer.amd.com/tools/GPUPe...s/default.aspx

      No doubt it's fglrx only though. But with a lib there's something to RE, if there's no docs

      Comment


      • #4
        Airlied responded on irc that these aren't documented yet. But if they are simple counters, there is not much work nor IP in them.

        Comment


        • #5
          I'd really like to be able to profile my AMD gpu just like I can profile my AMD cpu, using open source

          Anyone who feels the same please post +1 or something. Let's show there's at least some interest in having these documented.

          Comment


          • #6
            It's more of a resource issue. Also in order to use them properly you need a fairly large amount of driver infrastructure. It's something I'd like to release, but spending time on it right now will take away from other projects (new asic support, 3D improvements, etc.). Also, there are TONS of perf counters so we'd probably need to pair down the list a bit to a subset that is actually generally useful and doesn't reveal too many low level hw design details. And while it would be cool to get fine grained measurements, I think there is a lot more low hanging fruit to harvest in mesa before these really become important.

            Comment


            • #7
              The "GPU utilization %" at least should be usable for dynpm, being more accurate than counting fences?

              Comment


              • #8
                Originally posted by curaga View Post
                The "GPU utilization %" at least should be usable for dynpm, being more accurate than counting fences?
                There's no such thing as a GPU utilization % per se. Perf counters are much lower level (e.g., number of quads passing through some interface or cache hit rates in some cache). Fences are actually a better metric since they give you an idea of how much work is scheduled to execute rather than a snapshot of what various blocks are doing. Additionally, there is a bit of additional overhead for using perf counters so you don't really want to have them on all the time.

                If you want to provide a basic GPU busy percentage, you can already calculate it to a certain extent using the GRBM_STATUS* and CP_STAT* registers which are documented. They will tell you what blocks are busy or not at any given time. So over a certain time period, you can sample these registers and create an overall busy percentage or per block busy percentages.
                Last edited by agd5f; 05 July 2012, 01:22 PM.

                Comment


                • #9
                  Well, that's the name of one of the numbers exposed in the AMD gpu analyzer

                  If you want to provide a basic GPU busy percentage, you can already calculate it to a certain extent using the GRBM_STATUS* and CP_STAT* registers which are documented. They will tell you what blocks are busy or not at any given time. So over a certain time period, you can sample these registers and create an overall busy percentage or per block busy percentages.
                  Thanks, I just might do that. How would I go about that, sending ioctls? Or would it need kernel hacking?

                  Comment


                  • #10
                    Originally posted by curaga View Post
                    Well, that's the name of one of the numbers exposed in the AMD gpu analyzer



                    Thanks, I just might do that. How would I go about that, sending ioctls? Or would it need kernel hacking?
                    You'd need hack the kernel and add whatever interface you'd want to use to expose that information.

                    Comment

                    Working...
                    X