Announcement

Collapse
No announcement yet.

Radeon Gardenshed DRM + Gallium3D Benchmarks

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Radeon Gardenshed DRM + Gallium3D Benchmarks

    Phoronix: Radeon Gardenshed DRM + Gallium3D Benchmarks

    It has been about a month since we last delivered ATI/AMD Radeon Linux benchmarks comparing the performance of the open-source driver against the high-performance proprietary driver. Since that point there's been various improvements to the Mesa/Gallium3D driver and there's also been the merge of the latest Radeon DRM code for the next kernel, which will likely be called the Linux 3.0 kernel, but in the DRM pull request was referred to as Gardenshed. Here are these benchmarks on several different Radeon graphics cards.

    http://www.phoronix.com/vr.php?view=16055

  • #2
    Just a thought, but wouldn't a comparative of Gallium r600 at 1 month intervals (ie compare todays gallium with last month's, march's ...) be far more interesting than this "Oh, one month on, we still haven't closed that order of magnitude gap....."

    David

    Comment


    • #3
      ^^ Or how about a graph of geometric means over time to get an overall feel of how open source and proprietary graphics drivers are progressing?

      Comment


      • #4
        looks like there's some serious bottlenecks... fps hardly drops between a low and a high resolution... but catalyst reacts as supposed, there's a real difference when you higher the resolution
        Does anybody have some clues about the features needed to unleash the power of those cards ? I mean, even if the performances was 50% lower than the catalyst ones and you can observe the drop of fps between resolution, you could say that everything is in order, and that there's some optimizations missing, but those benchmarks feel strange...

        Anyway, I really appreciate all the works around r300G, r600g and nouveau ! keep up the good work, guys !

        Comment


        • #5
          Here's an idea of what to test next (if it was already done, then please ignore )

          Open Source Radeon vs Nouveau performance compared to their blob counterparts. Radeon, which is supported as Open Source by AMD and has access to hardware specs, vs Nouveau which is not supported by NVidia and needs to reverse-engineer a black box. Did OSS Radeon do a better job at coming close to 100% performance while having access to specs? Did Nouveau do a better job? And if Nouveau is the winner, what's the conclusion? Not having specs and support leads to better drivers? If both are on-par, what does that mean? Specs and support doesn't help? It got wasted? If Radeon wins, does that mean that Nouveau is probably never going to become a true replacement for the blob?
          Last edited by RealNC; 05-27-2011, 09:21 AM.

          Comment


          • #6
            IANAMD (Mesa Developper) but IIRC the major bottle necks are in the shader compiler and the handling of state changes.

            Both bits are being worked on by the dedicated volunteers who keep our X stack moving, but neither is what you could call low hanging fruit and both need to be killed by a thousand paper cuts if you get what I mean.

            David

            Comment


            • #7
              Originally posted by Welsh Dwarf View Post
              IANAMD (Mesa Developper) but IIRC the major bottle necks are in the shader compiler and the handling of state changes.

              Both bits are being worked on by the dedicated volunteers who keep our X stack moving, but neither is what you could call low hanging fruit and both need to be killed by a thousand paper cuts if you get what I mean.

              David
              you can't be clearer! tough work all along the road!

              Comment


              • #8
                YOU HAVE TO DISABLE VSYNC. My low-end RV610 does 90fps in openarena with an old 2GHz single core CPU, how can a 4830 do 51fps?
                ## VGA ##
                AMD: X1950XTX, HD3870, HD5870
                Intel: GMA45, HD3000 (Core i5 2500K)

                Comment


                • #9
                  Originally posted by RealNC View Post
                  Open Source Radeon vs Nouveau performance compared to their blob counterparts. Radeon, which is supported as Open Source by AMD and has access to hardware specs, vs Nouveau which is not supported by NVidia and needs to reverse-engineer a black box. Did OSS Radeon do a better job at coming close to 100% performance while having access to specs? Did Nouveau do a better job? And if Nouveau is the winner, what's the conclusion? Not having specs and support leads to better drivers?
                  Similar studies have been done before and the conclusion is always the same. Having a higher percentage of the developers dress in black leads to better drivers.

                  Given the relatively high performance of the r300g stack it seems like better questions would be :

                  - what performance-related features are enabled in r600g relative to r300g at this time ?
                  - are there differences in hardware architecture between the pre-r600 and 600+ generations which could require design changes between r300g and r600g, were those changes made, and were they correct in hindsight ?
                  - what other design changes were made between r300g and r600g ?

                  My 10,000 foot impression of the answers is :

                  - programming some of the performance-related features is a lot trickier in r600+ hardware than in earlier hardware and as a consequence a number of those features are enabled in r300g but not yet enabled in r600g

                  - r600+ hardware has a larger number of registers and different grouping of registers so a different approach to state management was taken in r600g relative to r300g; in hindsight the mapping of registers to state changes is more complex than first expected so more work on state management is probably needed

                  - the shader compiler for r600+ started off as more of a 1:1 IR-to-hardware translator than a real compiler, although recent work may have improved that a lot (haven't had time to look)
                  Last edited by bridgman; 05-27-2011, 09:43 AM.

                  Comment


                  • #10
                    Wasn't catalyst faster than r600 with drawing off?

                    Comment


                    • #11
                      There is definately bottleneck somewhere!
                      This bottleneck is related to screen resolution, as opensource driver does not reduce the fps on par to closed source.

                      Maybe some huge data copy function/s that is unoptimized or unaccelerated? Maybe closed source copies data in single SIMD every frame, but opensource calls on demand?

                      Do amd developers have some profiling software, some software to monitor cpu load, memory bus, pcie bus, gpu load, gpu memory controller load, disk io, software io(like kernel ring 0/1 changes per (milli)second, time percentage and overall the cpu spends - in kernel, driver or elsewhere or idles)?

                      Comment


                      • #12
                        Originally posted by bridgman View Post
                        - what performance-related features are enabled in r600g relative to r300g at this time ?
                        - what different design decisions were made between r300g and r600g ?
                        - are there differences in hardware architecture between the pre-r600 and 600+ generations which could require different design decisions, and were those decisions made ?
                        well, that's what i meant (sort of)... maybe i should dress in black to see if my questions would be sharper

                        I hope you didn't take my post as an offense of any kind, but i'm really interested in those low-level subjects, i just don't have the skills to help

                        Be sure, i'm thankful for your work and your insightful posts on this forum

                        Comment


                        • #13
                          Originally posted by crazycheese View Post
                          Do amd developers have some profiling software, some software to monitor cpu load, memory bus, pcie bus, gpu load, gpu memory controller load, disk io, software io(like kernel ring 0/1 changes per (milli)second, time percentage and overall the cpu spends - in kernel, driver or elsewhere or idles)?
                          There are some tools but they don't generally help as much as you would expect. GPU driver programming involves very long pipelines with many things going on at once, so things that run slow are frequently related to code that ran some time earlier. Performance work usually ends up more like :

                          - stare at all the things you mentioned
                          - get an idea
                          - rewrite a bunch of code and see what happens
                          - repeat until you have to work on something else

                          That said, I believe the main work right now is finishing the enablement of "known" performance-related features, ie the ones which are enabled in r300g but not enabled in r600g. In most cases I think code exists and works on many configurations but not enough to enable by default yet.
                          Last edited by bridgman; 05-27-2011, 10:07 AM.

                          Comment


                          • #14
                            @Bridgman,
                            Ah I remember that 1:1 compiler post on Phoronix a long time ago that made the r600 actualy work. But that was an ugly hack, right? And this hasn't been fixed?! Isn't this top priority work? =o

                            Comment


                            • #15
                              Originally posted by bridgman View Post
                              There are some tools but they don't generally help as much as you would expect. GPU driver programming involves very long pipelines with many things going on at once, so things that run slow are frequently related to code that ran some time earlier. Performance work usually ends up more like :

                              - stare at all the things you mentioned
                              - get an idea
                              - rewrite a bunch of code and see what happens
                              - repeat until you have to work on something else

                              That said, I believe the main work right now is finishing the enablement of "known" performance-related features, ie the ones which are enabled in r300g but not enabled in r600g. In most cases I think code exists and works on many configurations but not enough to enable by default yet. [emphasis mine]
                              You mean like HyperZ? What about rendering using both GPUs of a dual-GPU card? Crossfire? These things would probably help a lot, but only if we eliminate the most severe of the existing CPU bottlenecks. The biggest issue is that the pipeline stalls for a very, very long time; it's not that the GPU has any problem handling the requests it does get.

                              Other problem I see is that certain workloads (some 3D apps) are constantly throwing errors inside DMAR / DRHD subsystems -- several times per frame. The kernel is protecting itself from data corruption and potentially system-crashing issues by detecting DMA remapping faults -- so that part of the kernel is doing its job. But clearly DRM is not doing its job, or the faults wouldn't occur in the first place.

                              While the fault prevention of DMAR/DRHD is great, the downside is that each fault is very expensive. It ends up generating an interrupt each time. If this is happening dozens of times per second, then no wonder we're getting crap FPS.

                              I wouldn't be surprised if several of the programs Michael tested in this article have brought out this behavior. It seems to only occur with mesa 7.11-dev, which he's using. Revert back to mesa stable, and although you lose a lot of features, mesa doesn't use libdrm in such a way as to trigger these constant faults, so the stack is much less preoccupied with handling a constant stream of invalid DMAR requests, and FPS wins. This appears to be closely related to the IOMMU.

                              In fact I remember FPS being much more competitive in past articles Michael has written pitting r600g against Catalyst. I would be surprised if this particular problem isn't to blame for several of the tests Michael used.

                              Edit: (this editing thing is cool!) -- Then again, maybe Michael doesn't have the same problem. You can clearly see whether you have the problem by looking for this in dmesg while rendering (the numbers are irrelevant as this is just an example):

                              Originally posted by Linux Kernel
                              DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
                              DMAR:[fault reason 05] PTE Write access is not set
                              DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
                              DMAR:[fault reason 05] PTE Write access is not set
                              From my (limited) understanding here, the IOMMU actually resides on the motherboard chipset, so it is not directly controlled by either the CPU or GPU manufacturer. Therefore maybe this is an isolated problem that is only buggering up on my specific motherboard chipset. That's entirely possible; I have an early first-generation Intel X58 chipset (ASUS P6T Deluxe v1 is the specific make/model). It was the first enthusiast / desktop Nehalem Architecture motherboard to market. With an Intel CPU and an AMD GPU, who knows if I've just got bad luck and the IOMMU hardware on the mobo doesn't perform to spec?

                              Regardless of whether or not that's true, I must insist that this is an issue that can be handled in software. Otherwise they would probably recall the motherboard, and the Catalyst drivers wouldn't work properly for me on Windows or Linux. As it stands, I can play all the big AAA titles on Windows just fine, so maybe the Catalyst team already discovered and squashed this bug.

                              Or I'm way off the mark and the hardware is fine but there's just a software bug in DRM. Sorry for over-speculating.
                              Last edited by allquixotic; 05-27-2011, 11:06 AM.

                              Comment

                              Working...
                              X