Announcement

Collapse
No announcement yet.

AMD's R300 Gallium3D Driver Is Looking Good For 2011

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by marek
    Because the current GLSL compiler in Mesa rocks and really produces optimized code. There's ongoing work to pass every shader (ARB assembly ones and fixed-function ones) through the GLSL compiler to optimize them a bit (and mainly to simplify things for hw drivers), but it's way harder to optimize low-level code than high-level one.
    Then how come ARB shaders work much faster using r300classic? Can't the developers use the assembly compiler of r300c in r300g?

    Comment


    • #62
      Originally posted by bridgman View Post
      The open source drivers are not multi-threaded AFAIK, so single-thread CPU power probably makes a big difference in the performance results. It would be great if the same CPU could be used across a series of benchmarks so that the driver/hardware differences could be isolated.
      "The open source drivers are not multi-threaded AFAIK" which is Very ODD Today if thats really the case, as NPTL(Native POSIX Thread Library)has been in since 2.6 started http://en.wikipedia.org/wiki/Native_...Thread_Library
      "NPTL has been part of Red Hat Enterprise Linux since version 3, and in the Linux kernel since version 2.6. It is now a fully integrated part of the GNU C Library.[citation needed]
      There exists a tracing tool for NPTL, called POSIX Thread Trace Tool (PTT). And an Open POSIX Test Suite (OPTS) was written for testing the NPTL library against the POSIX standard." ,not to mention there are several other optimized 3rd party threading libraries suitable for any such driver inclusion around too.

      Comment


      • #63
        It's not about the lack of a threading library. It's that drivers don't make use of threading. It's nod odd, it's the norm.

        Comment


        • #64
          Originally posted by glxextxexlg View Post
          Then how come ARB shaders work much faster using r300classic?
          That's a real mystery to me.

          Originally posted by glxextxexlg View Post
          Can't the developers use the assembly compiler of r300c in r300g?
          Both the drivers have been sharing the same compiler backend since ever.

          To Michael and the others: The article only benchmarks r300g and st/mesa. The changes made to the r300 compiler are not visible in the graphs, because all the new compiler optimizations are applied to both r300g and r300c. A benchmark of both the drivers with several different versions of Mesa would better show the overall improvement. Yes, r300c is getting faster as well through the compiler work.

          Comment


          • #65
            Originally posted by popper View Post
            "The open source drivers are not multi-threaded AFAIK" which is Very ODD Today if thats really the case, as NPTL(Native POSIX Thread Library)has been in since 2.6 started
            ???

            I don't think anyone is saying that the lack of a threading library is the issue. Someone actually has to do the design, implementation and testing work to make it happen, and once the work is done all of the developers and testers have to live with the additional debugging challenges. There are other tasks which need to be done first.

            Comment


            • #66
              Originally posted by bridgman View Post
              ???

              I don't think anyone is saying that the lack of a threading library is the issue. Someone actually has to do the design, implementation and testing work to make it happen, and once the work is done all of the developers and testers have to live with the additional debugging challenges. There are other tasks which need to be done first.
              your implication was that the open drivers were some how at a disadvantage compared to 'closed drivers that are threaded' in todays generic multi threaded CPUs market today, but perhaps it was a misunderstanding and you didn't actually mean that, but that's how it came across on first reading, fair enough.

              Comment


              • #67
                I think he _is_ implying that the closed-source drivers are multithreaded, and as such the open source driver are at a disadvantage.

                I don't have any clue what you're confused about, or how that confusion could lead to your post about NTPL being around for years.

                Comment


                • #68
                  I did mean that to a certain extent, but mostly relating to making benchmarks more meaningful for Phoronix readers.

                  The open drivers probably are at a disadvantage compared to the closed drivers as a consequence of being single threaded, but implementing multithreading is not a trivial task and the developer time is probably better spent on the development work that is being done today. Six months from now, who knows ?

                  Completing the move to the Gallium3D architecture needs to happen first, along with phasing out UMS other than for legacy distros, so that only one code base needs to be reworked. In the short term, adding support for the performance-related hardware features like tiling provides more end-user benefit relative to developer effort than multithreading, at least for typical end-user systems.

                  The important thing is that in a single threaded implementation the single-core CPU performance is a significant contributor to benchmark results, and comparing results across different benchmark articles is hard to do unless the same CPU/memory subsystem is used in both cases.

                  Comment


                  • #69
                    Originally posted by mattst88 View Post
                    It's not about the lack of a threading library. It's that drivers don't make use of threading. It's not odd, it's the norm.
                    dont you find that fact odd though? given virtually all dev's and end users today will be using at least a dual x86 CPU in 2010/11, but no matter, moving on....

                    Comment


                    • #70
                      Originally posted by popper View Post
                      dont you find that fact odd though? given virtually all dev's and end users today will be using at least a dual x86 CPU in 2010/11, but no matter, moving on....
                      No, it's not odd or unexpected at all.

                      Most programs are single threaded because in most cases, it's really hard to figure out how to split work efficiently across multiple CPUs. It's a huge outstanding research problem at the moment and probably will be for some time to come.

                      Comment


                      • #71
                        It's no more odd than any other tasks that will get done eventually as time permits. The developers are working on things based on (a) what delivers the most benefit to users and (b) the inherent dependencies that go along with any rearchitecture work (first you pillage, *then* you burn).

                        Popper, are you suggesting that multithreading should have been implemented before other tasks ? If so, which tasks do you feel should have been delayed in order to free up time for multithreading ?

                        Comment


                        • #72
                          Originally posted by bridgman View Post
                          It's no more odd than any other tasks that will get done eventually as time permits. The developers are working on things based on (a) what delivers the most benefit to users and (b) the inherent dependencies that go along with any rearchitecture work (first you pillage, *then* you burn).

                          Popper, are you suggesting that multithreading should have been implemented before other tasks ? If so, which tasks do you feel should have been delayed in order to free up time for multithreading ?
                          as it happens, i agree with what you say, basically make a development plan ,follow it, its the only way to progress in a timely manor.

                          But, and theres always a but, If your plan actually includes 'threading' in this case at some point, then allowing for that at the core of your plan is probably a good thing to consider, so you dont have to rip out lots of new code that just doesn't work well with threading later.

                          if there is a suggestion, that is it, nothing more, nothing less.

                          OC given we are talking Open source here and not closed , then what harm is there to pick an existing well optimized external threading code base library , use its supplied API ,and try building your plan with that along side your other tests as you progress, slightly more work sure, but mattst88 and others are coding for Fun and to learn/try new thing's not pay i assume!

                          Comment


                          • #73
                            Originally posted by bridgman View Post
                            Sure, but that's not what you said. First you were attacking the 300G project and developers, then you were saying that :
                            - we were holding back "secret sauce" that would presumably let 3D run faster
                            in your words there are some spec infos left and you say something about 5% means the OS driver r600 can have up to 95% of the speed,


                            "OK, now I'm confused too. You're talking about 300g, which supports only the older GPUs that don't have OpenCL-capable hardware or video decode hardware (other than a legacy MPEG-2 IDCT block). "


                            for the other people not bridgman--> the problem on OpenCL is that openCL is an 1to1 copy of Cuda and the reference card of cude is a geforce 8800 and this card do have 2 nvidia specific caches and amd ad one of them in the hd4000 cards and the other in the 5000 cards ....

                            @bridgman:but in the video decode side you imagine thats there are only mpeg2 but the x1950 have shader based h264 decode...

                            Comment


                            • #74
                              Originally posted by popper View Post
                              as it happens, i agree with what you say, basically make a development plan ,follow it, its the only way to progress in a timely manor.
                              Yep. I don't *think* there is anything being implemented today that will have problems with simple multithreading in the future, but it's also a pretty safe bet that there will be a pile of "oh crap" issues anyways.

                              I imagine that multithreading would be implemented with whatever the standard OS threading mechanism is at the time, and AFAIK that is NPTL today.

                              Modern games are starting to make more use of multithreading so there's not a big win from using a whole lot of driver threads -- the biggest advantage seems to come from using a single worker thread and letting some of the driver work run in parallel with a single-threaded game.

                              Anyways, multithreading does get discussed from time to time (hopefully enough to avoid anything that would preclude multithreading in the future) but right now the devs are focusing on things that can give bigger and more immediate gains.

                              Comment


                              • #75
                                Originally posted by Qaridarium View Post
                                in your words there are some spec infos left and you say something about 5% means the OS driver r600 can have up to 95% of the speed
                                Sure, but I also said that the last few bits of hardware info were not things where hardware support could easily be added to the open driver... if we ever get to the point where the open driver has a large team of developers doing application-specific optimizations *then* the additional HW info could make a difference. Not 100% sure, but I believe the 5% difference quote was from the 5xx days, and we have actually released some of that info since then (for both 5xx and 6xx+). I believe I was thinking of CMASK info and a few other bits when I mentioned the 5% number.

                                I'm not sure if the CMASK info turned out to be useful... there was some hope of using it for faster clears but don't remember if that actually worked out.

                                Originally posted by Qaridarium View Post
                                @bridgman:but in the video decode side you imagine thats there are only mpeg2 but the x1950 have shader based h264 decode...
                                I was talking about video decoding *hardware*, ie not counting general purpose shaders. On 5xx and earlier that is MPEG2-only (other than the rv550).

                                Comment

                                Working...
                                X