Announcement

Collapse
No announcement yet.

AMD Is Trying To Optimize Their Gallium3D Driver Even Further With Lower Overhead

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Is Trying To Optimize Their Gallium3D Driver Even Further With Lower Overhead

    Phoronix: AMD Is Trying To Optimize Their Gallium3D Driver Even Further With Lower Overhead

    While the RadeonSI Gallium3D open-source OpenGL driver for Linux systems is very well received and generally outperforming the proprietary AMD OpenGL driver on Linux/Windows and performing very strong against NVIDIA's proprietary OpenGL driver too, it's not game over for this older graphics API and AMD is still working to lower the CPU overhead even further for this open-source code...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    could amd use mesa's opengl in their windows driver to replace their own? opengl performance has been one of radeons biggest complaints long before amd bought ati.

    Comment


    • #3
      That's great to hear. Most of my games on Linux use OpenGL, so any extra optimization is always welcome. And I'm sure many like me don't play the latest games and prefer older ones that more likely use OpenGL. I'm also excited for ACO support in RadeonSI. It's interesting to see if RadeonSI with ACO will start consistently outperforming Nvidia OpenGL, because even with LLVM, it already beats Nvidia in some cases.

      Comment


      • #4
        Until Vulkan becomes ubiquitous I am happy to see AMD hammering away at OpenGL. This has knock-on effects for more than just games. Plus, knowledge gained from continuous optimization of OpenGL can only help fuel knowledge in optimizing Vulkan and OpenGL to Vulkan wrappers.

        Comment


        • #5
          Marek really is an exceptional developer;
          once he is completely finished with OpenGL, hopefully he can work his vodoo magic skills on refining RADV + ACO as well - without AMD holding him back because of NIH...

          Meanwhile, this is what You as mere mortals can do to both improve performance & latency:
          Code:
          module_blacklist=acpi_cpufreq intel_pstate=disable
          (The Intel option only applies to Intel, obviously!)

          By adding this to Your kernel command-line, Linux will disable taking control of Your CPU, which means the on-chip hardware logic for controlling the P- & C-states takes over and does the same job a lot more efficiently!

          Try it out & report back please, especially AMD Ryzen users...

          Comment


          • #6
            Originally posted by Linuxxx View Post
            Meanwhile, this is what You as mere mortals can do to both improve performance & latency:
            Code:
            module_blacklist=acpi_cpufreq intel_pstate=disable
            (The Intel option only applies to Intel, obviously!)

            By adding this to Your kernel command-line, Linux will disable taking control of Your CPU, which means the on-chip hardware logic for controlling the P- & C-states takes over and does the same job a lot more efficiently!

            Try it out & report back please, especially AMD Ryzen users...
            What the hell are you smoking?

            No, AMDs don't have on-chip hardware logic for controlling the P- & C-states. Intels do, but HWP is not on by default, and guess what, you suggest to disable the exact code that sets it up. No cpufreq = P0, efficiency goes out of the window.

            Note: this will still retain some efficiency, because in your own incompetence you forgot about cpuidle. Next try disabling THAT and tell me how it goes for C-states...
            Last edited by intelfx; 24 March 2021, 10:28 PM.

            Comment


            • #7
              Originally posted by intelfx View Post

              What the hell are you smoking?

              No, AMDs don't have on-chip hardware logic for controlling the P- & C-states. Intels do, but HWP is not on by default, and guess what, you suggest to disable the exact code that sets it up. No cpufreq = P0, efficiency goes out of the window.

              Note: this will still retain some efficiency, because in your own incompetence you forgot about cpuidle. Next try disabling THAT and tell me how it goes for C-states...
              Notice how I was asking specifically for AMD Ryzen input there, since I don't have access to such a machine currently; therefore, I didn't want to complicate things further by mentioning Intel's idle driver - and of course I donot advocate for disabling it, else I would have already included that in my previous post.

              Still, my point stands, and unlike you, I can also back it up with some hard facts instead of imagining of what SHOULD happen:

              Just for you, I booted up my ageing-old notebook with an Intel Core i5-3210M that most definitely doesn't support HWP.

              Here, have a look for yourself to be sure:
              Code:
              cat /proc/cpuinfo
              
              processor : 0
              vendor_id : GenuineIntel
              cpu family : 6
              model : 58
              model name : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
              stepping : 9
              microcode : 0x21
              cpu MHz : 2494.307
              cache size : 3072 KB
              physical id : 0
              siblings : 4
              core id : 0
              cpu cores : 2
              apicid : 0
              initial apicid : 0
              fpu : yes
              fpu_exception : yes
              cpuid level : 13
              wp : yes
              flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
              vmx flags : vnmi preemption_timer invvpid ept_x_only flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest
              bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds
              bogomips : 4988.61
              clflush size : 64
              cache_alignment : 64
              address sizes : 36 bits physical, 48 bits virtual
              power management:
              Convinced?
              Okay then, with the above kernel command-line parameters in my previous post "sudo cpupower frequency-info" gives the following output (note that I have run this through Google Translate, just so that you can follow along):
              CPU 0 is analyzed:
              No or an unknown cpufreq driver is active on this CPU
              CPUs that run with the same hardware frequency: Not Available
              CPUs that have to coordinate their frequency with software: Not Available
              Maximum duration of a clock frequency change: Cannot determine or is not supported. Not Available
              Available cpufreq controllers: Not Available
              Unable to determine current policy
              current CPU frequency: Unable to call hardware
              current CPU frequency: Unable to call to kernel
              boost state support:
              Supported: yes
              Active: yes
              2900 MHz max turbo 4 active cores
              2900 MHz max turbo 3 active cores
              2900 MHz max turbo 2 active cores
              3100 MHz max turbo 1 active cores
              Now, being the genius that you clearly are, you may notice that even though Linux now basically knows as much as you do about the state of the operating CPU (i.e. NOTHING!), it still at least manages to get the different boost states absolutely correct and on point! Isn't this just wonderful?
              Linux 1 - "intelfx" (isn't the name kinda ironic here?) 0

              So, you may now wonder: "But what about the output of »sudo cpupower monitor«?"
              You asked for it - you got it (sorry about the slightly messed up formatting, however just can't be bothered to fix that for you too):
              | Nehalem || SandyBridge || Mperf || Idle_Stats
              CPU| C3 | C6 | PC3 | PC6 || C7 | PC2 | PC7 || C0 | Cx | Freq || POLL | C1 | C1E | C3 | C6 | C7
              0| 0,05| 0,00| 0,03| 1,74|| 98,96| 0,77| 91,88|| 0,47| 99,53| 2044|| 0,00| 0,00| 0,11| 0,06| 0,00| 99,37
              1| 0,05| 0,00| 0,03| 1,74|| 98,96| 0,77| 91,88|| 0,19| 99,81| 2131|| 0,01| 0,11| 0,02| 0,01| 0,00| 99,67
              2| 0,14| 0,00| 0,03| 1,74|| 97,98| 0,77| 91,88|| 0,71| 99,29| 2095|| 0,00| 0,01| 0,00| 0,03| 0,00| 99,27
              3| 0,14| 0,00| 0,03| 1,74|| 97,98| 0,77| 91,88|| 0,84| 99,16| 1963|| 0,00| 0,03| 0,00| 0,12| 0,00| 98,99
              As anyone can clearly see now: NO, you dumb pump, the CPU is NOT operating at its highest frequency all the time like you seem to be hallucinating! (Talk about smoking strange stuff here...)

              Anyway, before I let you go, here is some more food-for-thought from an AMD developer, who believe it or not, is way more competent than you - trust me on that one!
              agd5f
              X.Org ATI Driver Developer

              This seems a bit like second guessing the hardware. The power management controllers on the CPU already monitor all of this with much better latency and accuracy then the OS could. I thought the whole point of CPPC was more to give the CPU a hint as to what the target performance range is so the power management unit can better tune it's dynamic power control, not as a way to override what the CPU is trying to do.
              Very much looking forward to your answer now, which I'm sure is going to enlighten us all into oblivion !!

              Comment


              • #8
                Time to fork Mesa into classic-mesa and mesa; with glvnd these days, and all currently-maintained drivers being Gallium-based, there's no real big reason not to.

                It'd also be cool over time to factor the OpenGL-supporting functionalities of the hardware-specific GL drivers into Vulkan extensions, then slowly improve Zink to use all of them well enough to eclipse those drivers.
                Last edited by microcode; 25 March 2021, 09:44 AM.

                Comment


                • #9
                  Originally posted by fafreeman View Post
                  could amd use mesa's opengl in their windows driver to replace their own? opengl performance has been one of radeons biggest complaints long before amd bought ati.
                  They're getting there bit by bit; though I think if they were to do that, they'd want to also have full support in their Mesa for the D3D{8..12} thing (I forget what they call the intermediate layer on Windows for these). And no, they have no current concrete plan to do this; but there are people, mostly in the broader community/public, who are getting Mesa there bit by bit whether AMD cares to or not.

                  Comment


                  • #10
                    Originally posted by microcode View Post

                    They're getting there bit by bit; though I think if they were to do that, they'd want to also have full support in their Mesa for the D3D{8..12} thing (I forget what they call the intermediate layer on Windows for these). And no, they have no current concrete plan to do this; but there are people, mostly in the broader community/public, who are getting Mesa there bit by bit whether AMD cares to or not.
                    Either that or port the two drivers over the same windows driver interface, and they can both run on it at the same time so you might have Mesa implementing OpenGL and Vulkan, and DX7-10 etc... and the proprietary driver implementing DX11-DX12 etc...

                    Comment

                    Working...
                    X