No announcement yet.

Intel's Open-Source Compute Runtime Performing Increasingly Well Against NVIDIA's Proprietary Linux Driver

  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel's Open-Source Compute Runtime Performing Increasingly Well Against NVIDIA's Proprietary Linux Driver

    Phoronix: Intel's Open-Source Compute Runtime Performing Increasingly Well Against NVIDIA's Proprietary Linux Driver

    Given the recent launch of the Intel Arc Graphics A580 for under $200, I've been working on a fresh round of Intel / AMD Radeon / NVIDIA GeForce Linux gaming/graphics and compute benchmark results. Next week that fresh arsenal of Linux graphics benchmarks on the very latest drivers will be published but for today is a look at the most surprising aspect: the OpenCL-focused GPU compute benchmarks.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    How about we compare it against an old Voodoo Banshee to make it really look good :P


    • #3
      Originally posted by FireBurn View Post
      How about we compare it against an old Voodoo Banshee to make it really look good :P
      RTX 4060 is the most recent graphic card released by Nvidia so it seems fair


      • #4
        Originally posted by FireBurn View Post
        How about we compare it against an old Voodoo Banshee to make it really look good :P
        Yeah Lets compare it against gfx cards costeing 5 -10x as much.

        Do you have the same requierements for Car Reviewer?
        Oh Yes, compare everyhting against a Lamborghini or Bentley. The new VW golf is very nice but still 0-60mph its way slower then a Lamborghini. Pro Size / Con Performance


        • #5
          Wow, good job Intel!
          I wonder WTF is AMD doing in this area?


          • #6
            Funny how 4060 is worse than 3060 in some tests


            • #7
              Intel is performing/functioning well in the AI space too.

              If there is a 24GB+ Battlemage card, it will almost certainly be the replacement for my 3090. AMD is an unlikely choice with their current software state, and Nvidia is outrageously priced and a pain to deal with on linux.
              Last edited by brucethemoose; 26 October 2023, 05:15 PM.


              • #8
                Good article, Michael Larabel, thanks for making it!

                It is interesting to see for someone who has interest in both the ARC & NVIDIA card capabilities / performance worlds!
                Using the nice PTS benchmarking test suite was part of my initial test / burn in / benchmark tests of my recent system build using ARC / RYZEN.

                Although it has been widely discussed incidentally, it might be interesting for ARC (and NVIDIA, AMD) users to add showing for future benchmarks the
                power consumption achieved by the GPU (and for that matter the whole system as well?) just sitting at the graphical desktop system idle but running,
                idle with the screen blanked but system running, single monitor attached to the GPU 1080Px60Hz, single 4Kx60 Hz, dual monitors, one/two monitors
                turned soft-off.

                Perhaps the lower power "Watts" bars on the GPU power consumption "in use tests" already to some extent for some tests measure the lowest achieved
                levels if the lowest power state was briefly achieved / measured just before or just after the high activity test parts commenced. The A770 seemed to have
                a low bar near (rough graphical estimate) 35W while the 3070 got down to ~20W in the FluidX32 2.9 test for instance, though I'm surprised I didn't see any of the NV cards drop further down than that but the test wasn't intended to measure that aspect of course.

                ARC cards in the 7x series tend to have very bad idle power consumption absent some Intel commended ASPM settings many cannot seem to apply due to missing BIOS enable options or system level incompatibility when trying to use them. And even if those are enabled if one exceeds a modest resolution x VSYNC rate for
                one or more monitors, or more than 1 monitor it's likely to not drop to a low power consumption level in any case.
                e.g. I've never seen my A770-16-LE take less than 41W when I've measured it at an idle LINUX desktop with X570 / RYZEN 5900X / ubuntu / manjaro tests.

                I imagine the results will vary per. motherboard / BIOS ASPM capabilities / settings and whatever LINUX kernel settings may support / override those; at one point I even thought there was some kind of "force" ASPM enable flag that might be relevant to ARC idle but I'll have to research it; I think the implication was that maybe in modern times LINUX should do that automatically (most / initially relevant for laptop use cases) in which case maybe it's not relevant to adjust.

                I've lost track of what's included in other benchmarks but with all the new-ish additions of NV having tensor RT acceleration improvements for some of the stable diffusion UI configurations / versions / use cases, as well as Intel's corresponding OpenVino accelerations / their better DG2 support for pytorch, TF in GPU mode I know there are cases where both NV and ARC are performing a lot better for those SD UI workflows so that (if not already in the benchmark mix / reports for some classes of reports) might be interesting to follow for many.

                On a totally tangential front one thing I've not seen good data about for consumer GPUs is the extent to which their RAM has integrity and their calculations are correct; the enterprise ones can have the equivalent of ECC in some GDDR variants but I don't know that anyone's consumer GPUs actually enable it or have it and the consumer cards tend to be clocked more aggressively than the industrial ones. So for those doing GPU compute the question is relevant to know whether in computing for a hour, day, week, month if the RAM / computation is likely corrupted anyway for these consumer GPUs and similar case for consumer non-ECC motherboards / RAM of DDR4/5 UDIMM type.

                Just mentioning some things that could be of general interest as ideas if you're ever looking for directions to measure / publish stuff that might be able to be derivative from your existing setups / data etc. but present information as to more areas of use / concern.


                • #9
                  Intel can do the numbers as well as anyone else. Those results are probably why the A580 is priced where it is. It appears Intel is positioning its dGPUs as an inexpensive compute module alternative to Nvidia more than competition for the already sewn up and highly unforgiving gaming market. It's bold, and it may pay off. I think they may be over estimating the number of people that only want a low end compute system and don't play games or only casually so, but perhaps not.

                  Edit to add: If they can get a handle on the idle power burn with a firmware update, or subsequent generations, they've probably got a winner for the low end compute niche they'll be able to use later on to muscle in on the higher margin server compute market. Familiarity with a product and technology has its own kind of loyalty. Similar in the way x86 eventually muscled out more capable, but far more expensive and less available "big iron" and proprietary Unix in the aughts.
                  Last edited by stormcrow; 26 October 2023, 08:00 PM.


                  • #10
                    Wth, another developer offering a compute stack nicely integrated with the gfx driver? Shouldn't it be a completely separate implementation that users have to hunt down and install it themselves? I thought that was the modern, open approach to supporting compute :P