Announcement

Collapse
No announcement yet.

Skylake & Newer Could Still See Faster Linux Graphics Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Skylake & Newer Could Still See Faster Linux Graphics Performance

    Phoronix: Skylake & Newer Could Still See Faster Linux Graphics Performance

    With my recent tests of Intel Kabylake graphics on Linux 4.13 showing no change in performance, it was asked whether the Intel Linux graphics driver has plateaued for reaching maximum performance. It hasn't...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Unless they're testing something like glxgears at 720p, I'm not really sure how bandwidth has anything to do with this. Even modern mid range AMD and Nvidia GPUs can play modern games just fine on x4 PCIe lanes, and these are a lot more powerful than Intel hardware.

    But, if Intel can insist there is room for improvement, I'd like to be proven wrong.

    Comment


    • #3
      Originally posted by schmidtbag View Post
      Unless they're testing something like glxgears at 720p, I'm not really sure how bandwidth has anything to do with this.
      FYI, one of the main bottlenecks of iGPUs is insufficient RAM-iGPU bandwith to do a good job (as they are using a RAM that wasn't meant for GPUs in the first place).
      Overclocking RAM gives up to 20% better iGPU performance on APUs (where there is actually an iGPU able to play something), and that's just because of dumb RAM bandwith increase.

      Dedicated cards have their own (special) RAM on a bigger memory bus for a reason.

      Anything that increases bandwith for iGPU is very welcome, any day.

      Comment


      • #4
        Originally posted by starshipeleven View Post
        FYI, one of the main bottlenecks of iGPUs is insufficient RAM-iGPU bandwith to do a good job (as they are using a RAM that wasn't meant for GPUs in the first place).
        Overclocking RAM gives up to 20% better iGPU performance on APUs (where there is actually an iGPU able to play something), and that's just because of dumb RAM bandwith increase.

        Dedicated cards have their own (special) RAM on a bigger memory bus for a reason.

        Anything that increases bandwith for iGPU is very welcome, any day.
        I'm aware RAM is usually the main bottleneck, though that shouldn't affect its performance vs Windows. Though I do see how the PCIe bus could potentially be the result of the performance loss, as this is something that the driver affects. But my point is the IGPs are so slow that I don't see how its possible they could be starving for bandwidth.

        And sure, there's absolutely nothing wrong with increasing bandwidth, but I don't think it's going to help much beyond synthetic benchmarks, or at least it won't fix the performance gap against Windows.

        Comment


        • #5
          Newer Intel CPU's require more and more a firmware blobs, so not even this speaks anymore for Intel graphics. I will stick to the older generations and the gaming performance is there quite unimportant.

          Comment


          • #6
            Originally posted by schmidtbag View Post
            I'm aware RAM is usually the main bottleneck, though that shouldn't affect its performance vs Windows.
            It depends on how the Windows driver deals with it vs the linux driver. If the Windows driver deals with it in more efficient ways, then it runs better.

            Though I do see how the PCIe bus could potentially be the result of the performance loss, as this is something that the driver affects. But my point is the IGPs are so slow that I don't see how its possible they could be starving for bandwidth.
            iGPU aren't on PCIe, but on internal die interconnect bus that is ridiculously fast (it's used by all on-die stuff to talk to each other, and you have CPU+iGPU+RAM controller +PCIe controller + Chipset connection controller that must carry Sata and other PCIe stuff in that same silicon).
            Really it makes no kind of sense whatsoever to place an on-die component behind the pcie controller, it just adds latencies and bullshit for no reason.

            That's also one of the reasons iGPUs can't be passed through to virtual machines without a dedicated iGPU virtualization infrastructure like Intel is making. IOMMU allows you to passthrough PCIe stuff, but iGPU isn't on PCIe.

            So no, when they talk of "bandwith" it's not PCIe, but raw iGPU-RAM bandwith.

            You know one of the tricks they used to make Iris Pro so great? They added on-chip L4 cache of a generous size (hundreds of MB, 128 or 256 don't remember how much, you can google it up), and this is obviously on a FAR faster bus than RAM.
            Last edited by starshipeleven; 22 July 2017, 12:07 PM.

            Comment


            • #7
              Originally posted by starshipeleven View Post
              FYI, one of the main bottlenecks of iGPUs is insufficient RAM-iGPU bandwith to do a good job (as they are using a RAM that wasn't meant for GPUs in the first place).
              Overclocking RAM gives up to 20% better iGPU performance on APUs (where there is actually an iGPU able to play something), and that's just because of dumb RAM bandwith increase.
              Right. On the other hand Geforce GT 1030 has 48 GB/s of memory bw, GTX 750 has 80 GB/s.. dual channel DDR4-3200 system has 51 GB/s. System memory bandwidth isn't that bad. You also need to consider the cache systems.

              Dedicated cards have their own (special) RAM on a bigger memory bus for a reason.
              Only the high end models have sufficient memory bw. All GTX 1060 and other "budget" models have artificially limited 64 to 192-bit interface. They even use DDR instead of GDDR.

              Comment


              • #8
                Originally posted by starshipeleven View Post
                iGPU aren't on PCIe, but on internal die interconnect bus that is ridiculously fast (it's used by all on-die stuff to talk to each other, and you have CPU+iGPU+RAM controller +PCIe controller + Chipset connection controller that must carry Sata and other PCIe stuff in that same silicon).
                Really it makes no kind of sense whatsoever to place an on-die component behind the pcie controller, it just adds latencies and bullshit for no reason.

                That's also one of the reasons iGPUs can't be passed through to virtual machines without a dedicated iGPU virtualization infrastructure like Intel is making. IOMMU allows you to passthrough PCIe stuff, but iGPU isn't on PCIe.

                So no, when they talk of "bandwith" it's not PCIe, but raw iGPU-RAM bandwith.

                You know one of the tricks they used to make Iris Pro so great? They added on-chip L4 cache of a generous size (hundreds of MB, 128 or 256 don't remember how much, you can google it up), and this is obviously on a FAR faster bus than RAM.
                AFAIK. The iGPU solution sits on the SoC interconnect (as you say, much like any core) for cache coherency, running on the GTI-bus. I think the GTI is just a glue bus. The iGPU solution has its own L1 and L2, while it shares it L3 with the CPU's (atleast memory accesses passes through) and for IrisPro, has a large eDRAM L4. The L4 is usable by the CPU's too, since there really isn't any difference in the cache hierarchy view. So IrisPro CPU's have better throughput for larger data sets. As for the eDRAM itself. I think it is in the order of two to three times more bandwidth than external DRAM. It's a victim cache to the L3 for the CPU's, but I don't know if the GPU uses their slice with a different function. I doubt that, since it's on its own memory channel, so the easy thing to do would just act as a eviction dump.

                Comment


                • #9
                  Originally posted by caligula View Post
                  Right. On the other hand Geforce GT 1030 has 48 GB/s of memory bw, GTX 750 has 80 GB/s.. dual channel DDR4-3200 system has 51 GB/s. System memory bandwidth isn't that bad. You also need to consider the cache systems.
                  Dual channel DDR4-3200 is also in use by the CPU at the same time so that's not really a 51GB/s for the iGPU.
                  Also assuming that the system is actually using it as dual channel at all. I've seen plenty of laptops where there is only one channel (even if they have 2 DIMM slots).

                  Only the high end models have sufficient memory bw. All GTX 1060 and other "budget" models have artificially limited 64 to 192-bit interface. They even use DDR instead of GDDR.
                  GTX 1060 isn't a "budget" model in any way, shape or form, it's a midrange card, and it has 192-bit interface. Not even (current) APUs can go anywhere near that. Let's see what they can pull off with Ryzen.

                  For actual "budget" cards yes, the GT 1030 does have a 64-bit interface and is usually on DDR. And while even the 1030 pwns Intel's iGPUs (excluding the usual Iris Pro) the (current) desktop APUs are still ahead of it.

                  EDIT: looks like I was confusing the 1030 with the 730 lol. They're on Gddr5.
                  Last edited by starshipeleven; 23 July 2017, 02:06 PM.

                  Comment


                  • #10
                    All GeForce 1000-series cards use GDDR5, with some using GDDR5X. The DDR used in past budget cards was the older type (DDR2/DDR3) that is no longer an option, as manufacturers have transitioned to making DDR4 for PCs and GDDR5/5X for video cards.

                    Comment

                    Working...
                    X