Announcement

Collapse
No announcement yet.

PCI Express 4.0 Is Ready, PCI Express 5.0 In 2019

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by jabl View Post
    You seem to be assuming that bitcoin mining and/or whatever you have benchmarked is representative of all GPGPU applications. If this were true, why, for instance, did Nvidia go through all the trouble and expense of creating NVLink for their high-end Teslas instead of just using a 1x PCI-E connector?
    Exactly - some compute apps have relatively light bandwidth requirements relative to the amount of processing done on each chunk of data but for every one of those apps there is another which runs transfer-limited and hence bus-limited even after heavy optimization.
    Test signature

    Comment


    • #62
      Originally posted by jabl View Post
      You seem to be assuming that bitcoin mining and/or whatever you have benchmarked is representative of all GPGPU applications. If this were true, why, for instance, did Nvidia go through all the trouble and expense of creating NVLink for their high-end Teslas instead of just using a 1x PCI-E connector?
      Yes you're right, I stand corrected. In the HPC world, the workloads can certainly vary wildly in terms of compute vs. I/O requirements. When I was building DEC Alpha clusters in the late 90's and early 00's with Quadrics and Myrinet interconnects, we saw this first hand. In some workloads, the interconnect latency and throughput made almost zero difference, and in others, it made an order of magnitude difference.

      Comment


      • #63
        Originally posted by torsionbar28 View Post
        A single PCI-E 1x slot is fine for xeon-phi or GPU compute. Google for "bitcoin rig" to see real world examples. GPU on a 1x PCI-E slot performs no differently than on 16x PCI-E slot for compute applications. I've benchmarked it myself. That's the entire point - the bandwidth intensive stuff happens *inside* the card, with its own dedicated CPU and memory, so the only thing traversing the PCI-E bus is the initial data set getting loaded in, and the results coming out. These faster 4.0 and 5.0 PCI-E revisions do nothing to benefit GPU compute applications.
        My current GPU compute program uses 33% of my PCIe 3.0 x16 bandwidth(so around 5GB/s)? Just over the x4 speed. When the GPU can have more memory and I can get some more CPU cores I'll probably be sending more data than that, possibly around 4x worth? PCIe 3.0 would be maxed at that point.

        Comment


        • #64
          Originally posted by GI_Jack View Post

          Why do we need more PCI-e lanes then? Why isn't 22 lanes enough? Do we run more than 22 devices? I think not.

          In the future, devices that are currently x16 or x8 can run on x4 and x2, and perhaps even x1.

          Ideally, we could have one device per lane, with only special exceptions. It'd make the wiring far more simple.
          22 lanes is not enough for me, I'm eyeing up those nice big AMD chips with 64-128 lanes. More lanes is quite helpful if you're adding alot of cards that need the bandwidth. I am curious though if I can use a PCIe 2.0 x4 card in a x1 slot. Such as USB 3.1.

          Comment


          • #65
            Originally posted by commodore256 View Post
            Oh, I'm going to love PCIe 5.0 because I'm crazy.

            I hope one day I could get a Laptop that supports IOMMU Groups via External PCIe Connections like Thunderbolt or OCuLink so I can get a dockable legacy peripheral solution. I would give it the Kitchen Sink, dual PCIe 1.0 x16 slots so I can run Windows 9x and XP in a VM at the same time with native GPU passthrough (There are early PCIe Cards that have Win 9X Drivers) and enough PCIe lanes to have a plethora of legacy cards like a Floppy Controller, a SCSI Card, IDE, SATA, Firewire, LPT, COM, The best Soundblaster with 9X/XP Drivers, an Agia PhysX Card, multiple capture cards for different things, ETC. All of that would have plenty of bandwidth for three PCIe 5.0 lanes

            Like I said, I'm crazy :P I also hate USB because the bandwidth is never consistent.
            So rather than a desktop where everything is neatly contained in a box, you want a tangled spaghetti octopus of cables, devices, and power bricks. Whatever makes you happy.

            FYI DirectX 9.0c with Shader Model 3.0 and OpenGL 2.1 is already supported in VM's today, without any hardware passthrough. Any modern GPU is capable of delivering triple-digit frame rates to multiple VM's concurrently.

            Comment


            • #66
              Originally posted by polarathene View Post

              22 lanes is not enough for me, I'm eyeing up those nice big AMD chips with 64-128 lanes. More lanes is quite helpful if you're adding alot of cards that need the bandwidth. I am curious though if I can use a PCIe 2.0 x4 card in a x1 slot. Such as USB 3.1.
              Are you going to add 64-128 devices?

              faster lanes are helpful for devices that for whatever reason need to be 1 lane or 4 lanes, such as nvme, network cards, etc... They also make the wiring a lot easier, and the switching a lot simpler.

              Which means you need more bandwith, not more lanes. Consider that not everything is a video card, and not everything gets plugged into a x16 slot, and not all video cards run with all sixteen lanes at once.

              perhaps in the future with PCIe v5 we can run video cards on x4 or even x1 slots, and there are no special video card slots anymore. That would be great actually. No need for complex wiring. It also makes it easier for other high bandwith devices.

              Comment


              • #67
                Originally posted by KellyClowers View Post
                No. Far better to have an internal, but easily swappable card than stupid external device on a cord
                For what, exactly? The only thing I can think of that this would be useful for is if you absolutely positively must use the latest WiFI standard.

                For anything wired you would already need wires anyway. And anything thin enough to be useful in most laptops today would need to be too thin for most sockets, meaning you would end up with an external part anyway.

                Comment


                • #68
                  Originally posted by GI_Jack View Post

                  Are you going to add 64-128 devices?

                  faster lanes are helpful for devices that for whatever reason need to be 1 lane or 4 lanes, such as nvme, network cards, etc... They also make the wiring a lot easier, and the switching a lot simpler.

                  Which means you need more bandwith, not more lanes. Consider that not everything is a video card, and not everything gets plugged into a x16 slot, and not all video cards run with all sixteen lanes at once.

                  perhaps in the future with PCIe v5 we can run video cards on x4 or even x1 slots, and there are no special video card slots anymore. That would be great actually. No need for complex wiring. It also makes it easier for other high bandwith devices.
                  64 lanes can satisfy 4 GPUs at x16(provided the motherboard supports it with x16 slots or I think splitters/risers?). With 4.0 I probably won't need as many lanes, I have several x4 cards I'd like to use and multiple GPUs. My workload can utilize them either for para-virtualized VMs(where the extra USB/Disk/Network controllers are helpful, higher bandwidth with 4.0 means more controllers can fit on a single expansion card with hopefully good IOMMU groups so they can each go to different VMs). For the GPUs I do compute workloads like photogrammetry and deep learning, these workloads can take advantage of the lanes and bandwidth far better than games.

                  I need more lanes/slots for my next system or PCIe 4.0 might reduce that need once products take advantage of it are available. Currently I only have a single dGPU with another x16 slot that'd switch to x8 if used I think hampering my GPU perf? and an x4 for the NVMe, that's my CPU lanes, mobo lanes only provides 3 x1 slots :\ I've not used risers yet but apparently those allow me to plug x4 devices into x1. I didn't know as much about this stuff when I built this machine, so unfortunately I think I have to wait until I upgrade.

                  Comment


                  • #69
                    Originally posted by quaz0r View Post
                    I know you guys need to foam at the mouth over what I said, but lets at least respond to what I'm actually saying: I'm not saying PCI is bad or that we should get rid of it. I'm saying lets think of a better approach for GPUs.
                    They used to have a dedicated port for GPUs called an Accelerated Graphics Port. PCIe replaced it.

                    Comment


                    • #70
                      I/O on CPU Die dont shrink as much. So getting more bandwidth per lane means more can be directly attached to CPU.

                      Example. Current best SSD are already bottlenecked by PCI-E 3.0 4x. And going to 8x is just not feasible. It has already been shown having an SSD connected directly to the CPU offer some latency improvement over connection to Hubs. Consider PCI-E 5.0 SSD that offer 10GB/s seq performance direct to CPU.

                      Essentially like the current AMD Ryzen, the CPU becomes the SoC or the Hub itself.

                      We are / we were used to be bottlenecked by IO, now it is the CPU that needs to keep up ( Moving to Many Core )

                      Comment

                      Working...
                      X