Announcement

Collapse
No announcement yet.

NVIDIA Announces "Pascal" Next-Gen GPU Family

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Luke_Wolf View Post
    Unless the situation has really changed I don't really see the point of NVLink, because when PCIe3.0 came out Tom's hardware did some test and found that there was minimal benefit to it over PCIe2.0, at that point those lanes weren't really the bottleneck.
    Link to article you refer too ? In any case depending on what the benchmark measure there are things for which you will not see the difference btw PCIE 3.0, PCIE2.0, PCIE1.0 or even good old PCI. For instance address translation latency which is IOMMU and which have been pretty constant over time, because there is nothing to be done to optimize it beside large IOMMU TLB cache. PCIE latency is kind of bad, especialy if you are doing a lot of random memory access (which compute workload sometime do but not all of them).

    If the system memory used on the platform doing the benchmark can not do more than 8GB/s then you can not even seen the difference btw the two. If the benchmark you use do not trigger frequent system memory access then again you will not see any advantages to PCIE 3.0.

    The list goes on and on.

    NVidia surely did not work on the NVLink for no reasons. There is many bottleneck in PCIE when it comes to accessing system memory (address translation, iommu tlb miss, pcie packet overhead, ...).

    Comment


    • #12
      Originally posted by glisse View Post
      Link to article you refer too ? In any case depending on what the benchmark measure there are things for which you will not see the difference btw PCIE 3.0, PCIE2.0, PCIE1.0 or even good old PCI. For instance address translation latency which is IOMMU and which have been pretty constant over time, because there is nothing to be done to optimize it beside large IOMMU TLB cache. PCIE latency is kind of bad, especialy if you are doing a lot of random memory access (which compute workload sometime do but not all of them).

      If the system memory used on the platform doing the benchmark can not do more than 8GB/s then you can not even seen the difference btw the two. If the benchmark you use do not trigger frequent system memory access then again you will not see any advantages to PCIE 3.0.

      The list goes on and on.

      NVidia surely did not work on the NVLink for no reasons. There is many bottleneck in PCIE when it comes to accessing system memory (address translation, iommu tlb miss, pcie packet overhead, ...).
      Tom's hardware do gaming related test and in gaming it's true that the difference is small.
      However NVLink isn't for gaming but for Nvidias Tesla series and for certain computations
      it will probably make a huge difference.

      Comment


      • #13


        Turns out Volta remains on the roadmap, but it comes after Pascal and will evidently include more extensive changes to Nvidia's core GPU architecture.

        Nvidia has inserted Pascal into its plans in order to take advantage of stacked memory and other innovations sooner. (I'm not sure we can say that Volta has been delayed, since the firm never pinned down that GPU's projected release date.) That makes Pascal intriguing even though its SM will be based on a modified version of the one from Maxwell. Memory bandwidth has long been one of the primary constraints for GPU performance, and bringing DRAM onto the same substrate opens up the possibility of substantial performance gains.
        Volta is still coming and was not renamed to Pascal.

        Comment


        • #14
          Originally posted by sunweb View Post
          I also don't really understand a thing about PCIe being the bottleneck when it was never the case, neither for OpenGL nor for OpenCL/Cuda performance. Infact AMD said goodbye to crossfire bridges because of the same reason of PCIe being enough for all needs. And i though Nvidia will do the same but somehow they don't.

          I liked the part when he said that all GPUs will give linear performance once you connect them with NVlim(renamed SLI i presume). I really hope he was talking about graphics because for compute tasks it always was the case.
          GPU compute is very PCIE limited. There are entire class of problems that do don't perform well on GPUs because of PCIE latency and bandwidth. Seriously, it is incredibly hard to find an OpenCL or CUDA research paper that doesn't talk about PCIE bottlenecks.

          Gaming doesn't show a major performance difference because they've designed around the bottleneck. A game sends texture data and such early and you pray to god that you can leave it in the GPU. If not, then when it gets swapped out the game literally just freezes.

          Comment


          • #15
            Originally posted by GT220 View Post
            Volta is still coming and was not renamed to Pascal.
            Maybe they'll make Volta into a codename that just means "GPU family Next".

            Comment


            • #16
              For computing tasks taking advantage of unified memory NVLink is a bug plus if it reduces PCIE latency. But I'd hope to see it standarized, not as an NVidia-only solution like Nvidia G-Sync or CUDA.

              Comment

              Working...
              X