Announcement

Collapse
No announcement yet.

PCI Peer-To-Peer Memory Support Queued Ahead Of Linux 4.20~5.0

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PCI Peer-To-Peer Memory Support Queued Ahead Of Linux 4.20~5.0

    Phoronix: PCI Peer-To-Peer Memory Support Queued Ahead Of Linux 4.20~5.0

    With the upcoming Linux 4.20 kernel cycle (that given past comments by Linus Torvalds might be renamed to Linux 5.0), a new PCI feature queued ahead of the upcoming merge window is peer-to-peer memory support...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I guess this could be useful for multiple graphics cards talking to each other directly (and other PCIE devices like modern storage, FPGAs etc.)? I was under the impression that AMD graphics cards could already do this since GCN 1.1. Could this be useful for HMM and HSA, too or do these concepts have their own infrastructure for dealing with the reduction of memory copies to the device? Conceptually, I suppose PCI P2P is narrower in its scope but could be also leveraged there.
    Last edited by ms178; 18 October 2018, 04:32 AM. Reason: clarification

    Comment


    • #3
      Sorry, my use of the term was meant broader to encompass mGPU as well. But you are right, they are not the same thing.

      Comment


      • #4
        As the article mentions, primary user of this tech is going to be nvme storage exposed over fast networks (100g eth & infiniband). Right now data needs to be copied from nvme into system memory, involving cpu, and then from system memory to nic, involving cpu again. Whole path can be optimized, making devices available to talk to each other directly, using cpu just to set up that communication and then skipping it as much as possible. This makes it possible for storage vendors to use cheap low power cpus for high throughput flash storage appliances, saving on cost and power.

        Comment


        • #5
          Originally posted by pegasus View Post
          As the article mentions, primary user of this tech is going to be nvme storage exposed over fast networks (100g eth & infiniband). Right now data needs to be copied from nvme into system memory, involving cpu, and then from system memory to nic, involving cpu again. Whole path can be optimized, making devices available to talk to each other directly, using cpu just to set up that communication and then skipping it as much as possible. This makes it possible for storage vendors to use cheap low power cpus for high throughput flash storage appliances, saving on cost and power.
          Thank you, I was refering to this discussion on the mailing list but as a non-expert on these questions I couldn't follow on all the details mentioned there: https://lists.01.org/pipermail/linux...ry/008395.html

          There are also use cases which would involve the GPU (such as peer-to-peer access between network card and GPU).

          The current behavior between GPU-CPU through system memory is also a major bottleneck for general purpose GPU computing.

          Comment


          • #6
            Where is AMD's (Christian König's / @deathsimple) effort mentioned?

            Comment


            • #7
              Christian has been working on getting support for dma-buf for device memory upstream for a while now. See:



              Comment


              • #8
                Originally posted by ms178 View Post
                I was under the impression that AMD graphics cards could already do this since GCN 1.1. Could this be useful for HMM and HSA, too or do these concepts have their own infrastructure for dealing with the reduction of memory copies to the device?
                Our GPUs have been able to do this for a long time - this adds some standardized support for it to the drm driver framework.

                Until now the driver support has either been vendor-specific (DirectGMA, which exposes physical bus addresses, Crossfire in FGLRX) or outside the drm framework (KFD/ROCR has full P2P support) - this provides a P2P capability within the drm framework.
                Test signature

                Comment


                • #9
                  Originally posted by bridgman View Post

                  Our GPUs have been able to do this for a long time - this adds some standardized support for it to the drm driver framework.

                  Until now the driver support has either been vendor-specific (DirectGMA, which exposes physical bus addresses, Crossfire in FGLRX) or outside the drm framework (KFD/ROCR has full P2P support) - this provides a P2P capability within the drm framework.
                  Thanks for enlighting me! As part of the peanut gallery I don't understand all the low-level technical details but I follow these developments on the mailinglist for some years now and that piece of information helps my understanding quite a bit. I've missed the recent developments on dri-devel though, thanks agd5f for the pointers!

                  Comment


                  • #10
                    This sounds like yet another method to enable nefarious operations which the host OS doesn't have a way to monitor. Is there a proof that this standard and an implementation of it is secure?

                    Comment

                    Working...
                    X