Announcement

Collapse
No announcement yet.

Google Posts Experimental Linux Code For "Device Memory TCP" - Network To/From Accelerator RAM

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Google Posts Experimental Linux Code For "Device Memory TCP" - Network To/From Accelerator RAM

    Phoronix: Google Posts Experimental Linux Code For "Device Memory TCP" - Network To/From Accelerator RAM

    Google engineers have published early code around "Device Memory TCP" (a.k.a. Devmem TCP) as a proposal for transferring data to/from device memory efficiently by avoiding the need to copy the data to a host memory buffer...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Isn't nvme over tcp already doing this for storage? I don't understand that use case, or why i needs a facility in dri for it either.

    - Distributed raw block storage applications transfer large amounts of data with
    remote SSDs, much of this data does not require host processing.
    ‚Äč
    But it's obvious that so much of ML is ripe for optimization, there's probably at least an order of magnitude of performance gains coming just from driving the transistors optimally.

    Comment


    • #3
      Originally posted by fitzie View Post
      Isn't nvme over tcp already doing this for storage? I don't understand that use case, or why i needs a facility in dri for it either.



      But it's obvious that so much of ML is ripe for optimization, there's probably at least an order of magnitude of performance gains coming just from driving the transistors optimally.
      Ah, I can tell you we have a use for that in very high bandwidth signal processing.
      We are already doing this, but having it standardized at least in part will reduce our development burden.

      Comment


      • #4
        Originally posted by fitzie View Post
        Isn't nvme over tcp already doing this for storage? I don't understand that use case, or why i needs a facility in dri for it either.
        I can't comment on the first part, other than to say that maybe they're not using NVMe over TCP, for some reason.

        Regarding the second part, what they're saying is they want the NIC to write incoming data directly into the GPU's memory, rather than first having it go to host memory and then having to take a second trip from host to GPU memory. Makes sense to me.

        Comment


        • #5
          Originally posted by coder View Post
          Regarding the second part, what they're saying is they want the NIC to write incoming data directly into the GPU's memory, rather than first having it go to host memory and then having to take a second trip from host to GPU memory. Makes sense to me.
          It does make sense - but isn't something that InfiniBand and Ethernet with RoCE already do, via their DMA/RDMA capabilities ?

          /puzzled

          Comment


          • #6
            We already have RDMA ... why bother with additional complexity of TCP? To keep cpu cores busy?

            Comment


            • #7
              Originally posted by pegasus View Post
              We already have RDMA ... why bother with additional complexity of TCP?
              First, I think RDMA only handles directly writing to userspace memory - not device memory.

              I don't know about why they're using TCP.

              Originally posted by pegasus View Post
              To keep cpu cores busy?
              For this to make any sense, the NIC would have to implement 100% TCP offload. Otherwise, you couldn't avoid a pass through the host CPU, which is the entire point.

              Comment


              • #8
                Originally posted by coder View Post
                First, I think RDMA only handles directly writing to userspace memory - not device memory.
                RDMA definitely supports reading and writing to device memory: for example, you can write data from the NIC directly into a GPU memory buffer.

                Comment


                • #9
                  Isn't the point of this for enterprise CXL? They want to be able to outfit a compute node with access to memory located somewhere on the network rather than being limited to the 2-4TB on board, so they make a PCI Memory board with more memory and have multiple machines able to use it.

                  Comment


                  • #10
                    Originally posted by fwyzard View Post
                    RDMA definitely supports reading and writing to device memory: for example, you can write data from the NIC directly into a GPU memory buffer.
                    Okay, then I guess we can conclude that they were not able (or allowed) to use RDMA, for whatever reason.

                    Comment

                    Working...
                    X