Announcement

Collapse
No announcement yet.

Arm Joins The Compute Express Link Bandwagon (CXL)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Arm Joins The Compute Express Link Bandwagon (CXL)

    Phoronix: Arm Joins The Compute Express Link Bandwagon (CXL)

    Arm has now joined Intel, HP Enterprise, Google, Microsoft, Dell EMC, and others in backing the new Compute Express Link (CXL) effort as the interconnect for future accelerators...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    AMD is onboard, too. Even CCIX stands.

    Comment


    • #3
      Yes, now with both AMD and ARM on board, one wonders whether nvlink, CCIX, Gen-z are now dead in the water.

      Comment


      • #4
        Originally posted by jabl View Post
        Yes, now with both AMD and ARM on board, one wonders whether nvlink, CCIX, Gen-z are now dead in the water.
        That were my whole thoughts, too.

        Comment


        • #5
          So what is the difference between NVLink, CCIX, Gen-Z, Intel QuickPath Interconnect (QPI), HyperTransport, OpenCAPI, etc?

          Maybe would be nice with a RISC-V supporting CXL or something...

          But with too many different interconnect, its hard to interconnect something. It's like standards when everyone have their own standard.

          Comment


          • #6
            Originally posted by uid313 View Post
            So what is the difference between NVLink, CCIX, Gen-Z, Intel QuickPath Interconnect (QPI), HyperTransport, OpenCAPI, etc?
            Protocol, and provider. The standard doesn't deliver (PCIe 4.0 isn't as fast enough as NVLink), so they're forced to make their own.

            Comment


            • #7
              Originally posted by tildearrow View Post

              Protocol, and provider. The standard doesn't deliver (PCIe 4.0 isn't as fast enough as NVLink), so they're forced to make their own.
              Terminology. All these standards use the same SerDes. They are lane-speed equivalents.
              Essentially NvLink is PCIe and vice versa.

              Difference is protocol, not raw bandwidth.
              For example. PCIe and the older SerialRapidIO are based on the same SerDes but turnaround for memory transactions and coherency is far faster in RapidIO.
              NvLink provides much the same capabilities as PCIe, but trunking and routing is different.
              All these have come to be because PCIe does not cater much to the needs of low latency/trunking/routing.
              There have been some attempts for PCIe, but are mostly left dead in the water because the only driving force for PCIe is raw bandwith as an I/O memory map expander (Mostly PC GPUs memory map bandwidth requirements and expectations).

              Comment


              • #8
                Originally posted by milkylainen View Post

                Terminology. All these standards use the same SerDes. They are lane-speed equivalents.
                Essentially NvLink is PCIe and vice versa.

                Difference is protocol, not raw bandwidth.
                For example. PCIe and the older SerialRapidIO are based on the same SerDes but turnaround for memory transactions and coherency is far faster in RapidIO.
                NvLink provides much the same capabilities as PCIe, but trunking and routing is different.
                All these have come to be because PCIe does not cater much to the needs of low latency/trunking/routing.
                There have been some attempts for PCIe, but are mostly left dead in the water because the only driving force for PCIe is raw bandwith as an I/O memory map expander (Mostly PC GPUs memory map bandwidth requirements and expectations).
                OK, thanks for explaining.
                Is there any reason why?

                Comment


                • #9
                  Originally posted by tildearrow View Post

                  OK, thanks for explaining.
                  Is there any reason why?
                  You mean why PCIe does not cater to low-latency and other types of needs?

                  Because bus development was mainly PC driven. You need bandwidth primarily.
                  Multiprocessor routing and low latency transactions just isn't in a gamers need-list.
                  GPUs have local caches and RAM. They execute separately from the context of the main system CPU.

                  Ie. GPU does not share high throughput coherency state with the CPU.
                  CPUs in a SMP/NUMA exchange cache and memory state many million times a second.
                  GPU's and I/O cards on PCIe are mainly "upload/download a chunk of data to a I/O map somewhere."
                  Ie. Host to peripheral bus topology and protocol.

                  AI / Military I/O / Ultra high speed sensor I/O need even higher bandwidth combined with fast state exchange.
                  As accelerator cards on busses are rapidly being deployed they need more and faster exchange with the main CPU.
                  They are more and more seen as "normal" (albeit fast at doing special stuff) execution units over a unified accelerator language and interfaces.

                  Like RapidIO, you can turn around the transaction on error before it is completed. RapidIO has in-transaction control symbols
                  This facilitates the implementation of MOESI or similar coherency protocols on top of RapidIO.

                  Comment


                  • #10
                    Originally posted by jabl View Post
                    Yes, now with both AMD and ARM on board, one wonders whether nvlink, CCIX, Gen-z are now dead in the water.
                    ARM claims that CXL doesn't cover its use case of chip-to-chip interconnect, in a system-on-package. So, it will continue to push CCIX for that.

                    Also, I'm not sure if CXL covers the NVLink use case of device-to-device, as it seems primarily a CPU-to-device protocol.

                    And you forgot to mention AMD's Infinity Fabric (in the original EPYC, it famously layered cache coherency over PCIe) - which is exactly what CXL does. However, as I understand it, Infinity Fabric is not tied to PCIe, though I think CXL is.
                    Last edited by coder; 14 September 2019, 06:29 PM.

                    Comment

                    Working...
                    X