Announcement

**timofonic** · 22 November 2019, 07:50 AM

Originally posted by wizard69 View Post

I’m not so sure, it sounds more like an intelligent I/O processor than any thing else. Hopefully more info will come soon.

Separate or integrated into the CPU die?

**coder** · 22 November 2019, 09:32 AM

Originally posted by timofonic View Post

Separate or integrated into the CPU die?

It should be on-die.

If it were larger, I'd just say "in-package". However, something like this is probably much too small to put on its own die. Maybe as part of an "I/O die", if Intel follows AMD's Zen2 approach...

**mrugiero** · 22 November 2019, 09:33 AM

Originally posted by coder View Post

Wow, I figured Hyper Threading killed DMA

What do you mean here? How are those related?

**coder** · 22 November 2019, 09:47 AM

Originally posted by mrugiero View Post

What do you mean here? How are those related?

Oh, quite simply. If you only have one CPU with one hardware thread, then the idea of tying it up with data movement is very unpalatable. However, if your CPU has 8 cores with 16 hardware threads, and one of them is tied up doing data movement across PCIe to a slow device, then you almost don't notice or much care - especially since that thread might be paired with a compute-heavy thread that keeps most of the core's functional units busy, anyhow.

So, the value proposition of a dedicated DMA engine is much lower. Not to speak of a 28-core CPU with 56 threads, or a 64-core CPU with 128 threads.

**mrugiero** · 22 November 2019, 10:07 AM

Originally posted by coder View Post

Oh, quite simply. If you only have one CPU with one hardware thread, then the idea of tying it up with data movement is very unpalatable. However, if your CPU has 8 cores with 16 hardware threads, and one of them is tied up doing data movement across PCIe to a slow device, then you almost don't notice or much care - especially since that thread might be paired with a compute-heavy thread that keeps most of the core's functional units busy, anyhow.

So, the value proposition of a dedicated DMA engine is much lower. Not to speak of a 28-core CPU with 56 threads, or a 64-core CPU with 128 threads.

Oh, that makes sense. I thought you meant something like DMA not working properly or being slowed down by hyper threading, so I was confused.
Further, big.LITTLE in the ARM world can be seen that way, maybe even more than HT.
There are use cases, though. For example, deep packet processing at line rate on high speed interfaces requires saturating all cores, and not everyone has many cores either, specially in the in-development world.
For example, I live in Argentina, and 2-4 threads are still common, even in retail computers, and that's also still very common in cellphones AFAIK.
But yeah, I see your point.
I have no idea if ARM does DMA tho.

**coder** · 22 November 2019, 10:19 AM

Originally posted by mrugiero View Post

There are use cases, though.

Yeah, like I think one thing they might be targeting is routing traffic between CPUs in a mesh, or something like that. Anyway, there was that reference to clustering, and it made me think of Nvidia's GPU interconect technology, NVLink.

Originally posted by mrugiero View Post

For example, deep packet processing at line rate on high speed interfaces requires saturating all cores, and not everyone has many cores either,

Good points. I think datacenter networking is starting to embrace 400 Gbps(!). Also, toward the lower-end of core counts, there maybe some embedded use cases, where power-efficiency could benefit from using simpler, lower-clocked cores for data movement.

**starshipeleven** · 23 November 2019, 07:14 AM

Originally posted by mrugiero View Post

I have no idea if ARM does DMA tho.

DMA is there if the protocol requires it. PCIe and Sata/SAS have DMA while USB 3.0 and lesser versions do not.

DMA is also very much there in any SoC as all processors in the SoC (CPU, GPU, modems, hardware decoding for media, and more) are sharing the same RAM.

One of the reasons projects like Purism's phone have the modem on USB bus (electrical USB interface) instead than integrated in the SoC is just that. The modem will have its own RAM and its own stuff and will have no access to the "app processor" (the main CPU running the OS) world.

**jayN** · 10 April 2021, 11:09 AM

Sapphire Rapids does have a DSA, according to recent slide leaks.

There is also an Oct 2020 detailed spec available at this link

Access Denied

https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html

**coder** · 10 April 2021, 11:46 AM

Originally posted by jayN View Post

Sapphire Rapids does have a DSA, according to recent slide leaks.

There is also an Oct 2020 detailed spec available at this link
https://software.intel.com/content/w...ification.html

Cool. Thanks for sharing!

I have to wonder how much of that can just be handled by a few CPU threads. With CPUs having SMT and so many cores, we don't need things like DMA engines, any more. Sure, it's a little bit of a waste to burn a big CPU core on that stuff, but a win for programmability.

If I had to chose between an Intel CPU with those engines but fewer cores, or an AMD/ARM CPU with more cores for the same or less $$$, my choice wouldn't be the Intel CPU.

**jayN** · 10 April 2021, 12:59 PM

Originally posted by coder View Post

With CPUs having SMT and so many cores, we don't need things like DMA engines, any more.

A couple of interesting features in there ... handles Optane, an operation for flushing caches, create and apply Delta

Intel added CXL on Sapphire Rapids also. It has biased cache coherency, but there may need to be some dma transfers between processor cache and accelerator memory when the bias is flipped. I wonder if they plan to use the DSA to do those transfers.

The operations of creating and applying Delta records is interesting, too. Perhaps it can be used to minimize writes to NVM.

Announcement

Intel Details New Data Streaming Accelerator For Future CPUs - Linux Support Started

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment