Announcement

**geerge** · 09 May 2024, 02:58 PM

What intel CPU's have TPH support? Some Xilinx stuff appears to have TPH support so maybe this is a fringe benefit of the merge.

**Adarion** · 09 May 2024, 03:32 PM

Did I read that right that PCIe hardware can write via DMA directly to the L2 of some CPU? Well... are there any security concerns?

**mlau** · 09 May 2024, 03:56 PM

Originally posted by Adarion View Post

Did I read that right that PCIe hardware can write via DMA directly to the L2 of some CPU? Well... are there any security concerns?

It doesn't do dma, but I guess the cache controller is snooping all those pice-based memory writes, looks at the steering tag, evicts a few invalid entries and refills them with the snooped data, i.e. it does a prefetch not based on a program hint, but on a pcie hint. As for security, I don't immediately see anything negative here. In a "normal" system, when network data is being processed, the cpu has to fetch it from main memory into the caches as well, in this case here I think the benefits are that the data is already cache-hot at the correct cpu when it's about to be touched.

**intelfx** · 09 May 2024, 06:27 PM

Originally posted by Adarion View Post

Did I read that right that PCIe hardware can write via DMA directly to the L2 of some CPU? Well... are there any security concerns?

As opposed to writing via the same DMA to the memory?

**dragorth** · 09 May 2024, 07:28 PM

I am surprised AMD didn't work on an implementation to the AMD Driver as the example use case. Considering AI bandwidth concerns, this could have been an immediate win and advertisement for their GPUs.

Its a win either way.

**intelfx** · 09 May 2024, 08:51 PM

Originally posted by dragorth View Post

I am surprised AMD didn't work on an implementation to the AMD Driver as the example use case. Considering AI bandwidth concerns, this could have been an immediate win and advertisement for their GPUs.

Uh. No. This is completely inapplicable to GPUs and AI.

If you use the GPU in such a way that you have to stream(!) data from the GPU to the main memory, and in such a way that L2 cache latency plays a significant role(!), then you're already completely fucked (i.e. this alone implies existence of a performance bottleneck several orders of magnitude more restrictive than the performance of your GPU).

This isn't how GPUs are supposed to be used at all.

**dragorth** · 10 May 2024, 12:53 AM

Originally posted by intelfx View Post

Uh. No. This is completely inapplicable to GPUs and AI.

If you use the GPU in such a way that you have to stream(!) data from the GPU to the main memory, and in such a way that L2 cache latency plays a significant role(!), then you're already completely fucked (i.e. this alone implies existence of a performance bottleneck several orders of magnitude more restrictive than the performance of your GPU).

This isn't how GPUs are supposed to be used at all.

So, the GPU talks to the CPU all the time, especially in OpenGL and Vulkan. It also talks to the CPU in AI terms, and being hinted at where the best place to put the images/text/ out put to the correct CPU could absolutely help with latency and bandwidth. Not to mention GPU transcoding and streaming.

I have no idea why you would think this wouldn't be useful at all. The GPU is not a black box that data goes in and never talks to the machine. It has to syncronize data and rendering, not to mention staying in lockstep with game engines and hitting timing targets for render performance, especially in this age of 200+ fps.

**ms178** · 10 May 2024, 05:23 AM

Originally posted by geerge View Post

What intel CPU's have TPH support? Some Xilinx stuff appears to have TPH support so maybe this is a fringe benefit of the merge.

I have an entry for that in my BIOS, but cannot say if it actually does anything. That's on X99 with an Intel Haswell-EP Xeon.

**RZSN** · 10 May 2024, 07:45 AM

Writing to cache memory from PCIe was a thing back in socket 2011 era, Intel called that DDIO - Data Direct I/O technology:

Access Denied

https://www.intel.com/content/www/us/en/io/data-direct-i-o-technology.html

Access Denied

https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/data-direct-i-o-technology-brief.pdf

I wonder why vendors make these things so overcomplicated. Just make a generic mechanism for userspace-dynamic CAR (cache as ram) - that allows to pin a section of an indicated level of cache to a range of physical adresses - which would work transparently for code, data and DMA.

Announcement

AMD Preparing PCIe TPH Support For Upcoming CPUs

AMD Preparing PCIe TPH Support For Upcoming CPUs

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment