Announcement

**bridgman** · 06 January 2018, 01:19 PM

Originally posted by darkbasic View Post

One way or another AMD cards CANNOT be used for OpenCL computing, because you'll be limited to one card only.

That's an extraordinary mis-generalization. If you were to say "one way or another AMD cards using ROCm stack can not be used for OpenCL computing with more than one GPU on a motherboard which only has one PCIE connector capable of supporting PCIE atomics" that would be accurate...

... but it's not a lot more restrictive than saying "AMD cards can not be used for OpenCL computing with more than one GPU if you only have one GPU".

AFAICS most motherboards either split the CPU-connected lanes across multiple PCIE connectors (eg 8 lanes to each of 2 connectors on consumer CPUs) or use a PCIE switch to feed the PCIE connectors, again from a consumer CPU. Motherboards with one x16 connector wired to CPU and a second x16 connector wired to chipset PCIE lanes (which is the issue with the user in this case) appear to be relatively uncommon - normally you either get just one x16 connector or you get direct-connect lanes split between 2 x16 connectors, with 8 lanes each when both connectors are being used. Those should work fine with ROCm and multiple GPUs. Also note that on any GPU before Vega the AMDGPU-PRO driver uses different driver paths for OpenCL which do not depend on having atomics.

Of course once you get into higher end CPUs the number of PCIE lanes from the CPU goes up significantly, to 40 on Intel and 64 on TR IIRC.

**darkbasic** · 08 January 2018, 08:10 AM

Originally posted by bridgman View Post

Motherboards with one x16 connector wired to CPU and a second x16 connector wired to chipset PCIE lanes (which is the issue with the user in this case) appear to be relatively uncommon - normally you either get just one x16 connector or you get direct-connect lanes split between 2 x16 connectors, with 8 lanes each when both connectors are being used.

Are you sure? I've not been able to do a comprehensive research but from what I've found they are far from being so uncommon.

**bridgman** · 08 January 2018, 10:31 AM

It's tough to be sure (documentation mostly focuses on how wonderful each motherboard is) but my understanding is that you can get a good idea by looking at what happens to the first x16 PCIE connector when the second x16 is in use.

If the first connector goes from x16 to x8 when the second x16 is used that tells me that the CPU-connected lanes are being used for both connectors.

There are probably cases where the mobo has three x16 connectors - two using CPU lanes and one using chipset lanes - but in that case the board would be able to support two GPUs with PCIE atomics at least. I'm going to try to hunt down details on the specific board that prompted this discussion and make sure it fits the theory.

EDIT - looks like that mobo uses a B250 chipset, which is specified for a single x16 only.

There are some Z250 and H370 mobos which offer a second x16 connector wired to chipset lanes, intended to support an x16 NvME card, and it is possible to use that for a second GPU in certain cases, but even Intel describes those chipsets as "single graphics only". The Z270 chipset supports up to three GPUs by splitting the 16 CPU-connect lanes into 8/8 or 8/4/4.

Bottom line - the original complaint seems to be "I can only use one AMD GPU running ROCm on a motherboard designed for a single GPU". There is limited functionality for >1 GPU using chipset lanes but I don't think anyone recommends it outside of the mobo marketing team responsible for a specific motherboard (whose job is to make that mobo seem like The Best Motherboard In The World) and specialized applications like mining (where you sometimes see 18 x1 connectors wired to B250 chipset lanes).

Anyways, we are working on providing an OpenCL solution which does not depend on atomics, and everything before Vega supports OpenCL without atomics today via the -PRO driver.

**cde1** · 25 January 2018, 05:49 PM

Just curious bridgman, what are PCIe used for in the context of OpenCL in ROCm?

**bridgman** · 26 January 2018, 09:07 AM

Guessing your question was intended to be "what are PCIE atomics used for in context of..." ?

The ROC stack uses work queues in application userspace for submitting work to the GPU. Queues can be shared, both between threads in a process and between CPU/GPUs, so atomic operations are used to ensure that multiple threads sharing a queue do not interfere with each other.

The work queues (like a PM4 kernel queue but using a cross-vendor "AQL" packet format) are maintained in system memory so that the GPU can perform atomic operations over the PCIE bus against queue contents.

Announcement

AMDGPU-PRO OpenCL Compiler Hacked Into Mesa's Clover

Comment

Comment

Comment

Comment

Comment