Intel AMX Support Appears Ready For Linux 5.16

coder replied

27 October 2021, 11:29 PM
Originally posted by sophisticles View Post

While Intel and AMD have a cross licensing agreement that allows one company to use technology the other company developed, there is a waiting period before either company can implement said competitor's technology.

Source?

Originally posted by sophisticles View Post

Look at AVX-512, Intel released it on Xeons and HEDT CPUs years ago, Intel now has it on laptop and desktop CPUs and AMD still hasn't released a single CPU with said instruction set.

Because AMD was smart and decided to wait until the silicon technology was ready.

Originally posted by sophisticles View Post

Hell, look how long it took for AMD to release a CPU with AVX2 and the first ones weren't that good anyway.

They adopted some technologies quicker than others. IIRC, they adopted SSE very quickly, but were slower on SSE2.

Originally posted by sophisticles View Post

Intel will have a laptop CPU with AMX long before AMD even thinks about implementing it in its high end offerings.

Could be, but I guess you didn't hear that Intel's new flagship desktop/laptop CPU will have no AVX-512, at all.

https://www.anandtech.com/show/16881...rchitectures/5
Leave a comment:
coder replied

27 October 2021, 11:21 PM
Originally posted by trueblue View Post

I know that Intel developed Advanced Matrix Extensions for AI work loads, but I wonder if it will also be useful for other scientific/engineering applications such as CFD.

No. The functionality being included in the upcoming Sapphire Rapids CPU supports operations only on 8-bit and BFloat16 data. Most scientific, engineering, and financial applications require double-precision arithmetic. At this precision it's not even capable of half-decent audio processing!

The x86 Advanced Matrix Extension (AMX) Brings Matrix Operations; To Debut with Sapphire Rapids

https://fuse.wikichip.org/news/3600/the-x86-advanced-matrix-extension-amx-brings-matrix-operations-to-debut-with-sapphire-rapids/

Intel publishes details of its upcoming Advanced Matrix Extension (AMX), an x86 extension set to debut with Sapphire Rapids that introduces a new matrix register file and accompanying matrix operations.
Leave a comment:
smitty3268 replied

27 October 2021, 08:14 PM
Originally posted by sophisticles View Post

While Intel and AMD have a cross licensing agreement that allows one company to use technology the other company developed, there is a waiting period before either company can implement said competitor's technology.

I don't believe there are any such restrictions. If you have a source, I'd love to see it.

The waiting period is just due to the fact that the design of a CPU takes place 2 years before it's released as a product available to buy, so once Intel announces a new instruction it's likely going to be at least 2 years you'll see it in an AMD product even if they pick it up right away. And in the case of AVX512, there were reasons (cost, yields, and power usage) to consider before AMD really wanted to even commit to including it in their products. Sometimes you don't want to add a new technology the moment you can, and it makes sense to wait for it to mature a bit first. Especially when you're an underdog with limited financial resources.
Likes 1
Leave a comment:
sophisticles replied

27 October 2021, 11:47 AM
Originally posted by uid313 View Post

Intel is only implementing AMX on Xeon, it would be interesting if AMD implemented it on consumer processors like Ryzen for laptop and desktop.

While Intel and AMD have a cross licensing agreement that allows one company to use technology the other company developed, there is a waiting period before either company can implement said competitor's technology.

Look at AVX-512, Intel released it on Xeons and HEDT CPUs years ago, Intel now has it on laptop and desktop CPUs and AMD still hasn't released a single CPU with said instruction set. Hell, look how long it took for AMD to release a CPU with AVX2 and the first ones weren't that good anyway.

Intel will have a laptop CPU with AMX long before AMD even thinks about implementing it in its high end offerings.
Likes 1
Leave a comment:
uid313 replied

27 October 2021, 04:41 AM
Intel is only implementing AMX on Xeon, it would be interesting if AMD implemented it on consumer processors like Ryzen for laptop and desktop.
Leave a comment:
Setif replied

27 October 2021, 02:56 AM
Originally posted by trueblue View Post

I know that Intel developed Advanced Matrix Extensions for AI work loads, but I wonder if it will also be useful for other scientific/engineering applications such as CFD. Would code need to be re-written to take advantage of it, or would it be done by compiler switches?

Intel AMX is a for low-precision computing bf16/int8, while the scientific/engineering applications need at least fp64/int32, even some of them need int64 or int128. Recently IBM added HW fp128 I think.
Likes 1
Leave a comment:
pipe13 replied

26 October 2021, 09:35 PM
Thanks Bridgman. With AVX-512 there's also a (relatively) minor matter of memory alignment, i.e. that the target vectors/arrays must start on 32-byte boundaries. This isn't a huge deal, but does impact one's source code. I worked on a robotics project a few years ago where our workspace evaluation was a limiting step and our Principal Engineer asked me to see if AVX-512 might help. After some discussion (I didn't think the project was ready yet for such optimization) I agreed, and saw to it.

We were using Eigen3 C++ template libraries at the time for our linear algebra, and all the computation was done on 4x4 rotation/translation matrices. So we both recognized the chances for substantial speedup were not large, and I was happy to eek out five or ten percent.

The Eigen3 documentation clearly explains what needs to be done, and it isn't hard. But it wasn't worth it. That portion of the code was still under active development by at least one other engineer, who couldn't efficiently implement her own work without unduly stumbling over the new alignment issues. Speed is speed, and the speed of developing good robust working code took priority over a few percent runtime improvement. So we quickly backed out the AVX-512.

Doesn't mean AVX-512 isn't worthwhile for larger matrices in more stable code. Michael has shown benchmark examples where it very clearly is. Perhaps future compilers can alleviate this, but three years ago at least, AVX-512 was a little bit more than just a new compiler switch and you're done.
Leave a comment:
bridgman replied

26 October 2021, 08:21 PM
Originally posted by sophisticles View Post

Can someone explain this to me. AVX-512, just like all SIMD instruction sets, including SSE/SSE2/SSSE3/AVX/AVX2/3DNOW!/AltiVec, are used via compiler intrinsics or hand-crafted assembler and the kernel, be it the Windows, Linux, Unix, MacOS kernel, have to support it.

How is this any different from AMX?

AFAIK the difference is not "how you use it" but "what you need to do before using it".

With AVX et al you need to make sure HW support is present before using it, but with AMX you also have to make an OS call to say "I'm going to be using <feature> so enable some additional state save/restore functionality" or "I want to use <feature> so make sure nobody else is already using it" (I forget which). Something like that anyways.
Likes 1
Leave a comment:
sophisticles replied

26 October 2021, 07:53 PM
Unlike AVX-512 and earlier, user-space applications actually need to request the support from the kernel to be able to use Advanced Matrix Extensions functionality.

Can someone explain this to me. AVX-512, just like all SIMD instruction sets, including SSE/SSE2/SSSE3/AVX/AVX2/3DNOW!/AltiVec, are used via compiler intrinsics or hand-crafted assembler and the kernel, be it the Windows, Linux, Unix, MacOS kernel, have to support it.

How is this any different from AMX?
Leave a comment:
trueblue replied

26 October 2021, 04:54 PM
I know that Intel developed Advanced Matrix Extensions for AI work loads, but I wonder if it will also be useful for other scientific/engineering applications such as CFD. Would code need to be re-written to take advantage of it, or would it be done by compiler switches?
Leave a comment:

Announcement

Intel AMX Support Appears Ready For Linux 5.16

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: