Announcement

**Timon&Pumba** · 12 February 2019, 08:36 AM

If the task placement was already optimal, how did they achieve the more optimal task placement? I guess the task placement only has improved...

**ms178** · 12 February 2019, 09:14 AM

AVX-512 is talked about since 2013, hence I am a bit curious why it took Intel so long to get these kinds of ISA specific improvements and optimizations into crucial parts of the software ecosystem. I would guess Intel would feed early hardware to their own various software teams and other ISVs to start this kind of work sooner rather than later.

**schmidtbag** · 12 February 2019, 09:52 AM

Originally posted by ms178 View Post

AVX-512 is talked about since 2013, hence I am a bit curious why it took Intel so long to get these kinds of ISA specific improvements and optimizations into crucial parts of the software ecosystem. I would guess Intel would feed early hardware to their own various software teams and other ISVs to start this kind of work sooner rather than later.

Good question. The only reason I can come up with is "we have no reason to push development on this" due to Intel having practically no competition from anyone at the time (I don't think Nvidia was even much of a threat in the server market back then). From what I can tell, Intel deliberately held it off to give themselves some performance leverage at last minute. Now that Intel is being attacked from multiple angles, AVX-512 is kinda their "secret weapon", since Intel's AVX performance is currently better than AMD's.

**skeevy420** · 12 February 2019, 10:11 AM

Originally posted by schmidtbag View Post

Good question. The only reason I can come up with is "we have no reason to push development on this" due to Intel having practically no competition from anyone at the time (I don't think Nvidia was even much of a threat in the server market back then). From what I can tell, Intel deliberately held it off to give themselves some performance leverage at last minute. Now that Intel is being attacked from multiple angles, AVX-512 is kinda their "secret weapon", since Intel's AVX performance is currently better than AMD's.

That wouldn't at all surprise me. On the business side, it makes sense to hold something back if you're already in the lead. If anyone starts to close in on that lead, you essentially have a magic bullet waiting for them. That's true of any business or industry. It sucks for consumers.

I have to imagine that Intel being out of magical bullets is why they're getting into the discreet GPU market -- if AMD starts taking CPU numbers, take GPU numbers from Nvidia.

**feydun** · 12 February 2019, 11:42 AM

Originally posted by ms178 View Post

AVX-512 is talked about since 2013, hence I am a bit curious why it took Intel so long to get these kinds of ISA specific improvements and optimizations into crucial parts of the software ecosystem. I would guess Intel would feed early hardware to their own various software teams and other ISVs to start this kind of work sooner rather than later.

It's not about using the vectorised instructions, which have been there for ages, just about tuning the cores usage, which is a small detail in the optimisation with some additional code complexity.

**Spacefish** · 12 February 2019, 01:25 PM

Originally posted by feydun View Post

It's not about using the vectorised instructions, which have been there for ages, just about tuning the cores usage, which is a small detail in the optimisation with some additional code complexity.

AVX512 is only present on consumer parts in the latest generation.. Before it was restricted to some high end Xenons and Xenon Phi

**dispat0r** · 12 February 2019, 05:23 PM

Originally posted by Spacefish View Post

AVX512 is only present on consumer parts in the latest generation.. Before it was restricted to some high end Xenons and Xenon Phi

Its only on SKYLAKE-X and XEONS no support on normal consumer CPUs.
I have 7900X but I didn't get any nice improvements with AVX512. x265 runs a bit better but that's with low AVX offsets so the CPU clocks higher than normal.

**ssokolow** · 12 February 2019, 05:25 PM

Originally posted by Timon&Pumba View Post

If the task placement was already optimal, how did they achieve the more optimal task placement? I guess the task placement only has improved...

If I understand it correctly, it's about allowing userland software with a better understanding of the performance characteristics of the task it's performing to have access to the information it needs to do its own tuning of CPU affinities.

**_Alex_** · 13 February 2019, 01:23 PM

What I get from the patch, is that they want to maintain higher clock speeds when avx-512 context switches aren't needed (properly written avx512 code), but were speculatively done because it wasn't known whether the program had cleared the avx512 registers or not. From the wording that says something like "real world loads like linpack" I wouldn't be surprised if it's done to boost benchmark scores

Announcement

Queued Linux Patches To Better Track AVX-512, Allowing For More Optimal Task Placement

Queued Linux Patches To Better Track AVX-512, Allowing For More Optimal Task Placement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment