Announcement

**nomadewolf** · 14 January 2020, 08:08 AM

Originally posted by F.Ultra View Post

HW is not magic, if you have to clear a certain cache in a particular way to avoid the mitigation it does not matter much if that clear cmd is done in the cpu microcode or if it's done in SW. Yes it could be done slightly faster in some cases in HW but one advantage that SW have over HW is that the SW can choose when and where to apply the mitigation while the HW have to apply it everywhere so we can e.g implement certain things only where it matters (inside the kernel) and ignore it in say userspace where it does not matter (depends upon the mitigation of course).

A new architecture that is built from scratch to avoid this family of vulnerabilities will take several years to develop and even then it's very likely that those CPUs will be slower than todays CPUs clock for clock (and thus perhaps never even released). There might just not be any way that you can perform speculative execution safely and if you completely disable that performance will go down the drain.

Wow. This is a great explanation.
Yes, if in fact cache has to be cleared, that means a huge impact on performance.
Makes sense.

**duby229** · 14 January 2020, 11:01 AM

Originally posted by nomadewolf View Post

Couldn't Intel be just doing software mitigations via firmware on the new chips?

yeah, I'm sure they must be

**ndegruchy** · 16 January 2020, 05:51 PM

Glad to see that it's not entirely a dumpster fire for Intel (especially servers). Still, the fact that this happened and was known for nearly a decade really sours me on Intel. New devices I am buying will have AMD or ARM CPUs in them. I know AMD is affected to some degree, I'm not sure if ARM-based chips were hit. Hopefully this issue was a shot across the bow of future manufacturers.

Probably not, though. We'll get the same trash with a different package and a promise that things are all better now (read: different problem, same scale).

**hotaru** · 19 January 2020, 03:17 PM

Originally posted by ndegruchy View Post

I know AMD is affected to some degree, I'm not sure if ARM-based chips were hit.

out-of-order ARM chips (A72, A75, A76, etc) were hit about the same as AMD. in-order ones (A53, A55, etc) aren't affected, but are obviously much slower. right now the best way to go looks like ARM for low power stuff and AMD for everything else.

**Vistaus** · 08 March 2020, 12:20 PM

Originally posted by Azrael5 View Post

The best fix is to punish Intel switching to AMD as I have done. I know AMD has its own flaws but it has been less guilty and less involved than Intel.

Keep on dreaming that dream, boy, but AMD is starting to become guiltier as well now: https://liliputing.com/2020/03/take-...rocessors.html

**hotaru** · 08 March 2020, 05:52 PM

Originally posted by Vistaus View Post

Keep on dreaming that dream, boy, but AMD is starting to become guiltier as well now: https://liliputing.com/2020/03/take-...rocessors.html

https://twitter.com/gnyueh/status/1236178639483527168

**zyxxel** · 15 March 2020, 06:39 AM

Originally posted by Vistaus View Post

Keep on dreaming that dream, boy, but AMD is starting to become guiltier as well now: https://liliputing.com/2020/03/take-...rocessors.html

Just note that it isn't a question of being guiltier. It's impossible to make a processor without vulnerabilities unless every single operation is always constant time - like they were in old-style microprocessors like 8080, Z80 etc. As soon as we got a pipeline that can reorder instructions depending on availability of read results, and where we have different layers of cache with different access times we do get vulnerabilities.

Way before we got Meltdown etc, the security specialists knew they needed to write constant-time code. You can't fail a user on first incorrect character in a password, but must evaluate every single character and then at the end decide if the user failed or not - else it's possible to figure out that the first three characters was ok - so just loop on fourth character...

Think about a situation where one path has the data available and can run in one clock cycle concurrently with one or more other instructions. While another path needs 10 or 100 clock cycles for the data to arrive. The speed difference is huge, which is a reason why it's possible to measure the difference. So how are you going to hide this difference, unless you make the processor pretend it always needs to wait?

In the end, we need a solution where you can slow-load all cryptographic material into a fixed RAM before you start processing so all of the data can be accessed at a fixed latency. And where you can turn on/off slow/secure mode for specific code blocks. And for some specific instructions, we need a dual-evaluation feature where the instruction always computes the true/false alternative. Then it's up to developers to write protected black boxes that performs the magic at constant time before handing back a result to the normal code.

That's basically the only way you can get your normal work to run at full speed with the best pipelines the chip manufacturers can design, while the threads basically takes a "stop-and-go penalty" whenever a secure operation needs to be handed over to a fixed-speed subprocessor.

Announcement

Looking At The Linux Performance Two Years After Spectre / Meltdown Mitigations

Comment

Comment

Comment

Comment

Comment

Comment

Comment