Announcement

**campbell** · 03 November 2018, 02:51 PM

Maybe this has been asked and answered before, but why not allow multiple threads within a process to use both virtual cores on the same physical core, but disallow it between processes? The threads within a process already have access to the same address space so it would seem like there's no additional security benefit to preventing them from running on the same core.

**ermo** · 03 November 2018, 03:30 PM

Originally posted by campbell View Post

Maybe this has been asked and answered before, but why not allow multiple threads within a process to use both virtual cores on the same physical core, but disallow it between processes? The threads within a process already have access to the same address space so it would seem like there's no additional security benefit to preventing them from running on the same core.

Microsoft implemented the HyperClear mitigation for their Hyper-V core scheduler in Windows Server 2016 which does exactly what you suggest.

I believe there has been some talk lately about adopting similar scheduling approaches in Linux, but I can't remember where I read it.

**Luke** · 03 November 2018, 06:25 PM

Interesting that the proof of concept code at https://github.com/bbbrumley/portsmash requires frequency scaling to be turned off to work. That implies that keeping cpu frequency scaling active should make this attack more difficult, assuming an attacker's process had to complete entirely while the CPU was at a single speed. Counter to that would be to set the attack to wait until the CPU was under 100% load and then strike

**Luke** · 03 November 2018, 06:58 PM

I think the real fix for this would be to limit all SMT execution to trusted and non-networked code. SMT for rendering local video with no network usage yes, SMT for the browser no. Any networked process or process run by another untrusted user could be blocked from running on the same physical core as any (other) cipher process.

Here's a real-world example: suppose I had a setup where an application had to be added to a list controlled by root to be permitted to use hyperthreading and the only thing I put on that list is Kdenlive. No other program can use it, so for any two threads to coexist on one core, both of them must be kdenlive threads. All execution of anything else is suspended while either side of any core is running an HT job. Any attacker wanting to use ANY of the various SMT vulnerabilities must either get me to install a modified version of Kdenlive (which is not networked so can't just be accessed remotely for an exploit), or else remotely attack whatever kernel feature is enforcing that list. Attacker needs root to just plain modify the list, and anyone who has broken root no longer needs these exploits anyway.

As for attackers with local access, in my security case a machine is not considered trustable by two or more mutually opposing users, as posession=root and keyloggers etc can be installed in an initramfs, in the BIOS, or as physical hardware to attack disk encryption or anything else.

**pcxmac** · 03 November 2018, 08:28 PM

Originally posted by TemplarGR View Post

As i said, SMT benefit depends on the workload, but even the absolute super best best of the best scenario won't give you more than 50-60% performance. And these scenarios are mostly achieved in synthetic benchmarks.

In the real world, especially on a DESKTOP COMPUTER, you won't see nearly that much of an improvement. In the real world desktop don't run heavily multithreaded applications that have I/O bottlenecks... If your applications use less heavy threads than the number of physical cores you probably won't see improved performance at all. for example if you have a 8 core ryzen on your desktop, the vast majority of your software and games will never use anything more than 4-6, so you effectively have 0% performance boost from SMT.

SMT made sense back in the Pentium 4 days when dual cores weren't even a thing yet... When Intel re-introduced it with Nehalem for the i7 lines, most normal people realized they were better off buying i5s anyway, since the benefit of SMT was really small (even for quads) compared to the price premium it demanded.

Now that AMD has opened the road for 6,8, and in the very near future 12,16, 32, cpus for mainstream consumer desktops, SMT is more of a hindrance than a benefit. In a regular desktop you will NEVER see a +1% difference from SMT in such a multi-core system, and it can probably lower performance if it drives threads to logical cores instead of physical ones by accident.

As i said, for a desktop system it is best to just disable SMT and boost your cpu clocks. By driving less activity through a single core and letting parts of it cooldown more often, it will have more headroom for overclocking, and single threaded performance trumps faux-core (SMT) performance anytime, anywhere, especially on the desktop.

I don't see why people get so defensive about SMT, i really don't. It is not like the things i am writting in this thread are new to people. They have been known for more than a decade.

As someone who runs a lot of containers on some of my machines, I prefer to have low latency context switching. I am not trying to brute force a password, but I do have a lot of little things related to disk io going on. Yes, it does depend on the use-case, but so do all things computing, like file systems and how much RAM you need.

**discordian** · 03 November 2018, 11:08 PM

Originally posted by Luke View Post

Interesting that the proof of concept code at https://github.com/bbbrumley/portsmash requires frequency scaling to be turned off to work. That implies that keeping cpu frequency scaling active should make this attack more difficult, assuming an attacker's process had to complete entirely while the CPU was at a single speed. Counter to that would be to set the attack to wait until the CPU was under 100% load and then strike

Well.. I said when Specte and Meltdown where released, then its only a question of time until all high precision performance counters will require elevated priviledges. Without them all side-channel attacks will fail or be extemly slow and unreliable.
To keep current software running, some barely-good-enough replacement is needed thats eg. not accurate below 20ms (randomly skew the clockrate up and down, but keep the total error bounded, potentially make the skew bigger the more measurements are requested in a small timespan). Outside of micro-benchmarks and profiling it should not be needed.

**gmturner** · 04 November 2018, 03:32 AM

If someone wanted AMD to get some side-channel egg on their face (and clearly some do) they might take a look at Bulldozer, which distributes four FPU's amongst its eight ALU's, each of which advertises itself to the OS as a full core.

Then again maybe FPU's don't get much security-sensitive work (or any work at all, I suppose, hence AMD being able to get away with it in the first place -- or, at least, think they could, seeing as Bulldozer is not what most observers would describe as an industry success story).

Announcement

PortSmash: A New Side-Channel Vulnerability Affecting SMT/HT Processors (CVE-2018-5407)

Comment

Comment

Comment

Comment

Comment

Comment

Comment