Honestly, laptops already have those "embedded controllers", which have supper shitty closed-source firmwares only tuned to Windows; I see this is a try to pull this into a main SOC. Overall, the level of closed-sourcedness and anti-consumer bullshit is somewhat preserved, and there is a chance this will be more stable on Linux. Though, all depends on AMD qualification process.
Announcement
Collapse
No announcement yet.
AMD Prepares PMF Linux Driver For "Smart PC Solutions Builder"
Collapse
X
-
Originally posted by stormcrow View Post
No, I can see why it would be controversial. We're long past the point where it should be obvious that signed firmware doesn't stop any determined hacker from infiltrating a system. There's numerous examples where this is the case, my favorite being the recent Switch hack presentation from 3C. The only thing it stops is the legitimate clients from exercising control over repairs, including fixing hardware configuration bugs that occur far too often to bother listing.
In the past, if there were a sub-optimal or even buggy ACPI table for a piece of hardware, you could "simply" substitute out the problematic entries for good ones. Course, the opposite can be true as well. You can royally screw up a system by naively altering ACPI values. OpenBSD, for an example, has its own ACPI tables that aren't provided by Intel (and it's known to cause problems on some systems - I have an old laptop that immediately shuts down after OpenBSD's kernel boot due to a bad ACPI table, but works fine with Linux). But the point is, it's impossible to fix anything without legal access to the configuration settings, which this system effectively locks away from the supposed owners of the hardware for no good reason other than to enforce OEM secrecy over things that probably shouldn't be secret to begin with.
The controversy isn't that it's 'necessary to mainstream Linux-based laptops'. The controversy is that the locked down signing is not necessary at all to have a functional system. It's an unnecessary external limitation of control, especially since the push is towards having more open and auditable hardware, firmware, and software stacks so people can at least be aware of any risks they're taking, if they can't be easily fixed. Let's face it, sometimes there's no feasible repairs possible, only remediation/mitigation/isolation, yet better the devil you know...
Edit to add: While signed firmware and executables outside of the end user's control won't stop a determined attacker, code signing in general (with user consent and control) may stop lesser skilled or determined hackers, therefore it shouldn't be totally discarded as a security layer. What makes it actually useful is the ability to revoke and reissue signatures in the event of compromise. AMD having the sole signing authority to do this means this isn't about security (because you can't depend on AMD to properly revoke and reissue keys into the indefinite future) but about usurping user control.
- Likes 1
Comment
-
Originally posted by edwaleni View PostAMD is simply providing exactly what the OEM's want. A way to differentiate themselves without having to rely on a custom SKU just for them. No different than what many of the NVidia OEM's were doing with the reference designs. Unlike Intel, AMD doesn't have the capacity to create and fab several hundred perturbations of their reference Zen's, so they simply will let the OEM's control the feature set.
Comment
-
These days it's a different story, particularly in mobile products. GPUs (and presumably CPUs) can use much (MUCH) more power that the cooling system can dissipate - even getting heat out of the package is a problem - so getting best performance while not letting the smoke out is much more complicated and multi-dimensional. Even thermal limiting with dozens of sensors is too slow to avoid local meltdowns, so power management also uses distributed current sensors to anticipate thermal rise and start limiting clocks & voltages.
The key question is who takes responsibility for not letting the smoke out and killing the system. With desktop systems the worst case on the CPU side is usually that you blow up a socketed CPU, but with mobile systems the cost of damage is usually much higher, typically over 50% of the system cost. Right now HW vendors keep fine-grained power management under their control and if something goes wrong they can be pretty sure it was "their fault" rather than something the user did. There are grey areas like dust buildup in cooling solutions but that tends to happen sufficiently slowly that the firmware can adapt to it and still keep the smoke in.
As edwaleni suggested this is really an alternative to creating custom SKUs or at least custom SMU firmware images.
I haven't had a chance to go through the patches to see if there is a way to disable the mechanism in the event of problems, but even that is complicated because (for example) an OEM might be using it to support a SKU with a tiny cooling solution that is not able too run at default power levels. I would rather see that specific case handled with a user-visible knob or hard fuse so maybe a bad example, but key point is that there is a line between "stuff the vendor has to control in order to be able to warranty the product" and "stuff that is pretty safe for users to manage, that might crash the system but not permanently damage it".
We try to make anything in the second bucket visible to users and developers, but we also try to lock down anything in the first bucket.
Would you be OK with a mechanism that lets you say "I am taking responsibility for fine-grained power management and do not expect warranty coverage if something goes wrong" then records the answer permanently in the hardware ? My impression from customer support is that the general answer seems to be "no". The problem is not individual developers experimenting with PM software and letting the smoke out, but thousands or millions of users running power management software written by someone they never met.
All that said, there may be a better idea we (and the other vendors) have missed so far. Nobody likes proliferation of blobs but AFAICS the alternative is having a lot of different SMU firmware images with no way to make sure that firmware images are properly paired with hardware implementations.Last edited by bridgman; 24 September 2023, 07:34 PM.Test signature
- Likes 7
Comment
-
Originally posted by edwaleni View PostAMD is simply providing exactly what the OEM's want. A way to differentiate themselves without having to rely on a custom SKU just for them. No different than what many of the NVidia OEM's were doing with the reference designs. Unlike Intel, AMD doesn't have the capacity to create and fab several hundred perturbations of their reference Zen's, so they simply will let the OEM's control the feature set.
Basically, the whole purpose of this system is to make laptops with mediocre cooling more performant and/or be able to sell a laptop with a "higher spec" CPU than the cooling system can support. Notice how they used "lid status" as proposed inputs? Why would the SMU need to know if the lid is closed? Shouldn't the only thing that matter if you close the lid is if you have the OS set to turn the display or off, or put the system to sleep? Well, when your ultra-thin-whatever relies on the whole keyboard surface to be a radiator, when you close the lid, you might lose 6-7 watts worth of power dissipation. Rather than have the whole thing bump up against the 100°C thermal limit and start throttling when you decided to have the system do a render overnight with the lid closed, the SMU will just drastically cut CPU power.
Personally, I really don't like where the industry has gone, and is continuing to go with this whole configurable and dynamic TDP nonsense. It basically makes CPU SKUs completely meaningless. The same CPU model can perform dramatically different in two completely different laptops just based on which one has more thermal headroom.
I miss the days where TDP was TDP, and manufacturers had to design their cooling systems to support a particular TDP. Not change a processor's TDP to fit their cooling solution.
- Likes 2
Comment
-
Originally posted by bridgman View Post
All that said, there may be a better idea we (and the other vendors) have missed so far. Nobody likes proliferation of blobs but AFAICS the alternative is having a lot of different SMU firmware images with no way to make sure that firmware images are properly paired with hardware implementations.
I speak from experience.- Version 1.0 of my AMD PC didn't have adequate cooling so my RX 580 kept overheating. I blamed MSI.
- Version 2.0 got a Noctua fan placed into every free opening. No more overheating.
- Version 3.0 has a heat sync for the CPU so large the internal side fan is an external. I forgot to account for the fan doing space calculations.
Comment
-
-
Originally posted by AmericanLocomotive View PostI miss the days where TDP was TDP, and manufacturers had to design their cooling systems to support a particular TDP. Not change a processor's TDP to fit their cooling solution.
The other challenge is simply getting heat out of the package fast enough to avoid thermal runaway even if you have what amounts to an infinite heat sink, eg a block of solid copper measuring a parsec on each edge.Last edited by bridgman; 24 September 2023, 09:12 PM.Test signature
- Likes 4
Comment
-
If it helps, I don't think the PMF blobs are needed for generic systems, just for OEM-customized systems. My guess is primarily laptops and AIO's but I don't know that for sure (I'm on the GPU side so guessing a bit re: CPU support).
The other blobs are primarily hardware microcode so more hardware than software.Test signature
- Likes 3
Comment
Comment