Originally posted by k1e0x
View Post
I stay well clear of talking about freebsd jails the above post covers why. This covers this history freebsd jails core design does not come from a trustworthy coder as well as performing badly. Performance problem cause lot of excess isolation.
Solaris Zones and Linux cgroups/namespaces suffer from basically the same set of problems.
1) The first problem type I can point to a recent example of it in Linux.
This is example were full isolation can turn rapidly into a hindrance this was one of the cgroup goof ups this is a particular class of goof up resulting in extra memory usage. But Solaris Zones have in their implementation many goof ups like this.
This type goof ups fairly much look the same you isolate process and effectively duplicate the memory incorrectly for some reason. This duplicate memory means you memory management has to work harder to defragment memory. Not being able to get large continuous allocations of memory starts effecting IO performance. At this point your performance disappears into hell.
k1e0x this is really the board game I talked about. Perfect security you are going to duplicate up the memory so that some missing memory protection flag is not going to allow a cross breach but doing this undermines stability and performance. So that deduplication fix in the Linux kernel for slabs is incorrect for perfect security. Maybe we want configuration here why this may not be the answer is 3.
2) Then you have like the Linux PID/network... namespace or the Solaris Non-Global Zones problem. This is like the first problem with a extra side of hell. The applications in these namespace/zones have to presented with information that looks like the a full system even that they are only seeing part information this is mandatory memory duplication that may come back and bite. This is duplication also has to be kept synced in many cases. This syncing takes extra cpu time.
The second one if your workload is hitting it can be faster to run you workload in kvm instead. Again maybe we want configuration here so we can avoid using these things when they make no security sense.
So it really hard working out how to do cgroups and zones exactly right. Get it wrong you can have massive performance hits that appear absolutely random.
Linux kernel did start over with cgroups once already why we have cgroupsv1 and cgroupsv2. Cgroups v2 is way better designed than the first one. Cgroupv1 broke apart zone design way to far allowing multi trees. But allowing users to use of the namespaces when they are not required that cgroups allow gives it performance advantage over zones.
Container on Linux is a theory construct built on top of cgroups and namespaces.
.Basically k1e0x there is no single absolutely right answer for every usage case. So for this stuff we need a stack of setting right.
3) Welcome to the third nightmare. As you add options for configuring the system you add cpu overhead possible to the complete system as you need to check what options apply. Something Solaris managed todo. leading to it being nicknamed Slowaris.
Basically this is one very hard game to win. Every path you can think is a solution to the zones or cgroups/namespace problem can in fact end up killing performance or security or both with a extra side of sometimes completely screwed up stability.
Problem is a perfect implementation cgroups/namespaces and zones for security will be slow. Redox OS cannot avoid this. So you need to make a imperfect solution to have performance the problem is how to achieve imperfection for performance without reducing security too much. Basically we do not want to do a intel with speculative execution.
Comment