Announcement

**OneTimeShot** · 16 January 2018, 08:32 AM

All of these chroot and VM systems are starting to confuse me now.

Why not just use the standard Linux kernel to allocate all these restrictions on a process by process basis? Then if the process happens to be a KVM VM, so be it.

it seems bizarre that we have process boundaries and VM boundaries, and container boundaries, all of which are basically variants of the same thing but with different configuration options and marketing...

**oiaohm** · 16 January 2018, 09:50 AM

Originally posted by OneTimeShot View Post

All of these chroot and VM systems are starting to confuse me now.

Why not just use the standard Linux kernel to allocate all these restrictions on a process by process basis? Then if the process happens to be a KVM VM, so be it.

it seems bizarre that we have process boundaries and VM boundaries, and container boundaries, all of which are basically variants of the same thing but with different configuration options and marketing...

Nothing is as straight forwards as it seams. Jailhouse is related to a old thing lguest. Both are designed to be insanely lightweight and not care about running operating systems built for generic hardware. So this is 1 form of VM uses hardware accleration

Next is like you KVM this is in fact design to run normal operating systems and operating systems that know about it. This is a second form. Having both forms make sense.

With virtual machines you have other fun things https://software.intel.com/en-us/blo...ory-encryption

Memory encryption. So you many want different processes in different VM so they cannot snoop on each other memory because the memory is encrypted and isolated.

So the idea that these are basically variants on chroot is wrong. chroot you were still operating in the same memory address space tables. VM you get into hardware/software assistance to have own memory address space tablets.

VM are all basically variations on hyper-visors. There is more than 1 way to implement a hyper-visor with different trade offs in OS support, performance and security.

A term to understand what is going on is. Multi-Level Security. Multi-Level Security you end up with layers like a onion. To understand what is going on you need to lay the stuff out in layers of where it owns.

VM/Hyper-visors are in one group doing true separated memory spaces.

chroot/cgroups/namespaces... are all in methods of doing process separation inside a share memory space. Process separation is not completely independent memory space between processes in most operating systems.

**zman0900** · 16 January 2018, 02:36 PM

Sounds a lot like LXC (container tech used by things like Docker). Is there a significant difference here?

**oiaohm** · 16 January 2018, 03:54 PM

Originally posted by zman0900 View Post

Sounds a lot like LXC (container tech used by things like Docker). Is there a significant difference here?

LXC container technology is not using hyper-visor instruction. Is not using hyper-visor memory level separation. Docker can be managing hyper-visor.

Jailhouse each hypervisor guest has it own kernel space memory as well so its not shared kernel space memory like LXC.

Hyper-visor stuff and containers can sound a lot a like but there is a big different in how much is separated.

Jailhouse and KVM are in the same class. Jailhouse is a lot lighter KVM.

**OneTimeShot** · 16 January 2018, 04:38 PM

Originally posted by oiaohm View Post

With virtual machines you have other fun things https://software.intel.com/en-us/blo...ory-encryption

Memory encryption. So you many want different processes in different VM so they cannot snoop on each other memory because the memory is encrypted and isolated.

Why would I not want this between different processes? Maybe two users should encrypt their process space in exactly the same way. This is therefore not a feature specifically for VMs...

Originally posted by oiaohm View Post

So the idea that these are basically variants on chroot is wrong. chroot you were still operating in the same memory address space tables. VM you get into hardware/software assistance to have own memory address space tablets.

Actually there is not much difference in memory space here - the Ring -1/-2 hypervisors are just extensions to the standard MMU address space. There is nothing special about the address tables beyond two extra MMU page table indirections. In that respect it *is* the same memory address space table (just a slightly deeper one).

Originally posted by oiaohm View Post

VM/Hyper-visors are in one group doing true separated memory spaces.

chroot/cgroups/namespaces... are all in methods of doing process separation inside a share memory space. Process separation is not completely independent memory space between processes in most operating systems.

I don't think that the difference is particularly great. A running process can't access anything that the kernel doesn't let it. Two processes on a single kernel should/can have the same security properties as two processes on two different Linux VMs on the same hardware. All the Hypervisor is doing is providing some extra configuration options to help you separate processes.

IMO therefore: chroot jails allow for the same security separation properties as two VMs, and a KVM process is no different to any other Linux process (albeit is has a funny syscall API). So I think that VMs are just processes with a fancy names, and different access control models.

This is a pretty theoretical model of course. Configuring a chroot to isolate what a VM does would be a real pain in the neck.

**oiaohm** · 16 January 2018, 07:43 PM

Originally posted by OneTimeShot View Post

Why would I not want this between different processes? Maybe two users should encrypt their process space in exactly the same way. This is therefore not a feature specifically for VMs...

There is a performance reason not do the above. Kernel space and userspace encoded with the same key can be very effective when transferring data between user space and kernel space. Same with transferring processes to processes. Decoding from one encryption key and encoding into another is just painful.

Originally posted by OneTimeShot View Post

Actually there is not much difference in memory space here - the Ring -1/-2 hypervisors are just extensions to the standard MMU address space. There is nothing special about the address tables beyond two extra MMU page table indirections. In that respect it *is* the same memory address space table (just a slightly deeper one).

A secure -1/-2 hypervisor will be doing like kernel page-table isolation. So there is not two extra mmu page table indirect done securely it does not technically get deeper. So each VM has its own page tables/allotted memory. If you look at how page table isolation is done its not the same memory address space table the reason its not the same address space table is so you cannot probe the mmu and have it hand over information if it has flaws such as every intel processor since and including the 286.

Originally posted by OneTimeShot View Post

I don't think that the difference is particularly great. A running process can't access anything that the kernel doesn't let it. Two processes on a single kernel should/can have the same security properties as two processes on two different Linux VMs on the same hardware. All the Hypervisor is doing is providing some extra configuration options to help you separate processes.

2 kernel spaces means unless the hypervisor allows sharing there will not be sharing.

What you are skip straight over is side channel. Like monitoring dmesg. Two VM with 2 Kernels with a process running in both compared 2 process running on 1 kernel split by cgroups/namespaces.

In the cgroups/namespaces case where you only have 1 kernel if one of those processes can monitor the kernel dmesg when there is only one kernel they could be performing a side channel attack against all the processes running in the system..

With the 2 vms and 2 kernels monitoring the dmesg one kernel only gives away information about the processes running on that kernel the other kernel processes have not be got at by a side channel attack.

Information the kernel is giving to processes at times can be give that process more information that was it safe so giving away what other processes are doing on the kernel.

The big thing you over looked is that at times a process will need to be let have information from kernel that is not always 100 safe against being used as a side channel attack against anything else running on the same kernel.

Reason for using a hypervisor over a cgroup/namespace is stronger resistance to side channel attacks and if the kernels used does have a flaw hopefully its isolated inside the hypervisor.

So the your idea that they are the same is bogus.

**OneTimeShot** · 16 January 2018, 08:35 PM

Originally posted by oiaohm View Post

So the your idea that they are the same is bogus.

Fair point on the side-channel attack and increased surface, I guess... I'd count two processes running in different security contexts being able to attack each other as a problem that needs resolution in itself, though.

**oiaohm** · 17 January 2018, 02:23 AM

Originally posted by OneTimeShot View Post

Fair point on the side-channel attack and increased surface, I guess... I'd count two processes running in different security contexts being able to attack each other as a problem that needs resolution in itself, though.

Missing a basic. A lot of the side channels that exist when you are running single kernel are there for debugging why something is not working.

Remember even hypervisors have weaknesses caused by debugging features as well.

Highest to lowest in security
Bare metal with 1 application per system
Hyper-visors with 1 application per VM
cgroups/namespace/zones/jails with individually contained applications in userspace.
Bare metal running multi applications without containment.

Cost vs Performance vs Security at best you can only ever pick 2 at perfection in that 3.

This is the problem of thinking all this stuff is the same is that it different level of compromises.

The fact jailhouse hypervisor does not attempt to support non modified operating systems like KVM does means it can have a smaller attack surface because it not emulating as much hardware and does not need as much in the debugging functions side. So there will be valid times to choose jailhouse hypervisor over kvm,

The one thing dockers containers provide is means to run in most of the different options without have to redo everything. But it still very important to understand security the difference between docker containers deployed 1 per system, 1 per VM, 1 per cgroups/namespace/zones and all stuffed into one system with the containment mostly turned off in a chroot.

The confusion comes from the fact docker lets you develop one thing and use it multi different ways with different levels of security and performance.

**OneTimeShot** · 17 January 2018, 05:05 AM

Originally posted by oiaohm View Post

Highest to lowest in security
Bare metal with 1 application per system
Hyper-visors with 1 application per VM
cgroups/namespace/zones/jails with individually contained applications in userspace.
Bare metal running multi applications without containment.

The confusion comes from the fact docker lets you develop one thing and use it multi different ways with different levels of security and performance.

You can also extend that list with "multiple Cores/CPUs" and "multiple hosts in a NUMA cluster", and it all becomes very complicated for somewhat diminishing returns (and that's before the chances that a Spectre-like security bug blows the whole stack away)...

IMO the future is a kernel scheduler that can select the optimum deployments for my running systems automatically and switch between them as required. So (e.g.) JailHouse can just kick in and do things with the processes that I am running without me having to explicitly configure it.

Announcement

Jailhouse Guest Support Queued For Linux 4.16

Jailhouse Guest Support Queued For Linux 4.16

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment