Announcement

Collapse
No announcement yet.

OverlayFS Adding Support For IDMAPPED Layers For Various Benefits

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • OverlayFS Adding Support For IDMAPPED Layers For Various Benefits

    Phoronix: OverlayFS Adding Support For IDMAPPED Layers For Various Benefits

    Sent in this morning for the Linux 5.19 merge window were the OverlayFS updates of which the main feature addition this cycle is support for IDMAPPED layers...

    https://www.phoronix.com/scan.php?pa...DMAPPED-Layers

  • #2
    Yeah the biggest usage is of course the default Docker storage engine!

    Comment


    • #3
      Originally posted by ernstp View Post
      Yeah the biggest usage is of course the default Docker storage engine!
      Nah. There's millions if not billions of openwrt devices using OverlayFS.

      Comment


      • #4
        Originally posted by c117152 View Post

        Nah. There's millions if not billions of openwrt devices using OverlayFS.
        Do those actually have much need for this feature though?

        If you've developed within container based ecosystems, this feature is actually really handy not having folders/content written to the host as root when you don't want that.

        IIRC rootless containers aren't at parity with root ones (unrelated to this IDMAPPED feature support), but this feature I think was quite useful for rootless Podman (I remember a PR discussion on Github about it with Podman maintainers and the developer that was implementing support for filesystems).

        Comment


        • #5
          Originally posted by polarathene View Post

          Do those actually have much need for this feature though?
          No, OpenWrt doesn't use it usually, but if you run OpenWrt itself in a container it could be useful.

          If you've developed within container based ecosystems, this feature is actually really handy not having folders/content written to the host as root when you don't want that.
          You can also mount those as ro into the container, but yes, it's an additional barrier when files belong to nobody.

          IIRC rootless containers aren't at parity with root ones (unrelated to this IDMAPPED feature support), but this feature I think was quite useful for rootless Podman (I remember a PR discussion on Github about it with Podman maintainers and the developer that was implementing support for filesystems).
          YMMV, but they seem pretty full featured to me. There are some caveats like ksmbd not working or inability to set the region of WiFi interfaces. Of course, this requires setting up the container properly because you can't mount anything or create new device nodes, but if you do there are very few things that don't work from my experiece. Personanny, I'm using LXC (not LXD).

          Comment


          • #6
            Originally posted by binarybanana View Post
            No, OpenWrt doesn't use it usually, but if you run OpenWrt itself in a container it could be useful.
            Um the factory reset feature is based on this. You obviously have no idea.

            Comment


            • #7
              Originally posted by binarybanana View Post
              You can also mount those as ro into the container, but yes, it's an additional barrier when files belong to nobody.
              ? If you're providing volume mounts where state is persisted, as in the container writes to the location, read-only access is obviously not an option. And that's the common problem you'll see people complain about with Docker in the past.

              If a process runs as root inside the container and writes to that location, it's written locally with root ownership which sometimes is not desirable. You can try work around it, there is the `--user` option to set the UID/GID to run the main container user as, but if that user internally requires root permissions to access content within the container (which I think was the case/expectation with some official images I've used in the past), then you run into failures and that workaround isn't viable.

              Other workarounds were either platform dependent IIRC or required admin/root access to adjust Docker Daemon settings. Probably likely to be running Docker on Windows/macOS within a linux environment (VM or WSL) and be able to leverage this IDMAPPED feature I think, whereas the other options I'm not sure if they were as well supported or had other issues due to host filesystem layering/sync/abstractions.

              It's been a while since I looked into IDMAPPED mounts as I was waiting on broader support, and was particularly interested in OverlayFS getting it. So my memory is a bit foggy on it, I do have notes I could look up. I just remember it would be the ideal solution for me in the past where it's been an issue.


              Originally posted by binarybanana View Post
              YMMV, but they seem pretty full featured to me.

              ..but if you do there are very few things that don't work from my experiece. Personanny, I'm using LXC (not LXD).
              I recall some issues, that can be worked around, such as binding ports below 1024, other limitations are noted in the Dockers docs.

              Rootless support with Podman differs a bit at least, I'm aware of one difference for rootless support when it comes to networking. In a project I help maintain we had someone contribute some docs which I reviewed and helped revise that revealed the difference in supporting Fail2Ban for rootless container runtimes.

              Comment


              • #8
                Originally posted by polarathene View Post

                Do those actually have much need for this feature though?
                Openwrt doesn't use IDMAPPED at the moment. But, they use overlayfs extensively so adopting union mounts and namespace as an extended security model, as opposed to mandatory access control that are not suitable for embedded, is quite viable.

                Comment


                • #9
                  Originally posted by caligula View Post
                  Um the factory reset feature is based on this. You obviously have no idea.
                  It uses overlayfs, but I'm pretty sure it does NOT use idmapped mounts.

                  Originally posted by polarathene View Post

                  ? If you're providing volume mounts where state is persisted, as in the container writes to the location, read-only access is obviously not an option. And that's the common problem you'll see people complain about with Docker in the past.
                  Of course, if you want state, which implies writing, then you can't mount that read-only (unless you add overlayfs on top), but for shared ressources you can and probably should.

                  If a process runs as root inside the container and writes to that location, it's written locally with root ownership which sometimes is not desirable. You can try work around it, there is the `--user` option to set the UID/GID to run the main container user as, but if that user internally requires root permissions to access content within the container (which I think was the case/expectation with some official images I've used in the past), then you run into failures and that workaround isn't viable.
                  You mean you don't even want unprivileged root, but you need unpriviileged root permissios inside the container for some reason? Yes, that's annoying, but can be handled by dropping capabilities except those necessary and hope CAP_SYS_ADMIN isn't one if them. If it is things get hairy with seccomp or SELinux/AppArmor, yeah. The other way around (adding capabilities to an user) can also work unless the program checks if it's root and bails if not. Or filecaps, but I haven't used them yet. Not sure if you're aware if these, just saying.

                  The reason stock images don't or didn't work was that previous systemd versioons expected to be able to create dev nodes, but can't inside an unprivileged container IIRC. SELinux/AppArmor0/seccomp can add additional restrictions and maybe docker itself also.

                  Other workarounds were either platform dependent IIRC or required admin/root access to adjust Docker Daemon settings. Probably likely to be running Docker on Windows/macOS within a linux environment (VM or WSL) and be able to leverage this IDMAPPED feature I think, whereas the other options I'm not sure if they were as well supported or had other issues due to host filesystem layering/sync/abstractions.
                  I have never used docker and not on Windows. I'm surprized you can even mount things into containers on Windows, actually.

                  It's been a while since I looked into IDMAPPED mounts as I was waiting on broader support, and was particularly interested in OverlayFS getting it. So my memory is a bit foggy on it, I do have notes I could look up. I just remember it would be the ideal solution for me in the past where it's been an issue.

                  I recall some issues, that can be worked around, such as binding ports below 1024, other limitations are noted in the Dockers docs
                  Ah, yes, but that's a limitation on the host-side, because even host-side users can't bind to those ports. Docker networking mostly uses NAT and port forwarding, but on LXC it's more common to use bridging and routing, so this doesn't apply.

                  Could also mean I'm mixing up something, but I think rootless containers ase the same as unprivileged containers, right? If they're containers with no root (unprivineged root) at all a lot of stuff I wrote doesn't make sense.

                  Also, another point that might be confusing is that you can set owner ids files outside the valid range. So if you map 100000 on the hist to 0 in the container, you can just chown 100000 $file on the host and it will map to 0 in the guest. At least you can do that with plain bind mounts (also works for single files!). Maybe that works for those docker managed things, too. Again, I don't know if you're aware of this or not, just throwing it out there.

                  Comment


                  • #10
                    Originally posted by binarybanana View Post
                    You mean you don't even want unprivileged root, but you need unprivileged root permissions inside the container for some reason?
                    Generally in my experience I'm using a third-party container and there's some mounted local folders for persisting state to, while making it easy to access to me or others (rather than a data volume).

                    I think it may have been an annoyance for some containers running on a server for some online community, where a non-root account was in the docker group and managed the containers but anything like database state was being written as root from the container.

                    Only other time I recall was spinning up some local containers for dev, writing files as my user, but the container would install packages for example as the default root user and if it wrote anything else, I couldn't easily edit that in VSCode or move/delete data afterwards when I didn't need it anymore.

                    For me the major gripe was just unwanted root ownership on the volume mounts, so IDMAPPED mounts would be pretty nice for me. I haven't looked into how rootless containers would avoid that otherwise, I've assumed that they don't or it's unclear how they map ownership between files on host vs container like IDMAPPED resolves. Those use-cases probably would be fine to use rootless containers too, I never got around to it personally (volunteer work at least for the community server).

                    I mean it doesn't have to be root in particular, if the container writes with any UID/GID that your user doesn't have permission to manipulate, it'd be the same problem AFAIK?

                    Originally posted by binarybanana View Post
                    I have never used docker and not on Windows. I'm surprized you can even mount things into containers on Windows, actually.
                    I rarely use Windows myself. Last I knew they had Windows based docker images, which would be built from a windows base image and only work on windows or something.. while separately supporting the usual Linux based Docker images via a VM and mounting to NTFS was using some filesync to a VM filesystem that Docker would actually bind mount to.

                    These days Windows users use Docker through WSL I think, which is meant to have less drawbacks/issues. macOS I think also made some bigger changes in the past year with their Docker support which has had probably the most problems with file sync for bind mounts and networking perf issues. I can't comment much on either as I'm pretty much always on Linux.

                    Originally posted by binarybanana View Post
                    Could also mean I'm mixing up something, but I think rootless containers ase the same as unprivileged containers, right? If they're containers with no root (unprivineged root) at all a lot of stuff I wrote doesn't make sense.
                    I've barely looked into the feature myself, but my understanding is they run the Docker daemon as your user (or some other non-root user). The containers can still run with a root user internally, but as you say it's unprivileged and meant to limit the damage if the container was compromised and the attacker managed to escape out of it.

                    Originally posted by binarybanana View Post
                    At least you can do that with plain bind mounts (also works for single files!). Maybe that works for those docker managed things, too. Again, I don't know if you're aware of this or not, just throwing it out there.
                    I don't quite follow, presently if you use a volume bind mount, there is no mapping done with UID:GID between host and guest(container). There is the ability to run the container and change the UID:GID via a run-time option, but it's not file specific, preventing the user from having the expected permissions in the container to operate correctly. There is config support for the Docker Daemon to do namespace mapping IIRC, it's been a while since I read about that but I think it was container wide, it supported providing a range of IDs to map.

                    What you're suggesting is presumably how IDMAPPED would work?

                    Comment

                    Working...
                    X