* Thoughts on tightening up user namespace creation
@ 2016-03-08 5:15 Andy Lutomirski
[not found] ` <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 13+ messages in thread
From: Andy Lutomirski @ 2016-03-08 5:15 UTC (permalink / raw)
To: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Eric W. Biederman, Linux Containers, Alexander Larsson,
Colin Walters, Serge Hallyn, Stephane Graber, Kees Cook,
Seth Forshee
Hi all-
There are several users and distros that are nervous about user
namespaces from an attack surface point of view.
- RHEL and Arch have userns disabled.
- Ubuntu requires CAP_SYS_ADMIN
- Kees periodically proposes to upstream some sysctl to control
userns creation.
I think there are three main types of concerns. First, there might be
some as-yet-unknown semantic issues that would allow privilege
escalation by users who create user namespaces and then confuse
something else in the system. Second, enabling user namespaces
exposes a lot of attack surface to unprivileged users. Third,
allowing tasks to create user namespaces exposes the kernel to various
resource exhaustion attacks that wouldn't be possible otherwise.
Since I doubt we'll ever fully address the attack surface issue at
least, would it make sense to try to come up with an upstreamable way
to limit who can create new user namespaces and/or do various
dangerous things with them?
I'll divide the rest of the email into the "what" and the "who".
+++ What does the privilege of creating a user namespace entail? +++
This could be an all-or-nothing thing. It would certainly be possible
for appropriately privileged tasks to be able to unshare namespaces
and use their facilities exactly like any task can in a current
user-ns-enabled kernel and for other tasks to be unable to unshare
anything.
Finer gradations are, in principle, possible. For example, it could
be possible for a given task to unshare its userns but to have limited
caps inside or to be unable to unshare certain other namespaces. For
example, maybe a task could unshare userns and mount ns but not net
ns. I don't think this would be particularly useful.
It might be more interesting to allow a task to unshare all
namespaces, hold all capabilities in them, but to still be unable to
use certain privileged facilities. For example, maybe denying
administrative control over iptables, creation of exotic network
interface types, or similar would make sense. I don't know how we'd
specify this type of constraint.
+++ Who can create user namespaces (possibly with restrictions)? +++
I can think of a few formulations.
A simpler approach would be to add a per-namespace setting listing
users and/or groups that can unshare their userns. A userns starts
out allowing everyone to unshare userns, and anyone with CAP_SYS_ADMIN
can change the setting.
A fancier approach would be to have an fd that represents the right to
unshare your userns. Some privilege broker could give out those fds
to apps that need them and meet whatever criteria are set. If you try
to unshare your userns without the fd, it falls back to some simpler
policy.
I think I prefer the simpler one. It's simple, and I haven't come up
with a concrete problem with it yet.
Thoughts?
^ permalink raw reply [flat|nested] 13+ messages in thread[parent not found: <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-03-08 6:06 ` Serge E. Hallyn [not found] ` <20160308060657.GA3565-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 2016-03-08 10:05 ` Alexander Larsson ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Serge E. Hallyn @ 2016-03-08 6:06 UTC (permalink / raw) To: Andy Lutomirski Cc: Kees Cook, Colin Walters, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Seth Forshee, Stephane Graber, Eric W. Biederman On Mon, Mar 07, 2016 at 09:15:25PM -0800, Andy Lutomirski wrote: > Hi all- > > There are several users and distros that are nervous about user > namespaces from an attack surface point of view. > > - RHEL and Arch have userns disabled. > > - Ubuntu requires CAP_SYS_ADMIN No, it does not. It has temporarily re-added a sysctl which can enable that behavior, but it's not set by default. The reason for providing it is not a distrust of user namespaces in general, but because we're enabling some bleeding edge patches which haven't been accepted upstream yet. Once they're accepted upstream I expect that patch to be dropped again, unless it has gone upstream. Debian does afaik still have a version of a patch I'd originally written before user namespaces were upstream which defaulted unprivileged userns cloning to off. Did you mean Debian here? > - Kees periodically proposes to upstream some sysctl to control > userns creation. > > I think there are three main types of concerns. First, there might be > some as-yet-unknown semantic issues that would allow privilege > escalation by users who create user namespaces and then confuse > something else in the system. Second, enabling user namespaces > exposes a lot of attack surface to unprivileged users. Third, > allowing tasks to create user namespaces exposes the kernel to various > resource exhaustion attacks that wouldn't be possible otherwise. > > Since I doubt we'll ever fully address the attack surface issue at > least, would it make sense to try to come up with an upstreamable way > to limit who can create new user namespaces and/or do various > dangerous things with them? > > I'll divide the rest of the email into the "what" and the "who". > > +++ What does the privilege of creating a user namespace entail? +++ > > This could be an all-or-nothing thing. It would certainly be possible > for appropriately privileged tasks to be able to unshare namespaces > and use their facilities exactly like any task can in a current > user-ns-enabled kernel and for other tasks to be unable to unshare > anything. > > Finer gradations are, in principle, possible. For example, it could > be possible for a given task to unshare its userns but to have limited > caps inside or to be unable to unshare certain other namespaces. For > example, maybe a task could unshare userns and mount ns but not net > ns. I don't think this would be particularly useful. > > It might be more interesting to allow a task to unshare all > namespaces, hold all capabilities in them, but to still be unable to > use certain privileged facilities. For example, maybe denying > administrative control over iptables, creation of exotic network > interface types, or similar would make sense. I don't know how we'd > specify this type of constraint. > > +++ Who can create user namespaces (possibly with restrictions)? +++ > > I can think of a few formulations. > > A simpler approach would be to add a per-namespace setting listing > users and/or groups that can unshare their userns. A userns starts > out allowing everyone to unshare userns, and anyone with CAP_SYS_ADMIN > can change the setting. > > A fancier approach would be to have an fd that represents the right to > unshare your userns. Some privilege broker could give out those fds > to apps that need them and meet whatever criteria are set. If you try > to unshare your userns without the fd, it falls back to some simpler > policy. > > I think I prefer the simpler one. It's simple, and I haven't come up > with a concrete problem with it yet. > > > > > Thoughts? > _______________________________________________ > Containers mailing list > Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20160308060657.GA3565-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <20160308060657.GA3565-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2016-03-08 18:31 ` Andy Lutomirski [not found] ` <CALCETrUx_O7Uiyxjs8H++bAR34dSvWny+HsVzXasCVE9wHFGFA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2016-03-08 18:31 UTC (permalink / raw) To: Serge Hallyn Cc: Kees Cook, Colin Walters, Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Seth Forshee, Eric W. Biederman, Stephane Graber On Mar 7, 2016 10:06 PM, "Serge E. Hallyn" <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > > On Mon, Mar 07, 2016 at 09:15:25PM -0800, Andy Lutomirski wrote: > > - Ubuntu requires CAP_SYS_ADMIN > > No, it does not. It has temporarily re-added a sysctl which can enable > that behavior, but it's not set by default. The reason for providing it > is not a distrust of user namespaces in general, but because we're enabling > some bleeding edge patches which haven't been accepted upstream yet. Once > they're accepted upstream I expect that patch to be dropped again, unless > it has gone upstream. > > Debian does afaik still have a version of a patch I'd originally written > before user namespaces were upstream which defaulted unprivileged userns > cloning to off. Did you mean Debian here? I meant Ubuntu 14.04, which I tested, possibly poorly. ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CALCETrUx_O7Uiyxjs8H++bAR34dSvWny+HsVzXasCVE9wHFGFA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <CALCETrUx_O7Uiyxjs8H++bAR34dSvWny+HsVzXasCVE9wHFGFA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-03-08 22:41 ` Serge E. Hallyn 0 siblings, 0 replies; 13+ messages in thread From: Serge E. Hallyn @ 2016-03-08 22:41 UTC (permalink / raw) To: Andy Lutomirski Cc: Kees Cook, Colin Walters, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Seth Forshee, Eric W. Biederman, Stephane Graber Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org): > On Mar 7, 2016 10:06 PM, "Serge E. Hallyn" <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> wrote: > > > > On Mon, Mar 07, 2016 at 09:15:25PM -0800, Andy Lutomirski wrote: > > > - Ubuntu requires CAP_SYS_ADMIN > > > > No, it does not. It has temporarily re-added a sysctl which can enable > > that behavior, but it's not set by default. The reason for providing it > > is not a distrust of user namespaces in general, but because we're enabling > > some bleeding edge patches which haven't been accepted upstream yet. Once > > they're accepted upstream I expect that patch to be dropped again, unless > > it has gone upstream. > > > > Debian does afaik still have a version of a patch I'd originally written > > before user namespaces were upstream which defaulted unprivileged userns > > cloning to off. Did you mean Debian here? > > I meant Ubuntu 14.04, which I tested, possibly poorly. Weird, 14.04 with the default kernel (3.13.0-79-generic #123-Ubuntu) doesn't have the sysctl at all. -serge ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Thoughts on tightening up user namespace creation [not found] ` <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-03-08 6:06 ` Serge E. Hallyn @ 2016-03-08 10:05 ` Alexander Larsson 2016-03-08 16:31 ` Eric W. Biederman 2016-03-09 18:14 ` Kees Cook 3 siblings, 0 replies; 13+ messages in thread From: Alexander Larsson @ 2016-03-08 10:05 UTC (permalink / raw) To: Andy Lutomirski, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Eric W. Biederman, Linux Containers, Colin Walters, Serge Hallyn, Stephane Graber, Kees Cook, Seth Forshee On mån, 2016-03-07 at 21:15 -0800, Andy Lutomirski wrote: > Hi all- > > I think there are three main types of concerns. First, there might > be > some as-yet-unknown semantic issues that would allow privilege > escalation by users who create user namespaces and then confuse > something else in the system. Second, enabling user namespaces > exposes a lot of attack surface to unprivileged users. Third, > allowing tasks to create user namespaces exposes the kernel to > various > resource exhaustion attacks that wouldn't be possible otherwise. In my work on xdg-app i've seen some issues that I'd ideally would like to see a solution to. They are not necessarily security vulnerabilities, but still problems: devpts is only mountable in a user namespace if the root user is mapped. Possible to work around, but ugly. There is no way to recursively apply mount flags. For example, I often want to recursively bind mount some directory from the host but with MS_READONLY|MS_NODEV. I cannot apply the flags in the MS_BIND|MS_REC mount, so instead i have to first bind mount and then remount. However, the remount is not recursive, so i have to manually parse /proc/self/mountinfo and figure out all the submounts that were added. Also, I have to manually avoid trying to remount covered mounts, because I can't reach those, and for each remount I have to parse out its current flags so i don't accidentally unset some set flag, causing EPERM. Mount flags are not applied on propagated mounts. Even if I do all the stuff above, if i get a *new* mount propagated into my namespace, or if a parent unmount is propagated uncovering an mount in my namespace, then this new mountpoint is not read-only. This has no workaround that I'm currently aware of. Abstract unix domain sockets are tied to the network namespace. I understand where this comes from, socket syscalls are "networkish". However, the non-abstract unix domain sockets are under the control of the filesystem namespace, and I can fully control them when setting up the sandbox. But, as long as the sandbox share the network namespace with the host (which is likely for desktop apps) it will have full access to all services listening on abstract sockets on the host. This is particularly problematic because 1) abstract sockets have no file permissions, so any Xserver running on the host is wide open, 2) Whether a connect call uses abstract sockets is not detectable via seccomp, so we can't filter it in any other way. I don't know how sever this is, as it depends on how trusty the individual services are but at least on my system "grep @ /proc/net/unix" lists session dbus instances, X server, and some iSCSI thing. /proc (even the limited pid namespace one) contains a lot of old cruft that at a minimum leaks hardware info to the sandbox, and could potentially do worse (/proc/sysrq-trigger anyone?). I'd like to be able to mount a "clean" /proc that has only the process-related stuff. > +++ What does the privilege of creating a user namespace entail? +++ > > > It might be more interesting to allow a task to unshare all > namespaces, hold all capabilities in them, but to still be unable to > use certain privileged facilities. For example, maybe denying > administrative control over iptables, creation of exotic network > interface types, or similar would make sense. > I don't know how we'd specify this type of constraint. I think this particular issue is the main problem here. Unless we add some very course bit-flags that specify the constraints it is going to be a very complex API to set up such constraints. Adding course bit- flags essentially means adding new capabilities (maybe subsetting existing ones). Given how hard it is to understand how all the current capabilities interact and how they can be exploited I'm not sure this is a great idea. Maybe we can use the LSM framework to model the constraints? For instance, the user could be allowed to create user namespaces, but they processes in it automatically get some selinux context applied. Then that selinux context could be configured to limit access to certain operations. > +++ Who can create user namespaces (possibly with restrictions)? +++ > > I can think of a few formulations. > > A simpler approach would be to add a per-namespace setting listing > users and/or groups that can unshare their userns. A userns starts > out allowing everyone to unshare userns, and anyone with > CAP_SYS_ADMIN > can change the setting. This sounds like a cgroup controller to me. It makes sense for my usecase (i.e. sandboxed desktop apps). You want to give all processes in the users login session access to user namespaces, but not necessary to e.g. a service or background process or a cron job running as that user. > A fancier approach would be to have an fd that represents the right > to > unshare your userns. Some privilege broker could give out those fds > to apps that need them and meet whatever criteria are set. If you > try > to unshare your userns without the fd, it falls back to some simpler > policy. In practice though, how would the privilege broken know and apply the criteria. Its not even got the information the kernel has (such as race-free access to the peer cgroup). -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander Larsson Red Hat, Inc alexl@redhat.com alexander.larsson@gmail.com He's an ungodly devious paramedic on his last day in the job. She's a sharp-shooting cigar-chomping archaeologist married to the Mob. They fight crime! _______________________________________________ Containers mailing list Containers@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Thoughts on tightening up user namespace creation [not found] ` <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-03-08 6:06 ` Serge E. Hallyn 2016-03-08 10:05 ` Alexander Larsson @ 2016-03-08 16:31 ` Eric W. Biederman 2016-03-09 18:14 ` Kees Cook 3 siblings, 0 replies; 13+ messages in thread From: Eric W. Biederman @ 2016-03-08 16:31 UTC (permalink / raw) To: Andy Lutomirski Cc: Kees Cook, Colin Walters, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Seth Forshee, Stephane Graber Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> writes: > Hi all- [Snip strange things distros do] Distros do strange things from other peoples perspectives. Sometimes we can help with that sometimes we can't. In general producing kernel code that is reliable and well maintained is what we can do. Distro folks can decide what they are comfortable beyond that. Frankly I find it heartening that not all distros enable everything all of the time, are are showing some modicum of restraint and judgement. If folks don't think a feature like user namespaces is ready and they don't need that feature I am quite happy for them not to enable that feature in their kernel. > Since I doubt we'll ever fully address the attack surface issue at > least, would it make sense to try to come up with an upstreamable way > to limit who can create new user namespaces and/or do various > dangerous things with them? Even without user namespaces the kernel has attack surface issues. The kernel is big and bugs happen. That surface is only bigger when you are root in a user namespace so the probability of a finding an exploitable bug goes up. > I'll divide the rest of the email into the "what" and the "who". > > +++ What does the privilege of creating a user namespace entail? +++ > > This could be an all-or-nothing thing. It would certainly be possible > for appropriately privileged tasks to be able to unshare namespaces > and use their facilities exactly like any task can in a current > user-ns-enabled kernel and for other tasks to be unable to unshare > anything. > > Finer gradations are, in principle, possible. For example, it could > be possible for a given task to unshare its userns but to have limited > caps inside or to be unable to unshare certain other namespaces. For > example, maybe a task could unshare userns and mount ns but not net > ns. I don't think this would be particularly useful. I am actually inclined to think just the opposite. There was a period where would have been much less susceptible to problems if just unprivileged create to the mount namespace could have been implemented. When I look at this from a resource consumption point of view I definitely see arguments for limiting things by resource type. As it can be very easy to know I need no more than X of some specific resource type but that I don't know how much memory that will take. > It might be more interesting to allow a task to unshare all > namespaces, hold all capabilities in them, but to still be unable to > use certain privileged facilities. For example, maybe denying > administrative control over iptables, creation of exotic network > interface types, or similar would make sense. I don't know how we'd > specify this type of constraint. That does seem to start approaching lsm territory. And there is a funny balance between reducing attack surface and adding attack surface to reduce attack surface. > +++ Who can create user namespaces (possibly with restrictions)? +++ > > I can think of a few formulations. > > A simpler approach would be to add a per-namespace setting listing > users and/or groups that can unshare their userns. A userns starts > out allowing everyone to unshare userns, and anyone with CAP_SYS_ADMIN > can change the setting. > > A fancier approach would be to have an fd that represents the right to > unshare your userns. Some privilege broker could give out those fds > to apps that need them and meet whatever criteria are set. If you try > to unshare your userns without the fd, it falls back to some simpler > policy. > > I think I prefer the simpler one. It's simple, and I haven't come up > with a concrete problem with it yet. Agreed. Your simple scheme is roughly what I was proposing earlier of having a per user limit on the number of user namespaces they can create. I am a little partial to having it be a resource limit as that covers more use cases with less code. That said the really important case to cover is the case where some subset of applications are denied access to resources (for sandboxing) and another subset is allowed. Eric ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Thoughts on tightening up user namespace creation [not found] ` <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> ` (2 preceding siblings ...) 2016-03-08 16:31 ` Eric W. Biederman @ 2016-03-09 18:14 ` Kees Cook [not found] ` <CAGXu5jLB5==RAs9YrsPi4m6ZBPn3UtbCzagu_+gr-rtSgKzB1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> [not found] ` <1457549467.650797.544465346.49653120@webmail.messagingengine.com> 3 siblings, 2 replies; 13+ messages in thread From: Kees Cook @ 2016-03-09 18:14 UTC (permalink / raw) To: Andy Lutomirski Cc: Colin Walters, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Seth Forshee, Stephane Graber, Eric W. Biederman On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > Hi all- > > There are several users and distros that are nervous about user > namespaces from an attack surface point of view. > > - RHEL and Arch have userns disabled. > > - Ubuntu requires CAP_SYS_ADMIN > > - Kees periodically proposes to upstream some sysctl to control > userns creation. And here's another ring0 escalation flaw, made available to unprivileged users because of userns: https://code.google.com/p/google-security-research/issues/detail?id=758 > I think there are three main types of concerns. First, there might be > some as-yet-unknown semantic issues that would allow privilege > escalation by users who create user namespaces and then confuse > something else in the system. Second, enabling user namespaces > exposes a lot of attack surface to unprivileged users. Third, > allowing tasks to create user namespaces exposes the kernel to various > resource exhaustion attacks that wouldn't be possible otherwise. > > Since I doubt we'll ever fully address the attack surface issue at > least, would it make sense to try to come up with an upstreamable way > to limit who can create new user namespaces and/or do various > dangerous things with them? The change in attack surface is _substantial_. We must have a way to globally disable userns. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAGXu5jLB5==RAs9YrsPi4m6ZBPn3UtbCzagu_+gr-rtSgKzB1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <CAGXu5jLB5==RAs9YrsPi4m6ZBPn3UtbCzagu_+gr-rtSgKzB1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-03-09 18:51 ` Colin Walters 2016-03-09 19:07 ` Serge E. Hallyn 1 sibling, 0 replies; 13+ messages in thread From: Colin Walters @ 2016-03-09 18:51 UTC (permalink / raw) To: Kees Cook, Andy Lutomirski Cc: Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Seth Forshee, Stephane Graber, Eric W. Biederman On Wed, Mar 9, 2016, at 01:14 PM, Kees Cook wrote: > On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > > Hi all- > > > > There are several users and distros that are nervous about user > > namespaces from an attack surface point of view. > > > > - RHEL and Arch have userns disabled. > > > > - Ubuntu requires CAP_SYS_ADMIN > > > > - Kees periodically proposes to upstream some sysctl to control > > userns creation. > > And here's another ring0 escalation flaw, made available to > unprivileged users because of userns: > > https://code.google.com/p/google-security-research/issues/detail?id=758 Looks like Andy won't have to eat his hat ;) > The change in attack surface is _substantial_. We must have a way to > globally disable userns. No one would object if it was enabled but only accessible to CAP_SYS_ADMIN though, right? This could be useful for writing setuid binaries that expose some of the features, but e.g. not CAP_NET_ADMIN. Andy's suggestion of having this be a per-namespace setting makes sense to me. Currently some container tools that do use userns are by default denying it to be recursive (Sandstorm.io and Docker 1.10 at least) by using a seccomp filter on clone(). If we had this setting that filter wouldn't be necessary, and would solve the issue that seccomp filters aren't robust against the kernel adding new API, e.g. a new CLONE_NEWUSER_NONEWPRIVS which might enable chroot() but not CAP_NET_ADMIN. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Thoughts on tightening up user namespace creation [not found] ` <CAGXu5jLB5==RAs9YrsPi4m6ZBPn3UtbCzagu_+gr-rtSgKzB1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-03-09 18:51 ` Colin Walters @ 2016-03-09 19:07 ` Serge E. Hallyn [not found] ` <20160309190725.GA2218-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 1 sibling, 1 reply; 13+ messages in thread From: Serge E. Hallyn @ 2016-03-09 19:07 UTC (permalink / raw) To: Kees Cook Cc: Colin Walters, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Andy Lutomirski, Seth Forshee, Eric W. Biederman, Stephane Graber Quoting Kees Cook (keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org): > On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > > Hi all- > > > > There are several users and distros that are nervous about user > > namespaces from an attack surface point of view. > > > > - RHEL and Arch have userns disabled. > > > > - Ubuntu requires CAP_SYS_ADMIN > > > > - Kees periodically proposes to upstream some sysctl to control > > userns creation. > > And here's another ring0 escalation flaw, made available to > unprivileged users because of userns: > > https://code.google.com/p/google-security-research/issues/detail?id=758 Kees, I think you think this makes your point, but all it does is make me want to argue with you and start flinging back cves against kvm, af_unix, sctp, etc. > > I think there are three main types of concerns. First, there might be > > some as-yet-unknown semantic issues that would allow privilege > > escalation by users who create user namespaces and then confuse > > something else in the system. Second, enabling user namespaces > > exposes a lot of attack surface to unprivileged users. Third, > > allowing tasks to create user namespaces exposes the kernel to various > > resource exhaustion attacks that wouldn't be possible otherwise. > > > > Since I doubt we'll ever fully address the attack surface issue at > > least, would it make sense to try to come up with an upstreamable way > > to limit who can create new user namespaces and/or do various > > dangerous things with them? > > The change in attack surface is _substantial_. We must have a way to > globally disable userns. I'm confused. Didn't we agree a few months ago, somewhat reluctantly, on a sysctl? ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20160309190725.GA2218-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <20160309190725.GA2218-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2016-03-09 19:12 ` Kees Cook 0 siblings, 0 replies; 13+ messages in thread From: Kees Cook @ 2016-03-09 19:12 UTC (permalink / raw) To: Serge E. Hallyn Cc: Colin Walters, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Andy Lutomirski, Seth Forshee, Eric W. Biederman, Stephane Graber On Wed, Mar 9, 2016 at 11:07 AM, Serge E. Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote: > Quoting Kees Cook (keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org): >> On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >> > Hi all- >> > >> > There are several users and distros that are nervous about user >> > namespaces from an attack surface point of view. >> > >> > - RHEL and Arch have userns disabled. >> > >> > - Ubuntu requires CAP_SYS_ADMIN >> > >> > - Kees periodically proposes to upstream some sysctl to control >> > userns creation. >> >> And here's another ring0 escalation flaw, made available to >> unprivileged users because of userns: >> >> https://code.google.com/p/google-security-research/issues/detail?id=758 > > Kees, I think you think this makes your point, but all it does is make > me want to argue with you and start flinging back cves against kvm, > af_unix, sctp, etc. I can run a distro kernel without kvm and sctp, because I can leave their modules unloaded. There is no such option for userns. The last af_unix CVEs I see were 2 from 2013, and before that, 2010. There's no comparison here on frequency. >> > I think there are three main types of concerns. First, there might be >> > some as-yet-unknown semantic issues that would allow privilege >> > escalation by users who create user namespaces and then confuse >> > something else in the system. Second, enabling user namespaces >> > exposes a lot of attack surface to unprivileged users. Third, >> > allowing tasks to create user namespaces exposes the kernel to various >> > resource exhaustion attacks that wouldn't be possible otherwise. >> > >> > Since I doubt we'll ever fully address the attack surface issue at >> > least, would it make sense to try to come up with an upstreamable way >> > to limit who can create new user namespaces and/or do various >> > dangerous things with them? >> >> The change in attack surface is _substantial_. We must have a way to >> globally disable userns. > > I'm confused. Didn't we agree a few months ago, somewhat reluctantly, > on a sysctl? No, Eric refused it and wanted finer-grained controls. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1457549467.650797.544465346.49653120@webmail.messagingengine.com>]
[parent not found: <1457549467.650797.544465346.49653120-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <1457549467.650797.544465346.49653120-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org> @ 2016-03-09 19:04 ` Austin S. Hemmelgarn 2016-03-09 19:21 ` Serge E. Hallyn 1 sibling, 0 replies; 13+ messages in thread From: Austin S. Hemmelgarn @ 2016-03-09 19:04 UTC (permalink / raw) To: Colin Walters, Kees Cook, Andy Lutomirski Cc: Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Seth Forshee, Stephane Graber, Eric W. Biederman On 2016-03-09 13:51, Colin Walters wrote: > On Wed, Mar 9, 2016, at 01:14 PM, Kees Cook wrote: >> On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >>> Hi all- >>> >>> There are several users and distros that are nervous about user >>> namespaces from an attack surface point of view. >>> >>> - RHEL and Arch have userns disabled. >>> >>> - Ubuntu requires CAP_SYS_ADMIN >>> >>> - Kees periodically proposes to upstream some sysctl to control >>> userns creation. >> >> And here's another ring0 escalation flaw, made available to >> unprivileged users because of userns: >> >> https://code.google.com/p/google-security-research/issues/detail?id=758 > > Looks like Andy won't have to eat his hat ;) > >> The change in attack surface is _substantial_. We must have a way to >> globally disable userns. > > No one would object if it was enabled but only accessible to > CAP_SYS_ADMIN though, right? This could be useful for > writing setuid binaries that expose some of the features, but e.g. not > CAP_NET_ADMIN. At least Google Chrome (and probably Chromium) is using user namespaces without CAP_SYS_ADMIM (although AFAIUI, it's because they can't use the other namespace types effectively as a regular user). > > Andy's suggestion of having this be a per-namespace setting makes > sense to me. Currently some container tools that do use userns > are by default denying it to be recursive (Sandstorm.io and Docker 1.10 at least) > by using a seccomp filter on clone(). If we had this setting that > filter wouldn't be necessary, and would solve the issue that seccomp filters > aren't robust against the kernel adding new API, e.g. a new CLONE_NEWUSER_NONEWPRIVS > which might enable chroot() but not CAP_NET_ADMIN. > Personally, I like the suggestion from Alexander Larsson to make a cgroup controller. Container tools obviously want some degree of hierarchical control (even if it's just saying that the hierarchy ends here), and it would simplify the possibility of running more than one container stack on the same host (I know at least a couple people who would love to be able to safely use Docker on the same host as LXC or lmctfy). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Thoughts on tightening up user namespace creation [not found] ` <1457549467.650797.544465346.49653120-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org> 2016-03-09 19:04 ` Austin S. Hemmelgarn @ 2016-03-09 19:21 ` Serge E. Hallyn [not found] ` <20160309192103.GA2523-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> 1 sibling, 1 reply; 13+ messages in thread From: Serge E. Hallyn @ 2016-03-09 19:21 UTC (permalink / raw) To: Colin Walters Cc: Kees Cook, Linux Containers, Serge Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andy Lutomirski, Seth Forshee, Eric W. Biederman, Stephane Graber Quoting Colin Walters (walters-gPq2gbYjIk8dnm+yROfE0A@public.gmane.org): > On Wed, Mar 9, 2016, at 01:14 PM, Kees Cook wrote: > > On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > > > Hi all- > > > > > > There are several users and distros that are nervous about user > > > namespaces from an attack surface point of view. > > > > > > - RHEL and Arch have userns disabled. > > > > > > - Ubuntu requires CAP_SYS_ADMIN > > > > > > - Kees periodically proposes to upstream some sysctl to control > > > userns creation. > > > > And here's another ring0 escalation flaw, made available to > > unprivileged users because of userns: > > > > https://code.google.com/p/google-security-research/issues/detail?id=758 > > Looks like Andy won't have to eat his hat ;) > > > The change in attack surface is _substantial_. We must have a way to > > globally disable userns. > > No one would object if it was enabled but only accessible to > CAP_SYS_ADMIN though, right? This could be useful for I think that would be terrible. I'd have to expose all of CAP_SYS_ADMIN to allow use of CLONE_NEWUSER. I'd be more interested in a new CAP_NEWUSER capability. Then systems wanting to support unprivileged users doing user namespaces could set a pam module giving certain users that cap in pI, and set it on fI on their container managers. Userspace has to give access to mapped uids through /etc/subuid too, so it's not *so* huge added hurdle. Well that's not quite true - with empty subuid, users can create a userns with no mapped userids which in itself is useful for sandboxing. The biggest problem with a CAP_NEWUSER would be that it's more inherently permanent than a new sysctl. The increase in attack surface is real, but over time I'd like to think that we will have dealt with it and should be able to make CLONE_NEWUSER unprivileged. Because what we have is an implementation issue (not in user namespaces), not a design issue. And I do agree the issue is real. -serge ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20160309192103.GA2523-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>]
* Re: Thoughts on tightening up user namespace creation [not found] ` <20160309192103.GA2523-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org> @ 2016-03-09 19:25 ` Kees Cook 0 siblings, 0 replies; 13+ messages in thread From: Kees Cook @ 2016-03-09 19:25 UTC (permalink / raw) To: Serge E. Hallyn Cc: Colin Walters, Linux Containers, Serge Hallyn, LKML, Andy Lutomirski, Seth Forshee, Eric W. Biederman, Stephane Graber On Wed, Mar 9, 2016 at 11:21 AM, Serge E. Hallyn <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> wrote: > Quoting Colin Walters (walters-gPq2gbYjIk8dnm+yROfE0A@public.gmane.org): >> On Wed, Mar 9, 2016, at 01:14 PM, Kees Cook wrote: >> > On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >> > > Hi all- >> > > >> > > There are several users and distros that are nervous about user >> > > namespaces from an attack surface point of view. >> > > >> > > - RHEL and Arch have userns disabled. >> > > >> > > - Ubuntu requires CAP_SYS_ADMIN >> > > >> > > - Kees periodically proposes to upstream some sysctl to control >> > > userns creation. >> > >> > And here's another ring0 escalation flaw, made available to >> > unprivileged users because of userns: >> > >> > https://code.google.com/p/google-security-research/issues/detail?id=758 >> >> Looks like Andy won't have to eat his hat ;) >> >> > The change in attack surface is _substantial_. We must have a way to >> > globally disable userns. >> >> No one would object if it was enabled but only accessible to >> CAP_SYS_ADMIN though, right? This could be useful for > > I think that would be terrible. I'd have to expose all of CAP_SYS_ADMIN > to allow use of CLONE_NEWUSER. I'd be more interested in a new CAP_NEWUSER > capability. Then systems wanting to support unprivileged users doing user > namespaces could set a pam module giving certain users that cap in pI, and > set it on fI on their container managers. Userspace has to give access to > mapped uids through /etc/subuid too, so it's not *so* huge added hurdle. > Well that's not quite true - with empty subuid, users can create a userns > with no mapped userids which in itself is useful for sandboxing. > > The biggest problem with a CAP_NEWUSER would be that it's more inherently > permanent than a new sysctl. The increase in attack surface is real, but > over time I'd like to think that we will have dealt with it and should be > able to make CLONE_NEWUSER unprivileged. Because what we have is an > implementation issue (not in user namespaces), not a design issue. Andy suggested a capability back in October. But I agree with you, we don't want a new capability. https://lkml.org/lkml/2015/10/17/94 > And I do agree the issue is real. And I fully expect for the issue to improve over time: it's not that I don't want userns, I just want to have the _option_ to disable it at runtime for the systems that don't need it until the newly exposed interfaces look like they've had the bulk of their issues resolved. -Kees -- Kees Cook Chrome OS & Brillo Security ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-03-09 19:25 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-08 5:15 Thoughts on tightening up user namespace creation Andy Lutomirski
[not found] ` <CALCETrU4+zTKABz1foEA=an3XYbe_UXxn_w9=1GjVzMe5DXXPw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-08 6:06 ` Serge E. Hallyn
[not found] ` <20160308060657.GA3565-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-03-08 18:31 ` Andy Lutomirski
[not found] ` <CALCETrUx_O7Uiyxjs8H++bAR34dSvWny+HsVzXasCVE9wHFGFA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-08 22:41 ` Serge E. Hallyn
2016-03-08 10:05 ` Alexander Larsson
2016-03-08 16:31 ` Eric W. Biederman
2016-03-09 18:14 ` Kees Cook
[not found] ` <CAGXu5jLB5==RAs9YrsPi4m6ZBPn3UtbCzagu_+gr-rtSgKzB1Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-03-09 18:51 ` Colin Walters
2016-03-09 19:07 ` Serge E. Hallyn
[not found] ` <20160309190725.GA2218-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-03-09 19:12 ` Kees Cook
[not found] ` <1457549467.650797.544465346.49653120@webmail.messagingengine.com>
[not found] ` <1457549467.650797.544465346.49653120-2RFepEojUI2N1INw9kWLP6GC3tUn3ZHUQQ4Iyu8u01E@public.gmane.org>
2016-03-09 19:04 ` Austin S. Hemmelgarn
2016-03-09 19:21 ` Serge E. Hallyn
[not found] ` <20160309192103.GA2523-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-03-09 19:25 ` Kees Cook
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox