[PATCH] userns/capability: Add user namespace capability

linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] userns/capability: Add user namespace capability
@ 2015-10-17 15:58 Tobias Markus
       [not found] ` <5622700C.9090107-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Tobias Markus @ 2015-10-17 15:58 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

Add capability CAP_SYS_USER_NS.
Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
when calling clone or unshare with CLONE_NEWUSER.

Rationale:

Linux 3.8 saw the introduction of unpriviledged user namespaces,
allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
inside a separate user namespace. Before that, any namespace creation
required CAP_SYS_ADMIN (or, in practice, the user had to be root).
Unfortunately, there have been some security-relevant bugs in the
meantime. Because of the fairly complex nature of user namespaces, it is
reasonable to say that future vulnerabilties can not be excluded. Some
distributions even wholly disable user namespaces because of this.

Both options, user namespaces with and without CAP_SYS_ADMIN, can be
said to represent the extreme end of the spectrum. In practice, there is
no reason for every process to have the abilitiy to create user
namespaces. Indeed, only very few and specialized programs require user
namespaces. This seems to be a perfect fit for the (file) capability
system: Priviledged users could manually allow only a certain executable
to be able to create user namespaces by setting a certain capability,
I'd suggest the name CAP_SYS_USER_NS. Executables completely unrelated
to user namespaces should and can not create them.

The capability should only be required in the "root" user namespace (the
user namespace with level 0) though, to allow nested user namespaces to
work as intended. If a user namespace has a level greater than 0, the
original process must have had CAP_SYS_USER_NS, so it is "trusted" anyway.

One question remains though: Does this break userspace executables that
expect being able to create user namespaces without priviledge? Since
creating user namespaces without CAP_SYS_ADMIN was not possible before
Linux 3.8, programs should already expect a potential EPERM upon calling
clone. Since creating a user namespace without CAP_SYS_USER_NS would
also cause EPERM, we should be on the safe side.

Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: Eric W. Biederman <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Al Viro <viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Christoph Lameter <cl-vYTEC60ixJUAvxtiuMwx3w@public.gmane.org>
Cc: Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
---
 include/uapi/linux/capability.h     | 5 ++++-
 kernel/cred.c                       | 7 ++++++-
 security/selinux/include/classmap.h | 2 +-
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 12c37a1..d83540f 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -351,8 +351,11 @@ struct vfs_cap_data {
  #define CAP_AUDIT_READ		37
 +/* Allow creating user namespaces (CLONE_NEWUSER) using clone() and unshare() */
 -#define CAP_LAST_CAP         CAP_AUDIT_READ
+#define CAP_SYS_USER_NS      38
+
+#define CAP_LAST_CAP         CAP_SYS_USER_NS
  #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 diff --git a/kernel/cred.c b/kernel/cred.c
index 71179a0..847d499 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -345,7 +345,12 @@ int copy_creds(struct task_struct *p, unsigned long clone_flags)
 		return -ENOMEM;
  	if (clone_flags & CLONE_NEWUSER) {
-		ret = create_user_ns(new);
+		if (new->user_ns->level == 0 &&
+		    !has_capability(p, CAP_SYS_USER_NS)) {
+			ret = -EPERM;
+		} else {
+			ret = create_user_ns(new);
+		}
 		if (ret < 0)
 			goto error_put;
 	}
diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.h
index 5a4eef5..07cec76 100644
--- a/security/selinux/include/classmap.h
+++ b/security/selinux/include/classmap.h
@@ -39,7 +39,7 @@ struct security_class_mapping secclass_map[] = {
 	    "linux_immutable", "net_bind_service", "net_broadcast",
 	    "net_admin", "net_raw", "ipc_lock", "ipc_owner", "sys_module",
 	    "sys_rawio", "sys_chroot", "sys_ptrace", "sys_pacct", "sys_admin",
-	    "sys_boot", "sys_nice", "sys_resource", "sys_time",
+	    "sys_boot", "sys_nice", "sys_resource", "sys_time", "sys_user_ns",
 	    "sys_tty_config", "mknod", "lease", "audit_write",
 	    "audit_control", "setfcap", NULL } },
 	{ "filesystem",
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found] ` <5622700C.9090107-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
@ 2015-10-17 20:17   ` Richard Weinberger
  2015-10-18 20:13     ` Tobias Markus
  2015-10-17 21:55   ` Serge E. Hallyn
  2015-10-19 14:24   ` Austin S Hemmelgarn
  2 siblings, 1 reply; 21+ messages in thread
From: Richard Weinberger @ 2015-10-17 20:17 UTC (permalink / raw)
  To: Tobias Markus
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> wrote:
> One question remains though: Does this break userspace executables that
> expect being able to create user namespaces without priviledge? Since
> creating user namespaces without CAP_SYS_ADMIN was not possible before
> Linux 3.8, programs should already expect a potential EPERM upon calling
> clone. Since creating a user namespace without CAP_SYS_USER_NS would
> also cause EPERM, we should be on the safe side.

In case of doubt, yes it will break existing software.
Hiding user namespaces behind CAP_SYS_USER_NS will not magically
make them secure.

-- 
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found] ` <5622700C.9090107-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  2015-10-17 20:17   ` Richard Weinberger
@ 2015-10-17 21:55   ` Serge E. Hallyn
       [not found]     ` <20151017215501.GA22900-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
  2015-10-19 14:24   ` Austin S Hemmelgarn
  2 siblings, 1 reply; 21+ messages in thread
From: Serge E. Hallyn @ 2015-10-17 21:55 UTC (permalink / raw)
  To: Tobias Markus
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman, Al Viro,
	Serge Hallyn, Andrew Morton, Andy Lutomirski, Christoph Lameter,
	Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

On Sat, Oct 17, 2015 at 05:58:04PM +0200, Tobias Markus wrote:
> Add capability CAP_SYS_USER_NS.
> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
> when calling clone or unshare with CLONE_NEWUSER.
> 
> Rationale:
> 
> Linux 3.8 saw the introduction of unpriviledged user namespaces,
> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
> inside a separate user namespace. Before that, any namespace creation
> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
> Unfortunately, there have been some security-relevant bugs in the
> meantime. Because of the fairly complex nature of user namespaces, it is
> reasonable to say that future vulnerabilties can not be excluded. Some
> distributions even wholly disable user namespaces because of this.

Fwiw I'm not in favor of this.  Debian has a patch (I believe the one
I originally wrote for Ubuntu but which Ubuntu dropped long ago) adding a
sysctl, off by default, for enabling user namespaces.

Posix capabilities are intended for privileged actions, not for
actions which explicitly should not require privilege, but which
we feel are in development.

In general, the feeling is that putting a feature like this behind a
wall will only slow down the finding of any bugs, so I think the goal
itself is questionable.  But the chosen means for achieving your goal
are definately wrong.

> Both options, user namespaces with and without CAP_SYS_ADMIN, can be
> said to represent the extreme end of the spectrum. In practice, there is
> no reason for every process to have the abilitiy to create user
> namespaces. Indeed, only very few and specialized programs require user

There is.  One of Eric's primary motivations for user namespaces was to
finally allow unprivileged users to safely do things like manipulate
their mounts tree without the risk of privileged (setuid) programs
being tricked.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]     ` <20151017215501.GA22900-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2015-10-18 20:13       ` Tobias Markus
       [not found]         ` <5623FD82.4030902-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Tobias Markus @ 2015-10-18 20:13 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman, Al Viro,
	Serge Hallyn, Andrew Morton, Andy Lutomirski, Christoph Lameter,
	Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Richard Weinberger

On 17.10.2015 23:55, Serge E. Hallyn wrote:
> On Sat, Oct 17, 2015 at 05:58:04PM +0200, Tobias Markus wrote:
>> Add capability CAP_SYS_USER_NS.
>> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
>> when calling clone or unshare with CLONE_NEWUSER.
>>
>> Rationale:
>>
>> Linux 3.8 saw the introduction of unpriviledged user namespaces,
>> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
>> inside a separate user namespace. Before that, any namespace creation
>> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
>> Unfortunately, there have been some security-relevant bugs in the
>> meantime. Because of the fairly complex nature of user namespaces, it is
>> reasonable to say that future vulnerabilties can not be excluded. Some
>> distributions even wholly disable user namespaces because of this.
> 
> Fwiw I'm not in favor of this.  Debian has a patch (I believe the one
> I originally wrote for Ubuntu but which Ubuntu dropped long ago) adding a
> sysctl, off by default, for enabling user namespaces.

While it certainly works, enabling a feature like this at runtime
doesn't seem like a long term solution.

The fact that Debian added this patch in the first place already
demonstrates that there is demand for a way to limit unpriviledged user
namespace creation. Please, don't get me wrong: I would *really like* to
see widespread adoption and continued development of user namespaces!
But the status quo remains: Distributions outright disabling user
namespaces (e.g. Arch Linux) won't make it easier.

> 
> Posix capabilities are intended for privileged actions, not for
> actions which explicitly should not require privilege, but which
> we feel are in development.
> 

Certainly, in an ideal world, user namespaces will never lead to any
kernel-level exploits. But reality is different: There *have been*
serious kernel vulnerabilities due to user namespaces, and there *will
be* serious kernel vulnerabilities due to user namespaces.

Now, those are the alternatives imho:

* Status quo: Some distributions will disable user namespaces by default
in some way or another. User wishing to use user namespaces will have to
use a custom kernel or enable a sysctl flag that was patched in by the
downstream developers. On distributions that enable user namespaces by
default, even users that don't wish to use them in the first places will
be affected by vulnerabilities.

* Adding a capabilitiy: First of all, there would be no need for any
downstream patches or custom kernels. Users that wish to use user
namespaces would only have to enable the capability on the affected
executables, if that hasn't been done by the package maintainers
already. Users that might not even know of user namespaces have their peace.

> In general, the feeling is that putting a feature like this behind a
> wall will only slow down the finding of any bugs, so I think the goal
> itself is questionable.  But the chosen means for achieving your goal
> are definately wrong.

I'm not talking about removing user namespaces altogether or making them
impossible to use - as I said above, user wouldn't notice anything in
the best case. Replacing setuid binaries with capabilitiy-based ones has
been done for quite some time now and I don't think anyone complained.

I honestly don't see why adding a new capability would slow down finding
bugs. Not every program magically profits from user namespaces. Why
would, say, GCC, date or vim improve by using user namespaces? My point
is that use cases for user namespaces won't magically rain down from
Heaven just because it possible to use them without priviledge. And it
is hardly difficult to add the capabilitiy to those applications that
use user namespaces, is it? setcap cap_sys_user_ns+ep $binary doesn't
sound very complicated to me.
I would actually say not adding this capability would slow down finding
bugs since users are less inclined to enable the feature if they can't
limit its security impact.

Furthermore, saying "let's enable this complex security-relevant feature
by default and make it impossible to limit it to certain files so users
will find more bugs" is fundamentally wrong approach to security imho.
First, you aren't likely to get more bug reports because distributions
aren't that risky. Second, even if you get more bug reports, _the damage
is already done_. Sysadmins won't be that happy and will very likely
disable the very feature that caused the damage in the first place.

> 
>> Both options, user namespaces with and without CAP_SYS_ADMIN, can be
>> said to represent the extreme end of the spectrum. In practice, there is
>> no reason for every process to have the abilitiy to create user
>> namespaces. Indeed, only very few and specialized programs require user
> 
> There is.  One of Eric's primary motivations for user namespaces was to
> finally allow unprivileged users to safely do things like manipulate
> their mounts tree without the risk of privileged (setuid) programs
> being tricked.
> 
> -serge

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
  2015-10-17 20:17   ` Richard Weinberger
@ 2015-10-18 20:13     ` Tobias Markus
       [not found]       ` <5623FD86.2030609-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Tobias Markus @ 2015-10-18 20:13 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

On 17.10.2015 22:17, Richard Weinberger wrote:
> On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias@miglix.eu> wrote:
>> One question remains though: Does this break userspace executables that
>> expect being able to create user namespaces without priviledge? Since
>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>> Linux 3.8, programs should already expect a potential EPERM upon calling
>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>> also cause EPERM, we should be on the safe side.
> 
> In case of doubt, yes it will break existing software.
> Hiding user namespaces behind CAP_SYS_USER_NS will not magically
> make them secure.
> 
The goal is not to make user namespaces secure, but to limit access to
them somewhat in order to reduce the potential attack surface.
Please see my reply to Serge for further details.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]       ` <5623FD86.2030609-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
@ 2015-10-18 20:21         ` Richard Weinberger
       [not found]           ` <5623FF36.8080800-/L3Ra7n9ekc@public.gmane.org>
  2015-10-19  0:28         ` Mike Frysinger
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Weinberger @ 2015-10-18 20:21 UTC (permalink / raw)
  To: Tobias Markus
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

Am 18.10.2015 um 22:13 schrieb Tobias Markus:
> On 17.10.2015 22:17, Richard Weinberger wrote:
>> On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> wrote:
>>> One question remains though: Does this break userspace executables that
>>> expect being able to create user namespaces without priviledge? Since
>>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>>> Linux 3.8, programs should already expect a potential EPERM upon calling
>>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>>> also cause EPERM, we should be on the safe side.
>>
>> In case of doubt, yes it will break existing software.
>> Hiding user namespaces behind CAP_SYS_USER_NS will not magically
>> make them secure.
>>
> The goal is not to make user namespaces secure, but to limit access to
> them somewhat in order to reduce the potential attack surface.

We have already a framework to reduce the attack surface, seccomp.
There is no need to invent new capabilities for every non-trivial
kernel feature.

I can understand the user namespaces seems scary and had bugs.
But which software didn't?

Are there any unfixed exploitable bugs in user namespaces in recent kerenls?

Thanks,
//richard

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]           ` <5623FF36.8080800-/L3Ra7n9ekc@public.gmane.org>
@ 2015-10-18 20:41             ` Tobias Markus
  2015-10-18 20:48               ` Richard Weinberger
  0 siblings, 1 reply; 21+ messages in thread
From: Tobias Markus @ 2015-10-18 20:41 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

On 18.10.2015 22:21, Richard Weinberger wrote:
> Am 18.10.2015 um 22:13 schrieb Tobias Markus:
>> On 17.10.2015 22:17, Richard Weinberger wrote:
>>> On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> wrote:
>>>> One question remains though: Does this break userspace executables that
>>>> expect being able to create user namespaces without priviledge? Since
>>>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>>>> Linux 3.8, programs should already expect a potential EPERM upon calling
>>>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>>>> also cause EPERM, we should be on the safe side.
>>>
>>> In case of doubt, yes it will break existing software.
>>> Hiding user namespaces behind CAP_SYS_USER_NS will not magically
>>> make them secure.
>>>
>> The goal is not to make user namespaces secure, but to limit access to
>> them somewhat in order to reduce the potential attack surface.
> 
> We have already a framework to reduce the attack surface, seccomp.
> There is no need to invent new capabilities for every non-trivial
> kernel feature.
> 
> I can understand the user namespaces seems scary and had bugs.
> But which software didn't?
> 
> Are there any unfixed exploitable bugs in user namespaces in recent kerenls?
> 
> Thanks,
> //richard

Isn't seccomp about setting a per-thread syscall filter? Correct me if
I'm wrong, but I don't know of any way of using seccomp to globally ban
the use of clone or unshare with CLONE_NEWUSER except for a few
whiteliste executables, and that's the idea of this hypothetical capability.

Sure, there are no known exploitable bugs in recent kernels, but would
you guarantee that for the next 10 years? Every software has bugs, some
of them exploitable, no amount of testing can prevent that. I'm not
paranoid, but on the other hand, why should every Linux user having
CONFIG_USER_NS enabled have to expose more attack surface than he
absolutely has to?

Richard, would you run an Apache HTTP server exposed to the internet on
your own laptop, without any security precautions? According to your
reasoning, Apache is surely scary and has many bugs, but every software
has bugs, right?

I really don't want to introduce a user-facing API change just for the
fun of it - so if there's any better way to do this, please tell me.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
  2015-10-18 20:41             ` Tobias Markus
@ 2015-10-18 20:48               ` Richard Weinberger
       [not found]                 ` <56240599.3050903-/L3Ra7n9ekc@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Richard Weinberger @ 2015-10-18 20:48 UTC (permalink / raw)
  To: Tobias Markus
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

Am 18.10.2015 um 22:41 schrieb Tobias Markus:
> On 18.10.2015 22:21, Richard Weinberger wrote:
>> Am 18.10.2015 um 22:13 schrieb Tobias Markus:
>>> On 17.10.2015 22:17, Richard Weinberger wrote:
>>>> On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias@miglix.eu> wrote:
>>>>> One question remains though: Does this break userspace executables that
>>>>> expect being able to create user namespaces without priviledge? Since
>>>>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>>>>> Linux 3.8, programs should already expect a potential EPERM upon calling
>>>>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>>>>> also cause EPERM, we should be on the safe side.
>>>>
>>>> In case of doubt, yes it will break existing software.
>>>> Hiding user namespaces behind CAP_SYS_USER_NS will not magically
>>>> make them secure.
>>>>
>>> The goal is not to make user namespaces secure, but to limit access to
>>> them somewhat in order to reduce the potential attack surface.
>>
>> We have already a framework to reduce the attack surface, seccomp.
>> There is no need to invent new capabilities for every non-trivial
>> kernel feature.
>>
>> I can understand the user namespaces seems scary and had bugs.
>> But which software didn't?
>>
>> Are there any unfixed exploitable bugs in user namespaces in recent kerenls?
>>
>> Thanks,
>> //richard
> 
> Isn't seccomp about setting a per-thread syscall filter? Correct me if
> I'm wrong, but I don't know of any way of using seccomp to globally ban
> the use of clone or unshare with CLONE_NEWUSER except for a few
> whiteliste executables, and that's the idea of this hypothetical capability.

This is correct.
If you want it globally you can still use LSM.

> Sure, there are no known exploitable bugs in recent kernels, but would
> you guarantee that for the next 10 years? Every software has bugs, some
> of them exploitable, no amount of testing can prevent that. I'm not
> paranoid, but on the other hand, why should every Linux user having
> CONFIG_USER_NS enabled have to expose more attack surface than he
> absolutely has to?

And what about all the other kernel features?
I really don't get why you choose user namespaces as your enemy.

> Richard, would you run an Apache HTTP server exposed to the internet on
> your own laptop, without any security precautions? According to your
> reasoning, Apache is surely scary and has many bugs, but every software
> has bugs, right?

This argument is bogus and you know that too.

> I really don't want to introduce a user-facing API change just for the
> fun of it - so if there's any better way to do this, please tell me.

As I said, it really don't see why we should treat user namespaces in a special
way. It is a kernel feature like many others are. If you don't trust it, disable it.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]                 ` <56240599.3050903-/L3Ra7n9ekc@public.gmane.org>
@ 2015-10-18 21:49                   ` Tobias Markus
  2015-10-18 22:06                     ` Richard Weinberger
  0 siblings, 1 reply; 21+ messages in thread
From: Tobias Markus @ 2015-10-18 21:49 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

On 18.10.2015 22:48, Richard Weinberger wrote:
> Am 18.10.2015 um 22:41 schrieb Tobias Markus:
>> On 18.10.2015 22:21, Richard Weinberger wrote:
>>> Am 18.10.2015 um 22:13 schrieb Tobias Markus:
>>>> On 17.10.2015 22:17, Richard Weinberger wrote:
>>>>> On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> wrote:
>>>>>> One question remains though: Does this break userspace executables that
>>>>>> expect being able to create user namespaces without priviledge? Since
>>>>>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>>>>>> Linux 3.8, programs should already expect a potential EPERM upon calling
>>>>>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>>>>>> also cause EPERM, we should be on the safe side.
>>>>>
>>>>> In case of doubt, yes it will break existing software.
>>>>> Hiding user namespaces behind CAP_SYS_USER_NS will not magically
>>>>> make them secure.
>>>>>
>>>> The goal is not to make user namespaces secure, but to limit access to
>>>> them somewhat in order to reduce the potential attack surface.
>>>
>>> We have already a framework to reduce the attack surface, seccomp.
>>> There is no need to invent new capabilities for every non-trivial
>>> kernel feature.
>>>
>>> I can understand the user namespaces seems scary and had bugs.
>>> But which software didn't?
>>>
>>> Are there any unfixed exploitable bugs in user namespaces in recent kerenls?
>>>
>>> Thanks,
>>> //richard
>>
>> Isn't seccomp about setting a per-thread syscall filter? Correct me if
>> I'm wrong, but I don't know of any way of using seccomp to globally ban
>> the use of clone or unshare with CLONE_NEWUSER except for a few
>> whiteliste executables, and that's the idea of this hypothetical capability.
> 
> This is correct.
> If you want it globally you can still use LSM.

The LSM isn't really the one-size-fits-all solution that distributions
like to ship in their standard kernels...

> 
>> Sure, there are no known exploitable bugs in recent kernels, but would
>> you guarantee that for the next 10 years? Every software has bugs, some
>> of them exploitable, no amount of testing can prevent that. I'm not
>> paranoid, but on the other hand, why should every Linux user having
>> CONFIG_USER_NS enabled have to expose more attack surface than he
>> absolutely has to?
> 
> And what about all the other kernel features?
> I really don't get why you choose user namespaces as your enemy.

I didn't choose user namespaces as my enemy, I chose user namespaces as
the feature that I would really like to have shipped by default by my
and by other distributions, but that's sadly often disabled for security
concerns. Is there any solution that can be safely used by distributions
to have user namespaces enabled by default without worrying about security?

> 
>> Richard, would you run an Apache HTTP server exposed to the internet on
>> your own laptop, without any security precautions? According to your
>> reasoning, Apache is surely scary and has many bugs, but every software
>> has bugs, right?
> 
> This argument is bogus and you know that too.

Sure, it's exaggregated, but still, if it's possible to reduce the
attack surface for every user without great effort, why shouldn't that
be done?

To give an example more closely resembling the matter in hand:
CAP_SYSLOG allows viewing kernel addresses when kptr_restrict is
enabled. But why limit access to the kernel symbols? There is nothing an
attacker can do with them, except there is a kernel bug.

But before we continue arguing endlessly, I just got an idea: What about
adding a sysctl to enable/disable enforcement of the hypothetical
CAP_SYS_USER_NS, just like with /proc/sys/kernel/kptr_restrict and
CAP_SYSLOG? Would also prevent any potential userspace breakage.

> 
>> I really don't want to introduce a user-facing API change just for the
>> fun of it - so if there's any better way to do this, please tell me.
> 
> As I said, it really don't see why we should treat user namespaces in a special
> way. It is a kernel feature like many others are. If you don't trust it, disable it.

And there's the problem: It's either 100% (unpriviledged user namespaces
without limit) or 0% (disable user namespaces entirely).
> 
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
  2015-10-18 21:49                   ` Tobias Markus
@ 2015-10-18 22:06                     ` Richard Weinberger
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Weinberger @ 2015-10-18 22:06 UTC (permalink / raw)
  To: Tobias Markus
  Cc: LKML, Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	LSM, open list:ABI/API, linux-man

Am 18.10.2015 um 23:49 schrieb Tobias Markus:
> But before we continue arguing endlessly, I just got an idea: What about
> adding a sysctl to enable/disable enforcement of the hypothetical
> CAP_SYS_USER_NS, just like with /proc/sys/kernel/kptr_restrict and
> CAP_SYSLOG? Would also prevent any potential userspace breakage.

My argument stands, hiding user namespaces behind whatever
switch and rendering it into a second class citizen does not improve its security.

Especially ad-hoc solutions are not expedient. Reducing the attack surface must not
lead to gazillions of new capabilities, sysctl switches, etc...
user namespaces are neither the first nor the last "critical" kernel feature.

I'm sure RHEL will ship a SELinux boolean to disable user namespaces as they do
for many other OS features. This is fine and exactly why we have LSM.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]       ` <5623FD86.2030609-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  2015-10-18 20:21         ` Richard Weinberger
@ 2015-10-19  0:28         ` Mike Frysinger
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Frysinger @ 2015-10-19  0:28 UTC (permalink / raw)
  To: Tobias Markus
  Cc: Richard Weinberger, LKML, Eric W. Biederman, Al Viro,
	Serge Hallyn, Andrew Morton, Andy Lutomirski, Christoph Lameter,
	Michael Kerrisk (man-pages), LSM, open list:ABI/API, linux-man

[-- Attachment #1: Type: text/plain, Size: 1519 bytes --]

On 18 Oct 2015 22:13, Tobias Markus wrote:
> On 17.10.2015 22:17, Richard Weinberger wrote:
> > On Sat, Oct 17, 2015 at 5:58 PM, Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> wrote:
> >> One question remains though: Does this break userspace executables that
> >> expect being able to create user namespaces without priviledge? Since
> >> creating user namespaces without CAP_SYS_ADMIN was not possible before
> >> Linux 3.8, programs should already expect a potential EPERM upon calling
> >> clone. Since creating a user namespace without CAP_SYS_USER_NS would
> >> also cause EPERM, we should be on the safe side.
> > 
> > In case of doubt, yes it will break existing software.
> > Hiding user namespaces behind CAP_SYS_USER_NS will not magically
> > make them secure.
> 
> The goal is not to make user namespaces secure, but to limit access to
> them somewhat in order to reduce the potential attack surface.

the irony is that disallowing non-privileged processes access to userns means
processes cannot jail themselves and thus make themselves more secure.  i've
been adding userns to various projects purely to get access to things like
mount, net, pid, sysv, and ipc namespaces.

putting this behind a cap also breaks the Chromium sandbox -- they were able
to drop set*id on the sandbox binary and utilize userns instead.
https://chromium.googlesource.com/chromium/src/+/master/docs/linux_sandboxing.md
https://code.google.com/p/chromium/issues/detail?id=312380
-mike

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]         ` <5623FD82.4030902-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
@ 2015-10-19  1:41           ` Serge E. Hallyn
       [not found]             ` <20151019014112.GA1683-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
  2015-10-22 20:45           ` Eric W. Biederman
  1 sibling, 1 reply; 21+ messages in thread
From: Serge E. Hallyn @ 2015-10-19  1:41 UTC (permalink / raw)
  To: Tobias Markus
  Cc: Serge E. Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Richard Weinberger

On Sun, Oct 18, 2015 at 10:13:54PM +0200, Tobias Markus wrote:
> On 17.10.2015 23:55, Serge E. Hallyn wrote:
> > On Sat, Oct 17, 2015 at 05:58:04PM +0200, Tobias Markus wrote:
> >> Add capability CAP_SYS_USER_NS.
> >> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
> >> when calling clone or unshare with CLONE_NEWUSER.
> >>
> >> Rationale:
> >>
> >> Linux 3.8 saw the introduction of unpriviledged user namespaces,
> >> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
> >> inside a separate user namespace. Before that, any namespace creation
> >> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
> >> Unfortunately, there have been some security-relevant bugs in the
> >> meantime. Because of the fairly complex nature of user namespaces, it is
> >> reasonable to say that future vulnerabilties can not be excluded. Some
> >> distributions even wholly disable user namespaces because of this.
> > 
> > Fwiw I'm not in favor of this.  Debian has a patch (I believe the one
> > I originally wrote for Ubuntu but which Ubuntu dropped long ago) adding a
> > sysctl, off by default, for enabling user namespaces.
> 
> While it certainly works, enabling a feature like this at runtime
> doesn't seem like a long term solution.

We shouldn't need a long-term solution.  Your concern is bugs.  After
some time surely we'll feel that we have achieved a stable solution?

> The fact that Debian added this patch in the first place already
> demonstrates that there is demand for a way to limit unpriviledged user

No it does not.  As i said, I wrote that patch originally in the very
early days, when wanting it turned off was much more understandable.
I do not know whether Debian would have written its own patch if I
hadn't.  (They may have)

> namespace creation. Please, don't get me wrong: I would *really like* to
> see widespread adoption and continued development of user namespaces!
> But the status quo remains: Distributions outright disabling user
> namespaces (e.g. Arch Linux) won't make it easier.
> 
> > 
> > Posix capabilities are intended for privileged actions, not for
> > actions which explicitly should not require privilege, but which
> > we feel are in development.
> > 
> 
> Certainly, in an ideal world, user namespaces will never lead to any
> kernel-level exploits. But reality is different: There *have been*
> serious kernel vulnerabilities due to user namespaces, and there *will
> be* serious kernel vulnerabilities due to user namespaces.

As there will be due to sctp and futex.

> Now, those are the alternatives imho:
> 
> * Status quo: Some distributions will disable user namespaces by default
> in some way or another. User wishing to use user namespaces will have to
> use a custom kernel or enable a sysctl flag that was patched in by the
> downstream developers. On distributions that enable user namespaces by
> default, even users that don't wish to use them in the first places will
> be affected by vulnerabilities.
> 
> * Adding a capabilitiy: First of all, there would be no need for any
> downstream patches or custom kernels. Users that wish to use user
> namespaces would only have to enable the capability on the affected
> executables, if that hasn't been done by the package maintainers
> already. Users that might not even know of user namespaces have their peace.
> 
> > In general, the feeling is that putting a feature like this behind a
> > wall will only slow down the finding of any bugs, so I think the goal
> > itself is questionable.  But the chosen means for achieving your goal
> > are definately wrong.
> 
> I'm not talking about removing user namespaces altogether or making them
> impossible to use - as I said above, user wouldn't notice anything in
> the best case. Replacing setuid binaries with capabilitiy-based ones has
> been done for quite some time now and I don't think anyone complained.

That's the opposite - making something easier to use with less privilege,
as opposed to requiring more.

> I honestly don't see why adding a new capability would slow down finding
> bugs. Not every program magically profits from user namespaces. Why
> would, say, GCC, date or vim improve by using user namespaces? My point

<shrug> this is irrelevant, but I could certainly envision value in
something like gcc, which takes arbitrary input, running as kuid -1
(not uid -1).  Or especially ffmpeg.  chromium.

> is that use cases for user namespaces won't magically rain down from
> Heaven just because it possible to use them without priviledge. And it
> is hardly difficult to add the capabilitiy to those applications that
> use user namespaces, is it? setcap cap_sys_user_ns+ep $binary doesn't
> sound very complicated to me.

But it requires root privilege, for something designed to not need root
privilege.

> I would actually say not adding this capability would slow down finding
> bugs since users are less inclined to enable the feature if they can't
> limit its security impact.
> 
> Furthermore, saying "let's enable this complex security-relevant feature
> by default and make it impossible to limit it to certain files so users
> will find more bugs" is fundamentally wrong approach to security imho.
> First, you aren't likely to get more bug reports because distributions
> aren't that risky. Second, even if you get more bug reports, _the damage
> is already done_. Sysadmins won't be that happy and will very likely
> disable the very feature that caused the damage in the first place.

Posix capabilities are privileges.  Like the privileges to create
device nodes, or open another user's files regardless of permission
settings, or the ability to read syslog.
We require privilege for things which affect shared resouces or would
otherwise impact other users.  We do not require privilege because
something may have bugs.  The user namespace is designed precisely to
*not* require privilege.

If you feel the userns *design* is inherently buggy, then suggest
ripping it out.  If you feel the implementation is buggy, then compile
it out, or upstreaming the sysctl could makes sense.  We can have those
discussions.  But assigning a privilege to it doesn't make sense.

I'm not saying this as someone who championed and acked the userns
patchet, but as the capabilities maintainer.

-serge

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]             ` <20151019014112.GA1683-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
@ 2015-10-19 12:36               ` Yves-Alexis Perez
       [not found]                 ` <1445258180.4099.18.camel-8fiUuRrzOP0dnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Yves-Alexis Perez @ 2015-10-19 12:36 UTC (permalink / raw)
  To: Serge E. Hallyn, Tobias Markus
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman, Al Viro,
	Serge Hallyn, Andrew Morton, Andy Lutomirski, Christoph Lameter,
	Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Richard Weinberger

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On dim., 2015-10-18 at 20:41 -0500, Serge E. Hallyn wrote:
> We shouldn't need a long-term solution.  Your concern is bugs.  After
> some time surely we'll feel that we have achieved a stable solution?

But this is actually the whole point: we need a long term solution, because
they will always be bug, whether in user namespaces or in others parts exposed
by user namespaces. It's fine to fix them when we find them, but that still
means they're exploitable even before we know about them. We still find bugs
in code written years ago, it's quite certain there are bugs in current code.

User namespaces are a way to expose more interfaces to unprivileged users,
interfaces which weren't designed to be exposed like that. In a way that's the
opposite of seccomp. That doesn't make it bad, obviously, but that still means
having a way to control it finely could be helpful.

Regards,
-- 

Yves-Alexis

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]                 ` <1445258180.4099.18.camel-8fiUuRrzOP0dnm+yROfE0A@public.gmane.org>
@ 2015-10-19 12:48                   ` Richard Weinberger
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Weinberger @ 2015-10-19 12:48 UTC (permalink / raw)
  To: Yves-Alexis Perez, Serge E. Hallyn, Tobias Markus
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Eric W. Biederman, Al Viro,
	Serge Hallyn, Andrew Morton, Andy Lutomirski, Christoph Lameter,
	Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Richard Weinberger

Am 19.10.2015 um 14:36 schrieb Yves-Alexis Perez:
> On dim., 2015-10-18 at 20:41 -0500, Serge E. Hallyn wrote:
>> We shouldn't need a long-term solution.  Your concern is bugs.  After
>> some time surely we'll feel that we have achieved a stable solution?
> 
> But this is actually the whole point: we need a long term solution, because
> they will always be bug, whether in user namespaces or in others parts exposed
> by user namespaces. It's fine to fix them when we find them, but that still
> means they're exploitable even before we know about them. We still find bugs
> in code written years ago, it's quite certain there are bugs in current code.

You can replace the term "user namespace" with any other non-trivial kernel subsystem.
There will always be bugs.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found] ` <5622700C.9090107-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  2015-10-17 20:17   ` Richard Weinberger
  2015-10-17 21:55   ` Serge E. Hallyn
@ 2015-10-19 14:24   ` Austin S Hemmelgarn
       [not found]     ` <5624FD3B.2050401-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2 siblings, 1 reply; 21+ messages in thread
From: Austin S Hemmelgarn @ 2015-10-19 14:24 UTC (permalink / raw)
  To: Tobias Markus, linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Eric W. Biederman, Al Viro, Serge Hallyn, Andrew Morton,
	Andy Lutomirski, Christoph Lameter, Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2971 bytes --]

On 2015-10-17 11:58, Tobias Markus wrote:
> Add capability CAP_SYS_USER_NS.
> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
> when calling clone or unshare with CLONE_NEWUSER.
>
> Rationale:
>
> Linux 3.8 saw the introduction of unpriviledged user namespaces,
> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
> inside a separate user namespace. Before that, any namespace creation
> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
> Unfortunately, there have been some security-relevant bugs in the
> meantime. Because of the fairly complex nature of user namespaces, it is
> reasonable to say that future vulnerabilties can not be excluded. Some
> distributions even wholly disable user namespaces because of this.
>
> Both options, user namespaces with and without CAP_SYS_ADMIN, can be
> said to represent the extreme end of the spectrum. In practice, there is
> no reason for every process to have the abilitiy to create user
> namespaces. Indeed, only very few and specialized programs require user
> namespaces. This seems to be a perfect fit for the (file) capability
> system: Priviledged users could manually allow only a certain executable
> to be able to create user namespaces by setting a certain capability,
> I'd suggest the name CAP_SYS_USER_NS. Executables completely unrelated
> to user namespaces should and can not create them.
>
> The capability should only be required in the "root" user namespace (the
> user namespace with level 0) though, to allow nested user namespaces to
> work as intended. If a user namespace has a level greater than 0, the
> original process must have had CAP_SYS_USER_NS, so it is "trusted" anyway.
>
> One question remains though: Does this break userspace executables that
> expect being able to create user namespaces without priviledge? Since
> creating user namespaces without CAP_SYS_ADMIN was not possible before
> Linux 3.8, programs should already expect a potential EPERM upon calling
> clone. Since creating a user namespace without CAP_SYS_USER_NS would
> also cause EPERM, we should be on the safe side.

Potentially stupid counter proposal:
Make it CAP_SYS_NS, make it allow access to all namespace types for 
non-root/CAP_SYS_ADMIN users, and teach the stuff that's using userns 
just to get to mount/pid/net/ipc namespaces to use those instead when 
it's something that doesn't really need to think it's running as root.

While this would still add a new capability (which is arguably not a 
good thing), the resultant capability would be significantly more useful 
for many of the use cases.

Potentially more flame resistant counter proposal:
Write a simple LSM to allow selective usage of namespaces (IIRC, working 
LSM stacking is in mainline now).  While this is more complicated than 
just adding a capability, it is also a lot more resilient from a long 
term prospective.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]     ` <5624FD3B.2050401-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2015-10-21 18:53       ` Andy Lutomirski
       [not found]         ` <CALCETrWfZ9hXvLPtJnZhU-ZdoUbYNo-QSydMPvP6Q7Rp0oCQaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2015-10-21 18:53 UTC (permalink / raw)
  To: Austin S Hemmelgarn
  Cc: Michael Kerrisk, Christoph Lameter, Al Viro, Serge Hallyn,
	LSM List, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Tobias Markus, linux-man, Andrew Morton, Eric W. Biederman,
	Linux API

On Oct 19, 2015 7:25 AM, "Austin S Hemmelgarn" <ahferroin7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> On 2015-10-17 11:58, Tobias Markus wrote:
>>
>> Add capability CAP_SYS_USER_NS.
>> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
>> when calling clone or unshare with CLONE_NEWUSER.
>>
>> Rationale:
>>
>> Linux 3.8 saw the introduction of unpriviledged user namespaces,
>> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
>> inside a separate user namespace. Before that, any namespace creation
>> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
>> Unfortunately, there have been some security-relevant bugs in the
>> meantime. Because of the fairly complex nature of user namespaces, it is
>> reasonable to say that future vulnerabilties can not be excluded. Some
>> distributions even wholly disable user namespaces because of this.
>>
>> Both options, user namespaces with and without CAP_SYS_ADMIN, can be
>> said to represent the extreme end of the spectrum. In practice, there is
>> no reason for every process to have the abilitiy to create user
>> namespaces. Indeed, only very few and specialized programs require user
>> namespaces. This seems to be a perfect fit for the (file) capability
>> system: Priviledged users could manually allow only a certain executable
>> to be able to create user namespaces by setting a certain capability,
>> I'd suggest the name CAP_SYS_USER_NS. Executables completely unrelated
>> to user namespaces should and can not create them.
>>
>> The capability should only be required in the "root" user namespace (the
>> user namespace with level 0) though, to allow nested user namespaces to
>> work as intended. If a user namespace has a level greater than 0, the
>> original process must have had CAP_SYS_USER_NS, so it is "trusted" anyway.
>>
>> One question remains though: Does this break userspace executables that
>> expect being able to create user namespaces without priviledge? Since
>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>> Linux 3.8, programs should already expect a potential EPERM upon calling
>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>> also cause EPERM, we should be on the safe side.
>
>
> Potentially stupid counter proposal:
> Make it CAP_SYS_NS, make it allow access to all namespace types for non-root/CAP_SYS_ADMIN users, and teach the stuff that's using userns just to get to mount/pid/net/ipc namespaces to use those instead when it's something that doesn't really need to think it's running as root.
>
> While this would still add a new capability (which is arguably not a good thing), the resultant capability would be significantly more useful for many of the use cases.

Then you'd have to come up with some argument that it could possibly
be safe.  You'd need *at least* no_new_privs forced on.  You would
also have fun defining the privilege to own such a namespace once
created.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]         ` <CALCETrWfZ9hXvLPtJnZhU-ZdoUbYNo-QSydMPvP6Q7Rp0oCQaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-21 19:13           ` Austin S Hemmelgarn
  2015-10-22 17:10             ` Andy Lutomirski
  0 siblings, 1 reply; 21+ messages in thread
From: Austin S Hemmelgarn @ 2015-10-21 19:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Michael Kerrisk, Christoph Lameter, Al Viro, Serge Hallyn,
	LSM List, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Tobias Markus, linux-man, Andrew Morton, Eric W. Biederman,
	Linux API

[-- Attachment #1: Type: text/plain, Size: 3297 bytes --]

On 2015-10-21 14:53, Andy Lutomirski wrote:
> On Oct 19, 2015 7:25 AM, "Austin S Hemmelgarn" <ahferroin7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>> On 2015-10-17 11:58, Tobias Markus wrote:
>>>
>>> Add capability CAP_SYS_USER_NS.
>>> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
>>> when calling clone or unshare with CLONE_NEWUSER.
>>>
>>> Rationale:
>>>
>>> Linux 3.8 saw the introduction of unpriviledged user namespaces,
>>> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
>>> inside a separate user namespace. Before that, any namespace creation
>>> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
>>> Unfortunately, there have been some security-relevant bugs in the
>>> meantime. Because of the fairly complex nature of user namespaces, it is
>>> reasonable to say that future vulnerabilties can not be excluded. Some
>>> distributions even wholly disable user namespaces because of this.
>>>
>>> Both options, user namespaces with and without CAP_SYS_ADMIN, can be
>>> said to represent the extreme end of the spectrum. In practice, there is
>>> no reason for every process to have the abilitiy to create user
>>> namespaces. Indeed, only very few and specialized programs require user
>>> namespaces. This seems to be a perfect fit for the (file) capability
>>> system: Priviledged users could manually allow only a certain executable
>>> to be able to create user namespaces by setting a certain capability,
>>> I'd suggest the name CAP_SYS_USER_NS. Executables completely unrelated
>>> to user namespaces should and can not create them.
>>>
>>> The capability should only be required in the "root" user namespace (the
>>> user namespace with level 0) though, to allow nested user namespaces to
>>> work as intended. If a user namespace has a level greater than 0, the
>>> original process must have had CAP_SYS_USER_NS, so it is "trusted" anyway.
>>>
>>> One question remains though: Does this break userspace executables that
>>> expect being able to create user namespaces without priviledge? Since
>>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>>> Linux 3.8, programs should already expect a potential EPERM upon calling
>>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>>> also cause EPERM, we should be on the safe side.
>>
>>
>> Potentially stupid counter proposal:
>> Make it CAP_SYS_NS, make it allow access to all namespace types for non-root/CAP_SYS_ADMIN users, and teach the stuff that's using userns just to get to mount/pid/net/ipc namespaces to use those instead when it's something that doesn't really need to think it's running as root.
>>
>> While this would still add a new capability (which is arguably not a good thing), the resultant capability would be significantly more useful for many of the use cases.
>
> Then you'd have to come up with some argument that it could possibly
> be safe.  You'd need *at least* no_new_privs forced on.  You would
> also have fun defining the privilege to own such a namespace once
> created.
Excellent point about the privileges, although wouldn't that also apply 
to just using a capability for non-root/CAP_SYS_ADMIN access to userns?


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
  2015-10-21 19:13           ` Austin S Hemmelgarn
@ 2015-10-22 17:10             ` Andy Lutomirski
  0 siblings, 0 replies; 21+ messages in thread
From: Andy Lutomirski @ 2015-10-22 17:10 UTC (permalink / raw)
  To: Austin S Hemmelgarn
  Cc: Michael Kerrisk, Christoph Lameter, Al Viro, Serge Hallyn,
	LSM List, linux-kernel@vger.kernel.org, Tobias Markus, linux-man,
	Andrew Morton, Eric W. Biederman, Linux API

On Wed, Oct 21, 2015 at 12:13 PM, Austin S Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2015-10-21 14:53, Andy Lutomirski wrote:
>>
>> On Oct 19, 2015 7:25 AM, "Austin S Hemmelgarn" <ahferroin7@gmail.com>
>> wrote:
>>>
>>>
>>> On 2015-10-17 11:58, Tobias Markus wrote:
>>>>
>>>>
>>>> Add capability CAP_SYS_USER_NS.
>>>> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
>>>> when calling clone or unshare with CLONE_NEWUSER.
>>>>
>>>> Rationale:
>>>>
>>>> Linux 3.8 saw the introduction of unpriviledged user namespaces,
>>>> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
>>>> inside a separate user namespace. Before that, any namespace creation
>>>> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
>>>> Unfortunately, there have been some security-relevant bugs in the
>>>> meantime. Because of the fairly complex nature of user namespaces, it is
>>>> reasonable to say that future vulnerabilties can not be excluded. Some
>>>> distributions even wholly disable user namespaces because of this.
>>>>
>>>> Both options, user namespaces with and without CAP_SYS_ADMIN, can be
>>>> said to represent the extreme end of the spectrum. In practice, there is
>>>> no reason for every process to have the abilitiy to create user
>>>> namespaces. Indeed, only very few and specialized programs require user
>>>> namespaces. This seems to be a perfect fit for the (file) capability
>>>> system: Priviledged users could manually allow only a certain executable
>>>> to be able to create user namespaces by setting a certain capability,
>>>> I'd suggest the name CAP_SYS_USER_NS. Executables completely unrelated
>>>> to user namespaces should and can not create them.
>>>>
>>>> The capability should only be required in the "root" user namespace (the
>>>> user namespace with level 0) though, to allow nested user namespaces to
>>>> work as intended. If a user namespace has a level greater than 0, the
>>>> original process must have had CAP_SYS_USER_NS, so it is "trusted"
>>>> anyway.
>>>>
>>>> One question remains though: Does this break userspace executables that
>>>> expect being able to create user namespaces without priviledge? Since
>>>> creating user namespaces without CAP_SYS_ADMIN was not possible before
>>>> Linux 3.8, programs should already expect a potential EPERM upon calling
>>>> clone. Since creating a user namespace without CAP_SYS_USER_NS would
>>>> also cause EPERM, we should be on the safe side.
>>>
>>>
>>>
>>> Potentially stupid counter proposal:
>>> Make it CAP_SYS_NS, make it allow access to all namespace types for
>>> non-root/CAP_SYS_ADMIN users, and teach the stuff that's using userns just
>>> to get to mount/pid/net/ipc namespaces to use those instead when it's
>>> something that doesn't really need to think it's running as root.
>>>
>>> While this would still add a new capability (which is arguably not a good
>>> thing), the resultant capability would be significantly more useful for many
>>> of the use cases.
>>
>>
>> Then you'd have to come up with some argument that it could possibly
>> be safe.  You'd need *at least* no_new_privs forced on.  You would
>> also have fun defining the privilege to own such a namespace once
>> created.
>
> Excellent point about the privileges, although wouldn't that also apply to
> just using a capability for non-root/CAP_SYS_ADMIN access to userns?
>

I'm not sure I understand your question.

Allowing the owner of a userns(or a holder of sufficient privilege in
that namespace) to create other types of namespaces in that userns is
safe, as long as there are no bug left.  There are plenty of ways for
the creator of a network namespace, mount namespace, or similar to
corrupt the user namespace to which they belong, which is why
unprivileged userns creators can only create new namespaces of other
types within the userns that they control.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]         ` <5623FD82.4030902-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
  2015-10-19  1:41           ` Serge E. Hallyn
@ 2015-10-22 20:45           ` Eric W. Biederman
       [not found]             ` <87twpi63ai.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
  1 sibling, 1 reply; 21+ messages in thread
From: Eric W. Biederman @ 2015-10-22 20:45 UTC (permalink / raw)
  To: Tobias Markus
  Cc: Serge E. Hallyn, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Al Viro,
	Serge Hallyn, Andrew Morton, Andy Lutomirski, Christoph Lameter,
	Michael Kerrisk (man-pages),
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Richard Weinberger,
	Yves-Alexis Perez, Austin S Hemmelgarn, Linux Containers

Thank you for a creative solution to a problem that you perceive.  I
appreciate it when people aim to solve problems they see.

Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> writes:

> On 17.10.2015 23:55, Serge E. Hallyn wrote:
>> On Sat, Oct 17, 2015 at 05:58:04PM +0200, Tobias Markus wrote:
>>> Add capability CAP_SYS_USER_NS.
>>> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
>>> when calling clone or unshare with CLONE_NEWUSER.
>>>
>>> Rationale:
>>>
>>> Linux 3.8 saw the introduction of unpriviledged user namespaces,
>>> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
>>> inside a separate user namespace. Before that, any namespace creation
>>> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
>>> Unfortunately, there have been some security-relevant bugs in the
>>> meantime. Because of the fairly complex nature of user namespaces, it is
>>> reasonable to say that future vulnerabilties can not be excluded. Some
>>> distributions even wholly disable user namespaces because of this.
>> 
>> Fwiw I'm not in favor of this.  Debian has a patch (I believe the one
>> I originally wrote for Ubuntu but which Ubuntu dropped long ago) adding a
>> sysctl, off by default, for enabling user namespaces.
>
> While it certainly works, enabling a feature like this at runtime
> doesn't seem like a long term solution.
>
> The fact that Debian added this patch in the first place already
> demonstrates that there is demand for a way to limit unpriviledged user
> namespace creation. Please, don't get me wrong: I would *really like* to
> see widespread adoption and continued development of user namespaces!
> But the status quo remains: Distributions outright disabling user
> namespaces (e.g. Arch Linux) won't make it easier.

Let me say I applaud Arch Linux for not doing what so many distributions
do and enable every feature in the kernel.  I appreciate a distribution
that does not enable interesting kernel features while they are still
having their bugs shaken out of them.

I also think Debians approach to limit things while they mature is also
wisdom.

>> Posix capabilities are intended for privileged actions, not for
>> actions which explicitly should not require privilege, but which
>> we feel are in development.
>> 
>
> Certainly, in an ideal world, user namespaces will never lead to any
> kernel-level exploits. But reality is different: There *have been*
> serious kernel vulnerabilities due to user namespaces, and there *will
> be* serious kernel vulnerabilities due to user namespaces.

When you start talk about the future that is not yet real you have
stopped talking about reality.  That sounds like a pessimists world view
rather than reality.

The reality is new features are buggy and take time to mature.  It takes
time for understanding to percolate through peoples heads.

> Now, those are the alternatives imho:
>
> * Status quo: Some distributions will disable user namespaces by default
> in some way or another. User wishing to use user namespaces will have to
> use a custom kernel or enable a sysctl flag that was patched in by the
> downstream developers. On distributions that enable user namespaces by
> default, even users that don't wish to use them in the first places will
> be affected by vulnerabilities.

Again I disagree.  I see distributions waiting to enable user namespaces
until they mature and until they are interesting enough.  I do not see
rushing to enable the newest features as wisdom, unless that the point
of your distribution is to enable people to play with the latest
features.

I suspect we are quickly coming to a point where user namespaces will be
sufficiently compelling that they will be enabled more widely.  

At this point the most helpful things I can see to be done are.
- Verify all userns related fixes have made it back into 4.1.x
- Play with and/or audit the userns code to see if more bugs can be
  found.
- Analyze user namespaces and see if they are uniquely worse than
  anything else.

I agree that if user namespaces pose a unique security challenge to
the kernel we should do something about them.  I think it is a healthy
question to ask.  For the conversation to be productive I think we need
numbers and analsysis, not just worst case analsysis based on fear.  To
date all I see are teething pains.

My back of the napkin analysis is that there are maybe 3,000 lines of
code executed in user namespaces (mostly from fs/namespace.c) that
are not otherwise reachable from unprivileged users, while there are
perhaps 100,000 - 250,000 lines of code reachable by unprivileged users
(not counting drivers).

At this point I do not expect that removing access to 3 lines out of 100
will significanlty reduce the probability that someone will find
exploitable code in the kernel.

I do think I goofed and enabled the code in fs/namespace.c before it was
ready to be accessed by unprivileged users.  My apologies to everyone
inconvinenced by that.

Tobias I do think you have fallen into a fault in your analysis of the
situtation that many other people have.  The assumption that by limiting
access to who can create user namespaces that we limit badness by people
who are root in a user namespace.   Very few of the problems I have seen
go away if a user is not able to create a user namespace.  Most problems
exist in some when an application is root inside a user namespace.

Tobias your proposal to me reads as enabling a feature only for those
users most likely to exploit it, which honestly seems backwards.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]             ` <87twpi63ai.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
@ 2015-10-22 21:02               ` Andy Lutomirski
       [not found]                 ` <CALCETrWKN+Uzw_TYqVTGatNZ3LT5RbSM1WuYPoXeKQs9Yw_qjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2015-10-22 21:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Tobias Markus, Serge E. Hallyn,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Al Viro,
	Serge Hallyn, Andrew Morton, Christoph Lameter,
	Michael Kerrisk (man-pages), LSM List, Linux API, linux-man,
	Richard Weinberger, Yves-Alexis Perez, Austin S Hemmelgarn,
	Linux Containers

On Thu, Oct 22, 2015 at 1:45 PM, Eric W. Biederman
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote:
>
> Thank you for a creative solution to a problem that you perceive.  I
> appreciate it when people aim to solve problems they see.
>
> Tobias Markus <tobias-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org> writes:
>
>> On 17.10.2015 23:55, Serge E. Hallyn wrote:
>>> On Sat, Oct 17, 2015 at 05:58:04PM +0200, Tobias Markus wrote:
>>>> Add capability CAP_SYS_USER_NS.
>>>> Tasks having CAP_SYS_USER_NS are allowed to create a new user namespace
>>>> when calling clone or unshare with CLONE_NEWUSER.
>>>>
>>>> Rationale:
>>>>
>>>> Linux 3.8 saw the introduction of unpriviledged user namespaces,
>>>> allowing unpriviledged users (without CAP_SYS_ADMIN) to be a "fake" root
>>>> inside a separate user namespace. Before that, any namespace creation
>>>> required CAP_SYS_ADMIN (or, in practice, the user had to be root).
>>>> Unfortunately, there have been some security-relevant bugs in the
>>>> meantime. Because of the fairly complex nature of user namespaces, it is
>>>> reasonable to say that future vulnerabilties can not be excluded. Some
>>>> distributions even wholly disable user namespaces because of this.
>>>
>>> Fwiw I'm not in favor of this.  Debian has a patch (I believe the one
>>> I originally wrote for Ubuntu but which Ubuntu dropped long ago) adding a
>>> sysctl, off by default, for enabling user namespaces.
>>
>> While it certainly works, enabling a feature like this at runtime
>> doesn't seem like a long term solution.
>>
>> The fact that Debian added this patch in the first place already
>> demonstrates that there is demand for a way to limit unpriviledged user
>> namespace creation. Please, don't get me wrong: I would *really like* to
>> see widespread adoption and continued development of user namespaces!
>> But the status quo remains: Distributions outright disabling user
>> namespaces (e.g. Arch Linux) won't make it easier.
>
> Let me say I applaud Arch Linux for not doing what so many distributions
> do and enable every feature in the kernel.  I appreciate a distribution
> that does not enable interesting kernel features while they are still
> having their bugs shaken out of them.
>
> I also think Debians approach to limit things while they mature is also
> wisdom.
>
>>> Posix capabilities are intended for privileged actions, not for
>>> actions which explicitly should not require privilege, but which
>>> we feel are in development.
>>>
>>
>> Certainly, in an ideal world, user namespaces will never lead to any
>> kernel-level exploits. But reality is different: There *have been*
>> serious kernel vulnerabilities due to user namespaces, and there *will
>> be* serious kernel vulnerabilities due to user namespaces.
>
> When you start talk about the future that is not yet real you have
> stopped talking about reality.  That sounds like a pessimists world view
> rather than reality.
>
> The reality is new features are buggy and take time to mature.  It takes
> time for understanding to percolate through peoples heads.
>
>> Now, those are the alternatives imho:
>>
>> * Status quo: Some distributions will disable user namespaces by default
>> in some way or another. User wishing to use user namespaces will have to
>> use a custom kernel or enable a sysctl flag that was patched in by the
>> downstream developers. On distributions that enable user namespaces by
>> default, even users that don't wish to use them in the first places will
>> be affected by vulnerabilities.
>
> Again I disagree.  I see distributions waiting to enable user namespaces
> until they mature and until they are interesting enough.  I do not see
> rushing to enable the newest features as wisdom, unless that the point
> of your distribution is to enable people to play with the latest
> features.
>
> I suspect we are quickly coming to a point where user namespaces will be
> sufficiently compelling that they will be enabled more widely.
>
>
> At this point the most helpful things I can see to be done are.
> - Verify all userns related fixes have made it back into 4.1.x
> - Play with and/or audit the userns code to see if more bugs can be
>   found.
> - Analyze user namespaces and see if they are uniquely worse than
>   anything else.
>
> I agree that if user namespaces pose a unique security challenge to
> the kernel we should do something about them.  I think it is a healthy
> question to ask.  For the conversation to be productive I think we need
> numbers and analsysis, not just worst case analsysis based on fear.  To
> date all I see are teething pains.
>
> My back of the napkin analysis is that there are maybe 3,000 lines of
> code executed in user namespaces (mostly from fs/namespace.c) that
> are not otherwise reachable from unprivileged users, while there are
> perhaps 100,000 - 250,000 lines of code reachable by unprivileged users
> (not counting drivers).

At the risk of pointing out a can of worms, the attack surface also
includes things like the iptables configuration APIs, parsers, and
filter/conntrack/action modules.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] userns/capability: Add user namespace capability
       [not found]                 ` <CALCETrWKN+Uzw_TYqVTGatNZ3LT5RbSM1WuYPoXeKQs9Yw_qjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-10-22 21:44                   ` Eric W. Biederman
  0 siblings, 0 replies; 21+ messages in thread
From: Eric W. Biederman @ 2015-10-22 21:44 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tobias Markus, Serge E. Hallyn, linux-kernel@vger.kernel.org,
	Al Viro, Serge Hallyn, Andrew Morton, Christoph Lameter,
	Michael Kerrisk (man-pages), LSM List, Linux API, linux-man,
	Richard Weinberger, Yves-Alexis Perez, Austin S Hemmelgarn,
	Linux Containers

Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> writes:

> At the risk of pointing out a can of worms, the attack surface also
> includes things like the iptables configuration APIs, parsers, and
> filter/conntrack/action modules.

It is worth noting that module auto-load does not happen if the
triggering code does not have the proper permissions in the initial user
namespace.

I agree that is another piece of code that should be counted.  How that
compares to the other 130,000 or so lines of code in the network stack
an unprivileged user can caused to be exercised already I don't know.
In my back of the napkin swag I had totally forgotten to count anything
in the network stack.

A lot of the netfilter code that I have read and looked at is
compartively simple and clean so I don't expect there is much risk
except from sheer volume of code there.

It is also tricky to count because the entire network side of the
networking stack is exposed to hostile users on the internet so anything
except the configuration is already exposed to hostile users.  The
average check entry is 15-20 lines long.  There appear to be 117 unique
check entry functions in the kernel so there may be another 2.5k lines of
code there.

Hmm.  And we have not had any design issues with the network stack.

Absent of design issues where the code even when implemented correctly
has the wrong semantics, we are left with the probability of exploitable
buggy code.  I suspect we have enough code even without user namespaces
enabled that the probability of exploitable buggy code someone in the
code that unprivilged users can cause to be exercised run is > 50%.

I wonder if there are any good statistical models that give realistic
estimates of those things.

Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-10-22 21:44 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-17 15:58 [PATCH] userns/capability: Add user namespace capability Tobias Markus
     [not found] ` <5622700C.9090107-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
2015-10-17 20:17   ` Richard Weinberger
2015-10-18 20:13     ` Tobias Markus
     [not found]       ` <5623FD86.2030609-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
2015-10-18 20:21         ` Richard Weinberger
     [not found]           ` <5623FF36.8080800-/L3Ra7n9ekc@public.gmane.org>
2015-10-18 20:41             ` Tobias Markus
2015-10-18 20:48               ` Richard Weinberger
     [not found]                 ` <56240599.3050903-/L3Ra7n9ekc@public.gmane.org>
2015-10-18 21:49                   ` Tobias Markus
2015-10-18 22:06                     ` Richard Weinberger
2015-10-19  0:28         ` Mike Frysinger
2015-10-17 21:55   ` Serge E. Hallyn
     [not found]     ` <20151017215501.GA22900-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2015-10-18 20:13       ` Tobias Markus
     [not found]         ` <5623FD82.4030902-gyUQdkDHmHmHXe+LvDLADg@public.gmane.org>
2015-10-19  1:41           ` Serge E. Hallyn
     [not found]             ` <20151019014112.GA1683-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2015-10-19 12:36               ` Yves-Alexis Perez
     [not found]                 ` <1445258180.4099.18.camel-8fiUuRrzOP0dnm+yROfE0A@public.gmane.org>
2015-10-19 12:48                   ` Richard Weinberger
2015-10-22 20:45           ` Eric W. Biederman
     [not found]             ` <87twpi63ai.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2015-10-22 21:02               ` Andy Lutomirski
     [not found]                 ` <CALCETrWKN+Uzw_TYqVTGatNZ3LT5RbSM1WuYPoXeKQs9Yw_qjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-22 21:44                   ` Eric W. Biederman
2015-10-19 14:24   ` Austin S Hemmelgarn
     [not found]     ` <5624FD3B.2050401-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-10-21 18:53       ` Andy Lutomirski
     [not found]         ` <CALCETrWfZ9hXvLPtJnZhU-ZdoUbYNo-QSydMPvP6Q7Rp0oCQaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-10-21 19:13           ` Austin S Hemmelgarn
2015-10-22 17:10             ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).