* [PATCH] nsenter: fix ability to enter unprivileged containers
@ 2016-04-18 12:26 James Bottomley
2016-04-18 14:33 ` Yuriy M. Kaminskiy
2016-04-18 15:37 ` James Bottomley
0 siblings, 2 replies; 12+ messages in thread
From: James Bottomley @ 2016-04-18 12:26 UTC (permalink / raw)
To: util-linux; +Cc: ebiederm
If you enter it first, you lose privilege for subsequent namespace
enters,see issue
https://github.com/karelzak/util-linux/issues/315
The fix is to enter the user namespace last of all.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
index d8690db..1525f15 100644
--- a/sys-utils/nsenter.c
+++ b/sys-utils/nsenter.c
@@ -52,13 +52,13 @@ static struct namespace_file {
* first. This gives an unprivileged user the potential to
* enter the other namespaces.
*/
- { .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
{ .nstype = CLONE_NEWCGROUP,.name = "ns/cgroup", .fd = -1 },
{ .nstype = CLONE_NEWIPC, .name = "ns/ipc", .fd = -1 },
{ .nstype = CLONE_NEWUTS, .name = "ns/uts", .fd = -1 },
{ .nstype = CLONE_NEWNET, .name = "ns/net", .fd = -1 },
{ .nstype = CLONE_NEWPID, .name = "ns/pid", .fd = -1 },
{ .nstype = CLONE_NEWNS, .name = "ns/mnt", .fd = -1 },
+ { .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
{ .nstype = 0, .name = NULL, .fd = -1 }
};
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 12:26 [PATCH] nsenter: fix ability to enter unprivileged containers James Bottomley
@ 2016-04-18 14:33 ` Yuriy M. Kaminskiy
2016-04-18 15:51 ` Yuriy M. Kaminskiy
2016-04-18 15:37 ` James Bottomley
1 sibling, 1 reply; 12+ messages in thread
From: Yuriy M. Kaminskiy @ 2016-04-18 14:33 UTC (permalink / raw)
To: util-linux
James Bottomley
<James.Bottomley@HansenPartnership.com>
writes:
> If you enter it first, you lose privilege for subsequent namespace
> enters,see issue
>
> https://github.com/karelzak/util-linux/issues/315
>
> The fix is to enter the user namespace last of all.
I verified that with *current*/unpatched nsenter,
$ unshare -rm sleep inf &
$ nsenter -t $! -U -m --preserve
works as expected (from regular user [and with unprivileged userns
enabled]).
With this patch it *won't* work [verified], of course (as you'll need
root privileges in userns before joining mount-ns, and you can only
obtain them by entering userns first).
Of course, you can workaround it by invoking nsenter twice:
$ nsenter -t $! -U --preserve nsenter -t $! -m
but same could be said about issue 315: you can workaround it by
manually splitting entering mount-ns and user-ns, something like
# nsenter --mount=/run/build-container/aarch64 nsenter --user=/run/build-container/user
or (if /run/build-container/user is not visible inside mount-ns)
# nsenter --mount=/run/build-container/aarch64 nsenter --user=/dev/fd/3 3</run/build-container/user
(disclaimer: unverified; on my kernel mount-bind fails for mount-ns fds).
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
>
> diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
> index d8690db..1525f15 100644
> --- a/sys-utils/nsenter.c
> +++ b/sys-utils/nsenter.c
> @@ -52,13 +52,13 @@ static struct namespace_file {
> * first. This gives an unprivileged user the potential to
> * enter the other namespaces.
> */
> - { .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
> { .nstype = CLONE_NEWCGROUP,.name = "ns/cgroup", .fd = -1 },
> { .nstype = CLONE_NEWIPC, .name = "ns/ipc", .fd = -1 },
> { .nstype = CLONE_NEWUTS, .name = "ns/uts", .fd = -1 },
> { .nstype = CLONE_NEWNET, .name = "ns/net", .fd = -1 },
> { .nstype = CLONE_NEWPID, .name = "ns/pid", .fd = -1 },
> { .nstype = CLONE_NEWNS, .name = "ns/mnt", .fd = -1 },
> + { .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
> { .nstype = 0, .name = NULL, .fd = -1 }
> };
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 12:26 [PATCH] nsenter: fix ability to enter unprivileged containers James Bottomley
2016-04-18 14:33 ` Yuriy M. Kaminskiy
@ 2016-04-18 15:37 ` James Bottomley
2016-04-18 15:50 ` James Bottomley
2016-04-18 17:11 ` Karel Zak
1 sibling, 2 replies; 12+ messages in thread
From: James Bottomley @ 2016-04-18 15:37 UTC (permalink / raw)
To: util-linux; +Cc: ebiederm
OK, so if you want me to reply properly, you're going to have to keep
my address in the cc list.
> > If you enter it first, you lose privilege for subsequent
> namespace
> > enters,see issue
> >
> > https://github.com/karelzak/util-linux/issues/315
> >
> > The fix is to enter the user namespace last of all.
>
> I verified that with *current*/unpatched nsenter,
>
> $ unshare -rm sleep inf &
> $ nsenter -t $! -U -m --preserve
>
> works as expected (from regular user [and with unprivileged userns
> enabled]).
>
> With this patch it *won't* work [verified], of course (as you'll need
> root privileges in userns before joining mount-ns, and you can only
> obtain them by entering userns first).
So we're using userns for different things. I'm using it to remove
privilege (so on my userns implementation root in the host enters but
on becoming root in the userns, it can do nothing other than write to
its own files) and you're using it to enhance privilege. It looks like
these two things will always be mutually exclusive, so perhaps we need
an extra flag to nsenter to say do the userns first or last?
> Of course, you can workaround it by invoking nsenter twice:
>
> $ nsenter -t $! -U --preserve nsenter -t $! -m
>
> but same could be said about issue 315: you can workaround it by
> manually splitting entering mount-ns and user-ns, something like
>
> # nsenter --mount=/run/build-container/aarch64 nsenter -
> -user=/run/build-container/user
>
> or (if /run/build-container/user is not visible inside mount-ns)
That's right, it isn't
> # nsenter --mount=/run/build-container/aarch64 nsenter -
> -user=/dev/fd/3 3</run/build-container/user
It should work, but for some inexplicable reason it's giving EINVAL.
# nsenter --mount=/run/build-container/aarch64 3</run/build-container/user
# ls -l /proc/self/fd
total 0
lrwx------ 1 root root 64 Apr 18 15:31 0 -> /dev/pts/1
lrwx------ 1 root root 64 Apr 18 15:31 1 -> /dev/pts/1
lrwx------ 1 root root 64 Apr 18 15:31 2 -> /dev/pts/1
lr-x------ 1 root root 64 Apr 18 15:31 3 -> /run/build-container/user
lr-x------ 1 root root 64 Apr 18 15:31 4 -> /proc/10304/fd
# nsenter --user=/proc/self/fd/3
nsenter: reassociate to namespace 'ns/user' failed: Invalid argument
I think it's because the fd wasn't properly opened by the shell
> (disclaimer: unverified; on my kernel mount-bind fails for mount-ns
> fds).
That's probably because you're running systemd. Systemd sets all
subtrees to shared and you can only bind mount a mount namespace file
descriptor on to a private subtree.
James
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 15:37 ` James Bottomley
@ 2016-04-18 15:50 ` James Bottomley
2016-04-18 17:11 ` Karel Zak
1 sibling, 0 replies; 12+ messages in thread
From: James Bottomley @ 2016-04-18 15:50 UTC (permalink / raw)
To: util-linux; +Cc: ebiederm
On Mon, 2016-04-18 at 11:37 -0400, James Bottomley wrote:
> > # nsenter --mount=/run/build-container/aarch64 nsenter -
> > -user=/dev/fd/3 3</run/build-container/user
>
> It should work, but for some inexplicable reason it's giving EINVAL.
>
> # nsenter --mount=/run/build-container/aarch64 3</run/build
> -container/user
> # ls -l /proc/self/fd
> total 0
> lrwx------ 1 root root 64 Apr 18 15:31 0 -> /dev/pts/1
> lrwx------ 1 root root 64 Apr 18 15:31 1 -> /dev/pts/1
> lrwx------ 1 root root 64 Apr 18 15:31 2 -> /dev/pts/1
> lr-x------ 1 root root 64 Apr 18 15:31 3 -> /run/build-container/user
> lr-x------ 1 root root 64 Apr 18 15:31 4 -> /proc/10304/fd
> # nsenter --user=/proc/self/fd/3
> nsenter: reassociate to namespace 'ns/user' failed: Invalid argument
>
> I think it's because the fd wasn't properly opened by the shell
Actually, just to follow up, this is specifically a problem of the type
of container I'm running: it's actually an architecture emulation
container, so after I've done the mount namespace enter, I'm running
under aarch64 emulation and, apparently, qemu has some problem with the
setns system call. This does, however, mean I have to enter both
namespaces together ...
James
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 14:33 ` Yuriy M. Kaminskiy
@ 2016-04-18 15:51 ` Yuriy M. Kaminskiy
0 siblings, 0 replies; 12+ messages in thread
From: Yuriy M. Kaminskiy @ 2016-04-18 15:51 UTC (permalink / raw)
To: util-linux
yumkam@gmail.com (Yuriy M. Kaminskiy) writes:
> # nsenter --mount=/run/build-container/aarch64 nsenter --user=/dev/fd/3 3</run/build-container/user
>
> (disclaimer: unverified;
Just for record,
1) above workaround verified/works;
2) and this part:
> on my kernel mount-bind fails for mount-ns fds).
was *my* mistake (I somehow missed mount-ns peculiarity about incompatibility
with shared propagation [it is documented in unshare(1)])
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 15:37 ` James Bottomley
2016-04-18 15:50 ` James Bottomley
@ 2016-04-18 17:11 ` Karel Zak
2016-04-18 17:28 ` James Bottomley
2016-04-18 19:40 ` James Bottomley
1 sibling, 2 replies; 12+ messages in thread
From: Karel Zak @ 2016-04-18 17:11 UTC (permalink / raw)
To: James Bottomley; +Cc: util-linux, ebiederm
On Mon, Apr 18, 2016 at 11:37:34AM -0400, James Bottomley wrote:
> OK, so if you want me to reply properly, you're going to have to keep
> my address in the cc list.
>
> > > If you enter it first, you lose privilege for subsequent
> > namespace
> > > enters,see issue
> > >
> > > https://github.com/karelzak/util-linux/issues/315
> > >
> > > The fix is to enter the user namespace last of all.
> >
> > I verified that with *current*/unpatched nsenter,
> >
> > $ unshare -rm sleep inf &
> > $ nsenter -t $! -U -m --preserve
> >
> > works as expected (from regular user [and with unprivileged userns
> > enabled]).
> >
> > With this patch it *won't* work [verified], of course (as you'll need
> > root privileges in userns before joining mount-ns, and you can only
> > obtain them by entering userns first).
>
> So we're using userns for different things. I'm using it to remove
> privilege (so on my userns implementation root in the host enters but
> on becoming root in the userns, it can do nothing other than write to
> its own files) and you're using it to enhance privilege. It looks like
> these two things will always be mutually exclusive, so perhaps we need
> an extra flag to nsenter to say do the userns first or last?
That's what I have talked about at github -- see Eric's comment in the
code, the user NS is the first in the array for a good reason. May be
it would be really better to add --user-{first,last} options to
specify when you want to enter user NS.
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 17:11 ` Karel Zak
@ 2016-04-18 17:28 ` James Bottomley
2016-04-18 18:26 ` Eric W. Biederman
2016-04-18 19:40 ` James Bottomley
1 sibling, 1 reply; 12+ messages in thread
From: James Bottomley @ 2016-04-18 17:28 UTC (permalink / raw)
To: Karel Zak; +Cc: util-linux, ebiederm
On Mon, 2016-04-18 at 19:11 +0200, Karel Zak wrote:
> On Mon, Apr 18, 2016 at 11:37:34AM -0400, James Bottomley wrote:
> > OK, so if you want me to reply properly, you're going to have to keep
> > my address in the cc list.
> >
> > > > If you enter it first, you lose privilege for subsequent
> > > namespace
> > > > enters,see issue
> > > >
> > > > https://github.com/karelzak/util-linux/issues/315
> > > >
> > > > The fix is to enter the user namespace last of all.
> > >
> > > I verified that with *current*/unpatched nsenter,
> > >
> > > $ unshare -rm sleep inf &
> > > $ nsenter -t $! -U -m --preserve
> > >
> > > works as expected (from regular user [and with unprivileged userns
> > > enabled]).
> > >
> > > With this patch it *won't* work [verified], of course (as you'll need
> > > root privileges in userns before joining mount-ns, and you can only
> > > obtain them by entering userns first).
> >
> > So we're using userns for different things. I'm using it to remove
> > privilege (so on my userns implementation root in the host enters but
> > on becoming root in the userns, it can do nothing other than write to
> > its own files) and you're using it to enhance privilege. It looks like
> > these two things will always be mutually exclusive, so perhaps we need
> > an extra flag to nsenter to say do the userns first or last?
>
> That's what I have talked about at github -- see Eric's comment in the
> code, the user NS is the first in the array for a good reason. May be
> it would be really better to add --user-{first,last} options to
> specify when you want to enter user NS.
OK, I'll code this up; hang on.
James
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 17:28 ` James Bottomley
@ 2016-04-18 18:26 ` Eric W. Biederman
2016-04-18 20:56 ` James Bottomley
0 siblings, 1 reply; 12+ messages in thread
From: Eric W. Biederman @ 2016-04-18 18:26 UTC (permalink / raw)
To: James Bottomley; +Cc: Karel Zak, util-linux
James Bottomley <James.Bottomley@HansenPartnership.com> writes:
> On Mon, 2016-04-18 at 19:11 +0200, Karel Zak wrote:
>> On Mon, Apr 18, 2016 at 11:37:34AM -0400, James Bottomley wrote:
>> > OK, so if you want me to reply properly, you're going to have to keep
>> > my address in the cc list.
>> >
>> > > > If you enter it first, you lose privilege for subsequent
>> > > namespace
>> > > > enters,see issue
>> > > >
>> > > > https://github.com/karelzak/util-linux/issues/315
>> > > >
>> > > > The fix is to enter the user namespace last of all.
>> > >
>> > > I verified that with *current*/unpatched nsenter,
>> > >
>> > > $ unshare -rm sleep inf &
>> > > $ nsenter -t $! -U -m --preserve
>> > >
>> > > works as expected (from regular user [and with unprivileged userns
>> > > enabled]).
>> > >
>> > > With this patch it *won't* work [verified], of course (as you'll need
>> > > root privileges in userns before joining mount-ns, and you can only
>> > > obtain them by entering userns first).
>> >
>> > So we're using userns for different things. I'm using it to remove
>> > privilege (so on my userns implementation root in the host enters but
>> > on becoming root in the userns, it can do nothing other than write to
>> > its own files) and you're using it to enhance privilege. It looks like
>> > these two things will always be mutually exclusive, so perhaps we need
>> > an extra flag to nsenter to say do the userns first or last?
>>
>> That's what I have talked about at github -- see Eric's comment in the
>> code, the user NS is the first in the array for a good reason. May be
>> it would be really better to add --user-{first,last} options to
>> specify when you want to enter user NS.
>
> OK, I'll code this up; hang on.
I think we can do this even better with two passes to setns.
A first pass before the user namespace is set (that ignores failures),
and a second pass that sets the user namespace first as happens today.
That should satisfy both cases without flags, and would remove the need
to remember/guess which kind of container people are using.
Eric
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 17:11 ` Karel Zak
2016-04-18 17:28 ` James Bottomley
@ 2016-04-18 19:40 ` James Bottomley
1 sibling, 0 replies; 12+ messages in thread
From: James Bottomley @ 2016-04-18 19:40 UTC (permalink / raw)
To: Karel Zak; +Cc: util-linux, ebiederm
On Mon, 2016-04-18 at 19:11 +0200, Karel Zak wrote:
> On Mon, Apr 18, 2016 at 11:37:34AM -0400, James Bottomley wrote:
> > OK, so if you want me to reply properly, you're going to have to
> > keep
> > my address in the cc list.
> >
> > > > If you enter it first, you lose privilege for subsequent
> > > namespace
> > > > enters,see issue
> > > >
> > > > https://github.com/karelzak/util-linux/issues/315
> > > >
> > > > The fix is to enter the user namespace last of all.
> > >
> > > I verified that with *current*/unpatched nsenter,
> > >
> > > $ unshare -rm sleep inf &
> > > $ nsenter -t $! -U -m --preserve
> > >
> > > works as expected (from regular user [and with unprivileged
> > > userns
> > > enabled]).
> > >
> > > With this patch it *won't* work [verified], of course (as you'll
> > > need
> > > root privileges in userns before joining mount-ns, and you can
> > > only
> > > obtain them by entering userns first).
> >
> > So we're using userns for different things. I'm using it to remove
> > privilege (so on my userns implementation root in the host enters
> > but
> > on becoming root in the userns, it can do nothing other than write
> > to
> > its own files) and you're using it to enhance privilege. It looks
> > like
> > these two things will always be mutually exclusive, so perhaps we
> > need
> > an extra flag to nsenter to say do the userns first or last?
>
> That's what I have talked about at github -- see Eric's comment in
> the
> code, the user NS is the first in the array for a good reason. May be
> it would be really better to add --user-{first,last} options to
> specify when you want to enter user NS.
How does this look; it's just got a --user-last switch because --user
-first is current behaviour.
James
---
>From eee660d40179ed4a186f6c5a73d5596058f87473 Mon Sep 17 00:00:00 2001
From: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Fri, 15 Apr 2016 08:10:20 -0700
Subject: [PATCH] nsenter: add --user-last flag to enter the user namespace
last
We have two use cases for user namespaces, one to elevate the
privilege of an unprivileged user, in which case we have to enter the
user namespace before all other namespaces (otherwise there isn't
enough permission to enter any other namespace). And the other one is
where we're deprivileging a user and thus have to enter the user
namespace last (because that's the point at which we lose the
privileges).
This fixes
https://github.com/karelzak/util-linux/issues/315
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
diff --git a/sys-utils/nsenter.1 b/sys-utils/nsenter.1
index ea5992e..3acf8da 100644
--- a/sys-utils/nsenter.1
+++ b/sys-utils/nsenter.1
@@ -154,6 +154,10 @@ always sets UID for user namespaces, the default is 0.
Don't modify UID and GID when enter user namespace. The default is to
drops supplementary groups and sets GID and UID to 0.
.TP
+\fB\-\-user\-last\fR
+Enter the user namespace last rather than first. You need this option
+if the usernamespace deprivileges the current process.
+.TP
\fB\-r\fR, \fB\-\-root\fR[=\fIdirectory\fR]
Set the root directory. If no directory is specified, set the root directory to
the root directory of the target process. If directory is specified, set the
diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
index d8690db..9d82621 100644
--- a/sys-utils/nsenter.c
+++ b/sys-utils/nsenter.c
@@ -48,9 +48,10 @@ static struct namespace_file {
} namespace_files[] = {
/* Careful the order is significant in this array.
*
- * The user namespace comes first, so that it is entered
- * first. This gives an unprivileged user the potential to
- * enter the other namespaces.
+ * The user namespace comes either first or last: first if
+ * you're using it to increase your privilege and so enter the
+ * other namespaces and last if you're using it to decrease
+ * your privilege (see --user-last flag).
*/
{ .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
{ .nstype = CLONE_NEWCGROUP,.name = "ns/cgroup", .fd = -1 },
@@ -88,6 +89,7 @@ static void usage(int status)
fputs(_(" -r, --root[=<dir>] set the root directory\n"), out);
fputs(_(" -w, --wd[=<dir>] set the working directory\n"), out);
fputs(_(" -F, --no-fork do not fork before exec'ing <program>\n"), out);
+ fputs(_(" --user-last enter the user namespace last intstead of first\n"), out);
#ifdef HAVE_LIBSELINUX
fputs(_(" -Z, --follow-context set SELinux context according to --target PID\n"), out);
#endif
@@ -176,7 +178,8 @@ static void continue_as_child(void)
int main(int argc, char *argv[])
{
enum {
- OPT_PRESERVE_CRED = CHAR_MAX + 1
+ OPT_PRESERVE_CRED = CHAR_MAX + 1,
+ OPT_USER_LAST,
};
static const struct option longopts[] = {
{ "help", no_argument, NULL, 'h' },
@@ -195,6 +198,7 @@ int main(int argc, char *argv[])
{ "wd", optional_argument, NULL, 'w' },
{ "no-fork", no_argument, NULL, 'F' },
{ "preserve-credentials", no_argument, NULL, OPT_PRESERVE_CRED },
+ { "user-last", no_argument, NULL, OPT_USER_LAST },
#ifdef HAVE_LIBSELINUX
{ "follow-context", no_argument, NULL, 'Z' },
#endif
@@ -202,7 +206,7 @@ int main(int argc, char *argv[])
};
struct namespace_file *nsfile;
- int c, namespaces = 0, setgroups_nerrs = 0, preserve_cred = 0;
+ int c, namespaces = 0, setgroups_nerrs = 0, preserve_cred = 0, user_last = 0;
bool do_rd = false, do_wd = false, force_uid = false, force_gid = false;
int do_fork = -1; /* unknown yet */
uid_t uid = 0;
@@ -297,6 +301,9 @@ int main(int argc, char *argv[])
case OPT_PRESERVE_CRED:
preserve_cred = 1;
break;
+ case OPT_USER_LAST:
+ user_last = 1;
+ break;
#ifdef HAVE_LIBSELINUX
case 'Z':
selinux = 1;
@@ -321,6 +328,19 @@ int main(int argc, char *argv[])
freecon(scon);
}
#endif
+
+ if (user_last) {
+ /*
+ * swap the first and last entries in the namespace_files
+ * array (so swap the user and mount entries)
+ */
+ int user_new = ARRAY_SIZE(namespace_files) - 2;
+ struct namespace_file nsf = namespace_files[user_new];
+
+ namespace_files[user_new] = namespace_files[0];
+ namespace_files[0] = nsf;
+ }
+
/*
* Open remaining namespace and directory descriptors.
*/
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 18:26 ` Eric W. Biederman
@ 2016-04-18 20:56 ` James Bottomley
2016-04-18 21:31 ` Eric W. Biederman
2016-04-22 9:05 ` Karel Zak
0 siblings, 2 replies; 12+ messages in thread
From: James Bottomley @ 2016-04-18 20:56 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Karel Zak, util-linux
On Mon, 2016-04-18 at 13:26 -0500, Eric W. Biederman wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> writes:
>
> > On Mon, 2016-04-18 at 19:11 +0200, Karel Zak wrote:
> > > On Mon, Apr 18, 2016 at 11:37:34AM -0400, James Bottomley wrote:
> > > > OK, so if you want me to reply properly, you're going to have
> > > > to keep
> > > > my address in the cc list.
> > > >
> > > > > > If you enter it first, you lose privilege for subsequent
> > > > > namespace
> > > > > > enters,see issue
> > > > > >
> > > > > > https://github.com/karelzak/util-linux/issues/315
> > > > > >
> > > > > > The fix is to enter the user namespace last of all.
> > > > >
> > > > > I verified that with *current*/unpatched nsenter,
> > > > >
> > > > > $ unshare -rm sleep inf &
> > > > > $ nsenter -t $! -U -m --preserve
> > > > >
> > > > > works as expected (from regular user [and with unprivileged
> > > > > userns
> > > > > enabled]).
> > > > >
> > > > > With this patch it *won't* work [verified], of course (as
> > > > > you'll need
> > > > > root privileges in userns before joining mount-ns, and you
> > > > > can only
> > > > > obtain them by entering userns first).
> > > >
> > > > So we're using userns for different things. I'm using it to
> > > > remove
> > > > privilege (so on my userns implementation root in the host
> > > > enters but
> > > > on becoming root in the userns, it can do nothing other than
> > > > write to
> > > > its own files) and you're using it to enhance privilege. It
> > > > looks like
> > > > these two things will always be mutually exclusive, so perhaps
> > > > we need
> > > > an extra flag to nsenter to say do the userns first or last?
> > >
> > > That's what I have talked about at github -- see Eric's comment
> > > in the
> > > code, the user NS is the first in the array for a good reason.
> > > May be
> > > it would be really better to add --user-{first,last} options to
> > > specify when you want to enter user NS.
> >
> > OK, I'll code this up; hang on.
>
> I think we can do this even better with two passes to setns.
>
> A first pass before the user namespace is set (that ignores
> failures),
> and a second pass that sets the user namespace first as happens
> today.
>
> That should satisfy both cases without flags, and would remove the
> need
> to remember/guess which kind of container people are using.
Makes sense, so like this?
James
---
>From 39cecc604a6c997c328affea52e65d6f67898bf5 Mon Sep 17 00:00:00 2001
From: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Fri, 15 Apr 2016 08:10:20 -0700
Subject: [PATCH] nsenter: enter namespaces in two passes
We have two use cases for user namespaces, one to elevate the
privilege of an unprivileged user, in which case we have to enter the
user namespace before all other namespaces (otherwise there isn't
enough permission to enter any other namespace). And the other one is
where we're deprivileging a user and thus have to enter the user
namespace last (because that's the point at which we lose the
privileges). On the first pass, we start at the position one after
the user namespace clearing the file descriptors as we close them
after calling setns(). If setns() fails on the first pass, ignore the
failure assuming that it will succeed after we enter the user
namespace.
This fixes
https://github.com/karelzak/util-linux/issues/315
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
index d8690db..8dbd22b 100644
--- a/sys-utils/nsenter.c
+++ b/sys-utils/nsenter.c
@@ -48,9 +48,11 @@ static struct namespace_file {
} namespace_files[] = {
/* Careful the order is significant in this array.
*
- * The user namespace comes first, so that it is entered
- * first. This gives an unprivileged user the potential to
- * enter the other namespaces.
+ * The user namespace comes either first or last: first if
+ * you're using it to increase your privilege and last if
+ * you're using it to decrease. We enter the namespaces in
+ * two passes starting initially from offset 1 and then offset
+ * 0 if that fails.
*/
{ .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
{ .nstype = CLONE_NEWCGROUP,.name = "ns/cgroup", .fd = -1 },
@@ -202,7 +204,7 @@ int main(int argc, char *argv[])
};
struct namespace_file *nsfile;
- int c, namespaces = 0, setgroups_nerrs = 0, preserve_cred = 0;
+ int c, pass, namespaces = 0, setgroups_nerrs = 0, preserve_cred = 0;
bool do_rd = false, do_wd = false, force_uid = false, force_gid = false;
int do_fork = -1; /* unknown yet */
uid_t uid = 0;
@@ -353,19 +355,31 @@ int main(int argc, char *argv[])
}
/*
- * Now that we know which namespaces we want to enter, enter them.
+ * Now that we know which namespaces we want to enter, enter
+ * them. Do this in two passes, not entering the user
+ * namespace on the first pass. So if we're deprivileging the
+ * container we'll enter the user namespace last and if we're
+ * privileging it then we enter the usernamespace first
+ * (because the initial setns will fail).
*/
- for (nsfile = namespace_files; nsfile->nstype; nsfile++) {
- if (nsfile->fd < 0)
- continue;
- if (nsfile->nstype == CLONE_NEWPID && do_fork == -1)
- do_fork = 1;
- if (setns(nsfile->fd, nsfile->nstype))
- err(EXIT_FAILURE,
- _("reassociate to namespace '%s' failed"),
- nsfile->name);
- close(nsfile->fd);
- nsfile->fd = -1;
+ for (pass = 0; pass < 2; pass ++) {
+ for (nsfile = namespace_files + 1 - pass; nsfile->nstype; nsfile++) {
+ if (nsfile->fd < 0)
+ continue;
+ if (nsfile->nstype == CLONE_NEWPID && do_fork == -1)
+ do_fork = 1;
+ if (setns(nsfile->fd, nsfile->nstype)) {
+ if (pass != 0)
+ err(EXIT_FAILURE,
+ _("reassociate to namespace '%s' failed"),
+ nsfile->name);
+ else
+ continue;
+ }
+
+ close(nsfile->fd);
+ nsfile->fd = -1;
+ }
}
/* Remember the current working directory if I'm not changing it */
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 20:56 ` James Bottomley
@ 2016-04-18 21:31 ` Eric W. Biederman
2016-04-22 9:05 ` Karel Zak
1 sibling, 0 replies; 12+ messages in thread
From: Eric W. Biederman @ 2016-04-18 21:31 UTC (permalink / raw)
To: James Bottomley; +Cc: Karel Zak, util-linux
James Bottomley <James.Bottomley@HansenPartnership.com> writes:
> On Mon, 2016-04-18 at 13:26 -0500, Eric W. Biederman wrote:
>> I think we can do this even better with two passes to setns.
>>
>> A first pass before the user namespace is set (that ignores
>> failures),
>> and a second pass that sets the user namespace first as happens
>> today.
>>
>> That should satisfy both cases without flags, and would remove the
>> need
>> to remember/guess which kind of container people are using.
>
> Makes sense, so like this?
That looks like it will work, and for the right reasons.
> ---
> From 39cecc604a6c997c328affea52e65d6f67898bf5 Mon Sep 17 00:00:00 2001
> From: James Bottomley <James.Bottomley@HansenPartnership.com>
> Date: Fri, 15 Apr 2016 08:10:20 -0700
> Subject: [PATCH] nsenter: enter namespaces in two passes
>
> We have two use cases for user namespaces, one to elevate the
> privilege of an unprivileged user, in which case we have to enter the
> user namespace before all other namespaces (otherwise there isn't
> enough permission to enter any other namespace). And the other one is
> where we're deprivileging a user and thus have to enter the user
> namespace last (because that's the point at which we lose the
> privileges). On the first pass, we start at the position one after
> the user namespace clearing the file descriptors as we close them
> after calling setns(). If setns() fails on the first pass, ignore the
> failure assuming that it will succeed after we enter the user
> namespace.
>
> This fixes
>
> https://github.com/karelzak/util-linux/issues/315
>
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
> index d8690db..8dbd22b 100644
> --- a/sys-utils/nsenter.c
> +++ b/sys-utils/nsenter.c
> @@ -48,9 +48,11 @@ static struct namespace_file {
> } namespace_files[] = {
> /* Careful the order is significant in this array.
> *
> - * The user namespace comes first, so that it is entered
> - * first. This gives an unprivileged user the potential to
> - * enter the other namespaces.
> + * The user namespace comes either first or last: first if
> + * you're using it to increase your privilege and last if
> + * you're using it to decrease. We enter the namespaces in
> + * two passes starting initially from offset 1 and then offset
> + * 0 if that fails.
> */
> { .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 },
> { .nstype = CLONE_NEWCGROUP,.name = "ns/cgroup", .fd = -1 },
> @@ -202,7 +204,7 @@ int main(int argc, char *argv[])
> };
>
> struct namespace_file *nsfile;
> - int c, namespaces = 0, setgroups_nerrs = 0, preserve_cred = 0;
> + int c, pass, namespaces = 0, setgroups_nerrs = 0, preserve_cred = 0;
> bool do_rd = false, do_wd = false, force_uid = false, force_gid = false;
> int do_fork = -1; /* unknown yet */
> uid_t uid = 0;
> @@ -353,19 +355,31 @@ int main(int argc, char *argv[])
> }
>
> /*
> - * Now that we know which namespaces we want to enter, enter them.
> + * Now that we know which namespaces we want to enter, enter
> + * them. Do this in two passes, not entering the user
> + * namespace on the first pass. So if we're deprivileging the
> + * container we'll enter the user namespace last and if we're
> + * privileging it then we enter the usernamespace first
> + * (because the initial setns will fail).
> */
> - for (nsfile = namespace_files; nsfile->nstype; nsfile++) {
> - if (nsfile->fd < 0)
> - continue;
> - if (nsfile->nstype == CLONE_NEWPID && do_fork == -1)
> - do_fork = 1;
> - if (setns(nsfile->fd, nsfile->nstype))
> - err(EXIT_FAILURE,
> - _("reassociate to namespace '%s' failed"),
> - nsfile->name);
> - close(nsfile->fd);
> - nsfile->fd = -1;
> + for (pass = 0; pass < 2; pass ++) {
> + for (nsfile = namespace_files + 1 - pass; nsfile->nstype; nsfile++) {
> + if (nsfile->fd < 0)
> + continue;
> + if (nsfile->nstype == CLONE_NEWPID && do_fork == -1)
> + do_fork = 1;
> + if (setns(nsfile->fd, nsfile->nstype)) {
> + if (pass != 0)
> + err(EXIT_FAILURE,
> + _("reassociate to namespace '%s' failed"),
> + nsfile->name);
> + else
> + continue;
> + }
> +
> + close(nsfile->fd);
> + nsfile->fd = -1;
> + }
> }
>
> /* Remember the current working directory if I'm not changing it */
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] nsenter: fix ability to enter unprivileged containers
2016-04-18 20:56 ` James Bottomley
2016-04-18 21:31 ` Eric W. Biederman
@ 2016-04-22 9:05 ` Karel Zak
1 sibling, 0 replies; 12+ messages in thread
From: Karel Zak @ 2016-04-22 9:05 UTC (permalink / raw)
To: James Bottomley; +Cc: Eric W. Biederman, util-linux
On Mon, Apr 18, 2016 at 04:56:40PM -0400, James Bottomley wrote:
> diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
> index d8690db..8dbd22b 100644
> --- a/sys-utils/nsenter.c
> +++ b/sys-utils/nsenter.c
Applied, thanks!
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2016-04-22 9:05 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-18 12:26 [PATCH] nsenter: fix ability to enter unprivileged containers James Bottomley
2016-04-18 14:33 ` Yuriy M. Kaminskiy
2016-04-18 15:51 ` Yuriy M. Kaminskiy
2016-04-18 15:37 ` James Bottomley
2016-04-18 15:50 ` James Bottomley
2016-04-18 17:11 ` Karel Zak
2016-04-18 17:28 ` James Bottomley
2016-04-18 18:26 ` Eric W. Biederman
2016-04-18 20:56 ` James Bottomley
2016-04-18 21:31 ` Eric W. Biederman
2016-04-22 9:05 ` Karel Zak
2016-04-18 19:40 ` James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox