util-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] unshare: add some examples
@ 2014-12-28 20:23 Lubomir Rintel
  2014-12-28 20:23 ` [PATCH 2/2] unshare: allow persisting namespaces Lubomir Rintel
  2015-01-12 11:41 ` [PATCH 1/2] unshare: add some examples Karel Zak
  0 siblings, 2 replies; 8+ messages in thread
From: Lubomir Rintel @ 2014-12-28 20:23 UTC (permalink / raw)
  To: util-linux; +Cc: Mikhail Gusarov, Lubomir Rintel

...and fix one typo.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
---
 sys-utils/unshare.1 | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/sys-utils/unshare.1 b/sys-utils/unshare.1
index 1aa9bcb..3f9eccd 100644
--- a/sys-utils/unshare.1
+++ b/sys-utils/unshare.1
@@ -1,5 +1,5 @@
 .\" Process this file with
-.\" groff -man -Tascii lscpu.1
+.\" groff -man -Tascii unshare.1
 .\"
 .TH UNSHARE 1 "July 2014" "util-linux" "User Commands"
 .SH NAME
@@ -91,6 +91,20 @@ Display version information and exit.
 .TP
 .BR \-h , " \-\-help"
 Display help text and exit.
+.SH EXAMPLES
+.TP
+.B # unshare --fork --pid --mount-proc readlink /proc/self
+.TQ
+1
+.br
+Establish a PID namespace, ensure we're PID 1 in it against newly mounted 
+procfs instance.
+.TP
+.B $ unshare --map-root-user --user sh -c whoami
+.TQ
+root
+.br
+Establish a user namespace as an unprivileged user with a root user within it.
 .SH SEE ALSO
 .BR unshare (2),
 .BR clone (2),
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] unshare: allow persisting namespaces
  2014-12-28 20:23 [PATCH 1/2] unshare: add some examples Lubomir Rintel
@ 2014-12-28 20:23 ` Lubomir Rintel
  2015-01-06 13:03   ` Karel Zak
  2015-02-15 12:10   ` Mike Frysinger
  2015-01-12 11:41 ` [PATCH 1/2] unshare: add some examples Karel Zak
  1 sibling, 2 replies; 8+ messages in thread
From: Lubomir Rintel @ 2014-12-28 20:23 UTC (permalink / raw)
  To: util-linux; +Cc: Mikhail Gusarov, Lubomir Rintel

Bind mount the namespace file to a given location after creating it if
requested (analogously to what "ip netns" and other tools do). This makes
it possible for a namespace to survive with no processes running while
processes can enter it with nsenter(1):

  # unshare --uts=utsns hostname behemoth
  # nsenter --uts=utsns hostname
  behemoth
  # umount utsns; rm utsns
  #

The ugly bit about this patch is the clone(2) call, arguably not our
fault. The stack size glibc requires for its clone(2) wrapper is not
documented anywhere and its semantics (stack growth direction) is arch
dependent. We could figure it out by comparing a return value of a helper
function that would return an address of its local variable with caller's
local variable address, but I guess that would be even more messed-up.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
---
 sys-utils/unshare.1 |  58 +++++++++++++++-------
 sys-utils/unshare.c | 135 +++++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 159 insertions(+), 34 deletions(-)

diff --git a/sys-utils/unshare.1 b/sys-utils/unshare.1
index 3f9eccd..625a03a 100644
--- a/sys-utils/unshare.1
+++ b/sys-utils/unshare.1
@@ -11,8 +11,15 @@ unshare \- run program with some namespaces unshared from parent
 .RI [ arguments ]
 .SH DESCRIPTION
 Unshares the indicated namespaces from the parent process and then executes
-the specified \fIprogram\fR.  The namespaces to be unshared are indicated via
-options.  Unshareable namespaces are:
+the specified \fIprogram\fR.
+.PP
+The namespaces can optionally be persisted by mounting to a file system path
+and entered with \fBnsenter\fR even after \fIprogram\fR terminates.  Once a
+persistent namespace is no longer needed it can be unpersisted with \fBumount\fR
+and the name can be removed with \fBrm\fR.
+.PP
+The namespaces to be unshared are indicated via options.  Unshareable namespaces
+are:
 .TP
 .BR "mount namespace"
 Mounting and unmounting filesystems will not affect the rest of the system
@@ -48,24 +55,31 @@ The process will have a distinct set of UIDs, GIDs and capabilities.
 See \fBclone\fR(2) for the exact semantics of the flags.
 .SH OPTIONS
 .TP
-.BR \-i , " \-\-ipc"
-Unshare the IPC namespace.
+\fB\-i\fR, \fB\-\-ipc\fR[=\fIfile\fR]
+Unshare the IPC namespace.  If file is specified, persist the IPC namespace at
+the given path.
 .TP
-.BR \-m , " \-\-mount"
-Unshare the mount namespace.
+\fB\-m\fR, \fB\-\-mount\fR[=\fIfile\fR]
+Unshare the mount namespace.  If file is specified, fork (as if \fB\-\-fork\fR was
+specified) and persist the mount namespace at the given path in the parent's mount
+namespace.
 .TP
-.BR \-n , " \-\-net"
-Unshare the network namespace.
+\fB\-n\fR, \fB\-\-net\fR[=\fIfile\fR]
+Unshare the network namespace.  If file is specified, persist the network namespace at
+the given path.
 .TP
-.BR \-p , " \-\-pid"
-Unshare the pid namespace.
+\fB\-p\fR, \fB\-\-pid\fR[=\fIfile\fR]
+Unshare the pid namespace.  If file is specified, persist the pid namespace at
+the given path.
 See also the \fB--fork\fP and \fB--mount-proc\fP options.
 .TP
-.BR \-u , " \-\-uts"
-Unshare the UTS namespace.
+\fB\-u\fR, \fB\-\-uts\fR[=\fIfile\fR]
+Unshare the UTS namespace.  If file is specified, persist the UTS namespace at
+the given path.
 .TP
-.BR \-U , " \-\-user"
-Unshare the user namespace.
+\fB\-U\fR, \fB\-\-user\fR[=\fIfile\fR]
+Unshare the user namespace.  If file is specified, persist the user namespace at
+the given path.
 .TP
 .BR \-f , " \-\-fork"
 Fork the specified \fIprogram\fR as a child process of \fBunshare\fR rather than
@@ -105,14 +119,26 @@ procfs instance.
 root
 .br
 Establish a user namespace as an unprivileged user with a root user within it.
+.TP
+.B # unshare --uts=utsns hostname behemoth
+.TQ
+.B # nsenter --uts=utsns hostname
+.TQ
+behemoth
+.TQ
+.B # umount utsns; rm utsns
+.br
+Create a persistent UTS namespace and tear it down.
 .SH SEE ALSO
 .BR unshare (2),
 .BR clone (2),
-.BR mount (8)
+.BR mount (8),
+.BR umount (8)
 .SH BUGS
 None known so far.
 .SH AUTHOR
-Mikhail Gusarov <dottedmag@dottedmag.net>
+Mikhail Gusarov <dottedmag@dottedmag.net>,
+Lubomir Rintel <lkundrak@v3.sk>
 .SH AVAILABILITY
 The unshare command is part of the util-linux package and is available from
 ftp://ftp.kernel.org/pub/linux/utils/util-linux/.
diff --git a/sys-utils/unshare.c b/sys-utils/unshare.c
index 933f621..a8569b9 100644
--- a/sys-utils/unshare.c
+++ b/sys-utils/unshare.c
@@ -2,6 +2,7 @@
  * unshare(1) - command-line interface for unshare(2)
  *
  * Copyright (C) 2009 Mikhail Gusarov <dottedmag@dottedmag.net>
+ * Copyright (C) 2013,2014 Lubomir Rintel <lkundrak@v3.sk>
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -24,8 +25,10 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
+#include <setjmp.h>
 #include <sys/wait.h>
 #include <sys/mount.h>
+#include <sys/stat.h>
 
 /* we only need some defines missing in sys/mount.h, no libmount linkage */
 #include <libmount.h>
@@ -39,6 +42,41 @@
 #include "pathnames.h"
 #include "all-io.h"
 
+static struct namespace_file {
+	int nstype;
+	const char *proc_name;
+	const char *target_name;
+} namespace_files[] = {
+	{ .nstype = CLONE_NEWUSER, .proc_name = "ns/user", .target_name = NULL },
+	{ .nstype = CLONE_NEWIPC,  .proc_name = "ns/ipc",  .target_name = NULL },
+	{ .nstype = CLONE_NEWUTS,  .proc_name = "ns/uts",  .target_name = NULL },
+	{ .nstype = CLONE_NEWNET,  .proc_name = "ns/net",  .target_name = NULL },
+	{ .nstype = CLONE_NEWPID,  .proc_name = "ns/pid",  .target_name = NULL },
+	{ .nstype = CLONE_NEWNS,   .proc_name = "ns/mnt",  .target_name = NULL },
+	{ .nstype = 0, .proc_name = NULL, .target_name = NULL }
+};
+
+int c, forkit = 0, maproot = 0;
+const char *procmnt = NULL;
+
+static int ns_path(int nstype, const char *path)
+{
+	struct namespace_file *nsfile;
+
+	if (path == NULL)
+		return 0;
+
+	for (nsfile = namespace_files; nsfile->nstype; nsfile++) {
+		if (nstype != nsfile->nstype)
+			continue;
+
+		nsfile->target_name = path;
+		break;
+	}
+
+	return 1;
+}
+
 static void map_id(const char *file, uint32_t from, uint32_t to)
 {
 	char *buf;
@@ -64,12 +102,12 @@ static void usage(int status)
 		program_invocation_short_name);
 
 	fputs(USAGE_OPTIONS, out);
-	fputs(_(" -m, --mount               unshare mounts namespace\n"), out);
-	fputs(_(" -u, --uts                 unshare UTS namespace (hostname etc)\n"), out);
-	fputs(_(" -i, --ipc                 unshare System V IPC namespace\n"), out);
-	fputs(_(" -n, --net                 unshare network namespace\n"), out);
-	fputs(_(" -p, --pid                 unshare pid namespace\n"), out);
-	fputs(_(" -U, --user                unshare user namespace\n"), out);
+	fputs(_(" -m, --mount[=<file>]      unshare mounts namespace\n"), out);
+	fputs(_(" -u, --uts[=<file>]        unshare UTS namespace (hostname etc)\n"), out);
+	fputs(_(" -i, --ipc[=<file>]        unshare System V IPC namespace\n"), out);
+	fputs(_(" -n, --net[=<file>]        unshare network namespace\n"), out);
+	fputs(_(" -p, --pid[=<file>]        unshare pid namespace\n"), out);
+	fputs(_(" -U, --user[=<file>]       unshare user namespace\n"), out);
 	fputs(_(" -f, --fork                fork before launching <program>\n"), out);
 	fputs(_("     --mount-proc[=<dir>]  mount proc filesystem first (implies --mount)\n"), out);
 	fputs(_(" -r, --map-root-user       map current user to root (implies --user)\n"), out);
@@ -82,6 +120,61 @@ static void usage(int status)
 	exit(status);
 }
 
+static void persist_ns(pid_t pid)
+{
+	struct namespace_file *nsfile;
+
+	for (nsfile = namespace_files; nsfile->nstype; nsfile++) {
+		char pathbuf[PATH_MAX];
+
+		if (!nsfile->target_name)
+			continue;
+
+		snprintf(pathbuf, sizeof(pathbuf), "/proc/%u/%s", pid,
+			nsfile->proc_name);
+
+		umount(nsfile->target_name);
+		unlink(nsfile->target_name);
+
+		if (-1 == mknod(nsfile->target_name, 0666, 0)) {
+			warn(_("failed to create %s"), nsfile->target_name);
+			continue;
+		}
+
+		if (-1 == mount(pathbuf, nsfile->target_name, NULL, MS_BIND, NULL)) {
+			warn(_("mount %s failed"), nsfile->target_name);
+			unlink(nsfile->target_name);
+		}
+	}
+}
+
+static int in_child (void *arg)
+{
+	jmp_buf *child = arg;
+
+	longjmp(*child, 1);
+}
+
+#define STACK_SIZE 0x100000
+static pid_t unshare_fork(int unshare_flags)
+{
+	/* Twice the size, as we might be running of parisc
+	 * or metag where stack grows the other way. *sigh* */
+	static char stack[2*STACK_SIZE];
+	static jmp_buf child;
+	pid_t pid;
+
+	if (setjmp (child))
+		return 0;
+
+	pid = clone(in_child, &stack[STACK_SIZE],
+		SIGCHLD | unshare_flags, &child);
+	if (pid != -1)
+		persist_ns(pid);
+
+	return pid;
+}
+
 int main(int argc, char *argv[])
 {
 	enum {
@@ -90,12 +183,12 @@ int main(int argc, char *argv[])
 	static const struct option longopts[] = {
 		{ "help", no_argument, 0, 'h' },
 		{ "version", no_argument, 0, 'V'},
-		{ "mount", no_argument, 0, 'm' },
-		{ "uts", no_argument, 0, 'u' },
-		{ "ipc", no_argument, 0, 'i' },
-		{ "net", no_argument, 0, 'n' },
-		{ "pid", no_argument, 0, 'p' },
-		{ "user", no_argument, 0, 'U' },
+		{ "mount", optional_argument, 0, 'm' },
+		{ "uts", optional_argument, 0, 'u' },
+		{ "ipc", optional_argument, 0, 'i' },
+		{ "net", optional_argument, 0, 'n' },
+		{ "pid", optional_argument, 0, 'p' },
+		{ "user", optional_argument, 0, 'U' },
 		{ "fork", no_argument, 0, 'f' },
 		{ "mount-proc", optional_argument, 0, OPT_MOUNTPROC },
 		{ "map-root-user", no_argument, 0, 'r' },
@@ -103,8 +196,6 @@ int main(int argc, char *argv[])
 	};
 
 	int unshare_flags = 0;
-	int c, forkit = 0, maproot = 0;
-	const char *procmnt = NULL;
 	uid_t real_euid = geteuid();
 	gid_t real_egid = getegid();;
 
@@ -125,21 +216,28 @@ int main(int argc, char *argv[])
 			return EXIT_SUCCESS;
 		case 'm':
 			unshare_flags |= CLONE_NEWNS;
+			if (ns_path(CLONE_NEWNS, optarg))
+				forkit = 1;
 			break;
 		case 'u':
 			unshare_flags |= CLONE_NEWUTS;
+			ns_path(CLONE_NEWUTS, optarg);
 			break;
 		case 'i':
 			unshare_flags |= CLONE_NEWIPC;
+			ns_path(CLONE_NEWIPC, optarg);
 			break;
 		case 'n':
 			unshare_flags |= CLONE_NEWNET;
+			ns_path(CLONE_NEWNET, optarg);
 			break;
 		case 'p':
 			unshare_flags |= CLONE_NEWPID;
+			ns_path(CLONE_NEWPID, optarg);
 			break;
 		case 'U':
 			unshare_flags |= CLONE_NEWUSER;
+			ns_path(CLONE_NEWUSER, optarg);
 			break;
 		case OPT_MOUNTPROC:
 			unshare_flags |= CLONE_NEWNS;
@@ -154,12 +252,9 @@ int main(int argc, char *argv[])
 		}
 	}
 
-	if (-1 == unshare(unshare_flags))
-		err(EXIT_FAILURE, _("unshare failed"));
-
 	if (forkit) {
 		int status;
-		pid_t pid = fork();
+		pid_t pid = unshare_fork(unshare_flags);
 
 		switch(pid) {
 		case -1:
@@ -175,6 +270,10 @@ int main(int argc, char *argv[])
 				kill(getpid(), WTERMSIG(status));
 			err(EXIT_FAILURE, _("child exit failed"));
 		}
+	} else {
+		if (-1 == unshare(unshare_flags))
+			err(EXIT_FAILURE, _("unshare failed"));
+		persist_ns(getpid());
 	}
 
 	if (maproot) {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] unshare: allow persisting namespaces
  2014-12-28 20:23 ` [PATCH 2/2] unshare: allow persisting namespaces Lubomir Rintel
@ 2015-01-06 13:03   ` Karel Zak
  2015-01-06 17:11     ` Eric W. Biederman
  2015-02-15 12:10   ` Mike Frysinger
  1 sibling, 1 reply; 8+ messages in thread
From: Karel Zak @ 2015-01-06 13:03 UTC (permalink / raw)
  To: Lubomir Rintel; +Cc: util-linux, Mikhail Gusarov, Eric W. Biederman

On Sun, Dec 28, 2014 at 09:23:38PM +0100, Lubomir Rintel wrote:
> Bind mount the namespace file to a given location after creating it if
> requested (analogously to what "ip netns" and other tools do). This makes
> it possible for a namespace to survive with no processes running while
> processes can enter it with nsenter(1):
> 
>   # unshare --uts=utsns hostname behemoth
>   # nsenter --uts=utsns hostname
>   behemoth

Nice, especially when we already supports the same concept in nsenter.

But I guess that "empty namespace" (without any running process) is
impossible for PID namespaces, right? It would be nice to add a note
about it to the man page.

It would be also nice to add example with --mount-proc, because in
this case you need to specify --mount=<file> too

 # unshare --pid=ns-bind-pid --mount=ns-bind-mnt --mount-proc 

another session:

 # nsenter --pid=ns-bind-pid --mount=ns-bind-mnt

> The ugly bit about this patch is the clone(2) call, arguably not our

Please, can you a little elaborate why need clone() and what's wrong
with fork()+unshare()? I'd like to have some comment in code.

> +static void persist_ns(pid_t pid)
> +{
> +	struct namespace_file *nsfile;
> +
> +	for (nsfile = namespace_files; nsfile->nstype; nsfile++) {
> +		char pathbuf[PATH_MAX];
> +
> +		if (!nsfile->target_name)
> +			continue;
> +
> +		snprintf(pathbuf, sizeof(pathbuf), "/proc/%u/%s", pid,
> +			nsfile->proc_name);
> +
> +		umount(nsfile->target_name);
> +		unlink(nsfile->target_name);
> +
> +		if (-1 == mknod(nsfile->target_name, 0666, 0)) {
> +			warn(_("failed to create %s"), nsfile->target_name);
> +			continue;
> +		}
> +
> +		if (-1 == mount(pathbuf, nsfile->target_name, NULL, MS_BIND, NULL)) {
> +			warn(_("mount %s failed"), nsfile->target_name);
> +			unlink(nsfile->target_name);
> +		}
> +	}
> +}

 would be better to use err() that warn()? It's strange to continue
 and ignore errors in this case. The current result on errors is mess.

    Karel

> +static int in_child (void *arg)
> +{
> +	jmp_buf *child = arg;
> +
> +	longjmp(*child, 1);
> +}
> +
> +#define STACK_SIZE 0x100000
> +static pid_t unshare_fork(int unshare_flags)
> +{
> +	/* Twice the size, as we might be running of parisc
> +	 * or metag where stack grows the other way. *sigh* */
> +	static char stack[2*STACK_SIZE];
> +	static jmp_buf child;
> +	pid_t pid;
> +
> +	if (setjmp (child))
> +		return 0;
> +
> +	pid = clone(in_child, &stack[STACK_SIZE],
> +		SIGCHLD | unshare_flags, &child);
> +	if (pid != -1)
> +		persist_ns(pid);
> +
> +	return pid;
> +}
> +
>  int main(int argc, char *argv[])
>  {
>  	enum {
> @@ -90,12 +183,12 @@ int main(int argc, char *argv[])
>  	static const struct option longopts[] = {
>  		{ "help", no_argument, 0, 'h' },
>  		{ "version", no_argument, 0, 'V'},
> -		{ "mount", no_argument, 0, 'm' },
> -		{ "uts", no_argument, 0, 'u' },
> -		{ "ipc", no_argument, 0, 'i' },
> -		{ "net", no_argument, 0, 'n' },
> -		{ "pid", no_argument, 0, 'p' },
> -		{ "user", no_argument, 0, 'U' },
> +		{ "mount", optional_argument, 0, 'm' },
> +		{ "uts", optional_argument, 0, 'u' },
> +		{ "ipc", optional_argument, 0, 'i' },
> +		{ "net", optional_argument, 0, 'n' },
> +		{ "pid", optional_argument, 0, 'p' },
> +		{ "user", optional_argument, 0, 'U' },
>  		{ "fork", no_argument, 0, 'f' },
>  		{ "mount-proc", optional_argument, 0, OPT_MOUNTPROC },
>  		{ "map-root-user", no_argument, 0, 'r' },
> @@ -103,8 +196,6 @@ int main(int argc, char *argv[])
>  	};
>  
>  	int unshare_flags = 0;
> -	int c, forkit = 0, maproot = 0;
> -	const char *procmnt = NULL;
>  	uid_t real_euid = geteuid();
>  	gid_t real_egid = getegid();;
>  
> @@ -125,21 +216,28 @@ int main(int argc, char *argv[])
>  			return EXIT_SUCCESS;
>  		case 'm':
>  			unshare_flags |= CLONE_NEWNS;
> +			if (ns_path(CLONE_NEWNS, optarg))
> +				forkit = 1;
>  			break;
>  		case 'u':
>  			unshare_flags |= CLONE_NEWUTS;
> +			ns_path(CLONE_NEWUTS, optarg);
>  			break;
>  		case 'i':
>  			unshare_flags |= CLONE_NEWIPC;
> +			ns_path(CLONE_NEWIPC, optarg);
>  			break;
>  		case 'n':
>  			unshare_flags |= CLONE_NEWNET;
> +			ns_path(CLONE_NEWNET, optarg);
>  			break;
>  		case 'p':
>  			unshare_flags |= CLONE_NEWPID;
> +			ns_path(CLONE_NEWPID, optarg);
>  			break;
>  		case 'U':
>  			unshare_flags |= CLONE_NEWUSER;
> +			ns_path(CLONE_NEWUSER, optarg);
>  			break;
>  		case OPT_MOUNTPROC:
>  			unshare_flags |= CLONE_NEWNS;
> @@ -154,12 +252,9 @@ int main(int argc, char *argv[])
>  		}
>  	}
>  
> -	if (-1 == unshare(unshare_flags))
> -		err(EXIT_FAILURE, _("unshare failed"));
> -
>  	if (forkit) {
>  		int status;
> -		pid_t pid = fork();
> +		pid_t pid = unshare_fork(unshare_flags);
>  
>  		switch(pid) {
>  		case -1:
> @@ -175,6 +270,10 @@ int main(int argc, char *argv[])
>  				kill(getpid(), WTERMSIG(status));
>  			err(EXIT_FAILURE, _("child exit failed"));
>  		}
> +	} else {
> +		if (-1 == unshare(unshare_flags))
> +			err(EXIT_FAILURE, _("unshare failed"));
> +		persist_ns(getpid());
>  	}
>  
>  	if (maproot) {
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe util-linux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] unshare: allow persisting namespaces
  2015-01-06 13:03   ` Karel Zak
@ 2015-01-06 17:11     ` Eric W. Biederman
  2015-01-06 17:21       ` Karel Zak
  0 siblings, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2015-01-06 17:11 UTC (permalink / raw)
  To: Karel Zak; +Cc: Lubomir Rintel, util-linux, Mikhail Gusarov

Karel Zak <kzak@redhat.com> writes:

> On Sun, Dec 28, 2014 at 09:23:38PM +0100, Lubomir Rintel wrote:
>> Bind mount the namespace file to a given location after creating it if
>> requested (analogously to what "ip netns" and other tools do). This makes
>> it possible for a namespace to survive with no processes running while
>> processes can enter it with nsenter(1):
>> 
>>   # unshare --uts=utsns hostname behemoth
>>   # nsenter --uts=utsns hostname
>>   behemoth
>
> Nice, especially when we already supports the same concept in nsenter.
>
> But I guess that "empty namespace" (without any running process) is
> impossible for PID namespaces, right? It would be nice to add a note
> about it to the man page.

No.  An empty pid namespace is valid.   An empty pid namespace is one
in which an init process has not entered the pid namespace, or one in
which the init process has exited (and thus no more processes are
allowed).

So an empty pid namespace is a little weird but valid.

The implementation details of the patch completely baffle me.  I can't
see a reason for things being implemented with clone for example.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] unshare: allow persisting namespaces
  2015-01-06 17:11     ` Eric W. Biederman
@ 2015-01-06 17:21       ` Karel Zak
  2015-01-06 17:44         ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Karel Zak @ 2015-01-06 17:21 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Lubomir Rintel, util-linux, Mikhail Gusarov

On Tue, Jan 06, 2015 at 11:11:49AM -0600, Eric W. Biederman wrote:
> No.  An empty pid namespace is valid.   An empty pid namespace is one
> in which an init process has not entered the pid namespace, or one in

but if I create a PID namespace (unshare/clone) then then I'm the init
process.... how I can create empty PID namespace (from userspace)?

> which the init process has exited (and thus no more processes are
> allowed).

yes, this makes sense

> So an empty pid namespace is a little weird but valid.
> 
> The implementation details of the patch completely baffle me.  I can't
> see a reason for things being implemented with clone for example.

Yes, this part of the patch is strange, but I like the basic idea
of the patch -- so make it possible to create an empty namespace and 
then later enter by nsenter.

    Karel

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] unshare: allow persisting namespaces
  2015-01-06 17:21       ` Karel Zak
@ 2015-01-06 17:44         ` Eric W. Biederman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2015-01-06 17:44 UTC (permalink / raw)
  To: Karel Zak; +Cc: Lubomir Rintel, util-linux, Mikhail Gusarov

Karel Zak <kzak@redhat.com> writes:

> On Tue, Jan 06, 2015 at 11:11:49AM -0600, Eric W. Biederman wrote:
>> No.  An empty pid namespace is valid.   An empty pid namespace is one
>> in which an init process has not entered the pid namespace, or one in
>
> but if I create a PID namespace (unshare/clone) then then I'm the init
> process.... how I can create empty PID namespace (from userspace)?

Unshare creates an empty PID namespace.  Your first child when you fork
becomes the init process.  You can not change your current pid namespace
only the pid namespace for your children.

>> which the init process has exited (and thus no more processes are
>> allowed).
>
> yes, this makes sense
>
>> So an empty pid namespace is a little weird but valid.
>> 
>> The implementation details of the patch completely baffle me.  I can't
>> see a reason for things being implemented with clone for example.
>
> Yes, this part of the patch is strange, but I like the basic idea
> of the patch -- so make it possible to create an empty namespace and 
> then later enter by nsenter.

The idea of making a new namespace and making it possible to enter it
later with nsenter seems reasonable. 

But really that should be just a matter of adding the C equivalent of
"mount --bind /proc/self/ns/$TYPE $FILENAME" which should be a very
trivial addition to unshare.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] unshare: add some examples
  2014-12-28 20:23 [PATCH 1/2] unshare: add some examples Lubomir Rintel
  2014-12-28 20:23 ` [PATCH 2/2] unshare: allow persisting namespaces Lubomir Rintel
@ 2015-01-12 11:41 ` Karel Zak
  1 sibling, 0 replies; 8+ messages in thread
From: Karel Zak @ 2015-01-12 11:41 UTC (permalink / raw)
  To: Lubomir Rintel; +Cc: util-linux, Mikhail Gusarov

On Sun, Dec 28, 2014 at 09:23:37PM +0100, Lubomir Rintel wrote:
>  sys-utils/unshare.1 | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)

 Applied, thanks.

-- 
 Karel Zak  <kzak@redhat.com>
 http://karelzak.blogspot.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] unshare: allow persisting namespaces
  2014-12-28 20:23 ` [PATCH 2/2] unshare: allow persisting namespaces Lubomir Rintel
  2015-01-06 13:03   ` Karel Zak
@ 2015-02-15 12:10   ` Mike Frysinger
  1 sibling, 0 replies; 8+ messages in thread
From: Mike Frysinger @ 2015-02-15 12:10 UTC (permalink / raw)
  To: Lubomir Rintel; +Cc: util-linux, Mikhail Gusarov

[-- Attachment #1: Type: text/plain, Size: 3657 bytes --]

On 28 Dec 2014 21:23, Lubomir Rintel wrote:
> Bind mount the namespace file to a given location after creating it if
> requested (analogously to what "ip netns" and other tools do).

since i found out about `ip netns`, i've wanted this in unshare ;).  although 
the two implementations seem to differ: iproute uses a common location 
(/run/netns/$NAME) while this implementation requires specifying the full path 
all the time.  would it be possible to rectify this ?

maybe if you give it a plain name, it defaults to a common location ?  so 
something like this would "just work":
  $ ip netns add foo
  $ unshare --net=foo ...
(yes, i'm aware of `ip netns exec ...`)

using /run/${type}ns/ for all paths seems a bit ugly ... maybe claim 
/run/ns/${type}/ instead ?

> The ugly bit about this patch is the clone(2) call, arguably not our
> fault. The stack size glibc requires for its clone(2) wrapper is not
> documented anywhere and its semantics (stack growth direction) is arch
> dependent. We could figure it out by comparing a return value of a helper
> function that would return an address of its local variable with caller's
> local variable address, but I guess that would be even more messed-up.

are you sure this is strictly a glibc requirement ?  seems like it's mostly 
hardware/ABI related (certainly direction is).  i'd also point out that ia64 
doesn't implement clone either ... it has __clone2().

> +static struct namespace_file {

const

> +	int nstype;
> +	const char *proc_name;
> +	const char *target_name;
> +} namespace_files[] = {
> +	{ .nstype = CLONE_NEWUSER, .proc_name = "ns/user", .target_name = NULL },
> +	{ .nstype = CLONE_NEWIPC,  .proc_name = "ns/ipc",  .target_name = NULL },
> +	{ .nstype = CLONE_NEWUTS,  .proc_name = "ns/uts",  .target_name = NULL },
> +	{ .nstype = CLONE_NEWNET,  .proc_name = "ns/net",  .target_name = NULL },
> +	{ .nstype = CLONE_NEWPID,  .proc_name = "ns/pid",  .target_name = NULL },
> +	{ .nstype = CLONE_NEWNS,   .proc_name = "ns/mnt",  .target_name = NULL },
> +	{ .nstype = 0, .proc_name = NULL, .target_name = NULL }

use ARRAY_SIZE instead and you don't need the sentinel entry

> +int c, forkit = 0, maproot = 0;
> +const char *procmnt = NULL;

static

> +	fputs(_(" -m, --mount[=<file>]      unshare mounts namespace\n"), out);
> +	fputs(_(" -u, --uts[=<file>]        unshare UTS namespace (hostname etc)\n"), out);
> +	fputs(_(" -i, --ipc[=<file>]        unshare System V IPC namespace\n"), out);
> +	fputs(_(" -n, --net[=<file>]        unshare network namespace\n"), out);
> +	fputs(_(" -p, --pid[=<file>]        unshare pid namespace\n"), out);
> +	fputs(_(" -U, --user[=<file>]       unshare user namespace\n"), out);

probably want <path> instead of <file> since it can be either

> +static void persist_ns(pid_t pid)
> +{
> +	struct namespace_file *nsfile;
> +
> +	for (nsfile = namespace_files; nsfile->nstype; nsfile++) {
> +		char pathbuf[PATH_MAX];
> +
> +		if (!nsfile->target_name)
> +			continue;
> +
> +		snprintf(pathbuf, sizeof(pathbuf), "/proc/%u/%s", pid,
> +			nsfile->proc_name);

use xasprintf to avoid the PATH_MAX constant

> +		if (-1 == mknod(nsfile->target_name, 0666, 0)) {
> +			warn(_("failed to create %s"), nsfile->target_name);
> +			continue;
> +		}
> +
> +		if (-1 == mount(pathbuf, nsfile->target_name, NULL, MS_BIND, NULL)) {
> +			warn(_("mount %s failed"), nsfile->target_name);
> +			unlink(nsfile->target_name);

generally the codebase uses the other style -- constants go on the right

> +static int in_child (void *arg)

no space before the (
-mike

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-02-15 12:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-28 20:23 [PATCH 1/2] unshare: add some examples Lubomir Rintel
2014-12-28 20:23 ` [PATCH 2/2] unshare: allow persisting namespaces Lubomir Rintel
2015-01-06 13:03   ` Karel Zak
2015-01-06 17:11     ` Eric W. Biederman
2015-01-06 17:21       ` Karel Zak
2015-01-06 17:44         ` Eric W. Biederman
2015-02-15 12:10   ` Mike Frysinger
2015-01-12 11:41 ` [PATCH 1/2] unshare: add some examples Karel Zak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).