util-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] unshare: manage binfmt_misc mounts
@ 2024-06-10 17:33 Laurent Vivier
  2024-06-10 17:33 ` [PATCH 1/2] unshare: mount binfmt_misc Laurent Vivier
  2024-06-10 17:33 ` [PATCH 2/2] unshare: load binfmt_misc interpreter Laurent Vivier
  0 siblings, 2 replies; 3+ messages in thread
From: Laurent Vivier @ 2024-06-10 17:33 UTC (permalink / raw)
  To: util-linux; +Cc: Laurent Vivier

Since linux v6.7 and
commit 21ca59b365c0 ("binfmt_misc: enable sandboxed mounts"),
binfmt_misc can be mountable in a non-initial user namespace by
a non privileged user.

Extend unshare to manage it:

- add --mount-binfmt[=<dir>] to mount binfmt_misc filesystem, this
  results in clearing inherited interpreters from the previous namespace

- add -l, --load-interp <file> to load a binfmt_misc interpreter at startup.

  The interpreter is loaded from the initial fileystem if the 'F' flags is
  provided, otherwise from inside the new namespace
  This makes possible to start a chroot of another architecture without
  being root.

For instance:

  With 'F' flag, load the interpreter from the initial namespace:

    $ /bin/qemu-m68k-static --version
    qemu-m68k version 8.2.2 (qemu-8.2.2-1.fc40)
    Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
    $ unshare --map-root-user --fork --pid --load-interp=":qemu-m68k:M::\\x7fELF\\x01\\x02\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x04:\\xff\\xff\\xff\\xff\\xff\\xff\\xfe\\x00\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xfe\\xff\\xff:/bin/qemu-m68k-static:OCF" --root=chroot/m68k/sid
    # QEMU_VERSION= ls
    qemu-m68k version 8.2.2 (qemu-8.2.2-1.fc40)
    Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers
    # /qemu-m68k  --version
    qemu-m68k version 8.0.50 (v8.0.0-340-gb1cff5e2da95)
    Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

  Without 'F' flag, from inside the namespace:

    $ unshare --map-root-user --fork --pid --load-interp=":qemu-m68k:M::\\x7fELF\\x01\\x02\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x04:\\xff\\xff\\xff\\xff\\xff\\xff\\xfe\\x00\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xfe\\xff\\xff:/qemu-m68k:OC" --root=chroot/m68k/sid
    # QEMU_VERSION= ls
    qemu-m68k version 8.0.50 (v8.0.0-340-gb1cff5e2da95)
    Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers
    # /qemu-m68k  --version
    qemu-m68k version 8.0.50 (v8.0.0-340-gb1cff5e2da95)
    Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Laurent Vivier (2):
  unshare: mount binfmt_misc
  unshare: load binfmt_misc interpreter

 include/pathnames.h      |  2 ++
 sys-utils/unshare.1.adoc | 13 ++++++++
 sys-utils/unshare.c      | 64 +++++++++++++++++++++++++++++++++++++++-
 3 files changed, 78 insertions(+), 1 deletion(-)

-- 
2.45.2


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/2] unshare: mount binfmt_misc
  2024-06-10 17:33 [PATCH 0/2] unshare: manage binfmt_misc mounts Laurent Vivier
@ 2024-06-10 17:33 ` Laurent Vivier
  2024-06-10 17:33 ` [PATCH 2/2] unshare: load binfmt_misc interpreter Laurent Vivier
  1 sibling, 0 replies; 3+ messages in thread
From: Laurent Vivier @ 2024-06-10 17:33 UTC (permalink / raw)
  To: util-linux; +Cc: Laurent Vivier

add --mount-binfmt[=<dir>] to mount binfmt_misc filesystem,
this results in clearing inherited interpreters from the previous namespace

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
---
 include/pathnames.h      |  2 ++
 sys-utils/unshare.1.adoc |  3 +++
 sys-utils/unshare.c      | 16 ++++++++++++++++
 3 files changed, 21 insertions(+)

diff --git a/include/pathnames.h b/include/pathnames.h
index 81fa405f63c7..6a7f506951d5 100644
--- a/include/pathnames.h
+++ b/include/pathnames.h
@@ -204,6 +204,8 @@
 /* sysctl fs paths */
 #define _PATH_PROC_SYS_FS	"/proc/sys/fs"
 #define _PATH_PROC_PIPE_MAX_SIZE	_PATH_PROC_SYS_FS "/pipe-max-size"
+#define _PATH_PROC_BINFMT_MISC	_PATH_PROC_SYS_FS "/binfmt_misc"
+#define _PATH_PROC_BINFMT_MISC_REGISTER	_PATH_PROC_BINFMT_MISC "/register"
 
 /* irqtop paths */
 #define _PATH_PROC_INTERRUPTS	"/proc/interrupts"
diff --git a/sys-utils/unshare.1.adoc b/sys-utils/unshare.1.adoc
index e6201e28fffd..48d1a5579282 100644
--- a/sys-utils/unshare.1.adoc
+++ b/sys-utils/unshare.1.adoc
@@ -90,6 +90,9 @@ When *unshare* terminates, have _signame_ be sent to the forked child process. C
 *--mount-proc*[**=**__mountpoint__]::
 Just before running the program, mount the proc filesystem at _mountpoint_ (default is _/proc_). This is useful when creating a new PID namespace. It also implies creating a new mount namespace since the _/proc_ mount would otherwise mess up existing programs on the system. The new proc filesystem is explicitly mounted as private (with *MS_PRIVATE*|*MS_REC*).
 
+*--mount-binfmt*[**=**__mountpoint__]::
+Just before running the program, mount the binfmt_misc filesystem at _mountpoint_ (default is /proc/sys/fs/binfmt_misc).  It also implies creating a new mount namespace since the binfmt_misc mount would otherwise mess up existing programs on the system.  The new binfmt_misc filesystem is explicitly mounted as private (with *MS_PRIVATE*|*MS_REC*).
+
 **--map-user=**__uid|name__::
 Run the program only after the current effective user ID has been mapped to _uid_. If this option is specified multiple times, the last occurrence takes precedence. This option implies *--user*.
 
diff --git a/sys-utils/unshare.c b/sys-utils/unshare.c
index 57f3b8744fb5..06a9a427c524 100644
--- a/sys-utils/unshare.c
+++ b/sys-utils/unshare.c
@@ -760,6 +760,7 @@ static void __attribute__((__noreturn__)) usage(void)
 	fputs(_(" --kill-child[=<signame>]  when dying, kill the forked child (implies --fork)\n"
 		"                             defaults to SIGKILL\n"), out);
 	fputs(_(" --mount-proc[=<dir>]      mount proc filesystem first (implies --mount)\n"), out);
+	fputs(_(" --mount-binfmt[=<dir>]    mount binfmt filesystem first (implies --user and --mount)\n"), out);
 	fputs(_(" --propagation slave|shared|private|unchanged\n"
 	        "                           modify mount propagation in mount namespace\n"), out);
 	fputs(_(" --setgroups allow|deny    control the setgroups syscall in user namespaces\n"), out);
@@ -783,6 +784,7 @@ int main(int argc, char *argv[])
 {
 	enum {
 		OPT_MOUNTPROC = CHAR_MAX + 1,
+		OPT_MOUNTBINFMT,
 		OPT_PROPAGATION,
 		OPT_SETGROUPS,
 		OPT_KILLCHILD,
@@ -811,6 +813,7 @@ int main(int argc, char *argv[])
 		{ "fork",          no_argument,       NULL, 'f'             },
 		{ "kill-child",    optional_argument, NULL, OPT_KILLCHILD   },
 		{ "mount-proc",    optional_argument, NULL, OPT_MOUNTPROC   },
+		{ "mount-binfmt",  optional_argument, NULL, OPT_MOUNTBINFMT },
 		{ "map-user",      required_argument, NULL, OPT_MAPUSER     },
 		{ "map-users",     required_argument, NULL, OPT_MAPUSERS    },
 		{ "map-group",     required_argument, NULL, OPT_MAPGROUP    },
@@ -839,6 +842,7 @@ int main(int argc, char *argv[])
 	struct map_range *groupmap = NULL;
 	int kill_child_signo = 0; /* 0 means --kill-child was not used */
 	const char *procmnt = NULL;
+	const char *binfmt_mnt = NULL;
 	const char *newroot = NULL;
 	const char *newdir = NULL;
 	pid_t pid_bind = 0, pid_idmap = 0;
@@ -913,6 +917,12 @@ int main(int argc, char *argv[])
 			unshare_flags |= CLONE_NEWNS;
 			procmnt = optarg ? optarg : "/proc";
 			break;
+		case OPT_MOUNTBINFMT:
+			unshare_flags |= CLONE_NEWNS | CLONE_NEWUSER;
+			binfmt_mnt = optarg ? optarg : _PATH_PROC_BINFMT_MISC;
+			if (!optarg && !procmnt)
+				procmnt = "/proc";
+			break;
 		case OPT_MAPUSER:
 			unshare_flags |= CLONE_NEWUSER;
 			mapuser = get_user(optarg, _("failed to parse uid"));
@@ -1178,6 +1188,12 @@ int main(int argc, char *argv[])
 			err(EXIT_FAILURE, _("mount %s failed"), procmnt);
 	}
 
+	if (binfmt_mnt) {
+		if (mount("binfmt_misc", binfmt_mnt, "binfmt_misc",
+			  MS_NOSUID|MS_NOEXEC|MS_NODEV, NULL) != 0)
+			err(EXIT_FAILURE, _("mount %s failed"), binfmt_mnt);
+	}
+
 	if (force_gid) {
 		if (setgroups(0, NULL) != 0)	/* drop supplementary groups */
 			err(EXIT_FAILURE, _("setgroups failed"));
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] unshare: load binfmt_misc interpreter
  2024-06-10 17:33 [PATCH 0/2] unshare: manage binfmt_misc mounts Laurent Vivier
  2024-06-10 17:33 ` [PATCH 1/2] unshare: mount binfmt_misc Laurent Vivier
@ 2024-06-10 17:33 ` Laurent Vivier
  1 sibling, 0 replies; 3+ messages in thread
From: Laurent Vivier @ 2024-06-10 17:33 UTC (permalink / raw)
  To: util-linux; +Cc: Laurent Vivier

add -l, --load-interp <file> to load a binfmt_misc interpreter at startup.

The interpreter is loaded from the initial fileystem if the 'F' flags is
provided, otherwise from inside the new namespace
This makes possible to start a chroot of another architecture without
being root.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
---
 sys-utils/unshare.1.adoc | 10 +++++++++
 sys-utils/unshare.c      | 48 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/sys-utils/unshare.1.adoc b/sys-utils/unshare.1.adoc
index 48d1a5579282..24ac6fb01867 100644
--- a/sys-utils/unshare.1.adoc
+++ b/sys-utils/unshare.1.adoc
@@ -138,6 +138,9 @@ Set the user ID which will be used in the entered namespace.
 *-G*, *--setgid* _gid_::
 Set the group ID which will be used in the entered namespace and drop supplementary groups.
 
+*-l*, **--load-interp=**__file__::
+Load binfmt_misc definition in the namespace (implies *--mount-binfmt*).
+
 *--monotonic* _offset_::
 Set the offset of *CLOCK_MONOTONIC* which will be used in the entered time namespace. This option requires unsharing a time namespace with *--time*.
 
@@ -256,6 +259,13 @@ up 21 hours, 30 minutes
 up 9 years, 28 weeks, 1 day, 2 hours, 50 minutes
 ....
 
+The following example execute a chroot into the directory /chroot/powerpc/jessie and install the interpreter /bin/qemu-ppc-static to execute the powerpc binaries.
+If the interpreter is defined with the flag F, the interpreter is loaded before the chroot otherwise the interpreter is loaded from inside the chroot.
+
+....
+$  unshare --map-root-user --fork --pid --load-interp=":qemu-ppc:M::\\x7fELF\x01\\x02\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x14:\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\x00\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xfe\\xff\\xff:/bin/qemu-ppc-static:OCF" --root=/chroot/powerpc/jessie /bin/bash -l
+....
+
 == AUTHORS
 
 mailto:dottedmag@dottedmag.net[Mikhail Gusarov],
diff --git a/sys-utils/unshare.c b/sys-utils/unshare.c
index 06a9a427c524..7b7b24138056 100644
--- a/sys-utils/unshare.c
+++ b/sys-utils/unshare.c
@@ -725,6 +725,31 @@ static pid_t map_ids_from_child(int *fd, uid_t mapuser,
 	exit(EXIT_SUCCESS);
 }
 
+static int is_fixed(const char *interp)
+{
+	const char *flags;
+
+	flags = strrchr(interp, ':');
+
+	return strchr(flags, 'F') != NULL;
+}
+
+static void load_interp(const char *interp)
+{
+	int fd;
+
+	fd = open(_PATH_PROC_BINFMT_MISC_REGISTER, O_WRONLY);
+	if (fd < 0)
+		err(EXIT_FAILURE, _("cannot open %s"),
+		    _PATH_PROC_BINFMT_MISC_REGISTER);
+
+	if (write_all(fd, interp, strlen(interp)))
+		err(EXIT_FAILURE, _("write failed %s"),
+		    _PATH_PROC_BINFMT_MISC_REGISTER);
+
+	close(fd);
+}
+
 static void __attribute__((__noreturn__)) usage(void)
 {
 	FILE *out = stdout;
@@ -772,6 +797,7 @@ static void __attribute__((__noreturn__)) usage(void)
 	fputs(_(" -G, --setgid <gid>        set gid in entered namespace\n"), out);
 	fputs(_(" --monotonic <offset>      set clock monotonic offset (seconds) in time namespaces\n"), out);
 	fputs(_(" --boottime <offset>       set clock boottime offset (seconds) in time namespaces\n"), out);
+	fputs(_(" -l, --load-interp <file>  load binfmt definition in the namespace (implies --mount-binfmt)\n"), out);
 
 	fputs(USAGE_SEPARATOR, out);
 	fprintf(out, USAGE_HELP_OPTIONS(27));
@@ -830,6 +856,7 @@ int main(int argc, char *argv[])
 		{ "wd",		   required_argument, NULL, 'w'		    },
 		{ "monotonic",     required_argument, NULL, OPT_MONOTONIC   },
 		{ "boottime",      required_argument, NULL, OPT_BOOTTIME    },
+		{ "load-interp",   required_argument, NULL, 'l'		    },
 		{ NULL, 0, NULL, 0 }
 	};
 
@@ -846,6 +873,7 @@ int main(int argc, char *argv[])
 	const char *newroot = NULL;
 	const char *newdir = NULL;
 	pid_t pid_bind = 0, pid_idmap = 0;
+	const char *newinterp = NULL;
 	pid_t pid = 0;
 #ifdef UL_HAVE_PIDFD
 	int fd_parent_pid = -1;
@@ -868,7 +896,7 @@ int main(int argc, char *argv[])
 	textdomain(PACKAGE);
 	close_stdout_atexit();
 
-	while ((c = getopt_long(argc, argv, "+fhVmuinpCTUrR:w:S:G:c", longopts, NULL)) != -1) {
+	while ((c = getopt_long(argc, argv, "+fhVmuinpCTUrR:w:S:G:cl:", longopts, NULL)) != -1) {
 		switch (c) {
 		case 'f':
 			forkit = 1;
@@ -1008,6 +1036,15 @@ int main(int argc, char *argv[])
 			boottime = strtos64_or_err(optarg, _("failed to parse boottime offset"));
 			force_boottime = 1;
 			break;
+		case 'l':
+			unshare_flags |= CLONE_NEWNS | CLONE_NEWUSER;
+			if (!binfmt_mnt) {
+				if (!procmnt)
+					procmnt = "/proc";
+				binfmt_mnt = _PATH_PROC_BINFMT_MISC;
+			}
+			newinterp = optarg;
+			break;
 
 		case 'h':
 			usage();
@@ -1162,6 +1199,13 @@ int main(int argc, char *argv[])
 	if ((unshare_flags & CLONE_NEWNS) && propagation)
 		set_propagation(propagation);
 
+	if (newinterp && is_fixed(newinterp)) {
+		if (mount("binfmt_misc", _PATH_PROC_BINFMT_MISC, "binfmt_misc",
+			  MS_NOSUID|MS_NOEXEC|MS_NODEV, NULL) != 0)
+			err(EXIT_FAILURE, _("mount %s failed"), _PATH_PROC_BINFMT_MISC);
+		load_interp(newinterp);
+	}
+
 	if (newroot) {
 		if (chroot(newroot) != 0)
 			err(EXIT_FAILURE,
@@ -1193,6 +1237,8 @@ int main(int argc, char *argv[])
 			  MS_NOSUID|MS_NOEXEC|MS_NODEV, NULL) != 0)
 			err(EXIT_FAILURE, _("mount %s failed"), binfmt_mnt);
 	}
+	if (newinterp && !is_fixed(newinterp))
+		load_interp(newinterp);
 
 	if (force_gid) {
 		if (setgroups(0, NULL) != 0)	/* drop supplementary groups */
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-06-10 17:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-10 17:33 [PATCH 0/2] unshare: manage binfmt_misc mounts Laurent Vivier
2024-06-10 17:33 ` [PATCH 1/2] unshare: mount binfmt_misc Laurent Vivier
2024-06-10 17:33 ` [PATCH 2/2] unshare: load binfmt_misc interpreter Laurent Vivier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).