* Re: Building a clean namespace and MS_BIND across namespaces is now disabled
2005-11-20 12:23 Building a clean namespace and MS_BIND across namespaces is now disabled Eric W. Biederman
@ 2005-11-20 23:04 ` Serge E. Hallyn
2005-11-21 0:01 ` Eric W. Biederman
2005-11-21 23:13 ` Ram Pai
1 sibling, 1 reply; 4+ messages in thread
From: Serge E. Hallyn @ 2005-11-20 23:04 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Al Viro, linux-fsdevel, Ram Pai, Miklos Szeredi,
Christoph Hellwig, Jamie Lokier
[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]
Quoting Eric W. Biederman (ebiederm@xmission.com):
> Currently I am looking at what it takes to build a namespace
> from scratch.
>
> Intuitively I am thinking one of two forms:
>
> > pid = clone(..., CLONE_NEWNS, ...);
> > if (pid == 0) {
> > umount2("/", MNT_DETACH);
> > mount(NULL, "/", "ramfs", 0, NULL);
> > chdir("/");
> > chroot("/");
> > }
>
> > root_fd = open("path", O_DIRECTORY | O_RDONLY);
> > pid = clone(..., CONE_NEWNS, ...);
> > if (pid == 0) {
> > umount2("/", MNT_DETACH);
> > fchdir(root_fd);
> > mount(".", "/", NULL MS_BIND, NULL);
> > chroot(".");
> > }
Why not do
do_clone_namespace
mount -t ramfs none /build
# set up a fs under /build
mkdir /build/oldmount
cd /build
pivot_root . oldmount
umount -l oldmount
? That's what I used to do in chroot_ns.c (attached) for bsdjail
(www.sf.net/projects/linuxjail).
Hope I didn't completely misunderstand your question...
> This leads me to the second part of my puzzle. When you have
> multiple namespaces around it can be handy to mount a filesystem
> from a different namespace. Especially if you want to derive
> your new namespace from an old one.
Again, is it acceptable to do this ahead of time before doing
pivot_root (but after cloning the namespace)?
-serge
[-- Attachment #2: chroot_ns.c --]
[-- Type: text/x-csrc, Size: 3637 bytes --]
/*
* chroot_ns.c
* Author: Serge Hallyn <serue@us.ibm.com>
* Date: Jan 25, 2005
*
* This version acts as "chroot" using namespaces.
*
* Usage:
* chroot_ns -u /mnt/d6 mnt
* This will create a new filesystem namespace, make /mnt/d6 the root
* of the filesystem, place the old root under /mnt and immediately
* unmount it, then run /bin/sh in the new filesystem.
*
* Note that pivot_root requires the new root to be under a different
* vfsmount. If you get the following error:
* pivot_root: Device or resource busy
* then try the following command first:
*
* mount --bind <newroot> <newroot>
*
* Now you should be able to call chroot_ns <newroot>.
*
* Copyright (C) 2004 International Business Machines <serue@us.ibm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <string.h>
#include <errno.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
#include <linux/unistd.h>
#include <sys/syscall.h>
#include <sys/mount.h>
#ifndef CLONE_NEWNS
#define CLONE_NEWNS 0x00020000
#endif
#ifndef MNT_DETACH
#define MNT_DETACH 0x00000002
#endif
#define MAX_PATH 256
static inline _syscall2(int, clone, int, flags, int, foo)
static _syscall2(int,pivot_root,const char *,new_root,const char *,put_old)
void usage(char *cmd)
{
printf("Usage: %s [-u] <new_root> [<old_root>] [<command>]\n", cmd);
printf(" Perform <command> under a new namespace with <new_root>\n");
printf(" as the root of the filesystem.\n");
printf(" If -u is specified, the old root will be unmounted before"
" <command> is executed.\n");
printf(" <old_root> is relative to the old root.");
printf(" If unspecified, <old_root> is '/mnt'.\n");
printf(" If unspecified, <command> is '/bin/sh'.\n");
exit(-EINVAL);
}
#define OLD_ROOT "mnt"
#define CMD "/bin/sh"
int main(int argc, char *argv[])
{
int pid = clone(CLONE_NEWNS | SIGCHLD,0);
int ret;
char *new_root, *old_root, *cmd, *argv0;
char full_oldroot[MAX_PATH];
int do_umount;
if (pid == -1) {
fprintf(stderr, "Permission denied on clone.\n");
fprintf(stderr, "You must have CAP_SYS_ADMIN to clone a"
" fs namespace.\n");
exit(-1);
}
if (pid != 0) {
waitpid(pid, &ret, 0);
exit(-1);
}
argv0 = argv[0];
if (argc > 1 && strcmp(argv[1], "-u") == 0) {
do_umount = 1;
argv++;
argc--;
} else
do_umount = 0;
if (argc < 2 || strcmp(argv[1], "-h") == 0)
usage(argv0);
new_root = argv[1];
if (argc > 2)
old_root = argv[2];
else
old_root = OLD_ROOT;
if (argc > 3)
cmd = argv[3];
else
cmd = CMD;
if (strlen(old_root) + strlen(new_root) >= MAX_PATH-1) {
printf("paths too long.\n");
return -1;
}
snprintf(full_oldroot, MAX_PATH, "%s/%s", new_root, old_root);
/* jump into the new root directory */
printf("going into %s\n", new_root);
ret = chdir(new_root);
if (ret) {
perror("chdir");
exit(2);
}
/* pivot root */
printf("switching %s and %s\n", new_root, full_oldroot);
ret = pivot_root(new_root, full_oldroot);
if (ret) {
perror("pivot_root");
printf("Try \"mount --bind %s %s\"\n", new_root, new_root);
exit(ret);
}
/* unmount if requested */
if (do_umount) {
ret = umount2(old_root, MNT_DETACH);
if (ret) {
perror("umount");
exit(2);
}
}
/* Execute the command */
execl(cmd, cmd, NULL);
perror("execl");
fprintf(stderr, "Cannot exec %s.\n", cmd);
exit(-1);
}
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: Building a clean namespace and MS_BIND across namespaces is now disabled
2005-11-20 12:23 Building a clean namespace and MS_BIND across namespaces is now disabled Eric W. Biederman
2005-11-20 23:04 ` Serge E. Hallyn
@ 2005-11-21 23:13 ` Ram Pai
1 sibling, 0 replies; 4+ messages in thread
From: Ram Pai @ 2005-11-21 23:13 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Al Viro, linux-fsdevel, Miklos Szeredi, Christoph Hellwig,
Jamie Lokier
On Sun, 2005-11-20 at 04:23, Eric W. Biederman wrote:
> Currently I am looking at what it takes to build a namespace
> from scratch.
>
> Intuitively I am thinking one of two forms:
>
> > pid = clone(..., CLONE_NEWNS, ...);
> > if (pid == 0) {
> > umount2("/", MNT_DETACH);
> > mount(NULL, "/", "ramfs", 0, NULL);
> > chdir("/");
> > chroot("/");
> > }
I dont see why should this fail?
when a new namespace is created the new tasks fs->root and fs->pwd
are set appropriately to the corresponding mounts in the new namespace.
Take a look at copy_namespace(). What am I missing?
>
> > root_fd = open("path", O_DIRECTORY | O_RDONLY);
> > pid = clone(..., CONE_NEWNS, ...);
> > if (pid == 0) {
> > umount2("/", MNT_DETACH);
> > fchdir(root_fd);
> > mount(".", "/", NULL MS_BIND, NULL);
> > chroot(".");
> > }
>
> In practice the only form that seems to work is:
>
> > pid = clone(..., CLONE_NEWNS, ...);
> > if (pid == 0) {
> > chdir("path");
> > mount(".", ".", NULL, MS_BIND, NULL);
> > chdir("path");
> > mount(".", "/", NULL, MS_MOVE, NULL);
> > chroot(".");
> > }
>
> Both of the failing forms fail miserably because while MNT_DETACH
> works fine afterwords current->fs->pwd and current->fs->root
> both point to directories that are no longer part of a namespace,
> so check_mnt fails. In addition there appears to be no way to
> set current->fs->pwd or current->fs->root to a valid directory
> in the current namespace afterwards.
I guess your requirement is:
1) create a new namespace
2) get rid of all the mounts in the new namespace
3) stitch new mounts in the new namespace selectively using the
once from the old namespace.
Right? step (1) and (2) can be done with the new 2.6.15* kernel.
step (3) cannot be done because bind mount across namespaces has been
invalidated. But if all you want is to selectively get rid of some
mounts in the new namespace, why not just umount them?
RP
>
> Without some form of unmounting all of the filesystems my
> namespace is cluttered with all kinds of mounts I don't want
> to see, and can never use. By walking through /proc/self/mounts I can
> remove all but /. Even limiting the problem to a stack of mounts
> on / if that stack gets deep enough it is still ugly and confusing
> to look at.
>
> Like the umount case, mount(... "/") also does not
> update current->fs->pwd and current->fs->root. The
> latter can be worked around by using a temporary mount point
> and using MS_MOVE, so the semantics I want are possible
> but I still get a cluttered namespace with junk that is just
> confusing to see.
>
> The least intrusive fix I can think of would be to add a MNT_DETACH
> option to mount so I would be able to request that instead of stacking
> mounts all underlying mounts at the given mount point would be
> unmounted, as the mount is performed.
>
> ...
>
> This leads me to the second part of my puzzle. When you have
> multiple namespaces around it can be handy to mount a filesystem
> from a different namespace. Especially if you want to derive
> your new namespace from an old one.
>
> In most versions of 2.6 this can be implemented by opening
> a directory, and then when you want to mount it:
> fchdir(dir_fd);
> mount(".", "/some/path", NULL, MS_BIND, NULL);
>
> With the latest version of 2.6 this ability was removed in:
> ccd48bc7fac284caf704dcdcafd223a24f70bccf
>
> Is there a correctness implication I am missing here? Since
> you can fchdir to the directory it doesn't look like there are any
> security implications. It looks like any correctness problems were
> fixed in: 68b47139ea94ab6d05e89c654db8daa99e9a232c
>
> Eric
^ permalink raw reply [flat|nested] 4+ messages in thread