From: Rob Landley <rob-VoJi6FS/r0vR7s880joybQ@public.gmane.org>
To: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
"Eric W. Biederman"
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: Andrew Vagin <avagin-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
Linux FS Devel
<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Andrey Vagin <avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Alexander Viro
<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Cyrill Gorcunov
<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
Serge Hallyn
<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Subject: Re: [PATCH] [RFC] mnt: add ability to clone mntns starting with the current root
Date: Wed, 08 Oct 2014 16:36:01 -0500 [thread overview]
Message-ID: <5435AE41.20105@landley.net> (raw)
In-Reply-To: <CALCETrVSxYr=Oa29qHNL-GoifS26U8TfpreGY+KN7g926YgHUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 10/08/14 14:31, Andy Lutomirski wrote:
> On Wed, Oct 8, 2014 at 12:23 PM, Eric W. Biederman
> <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote:
>> Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> writes:
>>>> Maybe we want to say that rootfs should not be used if we are going to
>>>> create containers...
>>
>> Today it is an assumption of the vfs that rootfs is mounted. With
>> rootfs mounted and pivot_root at the base of the mount stack you can
>> make as minimal of a set of mounts as the vfs allows.
>>
>> Removing rootfs from the vfs requires an audit of everything that
>> manipulates mounts. It is not remotely a local excercise.
>
> Would it be a less invasive audit to allow different mount namespaces
> to have different rootfses?
I.E. The same way different namespaces have different init tasks?
The abstraction containers has implemented here should be logically
consistent.
>>> Could we have an extra rootfs-like fs that is always completely empty,
>>> doesn't allow any writes, and can sit at the bottom of container
>>> namespace hierarchies? If so, and if we add a new syscall that's like
>>> pivot_root (or unshare) but prunes the hierarchy, then we could switch
>>> to that rootfs then.
>>
>> Or equally have something that guarantees that rootfs is empty and
>> read-only at the time the normal root filesystem is mounted. That is
>> certainly a much more localized change if we want to go there.
>>
>> I am half tempted to suggest that mount --move /some/path / be updated
>> to make the old / just go away (perhaps to be replaced with a read-only
>> empty rootfs). That gets us into figuring out if we break userspace
>> which is a big challenge.
>
> Hence my argument for a new syscall or entirely new operation.
I'm still waiting for somebody to explain to my why chroot() shouldn't
be changed to do this instead of adding a new syscall. (At least when
mount namespace support is enabled.)
> mount(2) and friends are way too multiplexed right now. I just found
> yet another security bug due to the insanely complicated semantics of
> the vfs syscalls. (Yes, a different one from the one yesterday.)
As the guy who rewrote busybox mount 3 times, and who just implemented a
brand new one (toybox) from scratch:
It's a bit fiddly, yes.
> A new operation kills several birds with one stone. It could look like:
>
> int mntns_change_root(int dfd, const char *path, int flags);
>
> return -EPERM if chrooted.
Really?
> Returns -EINVAL if path (relative to dfd) isn't a mountmount.
Requiring that chroot() only be called on mountpoints would break
existing semantics, which gets us back to new systemcall instead of
changing behavior of existing one.
If I recall, the first line of pushback against merging the openvz code
as is was "buckets of new syscalls". Pushback against adding a new
system call is understandable. Why can't we fix chroot() now that we
have the tools to do so?
> Otherwise it disconnects path from the existing
> hierarchy, attaches a permanently-empty read-only rootfs under it,
> makes it the root of the mntns, and does the root refs fixup. The old
> hierarchy gets thrown out.
We have a chroot() syscall. We don't use it for containers because it
doesn't do what we want. Does it currently do what _anybody_ wants?
> Systemd could use this, too.
While that's a strong argument against it, I'm willing to overlook it.
Rob
next prev parent reply other threads:[~2014-10-08 21:36 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-07 12:12 [PATCH] [RFC] mnt: add ability to clone mntns starting with the current root Andrey Vagin
2014-10-07 13:30 ` Al Viro
[not found] ` <20141007133039.GG7996-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2014-10-07 13:33 ` Al Viro
[not found] ` <20141007133339.GH7996-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2014-10-07 19:44 ` Andrew Vagin
2014-10-07 20:30 ` Eric W. Biederman
2014-10-07 20:46 ` Serge Hallyn
2014-10-07 20:52 ` Eric W. Biederman
[not found] ` <87wq8bvbzg.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:32 ` Serge Hallyn
2014-10-07 21:42 ` Eric W. Biederman
[not found] ` <87zjd7r1z9.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 22:19 ` Andy Lutomirski
2014-10-07 22:42 ` Eric W. Biederman
[not found] ` <87h9zfpkm3.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 22:44 ` Andy Lutomirski
2014-10-07 23:42 ` Eric W. Biederman
2014-10-07 23:44 ` Andy Lutomirski
2014-10-08 0:20 ` Eric W. Biederman
[not found] ` <87vbnvif9e.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-08 0:25 ` Andy Lutomirski
[not found] ` <87r3yjy64e.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:02 ` Andy Lutomirski
[not found] ` <CALCETrXgssZfi3BirQ=K7-vrPyEh5AzFX2pF+yj76Ngi0sf7Yw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 21:26 ` Eric W. Biederman
[not found] ` <87siizshav.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:38 ` Andy Lutomirski
[not found] ` <CALCETrWfZwbGCxnUAg0PnM=tN8MGRQkHrJVC42bVF7sdJKXLmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 21:50 ` Eric W. Biederman
[not found] ` <87zjd7pn0o.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:52 ` Andy Lutomirski
2014-10-07 21:33 ` Serge Hallyn
[not found] ` <1412683977-29543-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2014-10-07 20:45 ` Eric W. Biederman
[not found] ` <87mw97wqvx.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-08 11:08 ` Andrew Vagin
[not found] ` <20141008110829.GC24908-yYYamFZzV1regbzhZkK2zA@public.gmane.org>
2014-10-08 15:35 ` Andy Lutomirski
[not found] ` <CALCETrX4XrgbQNZZa7=1009KqhJ2gT+VBUkC15+59K9yEiTSbQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-08 19:23 ` Eric W. Biederman
2014-10-08 19:31 ` Andy Lutomirski
[not found] ` <CALCETrVSxYr=Oa29qHNL-GoifS26U8TfpreGY+KN7g926YgHUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-08 21:36 ` Rob Landley [this message]
2014-10-08 22:01 ` Andy Lutomirski
[not found] ` <CALCETrXapWTiFw2CC1m43fs9yuHuesXxXtmHh-5F3J_bUYeRxg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-08 23:38 ` Serge Hallyn
2014-10-08 23:41 ` Andy Lutomirski
[not found] ` <87vbnue56f.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-08 21:23 ` Rob Landley
2014-10-09 10:29 ` Andrew Vagin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5435AE41.20105@landley.net \
--to=rob-voji6fs/r0vr7s880joybq@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=avagin-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
--cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
--cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
--cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).