All of lore.kernel.org
 help / color / mirror / Atom feed
From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
To: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
Cc: Rob Landley <rob-VoJi6FS/r0vR7s880joybQ@public.gmane.org>,
	"Eric W. Biederman"
	<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>,
	Andrew Vagin <avagin-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Linux FS Devel
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Andrey Vagin <avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Alexander Viro
	<viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Cyrill Gorcunov
	<gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
	Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	Serge Hallyn
	<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Subject: Re: [PATCH] [RFC] mnt: add ability to clone mntns starting with the current root
Date: Wed, 8 Oct 2014 23:38:54 +0000	[thread overview]
Message-ID: <20141008233854.GG31366@ubuntumail> (raw)
In-Reply-To: <CALCETrXapWTiFw2CC1m43fs9yuHuesXxXtmHh-5F3J_bUYeRxg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Quoting Andy Lutomirski (luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org):
> On Wed, Oct 8, 2014 at 2:36 PM, Rob Landley <rob-VoJi6FS/r0vR7s880joybQ@public.gmane.org> wrote:
> > On 10/08/14 14:31, Andy Lutomirski wrote:
> >> On Wed, Oct 8, 2014 at 12:23 PM, Eric W. Biederman
> >> <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> wrote:
> >>> Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> writes:
> >>>>> Maybe we want to say that rootfs should not be used if we are going to
> >>>>> create containers...
> >>>
> >>> Today it is an assumption of the vfs that rootfs is mounted.  With
> >>> rootfs mounted and pivot_root at the base of the mount stack you can
> >>> make as minimal of a set of mounts as the vfs allows.
> >>>
> >>> Removing rootfs from the vfs requires an audit of everything that
> >>> manipulates mounts.  It is not remotely a local excercise.
> >>
> >> Would it be a less invasive audit to allow different mount namespaces
> >> to have different rootfses?
> >
> > I.E. The same way different namespaces have different init tasks?
> >
> > The abstraction containers has implemented here should be logically
> > consistent.
> >
> >>>> Could we have an extra rootfs-like fs that is always completely empty,
> >>>> doesn't allow any writes, and can sit at the bottom of container
> >>>> namespace hierarchies?  If so, and if we add a new syscall that's like
> >>>> pivot_root (or unshare) but prunes the hierarchy, then we could switch
> >>>> to that rootfs then.
> >>>
> >>> Or equally have something that guarantees that rootfs is empty and
> >>> read-only at the time the normal root filesystem is mounted.  That is
> >>> certainly a much more localized change if we want to go there.
> >>>
> >>> I am half tempted to suggest that mount --move /some/path / be updated
> >>> to make the old / just go away (perhaps to be replaced with a read-only
> >>> empty rootfs).  That gets us into figuring out if we break userspace
> >>> which is a big challenge.
> >>
> >> Hence my argument for a new syscall or entirely new operation.
> >
> > I'm still waiting for somebody to explain to my why chroot() shouldn't
> > be changed to do this instead of adding a new syscall. (At least when
> > mount namespace support is enabled.)
> 
> Because chroot has no effect on the namespace at all.  If you fork and
> the child chroots, the parent isn't chrooted.  And, more importantly
> for my example, is a process has it's cwd as /foo, and then it forks
> and the child chroots, then parent's ".." isn't changed as a result of
> the chroot.
> 
> >
> >> mount(2) and friends are way too multiplexed right now.  I just found
> >> yet another security bug due to the insanely complicated semantics of
> >> the vfs syscalls.  (Yes, a different one from the one yesterday.)
> >
> > As the guy who rewrote busybox mount 3 times, and who just implemented a
> > brand new one (toybox) from scratch:
> >
> > It's a bit fiddly, yes.
> >
> >> A new operation kills several birds with one stone.  It could look like:
> >>
> >> int mntns_change_root(int dfd, const char *path, int flags);
> >>
> >> return -EPERM if chrooted.
> >
> > Really?
> 
> Now that CVE-2014-7970 is public: what the heck is pivot_root supposed
> to do if the caller is chrooted?  The current behavior is obviously
> incorrect (it leaks memory), but it's not entirely clear to me what
> should happen.  I think it should either be disallowed or should have
> well-defined semantics.
> 
> For simplicity, if a new syscall for this is added, then I think that
> the caller-is-chrooted case should be disallowed.  If someone needs it
> and can articulate what the semantics should be, then I have no
> problem with allowing it going forward.

It's not that I'd have a need for that, but rather if for some
reason I started out chrooted due to some bogus initramfs, I'd
prefer to not have to feel like a criminial and escape the chroot
first.

WARNING: multiple messages have this Message-ID (diff)
From: Serge Hallyn <serge.hallyn@ubuntu.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Rob Landley <rob@landley.net>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andrew Vagin <avagin@parallels.com>,
	Andrey Vagin <avagin@openvz.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>,
	Andrey Vagin <avagin@gmail.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Pavel Emelyanov <xemul@parallels.com>,
	Serge Hallyn <serge.hallyn@canonical.com>
Subject: Re: [PATCH] [RFC] mnt: add ability to clone mntns starting with the current root
Date: Wed, 8 Oct 2014 23:38:54 +0000	[thread overview]
Message-ID: <20141008233854.GG31366@ubuntumail> (raw)
In-Reply-To: <CALCETrXapWTiFw2CC1m43fs9yuHuesXxXtmHh-5F3J_bUYeRxg@mail.gmail.com>

Quoting Andy Lutomirski (luto@amacapital.net):
> On Wed, Oct 8, 2014 at 2:36 PM, Rob Landley <rob@landley.net> wrote:
> > On 10/08/14 14:31, Andy Lutomirski wrote:
> >> On Wed, Oct 8, 2014 at 12:23 PM, Eric W. Biederman
> >> <ebiederm@xmission.com> wrote:
> >>> Andy Lutomirski <luto@amacapital.net> writes:
> >>>>> Maybe we want to say that rootfs should not be used if we are going to
> >>>>> create containers...
> >>>
> >>> Today it is an assumption of the vfs that rootfs is mounted.  With
> >>> rootfs mounted and pivot_root at the base of the mount stack you can
> >>> make as minimal of a set of mounts as the vfs allows.
> >>>
> >>> Removing rootfs from the vfs requires an audit of everything that
> >>> manipulates mounts.  It is not remotely a local excercise.
> >>
> >> Would it be a less invasive audit to allow different mount namespaces
> >> to have different rootfses?
> >
> > I.E. The same way different namespaces have different init tasks?
> >
> > The abstraction containers has implemented here should be logically
> > consistent.
> >
> >>>> Could we have an extra rootfs-like fs that is always completely empty,
> >>>> doesn't allow any writes, and can sit at the bottom of container
> >>>> namespace hierarchies?  If so, and if we add a new syscall that's like
> >>>> pivot_root (or unshare) but prunes the hierarchy, then we could switch
> >>>> to that rootfs then.
> >>>
> >>> Or equally have something that guarantees that rootfs is empty and
> >>> read-only at the time the normal root filesystem is mounted.  That is
> >>> certainly a much more localized change if we want to go there.
> >>>
> >>> I am half tempted to suggest that mount --move /some/path / be updated
> >>> to make the old / just go away (perhaps to be replaced with a read-only
> >>> empty rootfs).  That gets us into figuring out if we break userspace
> >>> which is a big challenge.
> >>
> >> Hence my argument for a new syscall or entirely new operation.
> >
> > I'm still waiting for somebody to explain to my why chroot() shouldn't
> > be changed to do this instead of adding a new syscall. (At least when
> > mount namespace support is enabled.)
> 
> Because chroot has no effect on the namespace at all.  If you fork and
> the child chroots, the parent isn't chrooted.  And, more importantly
> for my example, is a process has it's cwd as /foo, and then it forks
> and the child chroots, then parent's ".." isn't changed as a result of
> the chroot.
> 
> >
> >> mount(2) and friends are way too multiplexed right now.  I just found
> >> yet another security bug due to the insanely complicated semantics of
> >> the vfs syscalls.  (Yes, a different one from the one yesterday.)
> >
> > As the guy who rewrote busybox mount 3 times, and who just implemented a
> > brand new one (toybox) from scratch:
> >
> > It's a bit fiddly, yes.
> >
> >> A new operation kills several birds with one stone.  It could look like:
> >>
> >> int mntns_change_root(int dfd, const char *path, int flags);
> >>
> >> return -EPERM if chrooted.
> >
> > Really?
> 
> Now that CVE-2014-7970 is public: what the heck is pivot_root supposed
> to do if the caller is chrooted?  The current behavior is obviously
> incorrect (it leaks memory), but it's not entirely clear to me what
> should happen.  I think it should either be disallowed or should have
> well-defined semantics.
> 
> For simplicity, if a new syscall for this is added, then I think that
> the caller-is-chrooted case should be disallowed.  If someone needs it
> and can articulate what the semantics should be, then I have no
> problem with allowing it going forward.

It's not that I'd have a need for that, but rather if for some
reason I started out chrooted due to some bogus initramfs, I'd
prefer to not have to feel like a criminial and escape the chroot
first.

  parent reply	other threads:[~2014-10-08 23:38 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-07 12:12 [PATCH] [RFC] mnt: add ability to clone mntns starting with the current root Andrey Vagin
2014-10-07 12:12 ` Andrey Vagin
2014-10-07 13:30 ` Al Viro
     [not found]   ` <20141007133039.GG7996-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2014-10-07 13:33     ` Al Viro
2014-10-07 13:33       ` Al Viro
     [not found]       ` <20141007133339.GH7996-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2014-10-07 19:44         ` Andrew Vagin
2014-10-07 19:44           ` Andrew Vagin
2014-10-07 19:44           ` Andrew Vagin
2014-10-07 20:30         ` Eric W. Biederman
2014-10-07 20:30           ` Eric W. Biederman
2014-10-07 20:46           ` Serge Hallyn
2014-10-07 20:52             ` Eric W. Biederman
2014-10-07 20:52               ` Eric W. Biederman
     [not found]               ` <87wq8bvbzg.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:32                 ` Serge Hallyn
2014-10-07 21:32                   ` Serge Hallyn
2014-10-07 21:42                   ` Eric W. Biederman
     [not found]                     ` <87zjd7r1z9.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 22:19                       ` Andy Lutomirski
2014-10-07 22:19                         ` Andy Lutomirski
2014-10-07 22:42                         ` Eric W. Biederman
2014-10-07 22:42                           ` Eric W. Biederman
     [not found]                           ` <87h9zfpkm3.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 22:44                             ` Andy Lutomirski
2014-10-07 22:44                               ` Andy Lutomirski
2014-10-07 23:42                               ` Eric W. Biederman
2014-10-07 23:42                                 ` Eric W. Biederman
2014-10-07 23:44                                 ` Andy Lutomirski
2014-10-08  0:20                                   ` Eric W. Biederman
2014-10-08  0:20                                     ` Eric W. Biederman
     [not found]                                     ` <87vbnvif9e.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-08  0:25                                       ` Andy Lutomirski
2014-10-08  0:25                                         ` Andy Lutomirski
     [not found]           ` <87r3yjy64e.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:02             ` Andy Lutomirski
2014-10-07 21:02               ` Andy Lutomirski
     [not found]               ` <CALCETrXgssZfi3BirQ=K7-vrPyEh5AzFX2pF+yj76Ngi0sf7Yw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 21:26                 ` Eric W. Biederman
2014-10-07 21:26                   ` Eric W. Biederman
2014-10-07 21:26                   ` Eric W. Biederman
     [not found]                   ` <87siizshav.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:38                     ` Andy Lutomirski
2014-10-07 21:38                       ` Andy Lutomirski
     [not found]                       ` <CALCETrWfZwbGCxnUAg0PnM=tN8MGRQkHrJVC42bVF7sdJKXLmw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-07 21:50                         ` Eric W. Biederman
2014-10-07 21:50                           ` Eric W. Biederman
2014-10-07 21:50                           ` Eric W. Biederman
     [not found]                           ` <87zjd7pn0o.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-07 21:52                             ` Andy Lutomirski
2014-10-07 21:52                               ` Andy Lutomirski
2014-10-07 21:33                 ` Serge Hallyn
2014-10-07 21:33                   ` Serge Hallyn
     [not found] ` <1412683977-29543-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2014-10-07 20:45   ` Eric W. Biederman
2014-10-07 20:45     ` Eric W. Biederman
     [not found]     ` <87mw97wqvx.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-08 11:08       ` Andrew Vagin
2014-10-08 11:08         ` Andrew Vagin
2014-10-08 11:08         ` Andrew Vagin
     [not found]         ` <20141008110829.GC24908-yYYamFZzV1regbzhZkK2zA@public.gmane.org>
2014-10-08 15:35           ` Andy Lutomirski
2014-10-08 15:35             ` Andy Lutomirski
     [not found]             ` <CALCETrX4XrgbQNZZa7=1009KqhJ2gT+VBUkC15+59K9yEiTSbQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-08 19:23               ` Eric W. Biederman
2014-10-08 19:23                 ` Eric W. Biederman
2014-10-08 19:23                 ` Eric W. Biederman
2014-10-08 19:31                 ` Andy Lutomirski
     [not found]                   ` <CALCETrVSxYr=Oa29qHNL-GoifS26U8TfpreGY+KN7g926YgHUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-08 21:36                     ` Rob Landley
2014-10-08 21:36                       ` Rob Landley
2014-10-08 22:01                       ` Andy Lutomirski
     [not found]                         ` <CALCETrXapWTiFw2CC1m43fs9yuHuesXxXtmHh-5F3J_bUYeRxg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-10-08 23:38                           ` Serge Hallyn [this message]
2014-10-08 23:38                             ` Serge Hallyn
2014-10-08 23:41                             ` Andy Lutomirski
2014-10-08 23:41                               ` Andy Lutomirski
     [not found]                 ` <87vbnue56f.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2014-10-08 21:23                   ` Rob Landley
2014-10-08 21:23                     ` Rob Landley
2014-10-09 10:29                   ` Andrew Vagin
2014-10-09 10:29                     ` Andrew Vagin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141008233854.GG31366@ubuntumail \
    --to=serge.hallyn-gewih/nmzzlqt0dzr+alfa@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=avagin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=avagin-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=rob-VoJi6FS/r0vR7s880joybQ@public.gmane.org \
    --cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.