All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Michael Kerrisk
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH] Forbid invocation of kexec_load() outside initial PID namespace
Date: Mon, 6 Aug 2012 19:24:46 +0000	[thread overview]
Message-ID: <20120806192446.GA29269@mail.hallyn.com> (raw)
In-Reply-To: <87r4rjn84y.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
> 
> > Quoting Daniel P. Berrange (berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org):
> >> From: "Daniel P. Berrange" <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> >> 
> >> The following commit
> >> 
> >>     commit cf3f89214ef6a33fad60856bc5ffd7bb2fc4709b
> >>     Author: Daniel Lezcano <daniel.lezcano-GANU6spQydw@public.gmane.org>
> >>     Date:   Wed Mar 28 14:42:51 2012 -0700
> >> 
> >>     pidns: add reboot_pid_ns() to handle the reboot syscall
> >> 
> >> introduced custom handling of the reboot() syscall when invoked
> >> from a non-initial PID namespace. The intent was that a process
> >> in a container can be allowed to keep CAP_SYS_BOOT and execute
> >> reboot() to shutdown/reboot just their private container, rather
> >> than the host.
> >> 
> >> Unfortunately the kexec_load() syscall also relies on the
> >> CAP_SYS_BOOT capability. So by allowing a container to keep
> >> this capability to safely invoke reboot(), they mistakenly
> >> also gain the ability to use kexec_load(). The solution is
> >> to make kexec_load() return -EPERM if invoked from a PID
> >> namespace that is not the initial namespace
> >> 
> >> Signed-off-by: Daniel P. Berrange <berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> >> Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> >
> > Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> >
> > (Please see my previous email explaining why I believe the pidns
> > is an appropriate check)
> 
> Serge as to your objects.
> 
> If we define kexec_load in terms of the pid namespace then something
> makes sense, but the error should be EINVAL, or something of that sort.

Makes sense.

> That is what we did with reboot.  We defined reboot in terms of the pid
> namespace.
> 
> Not defining kexec_load in terms of the pid namespace and then returning
> EPERM because having we happen to have CAP_SYS_BOOT for other reasons is
> semantically horrible.
> 
> At the end of the day the effect is the same, but I think it matters a
> great deal in how we think about things.
> 
> We have CAP_SYS_BOOT in the initial user namespace.  We do have
> permission to make the system call.
> 
> So I continue to see this patch the way it is current constructed as
> broken.
> 
> Nacked-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>

I do also prefer splitting the capability.  Michael Kerrisk, do you
have any good suggestions for better names than CAP_RESTART (for
killing or restarting /sbin/init) and CAP_BOOT (for kexec and/or
hardware resets)?  Maybe CAP_RESTART_USER and CAP_RESTART_HW?
(CAP_SYS_BOOT being an alias for both for backward compatibility)

WARNING: multiple messages have this Message-ID (diff)
From: "Serge E. Hallyn" <serge@hallyn.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	"Daniel P. Berrange" <berrange@redhat.com>,
	linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org,
	Serge Hallyn <serge.hallyn@canonical.com>,
	Daniel Lezcano <daniel.lezcano@free.fr>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Tejun Heo <tj@kernel.org>, Oleg Nesterov <oleg@redhat.com>
Subject: Re: [PATCH] Forbid invocation of kexec_load() outside initial PID namespace
Date: Mon, 6 Aug 2012 19:24:46 +0000	[thread overview]
Message-ID: <20120806192446.GA29269@mail.hallyn.com> (raw)
In-Reply-To: <87r4rjn84y.fsf@xmission.com>

Quoting Eric W. Biederman (ebiederm@xmission.com):
> "Serge E. Hallyn" <serge@hallyn.com> writes:
> 
> > Quoting Daniel P. Berrange (berrange@redhat.com):
> >> From: "Daniel P. Berrange" <berrange@redhat.com>
> >> 
> >> The following commit
> >> 
> >>     commit cf3f89214ef6a33fad60856bc5ffd7bb2fc4709b
> >>     Author: Daniel Lezcano <daniel.lezcano@free.fr>
> >>     Date:   Wed Mar 28 14:42:51 2012 -0700
> >> 
> >>     pidns: add reboot_pid_ns() to handle the reboot syscall
> >> 
> >> introduced custom handling of the reboot() syscall when invoked
> >> from a non-initial PID namespace. The intent was that a process
> >> in a container can be allowed to keep CAP_SYS_BOOT and execute
> >> reboot() to shutdown/reboot just their private container, rather
> >> than the host.
> >> 
> >> Unfortunately the kexec_load() syscall also relies on the
> >> CAP_SYS_BOOT capability. So by allowing a container to keep
> >> this capability to safely invoke reboot(), they mistakenly
> >> also gain the ability to use kexec_load(). The solution is
> >> to make kexec_load() return -EPERM if invoked from a PID
> >> namespace that is not the initial namespace
> >> 
> >> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
> >> Cc: Serge Hallyn <serge.hallyn@canonical.com>
> >
> > Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
> >
> > (Please see my previous email explaining why I believe the pidns
> > is an appropriate check)
> 
> Serge as to your objects.
> 
> If we define kexec_load in terms of the pid namespace then something
> makes sense, but the error should be EINVAL, or something of that sort.

Makes sense.

> That is what we did with reboot.  We defined reboot in terms of the pid
> namespace.
> 
> Not defining kexec_load in terms of the pid namespace and then returning
> EPERM because having we happen to have CAP_SYS_BOOT for other reasons is
> semantically horrible.
> 
> At the end of the day the effect is the same, but I think it matters a
> great deal in how we think about things.
> 
> We have CAP_SYS_BOOT in the initial user namespace.  We do have
> permission to make the system call.
> 
> So I continue to see this patch the way it is current constructed as
> broken.
> 
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

I do also prefer splitting the capability.  Michael Kerrisk, do you
have any good suggestions for better names than CAP_RESTART (for
killing or restarting /sbin/init) and CAP_BOOT (for kexec and/or
hardware resets)?  Maybe CAP_RESTART_USER and CAP_RESTART_HW?
(CAP_SYS_BOOT being an alias for both for backward compatibility)

  parent reply	other threads:[~2012-08-06 19:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-03 10:53 [PATCH] Forbid invocation of kexec_load() outside initial PID namespace Daniel P. Berrange
2012-08-03 10:53 ` Daniel P. Berrange
     [not found] ` <1343991184-3619-1-git-send-email-berrange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-08-03 11:25   ` richard -rw- weinberger
2012-08-03 11:25     ` richard -rw- weinberger
2012-08-03 12:45   ` Eric W. Biederman
2012-08-03 12:45     ` Eric W. Biederman
     [not found]     ` <cfa2e2c9-db52-40cd-979b-7a6084427190-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2012-08-03 12:52       ` Daniel P. Berrange
2012-08-03 12:52         ` Daniel P. Berrange
     [not found]         ` <20120803125210.GD12870-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-08-03 13:07           ` Eric W. Biederman
2012-08-03 13:07             ` Eric W. Biederman
2012-08-04 23:15             ` Serge Hallyn
2012-08-06 19:20               ` Serge E. Hallyn
2012-08-06 19:20                 ` Serge E. Hallyn
     [not found]             ` <bef31b2b-429b-4b2b-981b-b230f9c6bfad-2ueSQiBKiTY7tOexoI0I+QC/G2K4zDHf@public.gmane.org>
2012-08-04 23:15               ` Serge Hallyn
2012-08-06 19:00   ` Serge E. Hallyn
2012-08-06 19:00     ` Serge E. Hallyn
     [not found]     ` <20120806190014.GA15267-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2012-08-06 19:16       ` Eric W. Biederman
2012-08-06 19:16         ` Eric W. Biederman
     [not found]         ` <87r4rjn84y.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-08-06 19:24           ` Serge E. Hallyn [this message]
2012-08-06 19:24             ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120806192446.GA29269@mail.hallyn.com \
    --to=serge-a9i7lubdfnhqt0dzr+alfa@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.