public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Serge Hallyn <serge.hallyn@canonical.com>
To: "Bruno Prémont" <bonbons@linux-vserver.org>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>,
	containers@lists.linux-foundation.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	LXC Development <Lxc-devel@lists.sourceforge.net>
Subject: Re: [RFC] catching sys_reboot syscall
Date: Thu, 11 Aug 2011 13:10:23 -0500	[thread overview]
Message-ID: <20110811181022.GA12307@peqn> (raw)
In-Reply-To: <20110811190456.77ff9280@neptune.home>

Quoting Bruno Prémont (bonbons@linux-vserver.org):
> On Thu, 11 August 2011 Daniel Lezcano <daniel.lezcano@free.fr> wrote:
> > On 08/11/2011 06:30 PM, Bruno Prémont wrote:
> > > On Wed, 10 August 2011 Daniel Lezcano <daniel.lezcano@free.fr> wrote:
> > >> On 08/10/2011 10:10 PM, Bruno Prémont wrote:
> > >>> Hi Daniel,
> > >>>
> > >>> [I'm adding containers ml as we had a discussion there some time ago
> > >>>  for this feature]
> > >> [ ... ]
> > >>
> > >>>> +    if (cmd == LINUX_REBOOT_CMD_RESTART2)
> > >>>> +        if (strncpy_from_user(&buffer[0], arg, sizeof(buffer) - 1) < 0)
> > >>>> +            return -EFAULT;
> > >>>> +
> > >>>> +    /* If we are not in the initial pid namespace, we send a signal
> > >>>> +     * to the parent of this init pid namespace, notifying a shutdown
> > >>>> +     * occured */
> > >>>> +    if (pid_ns != &init_pid_ns)
> > >>>> +        pid_namespace_reboot(pid_ns, cmd, buffer);
> > >>> Should there be a return here?
> > >>> Or does pid_namespace_reboot() never return by submitting signal to
> > >>> parent?
> > >> Yes, it does not return a value, like 'do_notify_parent_cldstop'
> > > So execution flow continues reaching the whole "host reboot code"?
> > >
> > > That's not so good as it then prevents using CAP_SYS_BOOT inside PID namespace
> > > to limit access to rebooting the container from inside as giving a process
> > > inside container CAP_SYS_BOOT would cause host to reboot (and when not given
> > > process inside container would get -EPERM in all cases).
> > >
> > > Wouldn't the following be better?:
> > > ...
> > > +
> > > +    /* We only trust the superuser with rebooting the system. */
> > > +    if (!capable(CAP_SYS_BOOT))
> > > +        return -EPERM;
> > > +
> > > +    /* If we are not in the initial pid namespace, we send a signal
> > > +     * to the parent of this init pid namespace, notifying a shutdown
> > > +     * occured */
> > > +    if (pid_ns != &init_pid_ns) {
> > > +        pid_namespace_reboot(pid_ns, cmd, buffer);
> > > +        return 0;
> > > +    }
> > > +
> > >      mutex_lock(&reboot_mutex);
> > >      switch (cmd) {
> > > ...
> > >
> > >
> > > If I misunderstood, please correct me.
> > 
> > Yep, this is what I did at the beginning but I realized I was closing
> > the door for future applications using the pid namespaces. The pid
> > namespace could be used by another kind of application, not a container,
> > running some administrative tasks so they may want to shutdown the host
> > from a different pid namespace.
> > 
> > For this reason, to prevent this execution flow, the container has to
> > drop the CAP_SYS_BOOT in addition of taking care of the SIGCHLD signal
> > with CLDREBOOT.
> 
> Ok, though for later source code readers to know adding/extending comment
> would be nice.
> Maybe something like
> 
> +    /* If we are not in the initial pid namespace, we send a signal
> +     * to the parent of this init pid namespace, notifying a shutdown
> +     * occured
> +     * NOTE: if process has CAP_SYS_BOOT it will additionally have the
> +     * same effect as if it was not namespaced */
> 
> 
> How would all of this integrate with the ongoing work on user namespaces?
> Maybe that one should later be the differentiator for who may or may not
> trigger the host reboot.

Right, then you'll be able to do:

	if (ns_capable(current_pid_ns()->user_ns, CAP_SYS_BOOT)) {
		// do container reboot stuff
	}

	if (!capable(CAP_SYS_BOOT)
		return;

	// do host reboot stuff

The first checks whether the task has privilege against the user ns
which owns his pid_ns.

The second one is a synonym for

	if (!ns_capable(&init_user_ns, CAP_SYS_BOOT))

where init_user_ns is the owner of all physical resources.

Right now pid_ns doesn't yet have a user_ns owner (except in my patch over
here: http://kernel.ubuntu.com/git?p=serge/linux-syslogns.git;a=commit;h=63556e9a39bcd75ec4a88333425800905013c73e )
and, if it did, well you can't yet do enough in a user namespace to run
a container.  But that'll be the ideal.  Hopefully soon...

-serge

  parent reply	other threads:[~2011-08-11 18:10 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-08 21:14 [RFC] catching sys_reboot syscall Daniel Lezcano
2011-08-10 20:10 ` Bruno Prémont
2011-08-10 20:49   ` Daniel Lezcano
2011-08-11 16:30     ` Bruno Prémont
2011-08-11 16:49       ` Daniel Lezcano
2011-08-11 17:04         ` Bruno Prémont
2011-08-11 18:10           ` [lxc-devel] " Daniel Lezcano
2011-08-11 18:10           ` Serge Hallyn [this message]
2011-08-11 18:40             ` [PATCH] add pid->user_ns Serge Hallyn
2011-08-20 11:03 ` [RFC] catching sys_reboot syscall Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110811181022.GA12307@peqn \
    --to=serge.hallyn@canonical.com \
    --cc=Lxc-devel@lists.sourceforge.net \
    --cc=bonbons@linux-vserver.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=daniel.lezcano@free.fr \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox