From: Andrew Morton <akpm@linux-foundation.org>
To: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: serge.hallyn@canonical.com, oleg@redhat.com,
containers@lists.linux-foundation.org, gkurz@fr.ibm.com,
linux-kernel@vger.kernel.org,
Michael Kerrisk <mtk.manpages@gmail.com>,
Ulrich Drepper <drepper@redhat.com>
Subject: Re: [PATCH 1/1][V3] Add reboot_pid_ns to handle the reboot syscall
Date: Tue, 6 Dec 2011 17:16:17 -0800 [thread overview]
Message-ID: <20111206171617.e31bc3a6.akpm@linux-foundation.org> (raw)
In-Reply-To: <1323030290-22216-2-git-send-email-daniel.lezcano@free.fr>
On Sun, 4 Dec 2011 21:24:50 +0100
Daniel Lezcano <daniel.lezcano@free.fr> wrote:
> In the case of a child pid namespace, rebooting the system does not
> really makes sense. When the pid namespace is used in conjunction
> with the other namespaces in order to create a linux container, the
> reboot syscall leads to some problems.
>
> A container can reboot the host. That can be fixed by dropping
> the sys_reboot capability but we are unable to correctly to poweroff/
> halt/reboot a container and the container stays stuck at the shutdown
> time with the container's init process waiting indefinitively.
>
> After several attempts, no solution from userspace was found to reliabily
> handle the shutdown from a container.
>
> This patch propose to store the reboot value in the 16 upper bits of the
> exit code from the processes belonging to a pid namespace which has
> rebooted. When the reboot syscall is called and we are not in the initial
> pid namespace, we kill the pid namespace.
>
> By this way the parent process of the child pid namespace to know if
> it rebooted or not and take the right decision.
hm, modifying the exit code in this manner is a strange interface. I
didn't see that coming. Perhaps some additional justification for this
idea should be added to the changelog, along with discussion of
alternative schemes. I don't immediately see any problems with it,
but, odd... I wonder what potential it has to upset existing
userspace.
Also, do we need to reserve all upper 16 bits of the exit code? It
would be better to formally leave some free for future use.
Also, this alters the interface of wait4() and friends, yes? If so,
that should be documented in the manpages. Please Cc Michael on such
changes.
Also, this affects the data delivered by taskstats, I believe. Please
check this, test it, document it in the changelog and update
getdelays.c appropriately.
Also, glibc might be affected. For symmetry we might want to add a
WIFREBOOT() or something. And we now expect waitid() to fill in extra
bits in siginfo_t.si_status, which assumes that glibc (and other
libc's!) aren't using a u8 in there somewhere. etcetera. This all
should be tested, and reviewed by Uli (please).
>
> ...
>
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -444,6 +444,9 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
> magic2 != LINUX_REBOOT_MAGIC2C))
> return -EINVAL;
>
> + if (task_active_pid_ns(current) != &init_pid_ns)
> + return reboot_pid_ns(task_active_pid_ns(current), cmd);
This adds a bunch of useless code if CONFIG_PID_NS=n. It would be
better to do
#ifdef CONFIG_PID_NS
extern void pidns_handle_reboot(int cmd);
#else
static inline void pidns_handle_reboot(int cmd)
{
}
#endif
next prev parent reply other threads:[~2011-12-07 1:16 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-04 20:24 [PATCH 0/1][V3] Handle reboot in a child pid namespace Daniel Lezcano
2011-12-04 20:24 ` [PATCH 1/1][V3] Add reboot_pid_ns to handle the reboot syscall Daniel Lezcano
2011-12-05 18:35 ` Serge Hallyn
2011-12-05 20:42 ` Oleg Nesterov
2011-12-05 21:16 ` Daniel Lezcano
2011-12-05 21:17 ` Daniel Lezcano
2011-12-07 1:16 ` Andrew Morton [this message]
2011-12-07 15:12 ` Oleg Nesterov
2011-12-07 21:36 ` Daniel Lezcano
2011-12-04 21:27 ` [PATCH 0/1][V3] Handle reboot in a child pid namespace Henrique de Moraes Holschuh
2011-12-04 23:08 ` Daniel Lezcano
2011-12-05 20:49 ` Daniel Lezcano
2011-12-05 20:51 ` Oleg Nesterov
2011-12-05 20:50 ` Oleg Nesterov
2011-12-05 22:38 ` Miquel van Smoorenburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111206171617.e31bc3a6.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=daniel.lezcano@free.fr \
--cc=drepper@redhat.com \
--cc=gkurz@fr.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mtk.manpages@gmail.com \
--cc=oleg@redhat.com \
--cc=serge.hallyn@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox