From: ebiederm@xmission.com (Eric W. Biederman)
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Louis Rilling <louis.rilling@kerlabs.com>,
Mike Galbraith <efault@gmx.de>
Subject: [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped.
Date: Sun, 06 May 2012 17:35:02 -0700 [thread overview]
Message-ID: <8762c87rrd.fsf_-_@xmission.com> (raw)
In-Reply-To: <87havs7rvv.fsf_-_@xmission.com> (Eric W. Biederman's message of "Sun, 06 May 2012 17:32:20 -0700")
This change extends the thread group zombie leader logic to work for pid
namespaces. The task with pid 1 is declared the pid namespace leader.
A pid namespace with no more processes is detected by observing that the
init task is a zombie in an empty thread group, and the the init task
has no children.
Instead of moving lingering EXIT_DEAD tasks off of init's ->children
list we now block init from exiting until those children have self
reaped and have removed themselves. Which guarantees that the init task
is the last task in a pid namespace to be reaped.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
kernel/exit.c | 46 +++++++++++++++++++++++++++++++++++-----------
1 files changed, 35 insertions(+), 11 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index d8bd3b42..7269260 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -164,6 +164,16 @@ static void delayed_put_task_struct(struct rcu_head *rhp)
put_task_struct(tsk);
}
+static bool pidns_leader(struct task_struct *tsk)
+{
+ return is_child_reaper(task_pid(tsk));
+}
+
+static bool delay_pidns_leader(struct task_struct *tsk)
+{
+ return pidns_leader(tsk) &&
+ (!thread_group_empty(tsk) || !list_empty(&tsk->children));
+}
void release_task(struct task_struct * p)
{
@@ -183,15 +193,23 @@ repeat:
__exit_signal(p);
/*
- * If we are the last non-leader member of the thread
- * group, and the leader is zombie, then notify the
- * group leader's parent process. (if it wants notification.)
+ * If we are the last non-leader member of the thread group,
+ * or the last non-leader member of the pid namespace, and the
+ * leader is zombie, then notify the leader's parent
+ * process. (if it wants notification.)
*/
zap_leader = 0;
- leader = p->group_leader;
- if (leader != p && thread_group_empty(leader) && leader->exit_state == EXIT_ZOMBIE) {
+ leader = NULL;
+ /* Do we need to worry about our thread_group or our pidns leader? */
+ if (p != p->group_leader)
+ leader = p->group_leader;
+ else if (pidns_leader(p->real_parent))
+ leader = p->real_parent;
+
+ if (leader && thread_group_empty(leader) &&
+ leader->exit_state == EXIT_ZOMBIE && list_empty(&leader->children)) {
/*
- * If we were the last child thread and the leader has
+ * If we were the last task in the group and the leader has
* exited already, and the leader's parent ignores SIGCHLD,
* then we are the one who should release the leader.
*/
@@ -720,11 +738,10 @@ static struct task_struct *find_new_reaper(struct task_struct *father)
zap_pid_ns_processes(pid_ns);
write_lock_irq(&tasklist_lock);
/*
- * We can not clear ->child_reaper or leave it alone.
- * There may by stealth EXIT_DEAD tasks on ->children,
- * forget_original_parent() must move them somewhere.
+ * Move all lingering EXIT_DEAD tasks onto the
+ * children list of init's thread group leader.
*/
- pid_ns->child_reaper = init_pid_ns.child_reaper;
+ pid_ns->child_reaper = father->group_leader;
} else if (father->signal->has_child_subreaper) {
struct task_struct *reaper;
@@ -798,6 +815,12 @@ static void forget_original_parent(struct task_struct *father)
exit_ptrace(father);
reaper = find_new_reaper(father);
+ /* Return immediately if we aren't going to reparent anything */
+ if (unlikely(reaper == father)) {
+ write_unlock_irq(&tasklist_lock);
+ return;
+ }
+
list_for_each_entry_safe(p, n, &father->children, sibling) {
struct task_struct *t = p;
do {
@@ -853,6 +876,7 @@ static void exit_notify(struct task_struct *tsk, int group_dead)
autoreap = do_notify_parent(tsk, sig);
} else if (thread_group_leader(tsk)) {
autoreap = thread_group_empty(tsk) &&
+ !delay_pidns_leader(tsk) &&
do_notify_parent(tsk, tsk->exit_signal);
} else {
autoreap = true;
@@ -1579,7 +1603,7 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
}
/* we don't reap group leaders with subthreads */
- if (!delay_group_leader(p))
+ if (!delay_group_leader(p) && !delay_pidns_leader(p))
return wait_task_zombie(wo, p);
/*
--
1.7.5.4
next prev parent reply other threads:[~2012-05-07 0:35 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-28 9:19 [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
2012-04-28 14:26 ` Oleg Nesterov
2012-04-29 4:13 ` Mike Galbraith
2012-04-29 7:57 ` Eric W. Biederman
2012-04-29 9:49 ` Mike Galbraith
2012-04-29 16:58 ` Oleg Nesterov
2012-04-30 2:59 ` Eric W. Biederman
2012-04-30 3:25 ` Mike Galbraith
2012-05-02 12:40 ` Oleg Nesterov
2012-05-02 17:37 ` Eric W. Biederman
2012-04-30 3:01 ` [PATCH] " Mike Galbraith
[not found] ` <m1zk9rmyh4.fsf@fess.ebiederm.org>
2012-05-01 20:42 ` Andrew Morton
2012-05-03 3:12 ` Mike Galbraith
2012-05-03 14:56 ` Mike Galbraith
2012-05-04 4:27 ` Mike Galbraith
2012-05-04 7:55 ` Eric W. Biederman
2012-05-04 8:34 ` Mike Galbraith
2012-05-04 9:45 ` Mike Galbraith
2012-05-04 14:13 ` Eric W. Biederman
2012-05-04 14:49 ` Mike Galbraith
2012-05-04 15:36 ` Eric W. Biederman
2012-05-04 16:57 ` Mike Galbraith
2012-05-04 20:29 ` Eric W. Biederman
2012-05-05 5:56 ` Mike Galbraith
2012-05-05 6:08 ` Mike Galbraith
2012-05-05 7:12 ` Mike Galbraith
2012-05-05 11:37 ` Eric W. Biederman
2012-05-07 21:51 ` [PATCH] vfs: Speed up deactivate_super for non-modular filesystems Eric W. Biederman
2012-05-07 22:17 ` Al Viro
2012-05-07 23:56 ` Paul E. McKenney
2012-05-08 1:07 ` Eric W. Biederman
2012-05-08 4:53 ` Mike Galbraith
2012-05-09 7:55 ` Nick Piggin
2012-05-09 11:02 ` Eric W. Biederman
2012-05-09 11:02 ` Eric W. Biederman
2012-05-15 8:40 ` Nick Piggin
2012-05-16 0:34 ` Eric W. Biederman
2012-05-16 0:34 ` Eric W. Biederman
2012-05-09 13:59 ` Paul E. McKenney
2012-05-04 8:03 ` [PATCH] Re: [RFC PATCH] namespaces: fix leak on fork() failure Eric W. Biederman
2012-05-04 8:19 ` Mike Galbraith
2012-05-04 8:54 ` Mike Galbraith
2012-05-07 0:32 ` [PATCH 0/3] pidns: Closing the pid namespace exit race Eric W. Biederman
2012-05-07 0:33 ` [PATCH 1/3] pidns: Use task_active_pid_ns in do_notify_parent Eric W. Biederman
2012-05-07 0:35 ` Eric W. Biederman [this message]
2012-05-08 22:50 ` [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped Andrew Morton
2012-05-16 18:39 ` Oleg Nesterov
2012-05-16 19:34 ` Oleg Nesterov
2012-05-16 20:54 ` Eric W. Biederman
2012-05-17 17:00 ` Oleg Nesterov
2012-05-17 21:46 ` Eric W. Biederman
2012-05-18 12:39 ` Oleg Nesterov
2012-05-19 0:03 ` Eric W. Biederman
2012-05-21 12:44 ` Oleg Nesterov
2012-05-22 0:16 ` Eric W. Biederman
2012-05-22 0:20 ` [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2 Eric W. Biederman
2012-05-22 16:54 ` Oleg Nesterov
2012-05-22 19:23 ` Andrew Morton
2012-05-23 14:52 ` Oleg Nesterov
2012-05-25 15:15 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Oleg Nesterov
2012-05-25 15:59 ` [PATCH -mm 0/1] pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper Oleg Nesterov
2012-05-25 16:00 ` [PATCH -mm 1/1] " Oleg Nesterov
2012-05-25 21:43 ` Eric W. Biederman
2012-05-27 19:10 ` [PATCH v2 -mm 0/1] " Oleg Nesterov
2012-05-27 19:11 ` [PATCH v2 -mm 1/1] " Oleg Nesterov
2012-05-29 6:34 ` Eric W. Biederman
2012-05-25 21:25 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Eric W. Biederman
2012-05-27 18:41 ` [PATCH -mm v2] " Oleg Nesterov
2012-05-07 0:35 ` [PATCH 3/3] pidns: Make killed children autoreap Eric W. Biederman
2012-05-08 22:51 ` Andrew Morton
2012-04-30 13:57 ` [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8762c87rrd.fsf_-_@xmission.com \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=efault@gmx.de \
--cc=gorcunov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=louis.rilling@kerlabs.com \
--cc=oleg@redhat.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.