From: "Serge E. Hallyn" <serue@us.ibm.com>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Pavel Emelyanov <xemul@openvz.org>,
Robert Rex <robert.rex@exasol.com>,
Roland McGrath <roland@redhat.com>,
Sukadev Bhattiprolu <sukadev@us.ibm.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] pid_ns: (BUG 11391) change ->child_reaper when init->group_leader exits
Date: Tue, 26 Aug 2008 21:24:03 -0500 [thread overview]
Message-ID: <20080827022403.GA2320@us.ibm.com> (raw)
In-Reply-To: <20080824154912.GA3780@tv-sign.ru>
Quoting Oleg Nesterov (oleg@tv-sign.ru):
> We don't change pid_ns->child_reaper when the main thread of the
> subnamespace init exits. As Robert Rex <robert.rex@exasol.com>
> pointed out this is wrong.
I think there was another way that could go wrong as well - if the main
thread exited but sighand had another user, then the child_reaper would
not be changed, zap_pid_ns_processes() would not be called, and children
would continue to run and at death try to signal the dead main thread.
> Yes, the re-parenting itself works correctly, but if the reparented
> task exits it needs ->parent->nsproxy->pid_ns in do_notify_parent(),
> and if the main thread is zombie its ->nsproxy was already cleared
> by exit_task_namespaces().
>
> Introduce the new function, find_new_reaper(), which finds the new
> ->parent for the re-parenting and changes ->child_reaper if needed.
> Kill the now unneeded exit_child_reaper().
>
> Also move the changing of ->child_reaper from zap_pid_ns_processes()
> to find_new_reaper(), this consolidates the games with ->child_reaper
> and makes it stable under tasklist_lock.
>
> Reported-by: Robert Rex <robert.rex@exasol.com>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Sat on this for a bit - the only thing about this that worried me was
that zap_pid_ns_processes() is moved much further down in do_exit().
After looking I see no reason why that would be a problem, though.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Thanks, Oleg.
>
> kernel/pid_namespace.c | 6 ---
> kernel/exit.c | 78 +++++++++++++++++++++----------------------------
> 2 files changed, 34 insertions(+), 50 deletions(-)
>
> --- 2.6.27-rc4/kernel/pid_namespace.c~2_SWITCH_REAPER 2008-08-24 17:22:59.000000000 +0400
> +++ 2.6.27-rc4/kernel/pid_namespace.c 2008-08-24 18:22:44.000000000 +0400
> @@ -179,12 +179,6 @@ void zap_pid_ns_processes(struct pid_nam
> rc = sys_wait4(-1, NULL, __WALL, NULL);
> } while (rc != -ECHILD);
>
> - /*
> - * We can not clear ->child_reaper or leave it alone.
> - * There may by stealth EXIT_DEAD tasks on ->children,
> - * forget_original_parent() must move them somewhere.
> - */
> - pid_ns->child_reaper = init_pid_ns.child_reaper;
> acct_exit_ns(pid_ns);
> return;
> }
> --- 2.6.27-rc4/kernel/exit.c~2_SWITCH_REAPER 2008-08-24 16:46:51.000000000 +0400
> +++ 2.6.27-rc4/kernel/exit.c 2008-08-24 18:22:37.000000000 +0400
> @@ -831,26 +831,50 @@ static void reparent_thread(struct task_
> * the child reaper process (ie "init") in our pid
> * space.
> */
> +static struct task_struct *find_new_reaper(struct task_struct *father)
> +{
> + struct pid_namespace *pid_ns = task_active_pid_ns(father);
> + struct task_struct *thread;
> +
> + thread = father;
> + while_each_thread(father, thread) {
> + if (thread->flags & PF_EXITING)
> + continue;
> + if (unlikely(pid_ns->child_reaper == father))
> + pid_ns->child_reaper = thread;
> + return thread;
> + }
> +
> + if (unlikely(pid_ns->child_reaper == father)) {
> + write_unlock_irq(&tasklist_lock);
> + if (unlikely(pid_ns == &init_pid_ns))
> + panic("Attempted to kill init!");
> +
> + zap_pid_ns_processes(pid_ns);
> + write_lock_irq(&tasklist_lock);
> + /*
> + * We can not clear ->child_reaper or leave it alone.
> + * There may by stealth EXIT_DEAD tasks on ->children,
> + * forget_original_parent() must move them somewhere.
> + */
> + pid_ns->child_reaper = init_pid_ns.child_reaper;
> + }
> +
> + return pid_ns->child_reaper;
> +}
> +
> static void forget_original_parent(struct task_struct *father)
> {
> - struct task_struct *p, *n, *reaper = father;
> + struct task_struct *p, *n, *reaper;
> LIST_HEAD(ptrace_dead);
>
> write_lock_irq(&tasklist_lock);
> -
> + reaper = find_new_reaper(father);
> /*
> * First clean up ptrace if we were using it.
> */
> ptrace_exit(father, &ptrace_dead);
>
> - do {
> - reaper = next_thread(reaper);
> - if (reaper == father) {
> - reaper = task_child_reaper(father);
> - break;
> - }
> - } while (reaper->flags & PF_EXITING);
> -
> list_for_each_entry_safe(p, n, &father->children, sibling) {
> p->real_parent = reaper;
> if (p->parent == father) {
> @@ -959,39 +983,6 @@ static void check_stack_usage(void)
> static inline void check_stack_usage(void) {}
> #endif
>
> -static inline void exit_child_reaper(struct task_struct *tsk)
> -{
> - if (likely(tsk->group_leader != task_child_reaper(tsk)))
> - return;
> -
> - if (tsk->nsproxy->pid_ns == &init_pid_ns)
> - panic("Attempted to kill init!");
> -
> - /*
> - * @tsk is the last thread in the 'cgroup-init' and is exiting.
> - * Terminate all remaining processes in the namespace and reap them
> - * before exiting @tsk.
> - *
> - * Note that @tsk (last thread of cgroup-init) may not necessarily
> - * be the child-reaper (i.e main thread of cgroup-init) of the
> - * namespace i.e the child_reaper may have already exited.
> - *
> - * Even after a child_reaper exits, we let it inherit orphaned children,
> - * because, pid_ns->child_reaper remains valid as long as there is
> - * at least one living sub-thread in the cgroup init.
> -
> - * This living sub-thread of the cgroup-init will be notified when
> - * a child inherited by the 'child-reaper' exits (do_notify_parent()
> - * uses __group_send_sig_info()). Further, when reaping child processes,
> - * do_wait() iterates over children of all living sub threads.
> -
> - * i.e even though 'child_reaper' thread is listed as the parent of the
> - * orphaned children, any living sub-thread in the cgroup-init can
> - * perform the role of the child_reaper.
> - */
> - zap_pid_ns_processes(tsk->nsproxy->pid_ns);
> -}
> -
> NORET_TYPE void do_exit(long code)
> {
> struct task_struct *tsk = current;
> @@ -1051,7 +1042,6 @@ NORET_TYPE void do_exit(long code)
> }
> group_dead = atomic_dec_and_test(&tsk->signal->live);
> if (group_dead) {
> - exit_child_reaper(tsk);
> hrtimer_cancel(&tsk->signal->real_timer);
> exit_itimers(tsk->signal);
> }
next prev parent reply other threads:[~2008-08-27 2:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-24 15:49 [PATCH 2/4] pid_ns: (BUG 11391) change ->child_reaper when init->group_leader exits Oleg Nesterov
2008-08-26 23:33 ` sukadev
2008-08-27 0:35 ` sukadev
2008-08-27 17:06 ` Oleg Nesterov
2008-08-27 2:24 ` Serge E. Hallyn [this message]
2008-08-27 11:38 ` Pavel Emelyanov
2008-08-27 18:04 ` sukadev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080827022403.GA2320@us.ibm.com \
--to=serue@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@tv-sign.ru \
--cc=robert.rex@exasol.com \
--cc=roland@redhat.com \
--cc=sukadev@us.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.