From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
LKML <linux-kernel@vger.kernel.org>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Louis Rilling <louis.rilling@kerlabs.com>,
Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2
Date: Wed, 23 May 2012 16:52:39 +0200 [thread overview]
Message-ID: <20120523145239.GA20378@redhat.com> (raw)
In-Reply-To: <20120522122315.c3f2118c.akpm@linux-foundation.org>
On 05/22, Andrew Morton wrote:
>
> On Mon, 21 May 2012 18:20:31 -0600
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>
> > + /* If we are the last child process in a pid namespace
>
> like this:
> /*
> * If ...
>
> > + * to be reaped notify the child_reaper.
>
> s/reaped/reaped,/
>
> More seriously, it isn't a very good comment. It tells us "what" the
> code is doing (which is pretty obvious from reading it), but it didn't
> tell us "why" it is doing this. Why do PID namespaces need special
> handling here? What's the backstory??
Well, this is documented in the changelog but I agree, this needs some
documentation.
Perhaps zap_pid_ns_processes() should document that it waits for the
stealth EXIT_DEAD tasks, and __unhash_process() can simply say
"see zap_pid_ns_processes()".
> > @@ -189,6 +189,17 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
> > rc = sys_wait4(-1, NULL, __WALL, NULL);
> > } while (rc != -ECHILD);
> >
> > + read_lock(&tasklist_lock);
> > + for (;;) {
> > + __set_current_state(TASK_UNINTERRUPTIBLE);
> > + if (list_empty(¤t->children))
> > + break;
> > + read_unlock(&tasklist_lock);
> > + schedule();
> > + read_lock(&tasklist_lock);
> > + }
> > + read_unlock(&tasklist_lock);
>
> Well.
>
> a) This loop can leave the thread in state TASK_UNINTERRUPTIBLE,
> which looks wrong.
OOPS. You fixed this in *-fix.patch, thanks.
> b) Given that the waking side is also testing list_empty(), I think
> you might need set_current_state() here, with the barrier. So that
> this thread gets the correct view of the list_head wrt the waker's
> view.
We rely on tasklist_lock, note that the waking side takes it for writing.
> c) Did we really need to bang on tasklist_lock so many times? There
> doesn't seem a nicer way of structuring this :(
See above.
We do not really need tasklist_lock to wait for list_empty(children),
we could add a couple of barriers or use wait_event(wait_chldexit).
But the explicit usage of tasklist makes the code more understandable,
this this way we obviously can't race with the child doing release_task(),
say, we can't return before the last detach_pid(PIDTYPE_PID).
As for re-structuring, I'd suggest
for (;;) {
bool need_wait = false;
read_lock(&tasklist_lock);
if (!list_empty(¤t->children)) {
__set_current_state(TASK_UNINTERRUPTIBLE);
need_wait = true;
}
read_unlock(&tasklist_lock);
if (!need_wait)
break;
schedule();
}
but this is subjective.
Oleg.
next prev parent reply other threads:[~2012-05-23 14:54 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-28 9:19 [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
2012-04-28 14:26 ` Oleg Nesterov
2012-04-29 4:13 ` Mike Galbraith
2012-04-29 7:57 ` Eric W. Biederman
2012-04-29 9:49 ` Mike Galbraith
2012-04-29 16:58 ` Oleg Nesterov
2012-04-30 2:59 ` Eric W. Biederman
2012-04-30 3:25 ` Mike Galbraith
2012-05-02 12:40 ` Oleg Nesterov
2012-05-02 17:37 ` Eric W. Biederman
2012-04-30 3:01 ` [PATCH] " Mike Galbraith
[not found] ` <m1zk9rmyh4.fsf@fess.ebiederm.org>
2012-05-01 20:42 ` Andrew Morton
2012-05-03 3:12 ` Mike Galbraith
2012-05-03 14:56 ` Mike Galbraith
2012-05-04 4:27 ` Mike Galbraith
2012-05-04 7:55 ` Eric W. Biederman
2012-05-04 8:34 ` Mike Galbraith
2012-05-04 9:45 ` Mike Galbraith
2012-05-04 14:13 ` Eric W. Biederman
2012-05-04 14:49 ` Mike Galbraith
2012-05-04 15:36 ` Eric W. Biederman
2012-05-04 16:57 ` Mike Galbraith
2012-05-04 20:29 ` Eric W. Biederman
2012-05-05 5:56 ` Mike Galbraith
2012-05-05 6:08 ` Mike Galbraith
2012-05-05 7:12 ` Mike Galbraith
2012-05-05 11:37 ` Eric W. Biederman
2012-05-07 21:51 ` [PATCH] vfs: Speed up deactivate_super for non-modular filesystems Eric W. Biederman
2012-05-07 22:17 ` Al Viro
2012-05-07 23:56 ` Paul E. McKenney
2012-05-08 1:07 ` Eric W. Biederman
2012-05-08 4:53 ` Mike Galbraith
2012-05-09 7:55 ` Nick Piggin
2012-05-09 11:02 ` Eric W. Biederman
2012-05-09 11:02 ` Eric W. Biederman
2012-05-15 8:40 ` Nick Piggin
2012-05-16 0:34 ` Eric W. Biederman
2012-05-16 0:34 ` Eric W. Biederman
2012-05-09 13:59 ` Paul E. McKenney
2012-05-04 8:03 ` [PATCH] Re: [RFC PATCH] namespaces: fix leak on fork() failure Eric W. Biederman
2012-05-04 8:19 ` Mike Galbraith
2012-05-04 8:54 ` Mike Galbraith
2012-05-07 0:32 ` [PATCH 0/3] pidns: Closing the pid namespace exit race Eric W. Biederman
2012-05-07 0:33 ` [PATCH 1/3] pidns: Use task_active_pid_ns in do_notify_parent Eric W. Biederman
2012-05-07 0:35 ` [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped Eric W. Biederman
2012-05-08 22:50 ` Andrew Morton
2012-05-16 18:39 ` Oleg Nesterov
2012-05-16 19:34 ` Oleg Nesterov
2012-05-16 20:54 ` Eric W. Biederman
2012-05-17 17:00 ` Oleg Nesterov
2012-05-17 21:46 ` Eric W. Biederman
2012-05-18 12:39 ` Oleg Nesterov
2012-05-19 0:03 ` Eric W. Biederman
2012-05-21 12:44 ` Oleg Nesterov
2012-05-22 0:16 ` Eric W. Biederman
2012-05-22 0:20 ` [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2 Eric W. Biederman
2012-05-22 16:54 ` Oleg Nesterov
2012-05-22 19:23 ` Andrew Morton
2012-05-23 14:52 ` Oleg Nesterov [this message]
2012-05-25 15:15 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Oleg Nesterov
2012-05-25 15:59 ` [PATCH -mm 0/1] pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper Oleg Nesterov
2012-05-25 16:00 ` [PATCH -mm 1/1] " Oleg Nesterov
2012-05-25 21:43 ` Eric W. Biederman
2012-05-27 19:10 ` [PATCH v2 -mm 0/1] " Oleg Nesterov
2012-05-27 19:11 ` [PATCH v2 -mm 1/1] " Oleg Nesterov
2012-05-29 6:34 ` Eric W. Biederman
2012-05-25 21:25 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Eric W. Biederman
2012-05-27 18:41 ` [PATCH -mm v2] " Oleg Nesterov
2012-05-07 0:35 ` [PATCH 3/3] pidns: Make killed children autoreap Eric W. Biederman
2012-05-08 22:51 ` Andrew Morton
2012-04-30 13:57 ` [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120523145239.GA20378@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=efault@gmx.de \
--cc=gorcunov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=louis.rilling@kerlabs.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.