From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
LKML <linux-kernel@vger.kernel.org>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Louis Rilling <louis.rilling@kerlabs.com>,
Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2
Date: Wed, 23 May 2012 16:52:39 +0200 [thread overview]
Message-ID: <20120523145239.GA20378@redhat.com> (raw)
In-Reply-To: <20120522122315.c3f2118c.akpm@linux-foundation.org>
On 05/22, Andrew Morton wrote:
>
> On Mon, 21 May 2012 18:20:31 -0600
> ebiederm@xmission.com (Eric W. Biederman) wrote:
>
> > + /* If we are the last child process in a pid namespace
>
> like this:
> /*
> * If ...
>
> > + * to be reaped notify the child_reaper.
>
> s/reaped/reaped,/
>
> More seriously, it isn't a very good comment. It tells us "what" the
> code is doing (which is pretty obvious from reading it), but it didn't
> tell us "why" it is doing this. Why do PID namespaces need special
> handling here? What's the backstory??
Well, this is documented in the changelog but I agree, this needs some
documentation.
Perhaps zap_pid_ns_processes() should document that it waits for the
stealth EXIT_DEAD tasks, and __unhash_process() can simply say
"see zap_pid_ns_processes()".
> > @@ -189,6 +189,17 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
> > rc = sys_wait4(-1, NULL, __WALL, NULL);
> > } while (rc != -ECHILD);
> >
> > + read_lock(&tasklist_lock);
> > + for (;;) {
> > + __set_current_state(TASK_UNINTERRUPTIBLE);
> > + if (list_empty(¤t->children))
> > + break;
> > + read_unlock(&tasklist_lock);
> > + schedule();
> > + read_lock(&tasklist_lock);
> > + }
> > + read_unlock(&tasklist_lock);
>
> Well.
>
> a) This loop can leave the thread in state TASK_UNINTERRUPTIBLE,
> which looks wrong.
OOPS. You fixed this in *-fix.patch, thanks.
> b) Given that the waking side is also testing list_empty(), I think
> you might need set_current_state() here, with the barrier. So that
> this thread gets the correct view of the list_head wrt the waker's
> view.
We rely on tasklist_lock, note that the waking side takes it for writing.
> c) Did we really need to bang on tasklist_lock so many times? There
> doesn't seem a nicer way of structuring this :(
See above.
We do not really need tasklist_lock to wait for list_empty(children),
we could add a couple of barriers or use wait_event(wait_chldexit).
But the explicit usage of tasklist makes the code more understandable,
this this way we obviously can't race with the child doing release_task(),
say, we can't return before the last detach_pid(PIDTYPE_PID).
As for re-structuring, I'd suggest
for (;;) {
bool need_wait = false;
read_lock(&tasklist_lock);
if (!list_empty(¤t->children)) {
__set_current_state(TASK_UNINTERRUPTIBLE);
need_wait = true;
}
read_unlock(&tasklist_lock);
if (!need_wait)
break;
schedule();
}
but this is subjective.
Oleg.
next prev parent reply other threads:[~2012-05-23 14:54 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-28 9:19 [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
2012-04-28 14:26 ` Oleg Nesterov
2012-04-29 4:13 ` Mike Galbraith
2012-04-29 7:57 ` Eric W. Biederman
2012-04-29 9:49 ` Mike Galbraith
2012-04-29 16:58 ` Oleg Nesterov
2012-04-30 2:59 ` Eric W. Biederman
2012-04-30 3:25 ` Mike Galbraith
2012-05-02 12:40 ` Oleg Nesterov
2012-05-02 17:37 ` Eric W. Biederman
2012-04-30 3:01 ` [PATCH] " Mike Galbraith
[not found] ` <m1zk9rmyh4.fsf@fess.ebiederm.org>
2012-05-01 20:42 ` Andrew Morton
2012-05-03 3:12 ` Mike Galbraith
2012-05-03 14:56 ` Mike Galbraith
2012-05-04 4:27 ` Mike Galbraith
2012-05-04 7:55 ` Eric W. Biederman
2012-05-04 8:34 ` Mike Galbraith
2012-05-04 9:45 ` Mike Galbraith
2012-05-04 14:13 ` Eric W. Biederman
2012-05-04 14:49 ` Mike Galbraith
2012-05-04 15:36 ` Eric W. Biederman
2012-05-04 16:57 ` Mike Galbraith
2012-05-04 20:29 ` Eric W. Biederman
2012-05-05 5:56 ` Mike Galbraith
2012-05-05 6:08 ` Mike Galbraith
2012-05-05 7:12 ` Mike Galbraith
2012-05-05 11:37 ` Eric W. Biederman
2012-05-07 21:51 ` [PATCH] vfs: Speed up deactivate_super for non-modular filesystems Eric W. Biederman
2012-05-07 22:17 ` Al Viro
2012-05-07 23:56 ` Paul E. McKenney
2012-05-08 1:07 ` Eric W. Biederman
2012-05-08 4:53 ` Mike Galbraith
2012-05-09 7:55 ` Nick Piggin
2012-05-09 11:02 ` Eric W. Biederman
2012-05-15 8:40 ` Nick Piggin
2012-05-16 0:34 ` Eric W. Biederman
2012-05-09 13:59 ` Paul E. McKenney
2012-05-04 8:03 ` [PATCH] Re: [RFC PATCH] namespaces: fix leak on fork() failure Eric W. Biederman
2012-05-04 8:19 ` Mike Galbraith
2012-05-04 8:54 ` Mike Galbraith
2012-05-07 0:32 ` [PATCH 0/3] pidns: Closing the pid namespace exit race Eric W. Biederman
2012-05-07 0:33 ` [PATCH 1/3] pidns: Use task_active_pid_ns in do_notify_parent Eric W. Biederman
2012-05-07 0:35 ` [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped Eric W. Biederman
2012-05-08 22:50 ` Andrew Morton
2012-05-16 18:39 ` Oleg Nesterov
2012-05-16 19:34 ` Oleg Nesterov
2012-05-16 20:54 ` Eric W. Biederman
2012-05-17 17:00 ` Oleg Nesterov
2012-05-17 21:46 ` Eric W. Biederman
2012-05-18 12:39 ` Oleg Nesterov
2012-05-19 0:03 ` Eric W. Biederman
2012-05-21 12:44 ` Oleg Nesterov
2012-05-22 0:16 ` Eric W. Biederman
2012-05-22 0:20 ` [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2 Eric W. Biederman
2012-05-22 16:54 ` Oleg Nesterov
2012-05-22 19:23 ` Andrew Morton
2012-05-23 14:52 ` Oleg Nesterov [this message]
2012-05-25 15:15 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Oleg Nesterov
2012-05-25 15:59 ` [PATCH -mm 0/1] pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper Oleg Nesterov
2012-05-25 16:00 ` [PATCH -mm 1/1] " Oleg Nesterov
2012-05-25 21:43 ` Eric W. Biederman
2012-05-27 19:10 ` [PATCH v2 -mm 0/1] " Oleg Nesterov
2012-05-27 19:11 ` [PATCH v2 -mm 1/1] " Oleg Nesterov
2012-05-29 6:34 ` Eric W. Biederman
2012-05-25 21:25 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Eric W. Biederman
2012-05-27 18:41 ` [PATCH -mm v2] " Oleg Nesterov
2012-05-07 0:35 ` [PATCH 3/3] pidns: Make killed children autoreap Eric W. Biederman
2012-05-08 22:51 ` Andrew Morton
2012-04-30 13:57 ` [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120523145239.GA20378@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=efault@gmx.de \
--cc=gorcunov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=louis.rilling@kerlabs.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).