From: Oleg Nesterov <oleg@redhat.com>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Pavel Emelyanov <xemul@parallels.com>,
Serge Hallyn <serge.hallyn@canonical.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Tejun Heo <tj@kernel.org>, Vasiliy Kulikov <segoon@openwall.com>,
Andrew Vagin <avagin@openvz.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] fs, proc: Introduce the /proc/<pid>/children entry v2
Date: Fri, 9 Dec 2011 17:55:36 +0100 [thread overview]
Message-ID: <20111209165536.GA25268@redhat.com> (raw)
In-Reply-To: <20111209154930.GT21678@moon>
On 12/09, Cyrill Gorcunov wrote:
>
> On Fri, Dec 09, 2011 at 04:30:09PM +0100, Oleg Nesterov wrote:
> ...
> > >
> > > It is a potential problem, from the lock-hold point of view and
> > > also it can cause large scheduling latencies. What's involved in
> > > making ->children an rcu-protected list?
> >
> > At first glance, this doesn't look trivial... forget_original_parent()
> > abuses ->sibling.
> >
>
> But wait, forget_original_parent may move task out of the original
> list and put it in dead_children list, right? Then release_task
> may call delayed_put_task_struct under as call_rcu which should
> wait until we finish reading in our rcu section, isn't it?
>
> Even if we get inconsistent picture of children (ie we get pid
> of shildren which just getting dead) I think it's fine.
The problen is (one of the problems, in fact), list_move() changes
->next.
To simplify, let's talk about children_seq_start() your patch adds.
Suppose that we make ->children/sibling "rcu-safe" (we replace
__unhash_process()->list_del_init() with list_del_rcu, and so on).
Now, children_seq_start() does:
rcu_read_lock();
list_for_each(child, task->children)
...;
Suppose that this task exits and reparents the child we are looking at.
Once reparent_leader() moves it into another list, list_for_each()
can never stop.
In short. forget_original_parent() changes the _head_ of the list,
in some sense.
> > But yes, it is not really nice to hold tasklist_lock here. May be
> > we can change this code so that every iteration records the reported
> > task_struct and then tries to continue. This means we should verify
> > that ->real_parent is still the same under tasklist, but at least
> > this way we do not hold it throughout.
> >
>
> And if real_parent is changed,
... or if list_empty(->sibling) == T
> I should simply skip such task and
> continue, right?
I think you should restart in this (hopefully unlikely) case. Yes, this
means the potentional data loss, but I guess we can't avoid this in any
case. For example, with the current patch children_seq_start() can miss
the _live_ thread if the already reported process was reaped.
> > Personally I'd even prefer /proc/pid/children/ directory (like
> > /proc/pid/task), but I guess this needs much more complications.
> >
>
> I fear so. Oleg, if it possible I would like to bring in as minimum
> code as I can ;)
Yes, yes, I agree.
Oleg.
next prev parent reply other threads:[~2011-12-09 17:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-06 18:10 [PATCH] fs, proc: Introduce the /proc/<pid>/children entry v2 Cyrill Gorcunov
2011-12-06 22:33 ` Andrew Morton
2011-12-06 22:35 ` Cyrill Gorcunov
2011-12-07 9:41 ` Cyrill Gorcunov
2011-12-07 18:27 ` Tejun Heo
2011-12-07 18:43 ` Cyrill Gorcunov
2011-12-07 18:53 ` Oleg Nesterov
2011-12-07 19:03 ` Cyrill Gorcunov
2011-12-07 19:19 ` Cyrill Gorcunov
2011-12-07 20:34 ` Cyrill Gorcunov
2011-12-07 21:52 ` Cyrill Gorcunov
2011-12-08 16:35 ` Oleg Nesterov
2011-12-08 16:50 ` Cyrill Gorcunov
2011-12-08 21:28 ` Cyrill Gorcunov
2011-12-08 21:54 ` Andrew Morton
2011-12-08 22:21 ` Cyrill Gorcunov
2011-12-09 15:30 ` Oleg Nesterov
2011-12-09 15:49 ` Cyrill Gorcunov
2011-12-09 16:55 ` Oleg Nesterov [this message]
2011-12-09 17:11 ` Cyrill Gorcunov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111209165536.GA25268@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avagin@openvz.org \
--cc=gorcunov@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=segoon@openwall.com \
--cc=serge.hallyn@canonical.com \
--cc=tj@kernel.org \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.