From: Shailabh Nagar <nagar@watson.ibm.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Andrew Morton <akpm@osdl.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Jay Lan <jlan@sgi.com>, Chris Sturtivant <csturtiv@sgi.com>,
Paul Jackson <pj@sgi.com>, Balbir Singh <balbir@in.ibm.com>,
Chandra Seetharaman <sekharan@us.ibm.com>,
Jamal <hadi@cyberus.ca>, netdev <netdev@vger.kernel.org>
Subject: Re: [Patch 6/6] per task delay accounting taskstats interface: fix clone skbs for each listener
Date: Wed, 12 Jul 2006 01:41:49 -0400 [thread overview]
Message-ID: <1152682909.6054.29.camel@localhost.localdomain> (raw)
In-Reply-To: <20060711111525.GA11175@gondor.apana.org.au>
On Tue, 2006-07-11 at 07:15, Herbert Xu wrote:
> On Tue, Jul 11, 2006 at 03:57:31AM -0700, Andrew Morton wrote:
> >
> > > >> down_write(&listeners->sem);
> > > >> list_for_each_entry_safe(s, tmp, &listeners->list, list) {
> > > >> - ret = genlmsg_unicast(skb, s->pid);
> > > >> + skb_next = NULL;
> > > >> + if (!list_islast(&s->list, &listeners->list)) {
> > > >> + skb_next = skb_clone(skb_cur, GFP_KERNEL);
> > > >
> > > > If we do a GFP_KERNEL allocation with this semaphore held, and the
> > > > oom-killer tries to kill something to satisfy the allocation, and the
> > > > killed task gets stuck on that semaphore, I wonder of the box locks up.
> > >
> > > We do GFP_KERNEL inside semaphores/mutexes in lots of places. So if this
> > > can deadlock with the oom-killer we probably should fix that, preferably
> > > by having GFP_KERNEL fail in that case.
> >
> > This lock is special, in that it's taken on the exit() path (I think). So
> > it can block tasks which are trying to exit.
>
> Sorry, missed the context.
>
> If there is a deadlock then it's not just this allocation that you
> need worry about. There is also an allocation within genlmsg_uniast
> that would be GFP_KERNEL.
Remove down_write() from taskstats code invoked on the exit() path.
In send_cpu_listeners(), which is called on the exit path,
a down_write() was protecting operations like skb_clone() and
genlmsg_unicast() that do GFP_KERNEL allocations. If the oom-killer
decides to kill tasks to satisfy the allocations,the exit of those
tasks could block on the same semphore.
The down_write() was only needed to allow removal of invalid
listeners from the listener list. The patch converts the down_write
to a down_read and defers the removal to a separate critical region.
This ensures that even if the oom-killer is called, no other task's
exit is blocked as it can still acquire another down_read.
Thanks to Andrew Morton & Herbert Xu for pointing out the oom
related pitfalls, and to Chandra Seetharaman for suggesting this
fix instead of using something more complex like RCU.
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@watson.ibm.com>
---
kernel/taskstats.c | 24 +++++++++++++++++++-----
1 files changed, 19 insertions(+), 5 deletions(-)
Index: linux-2.6.18-rc1/kernel/taskstats.c
===================================================================
--- linux-2.6.18-rc1.orig/kernel/taskstats.c 2006-07-11 00:13:00.000000000 -0400
+++ linux-2.6.18-rc1/kernel/taskstats.c 2006-07-12 00:07:53.000000000 -0400
@@ -51,6 +51,7 @@ __read_mostly = {
struct listener {
struct list_head list;
pid_t pid;
+ char valid;
};
struct listener_list {
@@ -127,7 +128,7 @@ static int send_cpu_listeners(struct sk_
struct listener *s, *tmp;
struct sk_buff *skb_next, *skb_cur = skb;
void *reply = genlmsg_data(genlhdr);
- int rc, ret;
+ int rc, ret, delcount = 0;
rc = genlmsg_end(skb, reply);
if (rc < 0) {
@@ -137,7 +138,7 @@ static int send_cpu_listeners(struct sk_
rc = 0;
listeners = &per_cpu(listener_array, cpu);
- down_write(&listeners->sem);
+ down_read(&listeners->sem);
list_for_each_entry_safe(s, tmp, &listeners->list, list) {
skb_next = NULL;
if (!list_islast(&s->list, &listeners->list)) {
@@ -150,14 +151,26 @@ static int send_cpu_listeners(struct sk_
}
ret = genlmsg_unicast(skb_cur, s->pid);
if (ret == -ECONNREFUSED) {
- list_del(&s->list);
- kfree(s);
+ s->valid = 0;
+ delcount++;
rc = ret;
}
skb_cur = skb_next;
}
- up_write(&listeners->sem);
+ up_read(&listeners->sem);
+
+ if (!delcount)
+ return rc;
+ /* Delete invalidated entries */
+ down_write(&listeners->sem);
+ list_for_each_entry_safe(s, tmp, &listeners->list, list) {
+ if (!s->valid) {
+ list_del(&s->list);
+ kfree(s);
+ }
+ }
+ up_write(&listeners->sem);
return rc;
}
@@ -290,6 +303,7 @@ static int add_del_listener(pid_t pid, c
goto cleanup;
s->pid = pid;
INIT_LIST_HEAD(&s->list);
+ s->valid = 1;
listeners = &per_cpu(listener_array, cpu);
down_write(&listeners->sem);
prev parent reply other threads:[~2006-07-12 5:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1152591838.14142.114.camel@localhost.localdomain>
2006-07-11 4:36 ` [Patch 6/6] per task delay accounting taskstats interface: fix clone skbs for each listener Shailabh Nagar
2006-07-11 10:05 ` Andrew Morton
2006-07-11 10:28 ` Herbert Xu
2006-07-11 10:57 ` Andrew Morton
2006-07-11 11:15 ` Herbert Xu
2006-07-11 17:44 ` Shailabh Nagar
2006-07-12 5:41 ` Shailabh Nagar [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1152682909.6054.29.camel@localhost.localdomain \
--to=nagar@watson.ibm.com \
--cc=akpm@osdl.org \
--cc=balbir@in.ibm.com \
--cc=csturtiv@sgi.com \
--cc=hadi@cyberus.ca \
--cc=herbert@gondor.apana.org.au \
--cc=jlan@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pj@sgi.com \
--cc=sekharan@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).