* soft lockups/OOM after unix socket fixes
@ 2008-11-20 22:03 dann frazier
2008-11-25 23:17 ` [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector dann frazier
0 siblings, 1 reply; 7+ messages in thread
From: dann frazier @ 2008-11-20 22:03 UTC (permalink / raw)
To: netdev
hey,
I'm noticing that if I run the unix.c program from [1] in a tight
loop on 2.6.28-rc5, I get frequent soft lockups. Eventually, the OOM
killer kicks in and starts taking out other processes.
I've posted a console log here:
http://free.linux.hp.com/~dannf/unix-soft-lockups.log
[1] http://marc.info/?l=linux-netdev&m=122593044330973&w=2
--
dann frazier
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector
2008-11-20 22:03 soft lockups/OOM after unix socket fixes dann frazier
@ 2008-11-25 23:17 ` dann frazier
2008-11-26 5:19 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: dann frazier @ 2008-11-25 23:17 UTC (permalink / raw)
To: netdev; +Cc: eteo, davem
This is an implementation of David Miller's suggested fix in:
https://bugzilla.redhat.com/show_bug.cgi?id=470201
Paraphrasing the description from the above report, it makes sendmsg()
block while UNIX garbage collection is in progress. This avoids a
situation where child processes continue to queue new FDs over a
AF_UNIX socket to a parent which is in the exit path and running
garbage collection on these FDs. This contention can result in soft
lockups and oom-killing of unrelated processes.
Signed-off-by: dann frazier <dannf@hp.com>
---
include/net/af_unix.h | 1 +
net/unix/af_unix.c | 2 ++
net/unix/garbage.c | 18 +++++++++++++++---
3 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index c29ff1d..1614d78 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -9,6 +9,7 @@
extern void unix_inflight(struct file *fp);
extern void unix_notinflight(struct file *fp);
extern void unix_gc(void);
+extern void wait_for_unix_gc(void);
#define UNIX_HASH_SIZE 256
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 8bde9bf..b0785ef 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1341,6 +1341,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (NULL == siocb->scm)
siocb->scm = &tmp_scm;
+ wait_for_unix_gc();
err = scm_send(sock, msg, siocb->scm);
if (err < 0)
return err;
@@ -1491,6 +1492,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (NULL == siocb->scm)
siocb->scm = &tmp_scm;
+ wait_for_unix_gc();
err = scm_send(sock, msg, siocb->scm);
if (err < 0)
return err;
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 6d4a9a8..cf1b0b0 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -80,6 +80,7 @@
#include <linux/file.h>
#include <linux/proc_fs.h>
#include <linux/mutex.h>
+#include <linux/wait.h>
#include <net/sock.h>
#include <net/af_unix.h>
@@ -91,6 +92,7 @@
static LIST_HEAD(gc_inflight_list);
static LIST_HEAD(gc_candidates);
static DEFINE_SPINLOCK(unix_gc_lock);
+static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait);
unsigned int unix_tot_inflight;
@@ -266,12 +268,21 @@ static void inc_inflight_move_tail(struct unix_sock *u)
list_move_tail(&u->link, &gc_candidates);
}
-/* The external entry point: unix_gc() */
+static bool gc_in_progress = false;
-void unix_gc(void)
+void wait_for_unix_gc(void)
{
- static bool gc_in_progress = false;
+ int error;
+
+ do {
+ error = wait_event_interruptible(unix_gc_wait,
+ gc_in_progress == false);
+ } while(error);
+}
+/* The external entry point: unix_gc() */
+void unix_gc(void)
+{
struct unix_sock *u;
struct unix_sock *next;
struct sk_buff_head hitlist;
@@ -376,6 +387,7 @@ void unix_gc(void)
/* All candidates should have been detached by now. */
BUG_ON(!list_empty(&gc_candidates));
gc_in_progress = false;
+ wake_up(&unix_gc_wait);
out:
spin_unlock(&unix_gc_lock);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector
2008-11-25 23:17 ` [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector dann frazier
@ 2008-11-26 5:19 ` David Miller
2008-11-26 17:04 ` dann frazier
0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2008-11-26 5:19 UTC (permalink / raw)
To: dannf; +Cc: netdev, eteo
From: dann frazier <dannf@hp.com>
Date: Tue, 25 Nov 2008 16:17:13 -0700
> -void unix_gc(void)
> +void wait_for_unix_gc(void)
> {
> - static bool gc_in_progress = false;
> + int error;
> +
> + do {
> + error = wait_event_interruptible(unix_gc_wait,
> + gc_in_progress == false);
> + } while(error);
> +}
If you want a truly uninterruptible wait, simply use wait_event().
Could you make this change and resubmit your patch?
Thanks Dann!
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector
2008-11-26 5:19 ` David Miller
@ 2008-11-26 17:04 ` dann frazier
2008-11-26 23:32 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: dann frazier @ 2008-11-26 17:04 UTC (permalink / raw)
To: David Miller; +Cc: netdev, eteo
This is an implementation of David Miller's suggested fix in:
https://bugzilla.redhat.com/show_bug.cgi?id=470201
It has been updated to use wait_event() instead of
wait_event_interruptible().
Paraphrasing the description from the above report, it makes sendmsg()
block while UNIX garbage collection is in progress. This avoids a
situation where child processes continue to queue new FDs over a
AF_UNIX socket to a parent which is in the exit path and running
garbage collection on these FDs. This contention can result in soft
lockups and oom-killing of unrelated processes.
Signed-off-by: dann frazier <dannf@hp.com>
--
include/net/af_unix.h | 1 +
net/unix/af_unix.c | 2 ++
net/unix/garbage.c | 13 ++++++++++---
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index c29ff1d..1614d78 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -9,6 +9,7 @@
extern void unix_inflight(struct file *fp);
extern void unix_notinflight(struct file *fp);
extern void unix_gc(void);
+extern void wait_for_unix_gc(void);
#define UNIX_HASH_SIZE 256
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index eb90f77..66d5ac4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1343,6 +1343,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (NULL == siocb->scm)
siocb->scm = &tmp_scm;
+ wait_for_unix_gc();
err = scm_send(sock, msg, siocb->scm);
if (err < 0)
return err;
@@ -1493,6 +1494,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (NULL == siocb->scm)
siocb->scm = &tmp_scm;
+ wait_for_unix_gc();
err = scm_send(sock, msg, siocb->scm);
if (err < 0)
return err;
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 6d4a9a8..abb3ab3 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -80,6 +80,7 @@
#include <linux/file.h>
#include <linux/proc_fs.h>
#include <linux/mutex.h>
+#include <linux/wait.h>
#include <net/sock.h>
#include <net/af_unix.h>
@@ -91,6 +92,7 @@
static LIST_HEAD(gc_inflight_list);
static LIST_HEAD(gc_candidates);
static DEFINE_SPINLOCK(unix_gc_lock);
+static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait);
unsigned int unix_tot_inflight;
@@ -266,12 +268,16 @@ static void inc_inflight_move_tail(struct unix_sock *u)
list_move_tail(&u->link, &gc_candidates);
}
-/* The external entry point: unix_gc() */
+static bool gc_in_progress = false;
-void unix_gc(void)
+void wait_for_unix_gc(void)
{
- static bool gc_in_progress = false;
+ wait_event(unix_gc_wait, gc_in_progress == false);
+}
+/* The external entry point: unix_gc() */
+void unix_gc(void)
+{
struct unix_sock *u;
struct unix_sock *next;
struct sk_buff_head hitlist;
@@ -376,6 +382,7 @@ void unix_gc(void)
/* All candidates should have been detached by now. */
BUG_ON(!list_empty(&gc_candidates));
gc_in_progress = false;
+ wake_up(&unix_gc_wait);
out:
spin_unlock(&unix_gc_lock);
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector
2008-11-26 17:04 ` dann frazier
@ 2008-11-26 23:32 ` David Miller
2008-12-01 20:17 ` dann frazier
0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2008-11-26 23:32 UTC (permalink / raw)
To: dannf; +Cc: netdev, eteo
From: dann frazier <dannf@dannf.org>
Date: Wed, 26 Nov 2008 10:04:02 -0700
> This is an implementation of David Miller's suggested fix in:
> https://bugzilla.redhat.com/show_bug.cgi?id=470201
>
> It has been updated to use wait_event() instead of
> wait_event_interruptible().
>
> Paraphrasing the description from the above report, it makes sendmsg()
> block while UNIX garbage collection is in progress. This avoids a
> situation where child processes continue to queue new FDs over a
> AF_UNIX socket to a parent which is in the exit path and running
> garbage collection on these FDs. This contention can result in soft
> lockups and oom-killing of unrelated processes.
>
> Signed-off-by: dann frazier <dannf@hp.com>
Applied, thanks a lot Dann.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector
2008-11-26 23:32 ` David Miller
@ 2008-12-01 20:17 ` dann frazier
2008-12-01 21:16 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: dann frazier @ 2008-12-01 20:17 UTC (permalink / raw)
To: David Miller; +Cc: netdev, eteo
On Wed, Nov 26, 2008 at 03:32:43PM -0800, David Miller wrote:
> From: dann frazier <dannf@dannf.org>
> Date: Wed, 26 Nov 2008 10:04:02 -0700
>
> > This is an implementation of David Miller's suggested fix in:
> > https://bugzilla.redhat.com/show_bug.cgi?id=470201
> >
> > It has been updated to use wait_event() instead of
> > wait_event_interruptible().
> >
> > Paraphrasing the description from the above report, it makes sendmsg()
> > block while UNIX garbage collection is in progress. This avoids a
> > situation where child processes continue to queue new FDs over a
> > AF_UNIX socket to a parent which is in the exit path and running
> > garbage collection on these FDs. This contention can result in soft
> > lockups and oom-killing of unrelated processes.
> >
> > Signed-off-by: dann frazier <dannf@hp.com>
>
> Applied, thanks a lot Dann.
I was asked if this patch may introduce blocking during operations on
non-blocking sockets. Should we update wait_for_unix_gc (and its
callers) to something like this?
int wait_for_unix_gc(bool can_block)
{
if (!can_block)
return gc_in_progress ? -EWOULDBLOCK : 0;
wait_event(unix_gc_wait, gc_in_progress == false);
return 0;
}
--
dann frazier
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector
2008-12-01 20:17 ` dann frazier
@ 2008-12-01 21:16 ` David Miller
0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2008-12-01 21:16 UTC (permalink / raw)
To: dannf; +Cc: netdev, eteo
From: dann frazier <dannf@hp.com>
Date: Mon, 1 Dec 2008 13:17:04 -0700
> On Wed, Nov 26, 2008 at 03:32:43PM -0800, David Miller wrote:
> > From: dann frazier <dannf@dannf.org>
> > Date: Wed, 26 Nov 2008 10:04:02 -0700
> >
> > > This is an implementation of David Miller's suggested fix in:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=470201
> > >
> > > It has been updated to use wait_event() instead of
> > > wait_event_interruptible().
> > >
> > > Paraphrasing the description from the above report, it makes sendmsg()
> > > block while UNIX garbage collection is in progress. This avoids a
> > > situation where child processes continue to queue new FDs over a
> > > AF_UNIX socket to a parent which is in the exit path and running
> > > garbage collection on these FDs. This contention can result in soft
> > > lockups and oom-killing of unrelated processes.
> > >
> > > Signed-off-by: dann frazier <dannf@hp.com>
> >
> > Applied, thanks a lot Dann.
>
> I was asked if this patch may introduce blocking during operations on
> non-blocking sockets. Should we update wait_for_unix_gc (and its
> callers) to something like this?
No, it's just like waiting for a GFP_KERNEL memory allocation.
Non-blocking doesn't mean "never will sleep".
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-12-01 21:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-20 22:03 soft lockups/OOM after unix socket fixes dann frazier
2008-11-25 23:17 ` [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector dann frazier
2008-11-26 5:19 ` David Miller
2008-11-26 17:04 ` dann frazier
2008-11-26 23:32 ` David Miller
2008-12-01 20:17 ` dann frazier
2008-12-01 21:16 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).