* soft lockups/OOM after unix socket fixes @ 2008-11-20 22:03 dann frazier 2008-11-25 23:17 ` [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector dann frazier 0 siblings, 1 reply; 7+ messages in thread From: dann frazier @ 2008-11-20 22:03 UTC (permalink / raw) To: netdev hey, I'm noticing that if I run the unix.c program from [1] in a tight loop on 2.6.28-rc5, I get frequent soft lockups. Eventually, the OOM killer kicks in and starts taking out other processes. I've posted a console log here: http://free.linux.hp.com/~dannf/unix-soft-lockups.log [1] http://marc.info/?l=linux-netdev&m=122593044330973&w=2 -- dann frazier ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector 2008-11-20 22:03 soft lockups/OOM after unix socket fixes dann frazier @ 2008-11-25 23:17 ` dann frazier 2008-11-26 5:19 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: dann frazier @ 2008-11-25 23:17 UTC (permalink / raw) To: netdev; +Cc: eteo, davem This is an implementation of David Miller's suggested fix in: https://bugzilla.redhat.com/show_bug.cgi?id=470201 Paraphrasing the description from the above report, it makes sendmsg() block while UNIX garbage collection is in progress. This avoids a situation where child processes continue to queue new FDs over a AF_UNIX socket to a parent which is in the exit path and running garbage collection on these FDs. This contention can result in soft lockups and oom-killing of unrelated processes. Signed-off-by: dann frazier <dannf@hp.com> --- include/net/af_unix.h | 1 + net/unix/af_unix.c | 2 ++ net/unix/garbage.c | 18 +++++++++++++++--- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index c29ff1d..1614d78 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -9,6 +9,7 @@ extern void unix_inflight(struct file *fp); extern void unix_notinflight(struct file *fp); extern void unix_gc(void); +extern void wait_for_unix_gc(void); #define UNIX_HASH_SIZE 256 diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 8bde9bf..b0785ef 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1341,6 +1341,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock, if (NULL == siocb->scm) siocb->scm = &tmp_scm; + wait_for_unix_gc(); err = scm_send(sock, msg, siocb->scm); if (err < 0) return err; @@ -1491,6 +1492,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock, if (NULL == siocb->scm) siocb->scm = &tmp_scm; + wait_for_unix_gc(); err = scm_send(sock, msg, siocb->scm); if (err < 0) return err; diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 6d4a9a8..cf1b0b0 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -80,6 +80,7 @@ #include <linux/file.h> #include <linux/proc_fs.h> #include <linux/mutex.h> +#include <linux/wait.h> #include <net/sock.h> #include <net/af_unix.h> @@ -91,6 +92,7 @@ static LIST_HEAD(gc_inflight_list); static LIST_HEAD(gc_candidates); static DEFINE_SPINLOCK(unix_gc_lock); +static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait); unsigned int unix_tot_inflight; @@ -266,12 +268,21 @@ static void inc_inflight_move_tail(struct unix_sock *u) list_move_tail(&u->link, &gc_candidates); } -/* The external entry point: unix_gc() */ +static bool gc_in_progress = false; -void unix_gc(void) +void wait_for_unix_gc(void) { - static bool gc_in_progress = false; + int error; + + do { + error = wait_event_interruptible(unix_gc_wait, + gc_in_progress == false); + } while(error); +} +/* The external entry point: unix_gc() */ +void unix_gc(void) +{ struct unix_sock *u; struct unix_sock *next; struct sk_buff_head hitlist; @@ -376,6 +387,7 @@ void unix_gc(void) /* All candidates should have been detached by now. */ BUG_ON(!list_empty(&gc_candidates)); gc_in_progress = false; + wake_up(&unix_gc_wait); out: spin_unlock(&unix_gc_lock); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector 2008-11-25 23:17 ` [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector dann frazier @ 2008-11-26 5:19 ` David Miller 2008-11-26 17:04 ` dann frazier 0 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2008-11-26 5:19 UTC (permalink / raw) To: dannf; +Cc: netdev, eteo From: dann frazier <dannf@hp.com> Date: Tue, 25 Nov 2008 16:17:13 -0700 > -void unix_gc(void) > +void wait_for_unix_gc(void) > { > - static bool gc_in_progress = false; > + int error; > + > + do { > + error = wait_event_interruptible(unix_gc_wait, > + gc_in_progress == false); > + } while(error); > +} If you want a truly uninterruptible wait, simply use wait_event(). Could you make this change and resubmit your patch? Thanks Dann! ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector 2008-11-26 5:19 ` David Miller @ 2008-11-26 17:04 ` dann frazier 2008-11-26 23:32 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: dann frazier @ 2008-11-26 17:04 UTC (permalink / raw) To: David Miller; +Cc: netdev, eteo This is an implementation of David Miller's suggested fix in: https://bugzilla.redhat.com/show_bug.cgi?id=470201 It has been updated to use wait_event() instead of wait_event_interruptible(). Paraphrasing the description from the above report, it makes sendmsg() block while UNIX garbage collection is in progress. This avoids a situation where child processes continue to queue new FDs over a AF_UNIX socket to a parent which is in the exit path and running garbage collection on these FDs. This contention can result in soft lockups and oom-killing of unrelated processes. Signed-off-by: dann frazier <dannf@hp.com> -- include/net/af_unix.h | 1 + net/unix/af_unix.c | 2 ++ net/unix/garbage.c | 13 ++++++++++--- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index c29ff1d..1614d78 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -9,6 +9,7 @@ extern void unix_inflight(struct file *fp); extern void unix_notinflight(struct file *fp); extern void unix_gc(void); +extern void wait_for_unix_gc(void); #define UNIX_HASH_SIZE 256 diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index eb90f77..66d5ac4 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1343,6 +1343,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock, if (NULL == siocb->scm) siocb->scm = &tmp_scm; + wait_for_unix_gc(); err = scm_send(sock, msg, siocb->scm); if (err < 0) return err; @@ -1493,6 +1494,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock, if (NULL == siocb->scm) siocb->scm = &tmp_scm; + wait_for_unix_gc(); err = scm_send(sock, msg, siocb->scm); if (err < 0) return err; diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 6d4a9a8..abb3ab3 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -80,6 +80,7 @@ #include <linux/file.h> #include <linux/proc_fs.h> #include <linux/mutex.h> +#include <linux/wait.h> #include <net/sock.h> #include <net/af_unix.h> @@ -91,6 +92,7 @@ static LIST_HEAD(gc_inflight_list); static LIST_HEAD(gc_candidates); static DEFINE_SPINLOCK(unix_gc_lock); +static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait); unsigned int unix_tot_inflight; @@ -266,12 +268,16 @@ static void inc_inflight_move_tail(struct unix_sock *u) list_move_tail(&u->link, &gc_candidates); } -/* The external entry point: unix_gc() */ +static bool gc_in_progress = false; -void unix_gc(void) +void wait_for_unix_gc(void) { - static bool gc_in_progress = false; + wait_event(unix_gc_wait, gc_in_progress == false); +} +/* The external entry point: unix_gc() */ +void unix_gc(void) +{ struct unix_sock *u; struct unix_sock *next; struct sk_buff_head hitlist; @@ -376,6 +382,7 @@ void unix_gc(void) /* All candidates should have been detached by now. */ BUG_ON(!list_empty(&gc_candidates)); gc_in_progress = false; + wake_up(&unix_gc_wait); out: spin_unlock(&unix_gc_lock); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector 2008-11-26 17:04 ` dann frazier @ 2008-11-26 23:32 ` David Miller 2008-12-01 20:17 ` dann frazier 0 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2008-11-26 23:32 UTC (permalink / raw) To: dannf; +Cc: netdev, eteo From: dann frazier <dannf@dannf.org> Date: Wed, 26 Nov 2008 10:04:02 -0700 > This is an implementation of David Miller's suggested fix in: > https://bugzilla.redhat.com/show_bug.cgi?id=470201 > > It has been updated to use wait_event() instead of > wait_event_interruptible(). > > Paraphrasing the description from the above report, it makes sendmsg() > block while UNIX garbage collection is in progress. This avoids a > situation where child processes continue to queue new FDs over a > AF_UNIX socket to a parent which is in the exit path and running > garbage collection on these FDs. This contention can result in soft > lockups and oom-killing of unrelated processes. > > Signed-off-by: dann frazier <dannf@hp.com> Applied, thanks a lot Dann. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector 2008-11-26 23:32 ` David Miller @ 2008-12-01 20:17 ` dann frazier 2008-12-01 21:16 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: dann frazier @ 2008-12-01 20:17 UTC (permalink / raw) To: David Miller; +Cc: netdev, eteo On Wed, Nov 26, 2008 at 03:32:43PM -0800, David Miller wrote: > From: dann frazier <dannf@dannf.org> > Date: Wed, 26 Nov 2008 10:04:02 -0700 > > > This is an implementation of David Miller's suggested fix in: > > https://bugzilla.redhat.com/show_bug.cgi?id=470201 > > > > It has been updated to use wait_event() instead of > > wait_event_interruptible(). > > > > Paraphrasing the description from the above report, it makes sendmsg() > > block while UNIX garbage collection is in progress. This avoids a > > situation where child processes continue to queue new FDs over a > > AF_UNIX socket to a parent which is in the exit path and running > > garbage collection on these FDs. This contention can result in soft > > lockups and oom-killing of unrelated processes. > > > > Signed-off-by: dann frazier <dannf@hp.com> > > Applied, thanks a lot Dann. I was asked if this patch may introduce blocking during operations on non-blocking sockets. Should we update wait_for_unix_gc (and its callers) to something like this? int wait_for_unix_gc(bool can_block) { if (!can_block) return gc_in_progress ? -EWOULDBLOCK : 0; wait_event(unix_gc_wait, gc_in_progress == false); return 0; } -- dann frazier ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector 2008-12-01 20:17 ` dann frazier @ 2008-12-01 21:16 ` David Miller 0 siblings, 0 replies; 7+ messages in thread From: David Miller @ 2008-12-01 21:16 UTC (permalink / raw) To: dannf; +Cc: netdev, eteo From: dann frazier <dannf@hp.com> Date: Mon, 1 Dec 2008 13:17:04 -0700 > On Wed, Nov 26, 2008 at 03:32:43PM -0800, David Miller wrote: > > From: dann frazier <dannf@dannf.org> > > Date: Wed, 26 Nov 2008 10:04:02 -0700 > > > > > This is an implementation of David Miller's suggested fix in: > > > https://bugzilla.redhat.com/show_bug.cgi?id=470201 > > > > > > It has been updated to use wait_event() instead of > > > wait_event_interruptible(). > > > > > > Paraphrasing the description from the above report, it makes sendmsg() > > > block while UNIX garbage collection is in progress. This avoids a > > > situation where child processes continue to queue new FDs over a > > > AF_UNIX socket to a parent which is in the exit path and running > > > garbage collection on these FDs. This contention can result in soft > > > lockups and oom-killing of unrelated processes. > > > > > > Signed-off-by: dann frazier <dannf@hp.com> > > > > Applied, thanks a lot Dann. > > I was asked if this patch may introduce blocking during operations on > non-blocking sockets. Should we update wait_for_unix_gc (and its > callers) to something like this? No, it's just like waiting for a GFP_KERNEL memory allocation. Non-blocking doesn't mean "never will sleep". ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-12-01 21:16 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-20 22:03 soft lockups/OOM after unix socket fixes dann frazier 2008-11-25 23:17 ` [PATCH] Fix soft lockups/OOM issues w/ unix garbage collector dann frazier 2008-11-26 5:19 ` David Miller 2008-11-26 17:04 ` dann frazier 2008-11-26 23:32 ` David Miller 2008-12-01 20:17 ` dann frazier 2008-12-01 21:16 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).