netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: Fix the condition passed to sk_wait_event()
@ 2010-10-03  9:45 Nagendra Tomar
  2010-10-04  3:42 ` David Miller
  0 siblings, 1 reply; 2+ messages in thread
From: Nagendra Tomar @ 2010-10-03  9:45 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, davem

Resending, since this is the only patch now. Thanks.

---
This patch fixes the condition (3rd arg) passed to sk_wait_event() in 
sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory() 
causes the following soft lockup in tcp_sendmsg() when the global tcp 
memory pool has exhausted. 

>>> snip <<<

localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429]
localhost kernel: CPU 3:
localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200]  [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200
localhost kernel: 
localhost kernel: Call Trace:
localhost kernel:  [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0
localhost kernel:  [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140
localhost kernel:  [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130
localhost kernel:  [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40
localhost kernel:  [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170
localhost kernel:  [vfs_write+0x185/0x190] vfs_write+0x185/0x190
localhost kernel:  [sys_write+0x50/0x90] sys_write+0x50/0x90
localhost kernel:  [system_call+0x7e/0x83] system_call+0x7e/0x83

>>> snip <<<


What is happening is, that the sk_wait_event() condition passed from
sk_stream_wait_memory() evaluates to true for the case of tcp global memory
exhaustion. This is because both sk_stream_memory_free() and vm_wait are true
which causes sk_wait_event() to *not* call schedule_timeout(). 
Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping.
This causes the caller to again try allocation, which again fails and again
calls sk_stream_wait_memory(), and so on.


Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com>
---
--- linux-2.6.35.7/net/core/stream.c.orig	2010-03-25 07:37:58.000000000 +0530
+++ linux-2.6.35.7/net/core/stream.c	2010-03-25 07:42:16.000000000 +0530
@@ -144,10 +144,10 @@ int sk_stream_wait_memory(struct sock *s
 
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		sk->sk_write_pending++;
-		sk_wait_event(sk, &current_timeo, !sk->sk_err &&
-						  !(sk->sk_shutdown & SEND_SHUTDOWN) &&
-						  sk_stream_memory_free(sk) &&
-						  vm_wait);
+		sk_wait_event(sk, &current_timeo, sk->sk_err ||
+						  (sk->sk_shutdown & SEND_SHUTDOWN) ||
+						  (sk_stream_memory_free(sk) &&
+						  !vm_wait));
 		sk->sk_write_pending--;
 
 		if (vm_wait) {
---








      

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] net: Fix the condition passed to sk_wait_event()
  2010-10-03  9:45 [PATCH] net: Fix the condition passed to sk_wait_event() Nagendra Tomar
@ 2010-10-04  3:42 ` David Miller
  0 siblings, 0 replies; 2+ messages in thread
From: David Miller @ 2010-10-04  3:42 UTC (permalink / raw)
  To: tomer_iisc; +Cc: netdev, linux-kernel

From: Nagendra Tomar <tomer_iisc@yahoo.com>
Date: Sun, 3 Oct 2010 02:45:06 -0700 (PDT)

> This patch fixes the condition (3rd arg) passed to sk_wait_event() in 
> sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory() 
> causes the following soft lockup in tcp_sendmsg() when the global tcp 
> memory pool has exhausted. 
 ...
> What is happening is, that the sk_wait_event() condition passed from
> sk_stream_wait_memory() evaluates to true for the case of tcp global memory
> exhaustion. This is because both sk_stream_memory_free() and vm_wait are true
> which causes sk_wait_event() to *not* call schedule_timeout(). 
> Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping.
> This causes the caller to again try allocation, which again fails and again
> calls sk_stream_wait_memory(), and so on.
> 
> 
> Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com>

Applied, thanks.

This bug was introduced by the following commit, which I made
a note of in the commit message for the fix:

--------------------
commit c1cbe4b7ad0bc4b1d98ea708a3fecb7362aa4088
Author: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Date:   Tue Dec 13 23:22:19 2005 -0800

    [NET]: Avoid atomic xchg() for non-error case
    
    It also looks like there were 2 places where the test on sk_err was
    missing from the event wait logic (in sk_stream_wait_connect and
    sk_stream_wait_memory), while the rest of the sock_error() users look
    to be doing the right thing.  This version of the patch fixes those,
    and cleans up a few places that were testing ->sk_err directly.
    
    Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/include/net/sock.h b/include/net/sock.h
index 982b4ec..0fbae85 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1166,7 +1166,10 @@ static inline int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb)
  
 static inline int sock_error(struct sock *sk)
 {
-	int err = xchg(&sk->sk_err, 0);
+	int err;
+	if (likely(!sk->sk_err))
+		return 0;
+	err = xchg(&sk->sk_err, 0);
 	return -err;
 }
 
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index ea616e3..fb031fe 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -287,10 +287,9 @@ int bt_sock_wait_state(struct sock *sk, int state, unsigned long timeo)
 		timeo = schedule_timeout(timeo);
 		lock_sock(sk);
 
-		if (sk->sk_err) {
-			err = sock_error(sk);
+		err = sock_error(sk);
+		if (err)
 			break;
-		}
 	}
 	set_current_state(TASK_RUNNING);
 	remove_wait_queue(sk->sk_sleep, &wait);
diff --git a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c
index e3bb11c..95f33cc 100644
--- a/net/bluetooth/l2cap.c
+++ b/net/bluetooth/l2cap.c
@@ -767,8 +767,9 @@ static int l2cap_sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct ms
 
 	BT_DBG("sock %p, sk %p", sock, sk);
 
-	if (sk->sk_err)
-		return sock_error(sk);
+	err = sock_error(sk);
+	if (err)
+		return err;
 
 	if (msg->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 9cb00dc..6481814 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -637,8 +637,9 @@ static int sco_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 	BT_DBG("sock %p, sk %p", sock, sk);
 
-	if (sk->sk_err)
-		return sock_error(sk);
+	err = sock_error(sk);
+	if (err)
+		return err;
 
 	if (msg->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
diff --git a/net/core/stream.c b/net/core/stream.c
index 15bfd03..35e2525 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -55,8 +55,9 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p)
 	int done;
 
 	do {
-		if (sk->sk_err)
-			return sock_error(sk);
+		int err = sock_error(sk);
+		if (err)
+			return err;
 		if ((1 << sk->sk_state) & ~(TCPF_SYN_SENT | TCPF_SYN_RECV))
 			return -EPIPE;
 		if (!*timeo_p)
@@ -67,6 +68,7 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p)
 		prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE);
 		sk->sk_write_pending++;
 		done = sk_wait_event(sk, timeo_p,
+				     !sk->sk_err &&
 				     !((1 << sk->sk_state) & 
 				       ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)));
 		finish_wait(sk->sk_sleep, &wait);
@@ -137,7 +139,9 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
 
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		sk->sk_write_pending++;
-		sk_wait_event(sk, &current_timeo, sk_stream_memory_free(sk) &&
+		sk_wait_event(sk, &current_timeo, !sk->sk_err && 
+						  !(sk->sk_shutdown & SEND_SHUTDOWN) &&
+						  sk_stream_memory_free(sk) &&
 						  vm_wait);
 		sk->sk_write_pending--;
 
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 6f92f9c..f121f7d 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1438,8 +1438,9 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock,
 			/*
 			 *	POSIX 1003.1g mandates this order.
 			 */
-			if (sk->sk_err)
-				ret = sock_error(sk);
+			ret = sock_error(sk);
+			if (ret)
+				break;
 			else if (sk->sk_shutdown & RCV_SHUTDOWN)
 				;
 			else if (noblock)
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index c3f0b07..b6d3df5 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -566,10 +566,9 @@ static int llc_wait_data(struct sock *sk, long timeo)
 		/*
 		 * POSIX 1003.1g mandates this order.
 		 */
-		if (sk->sk_err) {
-			rc = sock_error(sk);
+		rc = sock_error(sk);
+		if (rc)
 			break;
-		}
 		rc = 0;
 		if (sk->sk_shutdown & RCV_SHUTDOWN)
 			break;

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-10-04  3:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-03  9:45 [PATCH] net: Fix the condition passed to sk_wait_event() Nagendra Tomar
2010-10-04  3:42 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).