* [TCP]: Fix sock_orphan dead lock
[not found] ` <20060429111507.GA3077@gondor.apana.org.au>
@ 2006-04-29 13:13 ` Herbert Xu
2006-05-04 6:32 ` David S. Miller
0 siblings, 1 reply; 5+ messages in thread
From: Herbert Xu @ 2006-04-29 13:13 UTC (permalink / raw)
To: Ingo Molnar; +Cc: David S. Miller, Arjan van de Ven, netdev
[-- Attachment #1: Type: text/plain, Size: 2191 bytes --]
On Sat, Apr 29, 2006 at 09:15:07PM +1000, herbert wrote:
>
> Unfortunately this is only true for TCP. All of the connectionless
> protocols use the callback lock without the socket lock so it does
> still serve a purpose.
>
> I'd be happy to see your patch included.
I've just changed my mind :) Instead of disabling BH on the read_lock
callers, we can instead move sock_orphan outside bh_lock_sock.
[TCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in tcp_close can lead to dead
locks. For example, the inet_diag code holds sk_callback_lock without
disabling BH. If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock. Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.
We can fix this by moving sock_orphan out of bh_lock_sock.
The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.
By moving sock_orphan before the release_sock we can solve this
problem. This is because as long as we own the socket lock its
state cannot change.
So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to TCP_CLOSE in the time being,
we know that the socket has been destroyed. Otherwise the socket is
still ours to keep.
Note that I've also moved the increment on the orphan count forward.
This may look like a problem as we're increasing it even if the socket
is just about to be destroyed where it'll be decreased again. However,
this simply enlarges a window that already exists. This also changes
the orphan count test by one.
Considering what the orphan count is meant to do this is no big deal.
This problem was discoverd by Ingo Molnar using his lock validator.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: tcp-sock-orphan.patch --]
[-- Type: text/plain, Size: 1337 bytes --]
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 87f68e7..e2b7b80 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1468,6 +1468,7 @@
{
struct sk_buff *skb;
int data_was_unread = 0;
+ int state;
lock_sock(sk);
sk->sk_shutdown = SHUTDOWN_MASK;
@@ -1544,6 +1545,11 @@
sk_stream_wait_close(sk, timeout);
adjudge_to_death:
+ state = sk->sk_state;
+ sock_hold(sk);
+ sock_orphan(sk);
+ atomic_inc(sk->sk_prot->orphan_count);
+
/* It is the last release_sock in its life. It will remove backlog. */
release_sock(sk);
@@ -1555,8 +1561,9 @@
bh_lock_sock(sk);
BUG_TRAP(!sock_owned_by_user(sk));
- sock_hold(sk);
- sock_orphan(sk);
+ /* Have we already been destroyed by a softirq or backlog? */
+ if (state != TCP_CLOSE && sk->sk_state == TCP_CLOSE)
+ goto out;
/* This is a (useful) BSD violating of the RFC. There is a
* problem with TCP as specified in that the other end could
@@ -1584,7 +1591,6 @@
if (tmo > TCP_TIMEWAIT_LEN) {
inet_csk_reset_keepalive_timer(sk, tcp_fin_time(sk));
} else {
- atomic_inc(sk->sk_prot->orphan_count);
tcp_time_wait(sk, TCP_FIN_WAIT2, tmo);
goto out;
}
@@ -1603,7 +1609,6 @@
NET_INC_STATS_BH(LINUX_MIB_TCPABORTONMEMORY);
}
}
- atomic_inc(sk->sk_prot->orphan_count);
if (sk->sk_state == TCP_CLOSE)
inet_csk_destroy_sock(sk);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [TCP]: Fix sock_orphan dead lock
2006-04-29 13:13 ` [TCP]: Fix sock_orphan dead lock Herbert Xu
@ 2006-05-04 6:32 ` David S. Miller
2006-05-04 7:17 ` [DCCP]: " Herbert Xu
0 siblings, 1 reply; 5+ messages in thread
From: David S. Miller @ 2006-05-04 6:32 UTC (permalink / raw)
To: herbert; +Cc: mingo, arjan, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Sat, 29 Apr 2006 23:13:20 +1000
> On Sat, Apr 29, 2006 at 09:15:07PM +1000, herbert wrote:
> >
> > Unfortunately this is only true for TCP. All of the connectionless
> > protocols use the callback lock without the socket lock so it does
> > still serve a purpose.
> >
> > I'd be happy to see your patch included.
>
> I've just changed my mind :) Instead of disabling BH on the read_lock
> callers, we can instead move sock_orphan outside bh_lock_sock.
>
> [TCP]: Fix sock_orphan dead lock
Ok, it took me a while to review this one as verifying such
a set of changes is always tricky, but looks good!
Applied, thanks a lot.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [DCCP]: Fix sock_orphan dead lock
2006-05-04 6:32 ` David S. Miller
@ 2006-05-04 7:17 ` Herbert Xu
2006-05-04 15:16 ` Arnaldo Carvalho de Melo
2006-05-06 0:09 ` David S. Miller
0 siblings, 2 replies; 5+ messages in thread
From: Herbert Xu @ 2006-05-04 7:17 UTC (permalink / raw)
To: David S. Miller; +Cc: mingo, arjan, netdev
[-- Attachment #1: Type: text/plain, Size: 1782 bytes --]
On Wed, May 03, 2006 at 11:32:23PM -0700, David S. Miller wrote:
>
> Ok, it took me a while to review this one as verifying such
> a set of changes is always tricky, but looks good!
Thanks. It took me quite a while to assure myself that it was OK
as well :)
Anyway, the same issue affects DCCP (and SCTP too probably). Here
is a patch that does the same thing for DCCP.
[DCCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead
locks. For example, the inet_diag code holds sk_callback_lock without
disabling BH. If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock. Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.
We can fix this by moving sock_orphan out of bh_lock_sock.
The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.
By moving sock_orphan before the release_sock we can solve this
problem. This is because as long as we own the socket lock its
state cannot change.
So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to DCCP_CLOSED in the time being,
we know that the socket has been destroyed. Otherwise the socket is
still ours to keep.
This problem was discoverd by Ingo Molnar using his lock validator.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[-- Attachment #2: dccp-sock-orphan.patch --]
[-- Type: text/plain, Size: 1111 bytes --]
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 1ff7328..2e0ee83 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -848,6 +848,7 @@
void dccp_close(struct sock *sk, long timeout)
{
struct sk_buff *skb;
+ int state;
lock_sock(sk);
@@ -882,6 +883,11 @@
sk_stream_wait_close(sk, timeout);
adjudge_to_death:
+ state = sk->sk_state;
+ sock_hold(sk);
+ sock_orphan(sk);
+ atomic_inc(sk->sk_prot->orphan_count);
+
/*
* It is the last release_sock in its life. It will remove backlog.
*/
@@ -894,8 +900,9 @@
bh_lock_sock(sk);
BUG_TRAP(!sock_owned_by_user(sk));
- sock_hold(sk);
- sock_orphan(sk);
+ /* Have we already been destroyed by a softirq or backlog? */
+ if (state != DCCP_CLOSED && sk->sk_state == DCCP_CLOSED)
+ goto out;
/*
* The last release_sock may have processed the CLOSE or RESET
@@ -915,12 +922,12 @@
#endif
}
- atomic_inc(sk->sk_prot->orphan_count);
if (sk->sk_state == DCCP_CLOSED)
inet_csk_destroy_sock(sk);
/* Otherwise, socket is reprieved until protocol close. */
+out:
bh_unlock_sock(sk);
local_bh_enable();
sock_put(sk);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [DCCP]: Fix sock_orphan dead lock
2006-05-04 7:17 ` [DCCP]: " Herbert Xu
@ 2006-05-04 15:16 ` Arnaldo Carvalho de Melo
2006-05-06 0:09 ` David S. Miller
1 sibling, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-05-04 15:16 UTC (permalink / raw)
To: Herbert Xu; +Cc: David S. Miller, mingo, arjan, netdev
On 5/4/06, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Wed, May 03, 2006 at 11:32:23PM -0700, David S. Miller wrote:
> >
> > Ok, it took me a while to review this one as verifying such
> > a set of changes is always tricky, but looks good!
>
> Thanks. It took me quite a while to assure myself that it was OK
> as well :)
>
> Anyway, the same issue affects DCCP (and SCTP too probably). Here
> is a patch that does the same thing for DCCP.
>
> [DCCP]: Fix sock_orphan dead lock
>
> Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead
> locks. For example, the inet_diag code holds sk_callback_lock without
> disabling BH. If an inbound packet arrives during that admittedly tiny
> window, it will cause a dead lock on bh_lock_sock. Another possible
> path would be through sock_wfree if the network device driver frees the
> tx skb in process context with BH enabled.
>
> We can fix this by moving sock_orphan out of bh_lock_sock.
>
> The tricky bit is to work out when we need to destroy the socket
> ourselves and when it has already been destroyed by someone else.
>
> By moving sock_orphan before the release_sock we can solve this
> problem. This is because as long as we own the socket lock its
> state cannot change.
>
> So we simply record the socket state before the release_sock
> and then check the state again after we regain the socket lock.
> If the socket state has transitioned to DCCP_CLOSED in the time being,
> we know that the socket has been destroyed. Otherwise the socket is
> still ours to keep.
>
> This problem was discoverd by Ingo Molnar using his lock validator.
>
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks Herbert,
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
- Arnaldo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [DCCP]: Fix sock_orphan dead lock
2006-05-04 7:17 ` [DCCP]: " Herbert Xu
2006-05-04 15:16 ` Arnaldo Carvalho de Melo
@ 2006-05-06 0:09 ` David S. Miller
1 sibling, 0 replies; 5+ messages in thread
From: David S. Miller @ 2006-05-06 0:09 UTC (permalink / raw)
To: herbert; +Cc: mingo, arjan, netdev
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 4 May 2006 17:17:50 +1000
> Anyway, the same issue affects DCCP (and SCTP too probably). Here
> is a patch that does the same thing for DCCP.
Thanks for catching that, applied.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-05-06 0:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20060424122659.GA26882@elte.hu>
[not found] ` <20060424145353.GA29318@elte.hu>
[not found] ` <20060425010549.GA28889@gondor.apana.org.au>
[not found] ` <20060425131036.GA28018@elte.hu>
[not found] ` <20060425232930.GB7740@gondor.apana.org.au>
[not found] ` <20060426053821.GA31107@elte.hu>
[not found] ` <20060427123404.GA20489@gondor.apana.org.au>
[not found] ` <20060429111507.GA3077@gondor.apana.org.au>
2006-04-29 13:13 ` [TCP]: Fix sock_orphan dead lock Herbert Xu
2006-05-04 6:32 ` David S. Miller
2006-05-04 7:17 ` [DCCP]: " Herbert Xu
2006-05-04 15:16 ` Arnaldo Carvalho de Melo
2006-05-06 0:09 ` David S. Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).