From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andi Kleen Subject: [RFC] [PATCH] Don't destroy TCP sockets twice Date: Fri, 6 Aug 2010 13:05:11 +0200 Message-ID: <20100806110511.GA16448@basil.fritz.box> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: netdev@vger.kernel.org Return-path: Received: from one.firstfloor.org ([213.235.205.2]:56601 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759365Ab0HFLFN (ORCPT ); Fri, 6 Aug 2010 07:05:13 -0400 Received: from basil.firstfloor.org (p5B3C940C.dip0.t-ipconnect.de [91.60.148.12]) by one.firstfloor.org (Postfix) with ESMTP id 4E4B91F08072 for ; Fri, 6 Aug 2010 13:05:12 +0200 (CEST) Content-Disposition: inline Sender: netdev-owner@vger.kernel.org List-ID: While working on something else I noticed that tcp_v4/6_destroy_sock() can get called twice on a socket. This happens because when a reset or similar happens tcp_done destroys the connection socket state, and then eventually when the socket is released it is destroyed again. Problem here is that the socket can actually still transmit packets even after having been destroyed by tcp_done: this happens because tcp_close is called after tcp_done, tcp_close calls tcp_send_fin which then sends the FIN at least. With my local modifications this lead to crashes (I destroyed some additional state in destroy_sock that is needed for sending any packets). But I think it's broken even without any changes: tcp_*_destroy sock removes a lot of state, including MD5 state, congestion control state and some others. At least for MD5 it's easy to argue that the the FIN still needs this state. I haven't tested that, but I think what will happen is that you'll see FINs which are not MD5 signed after a reset (there is no crash because zero MD5 state is ignored) There might be other similar problems with congestion avoidance state and the FIN packet. Overall it also seems inefficient to destroy twice. The only thing in tcp_*_destroy sock that I think really needs to be done is clearing the timers, but tcp_done does this already too. So this patch changes tcp_done to not call inet_csk_destroy_sock() and leave the destruction to the later final release or close. The only drawback is that the buffers could be freed a bit later for the RST case (e.g. when there is a reference to user space it will wait for the user close), but that doesn't seem like a big issue. I made this a RFC for now because it needs some more review. Signed-off-by: Andi Kleen diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6596b4f..f7cae15 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3171,8 +3171,6 @@ void tcp_done(struct sock *sk) if (!sock_flag(sk, SOCK_DEAD)) sk->sk_state_change(sk); - else - inet_csk_destroy_sock(sk); } EXPORT_SYMBOL_GPL(tcp_done);