From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40840) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cgSes-00006N-Mt for qemu-devel@nongnu.org; Wed, 22 Feb 2017 03:52:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cgSep-0005nt-JN for qemu-devel@nongnu.org; Wed, 22 Feb 2017 03:52:22 -0500 Received: from [45.249.212.188] (port=2936 helo=dggrg02-dlp.huawei.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1cgSeo-0005n9-NG for qemu-devel@nongnu.org; Wed, 22 Feb 2017 03:52:19 -0500 References: <1487735198-127300-1-git-send-email-zhang.zhanghailiang@huawei.com> <1487735198-127300-3-git-send-email-zhang.zhanghailiang@huawei.com> <58AD4F8F.5030703@huawei.com> From: Hailiang Zhang Message-ID: <58AD5125.4010306@huawei.com> Date: Wed, 22 Feb 2017 16:51:49 +0800 MIME-Version: 1.0 In-Reply-To: <58AD4F8F.5030703@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jason Wang , zhangchen.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com Cc: xuquan8@huawei.com, qemu-devel@nongnu.org, pss.wulizhen@huawei.com On 2017/2/22 16:45, Hailiang Zhang wrote: > On 2017/2/22 16:07, Jason Wang wrote: >> >> >> On 2017年02月22日 11:46, zhanghailiang wrote: >>> After a net connection is closed, we didn't clear its releated resources >>> in connection_track_table, which will lead to memory leak. >> >> Not a real leak but would lead reset of hash table if too many closed >> connections. >> > > Yes, you are right, there will be lots of stale connection data in hash table > if we don't remove it while it is been closed. Which > >>> >>> Let't track the state of net connection, if it is closed, its related >>> resources will be cleared up. >> >> The issue is the state were tracked partially, do we need a full state >> machine here? >> > > Not, IMHO, we only care about the last state of it, because, we will do nothing > even if we track the intermedial states. > >>> >>> Signed-off-by: zhanghailiang >>> --- >>> net/colo.h | 4 +++ >>> net/filter-rewriter.c | 70 +++++++++++++++++++++++++++++++++++++++++++++------ >>> 2 files changed, 67 insertions(+), 7 deletions(-) >>> >>> diff --git a/net/colo.h b/net/colo.h >>> index 7c524f3..cd9027f 100644 >>> --- a/net/colo.h >>> +++ b/net/colo.h >>> @@ -18,6 +18,7 @@ >>> #include "slirp/slirp.h" >>> #include "qemu/jhash.h" >>> #include "qemu/timer.h" >>> +#include "slirp/tcp.h" >>> >>> #define HASHTABLE_MAX_SIZE 16384 >>> >>> @@ -69,6 +70,9 @@ typedef struct Connection { >>> * run once in independent tcp connection >>> */ >>> int syn_flag; >>> + >>> + int tcp_state; /* TCP FSM state */ >>> + tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */ >>> } Connection; >>> >>> uint32_t connection_key_hash(const void *opaque); >>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c >>> index c4ab91c..7e7ec35 100644 >>> --- a/net/filter-rewriter.c >>> +++ b/net/filter-rewriter.c >>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt) >>> } >>> >>> /* handle tcp packet from primary guest */ >>> -static int handle_primary_tcp_pkt(NetFilterState *nf, >>> +static int handle_primary_tcp_pkt(RewriterState *rf, >>> Connection *conn, >>> - Packet *pkt) >>> + Packet *pkt, ConnectionKey *key) >>> { >>> struct tcphdr *tcp_pkt; >>> >>> @@ -97,15 +97,45 @@ static int handle_primary_tcp_pkt(NetFilterState *nf, >>> tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset); >>> >>> net_checksum_calculate((uint8_t *)pkt->data, pkt->size); >>> + /* >>> + * Case 1: >>> + * The *server* side of this connect is VM, *client* tries to close >>> + * the connection. >>> + * >>> + * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1' >>> + * packet from server side. From this point, we can ensure that there >>> + * will be no packets in the connection, except that, some errors >>> + * happen between the path of 'filter object' and vNIC, if this rare >>> + * case really happen, we can still create a new connection, >>> + * So it is safe to remove the connection from connection_track_table. >>> + * >>> + */ >>> + if ((conn->tcp_state == TCPS_LAST_ACK) && >>> + (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) { >>> + fprintf(stderr, "Remove conn " >> >> Can this even compile? >> > > Oops, i forgot to remove it, will remove it in next version. > >>> + g_hash_table_remove(rf->connection_track_table, key); >>> + } >>> + } >>> + /* >>> + * Case 2: >>> + * The *server* side of this connect is VM, *server* tries to close >>> + * the connection. >>> + * >>> + * We got 'fin=1, ack=1' packet from client side, we need to >>> + * record the seq of 'fin=1, ack=1' packet. >>> + */ >>> + if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) { >>> + conn->fin_ack_seq = htonl(tcp_pkt->th_seq); >>> + conn->tcp_state = TCPS_LAST_ACK; >>> } >>> >>> return 0; >>> } >>> >>> /* handle tcp packet from secondary guest */ >>> -static int handle_secondary_tcp_pkt(NetFilterState *nf, >>> +static int handle_secondary_tcp_pkt(RewriterState *rf, >>> Connection *conn, >>> - Packet *pkt) >>> + Packet *pkt, ConnectionKey *key) >>> { >>> struct tcphdr *tcp_pkt; >>> >>> @@ -133,8 +163,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf, >>> tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset); >>> >>> net_checksum_calculate((uint8_t *)pkt->data, pkt->size); >>> + /* >>> + * Case 2: >>> + * The *server* side of this connect is VM, *server* tries to close >>> + * the connection. >>> + * >>> + * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1' >>> + * packet from client side. Like Case 1, there should be no packets >>> + * in the connection from now know, But the difference here is >>> + * if the packet is lost, We will get the resent 'fin=1,ack=1' packet. >>> + * TODO: Fix above case. >>> + */ >>> + if ((conn->tcp_state == TCPS_LAST_ACK) && >>> + (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) { >>> + g_hash_table_remove(rf->connection_track_table, key); >>> + } >>> + } >>> + /* >>> + * Case 1: >>> + * The *server* side of this connect is VM, *client* tries to close >>> + * the connection. >>> + * >>> + * We got 'fin=1, ack=1' packet from server side, we need to >>> + * record the seq of 'fin=1, ack=1' packet. >>> + */ >>> + if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) { >>> + conn->fin_ack_seq = ntohl(tcp_pkt->th_seq); >>> + conn->tcp_state = TCPS_LAST_ACK; >> >> I thought the tcp_state should store the state of TCP from the view of >> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion. And >> the handle of active close needs more states here. E.g if connection is >> in FIN_WAIT_2, the connection is only half closed, remote peer can still >> send packet to us unless we receive a FIN. >> > > Yes, i know what you mean, actually, here, we try to only track the last > two steps for closing a connection, that is 'fin=1,ack=1,seq=2,ack=u+1' ^ 'FIN=1,ACK=1,seq=w,ack=u+1' > and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can ^ 'ACK=1,seq=u+1,ack=w+1' ^ 'FIN=1,ACK=1' > ensure that the 'fin=1,seq=u' packet has been posted. > ^ 'FIN=1,seq=u' > Another reason is we may can't track the 'fin=1,seq=u' packet while > we start COLO while one connection is closing, which the 'fin=1,seq=u' packet > has been posted. > > Actually, here, if we start COLO while one connection is closing, which the > 'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In this ^ 'FIN=1,ACK=1' Sorry for the typo. :) > case, the connection will be left in hash table for ever though it is harmless. > Any ideas for this case ? > > For the above codes question, i'd like to change tcp_state to tap_closing_wait, > is it OK ? > > Thanks. > Hailiang > >> Thanks >> >>> } >>> - >>> return 0; >>> } >>> >>> @@ -178,7 +234,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf, >>> >>> if (sender == nf->netdev) { >>> /* NET_FILTER_DIRECTION_TX */ >>> - if (!handle_primary_tcp_pkt(nf, conn, pkt)) { >>> + if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) { >>> qemu_net_queue_send(s->incoming_queue, sender, 0, >>> (const uint8_t *)pkt->data, pkt->size, NULL); >>> packet_destroy(pkt, NULL); >>> @@ -191,7 +247,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf, >>> } >>> } else { >>> /* NET_FILTER_DIRECTION_RX */ >>> - if (!handle_secondary_tcp_pkt(nf, conn, pkt)) { >>> + if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) { >>> qemu_net_queue_send(s->incoming_queue, sender, 0, >>> (const uint8_t *)pkt->data, pkt->size, NULL); >>> packet_destroy(pkt, NULL); >> >> >> . >>