From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: Jason Wang <jasowang@redhat.com>,
zhangchen.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com
Cc: xuquan8@huawei.com, qemu-devel@nongnu.org, pss.wulizhen@huawei.com
Subject: Re: [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table
Date: Wed, 22 Feb 2017 16:45:03 +0800 [thread overview]
Message-ID: <58AD4F8F.5030703@huawei.com> (raw)
In-Reply-To: <b0b04c02-7c97-ea04-1e38-99fd9344edae@redhat.com>
On 2017/2/22 16:07, Jason Wang wrote:
>
>
> On 2017年02月22日 11:46, zhanghailiang wrote:
>> After a net connection is closed, we didn't clear its releated resources
>> in connection_track_table, which will lead to memory leak.
>
> Not a real leak but would lead reset of hash table if too many closed
> connections.
>
Yes, you are right, there will be lots of stale connection data in hash table
if we don't remove it while it is been closed. Which
>>
>> Let't track the state of net connection, if it is closed, its related
>> resources will be cleared up.
>
> The issue is the state were tracked partially, do we need a full state
> machine here?
>
Not, IMHO, we only care about the last state of it, because, we will do nothing
even if we track the intermedial states.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>> net/colo.h | 4 +++
>> net/filter-rewriter.c | 70 +++++++++++++++++++++++++++++++++++++++++++++------
>> 2 files changed, 67 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/colo.h b/net/colo.h
>> index 7c524f3..cd9027f 100644
>> --- a/net/colo.h
>> +++ b/net/colo.h
>> @@ -18,6 +18,7 @@
>> #include "slirp/slirp.h"
>> #include "qemu/jhash.h"
>> #include "qemu/timer.h"
>> +#include "slirp/tcp.h"
>>
>> #define HASHTABLE_MAX_SIZE 16384
>>
>> @@ -69,6 +70,9 @@ typedef struct Connection {
>> * run once in independent tcp connection
>> */
>> int syn_flag;
>> +
>> + int tcp_state; /* TCP FSM state */
>> + tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */
>> } Connection;
>>
>> uint32_t connection_key_hash(const void *opaque);
>> diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
>> index c4ab91c..7e7ec35 100644
>> --- a/net/filter-rewriter.c
>> +++ b/net/filter-rewriter.c
>> @@ -60,9 +60,9 @@ static int is_tcp_packet(Packet *pkt)
>> }
>>
>> /* handle tcp packet from primary guest */
>> -static int handle_primary_tcp_pkt(NetFilterState *nf,
>> +static int handle_primary_tcp_pkt(RewriterState *rf,
>> Connection *conn,
>> - Packet *pkt)
>> + Packet *pkt, ConnectionKey *key)
>> {
>> struct tcphdr *tcp_pkt;
>>
>> @@ -97,15 +97,45 @@ static int handle_primary_tcp_pkt(NetFilterState *nf,
>> tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
>>
>> net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> + /*
>> + * Case 1:
>> + * The *server* side of this connect is VM, *client* tries to close
>> + * the connection.
>> + *
>> + * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1'
>> + * packet from server side. From this point, we can ensure that there
>> + * will be no packets in the connection, except that, some errors
>> + * happen between the path of 'filter object' and vNIC, if this rare
>> + * case really happen, we can still create a new connection,
>> + * So it is safe to remove the connection from connection_track_table.
>> + *
>> + */
>> + if ((conn->tcp_state == TCPS_LAST_ACK) &&
>> + (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> + fprintf(stderr, "Remove conn "
>
> Can this even compile?
>
Oops, i forgot to remove it, will remove it in next version.
>> + g_hash_table_remove(rf->connection_track_table, key);
>> + }
>> + }
>> + /*
>> + * Case 2:
>> + * The *server* side of this connect is VM, *server* tries to close
>> + * the connection.
>> + *
>> + * We got 'fin=1, ack=1' packet from client side, we need to
>> + * record the seq of 'fin=1, ack=1' packet.
>> + */
>> + if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>> + conn->fin_ack_seq = htonl(tcp_pkt->th_seq);
>> + conn->tcp_state = TCPS_LAST_ACK;
>> }
>>
>> return 0;
>> }
>>
>> /* handle tcp packet from secondary guest */
>> -static int handle_secondary_tcp_pkt(NetFilterState *nf,
>> +static int handle_secondary_tcp_pkt(RewriterState *rf,
>> Connection *conn,
>> - Packet *pkt)
>> + Packet *pkt, ConnectionKey *key)
>> {
>> struct tcphdr *tcp_pkt;
>>
>> @@ -133,8 +163,34 @@ static int handle_secondary_tcp_pkt(NetFilterState *nf,
>> tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
>>
>> net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
>> + /*
>> + * Case 2:
>> + * The *server* side of this connect is VM, *server* tries to close
>> + * the connection.
>> + *
>> + * We got 'ack=1' packets from server side, it acks 'fin=1, ack=1'
>> + * packet from client side. Like Case 1, there should be no packets
>> + * in the connection from now know, But the difference here is
>> + * if the packet is lost, We will get the resent 'fin=1,ack=1' packet.
>> + * TODO: Fix above case.
>> + */
>> + if ((conn->tcp_state == TCPS_LAST_ACK) &&
>> + (ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) {
>> + g_hash_table_remove(rf->connection_track_table, key);
>> + }
>> + }
>> + /*
>> + * Case 1:
>> + * The *server* side of this connect is VM, *client* tries to close
>> + * the connection.
>> + *
>> + * We got 'fin=1, ack=1' packet from server side, we need to
>> + * record the seq of 'fin=1, ack=1' packet.
>> + */
>> + if ((tcp_pkt->th_flags & (TH_ACK | TH_FIN)) == (TH_ACK | TH_FIN)) {
>> + conn->fin_ack_seq = ntohl(tcp_pkt->th_seq);
>> + conn->tcp_state = TCPS_LAST_ACK;
>
> I thought the tcp_state should store the state of TCP from the view of
> secondary VM? So TCPS_LAST_ACK is wrong and bring lots of confusion. And
> the handle of active close needs more states here. E.g if connection is
> in FIN_WAIT_2, the connection is only half closed, remote peer can still
> send packet to us unless we receive a FIN.
>
Yes, i know what you mean, actually, here, we try to only track the last
two steps for closing a connection, that is 'fin=1,ack=1,seq=2,ack=u+1'
and 'ack=1,seq=u+1,ack=w+1', because if we get a 'fin=1,ack=1', we can
ensure that the 'fin=1,seq=u' packet has been posted.
Another reason is we may can't track the 'fin=1,seq=u' packet while
we start COLO while one connection is closing, which the 'fin=1,seq=u' packet
has been posted.
Actually, here, if we start COLO while one connection is closing, which the
'fin=1,ack=1' has been posted, we can only track 'ack=1' packet. In this
case, the connection will be left in hash table for ever though it is harmless.
Any ideas for this case ?
For the above codes question, i'd like to change tcp_state to tap_closing_wait,
is it OK ?
Thanks.
Hailiang
> Thanks
>
>> }
>> -
>> return 0;
>> }
>>
>> @@ -178,7 +234,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>>
>> if (sender == nf->netdev) {
>> /* NET_FILTER_DIRECTION_TX */
>> - if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
>> + if (!handle_primary_tcp_pkt(s, conn, pkt, &key)) {
>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>> (const uint8_t *)pkt->data, pkt->size, NULL);
>> packet_destroy(pkt, NULL);
>> @@ -191,7 +247,7 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
>> }
>> } else {
>> /* NET_FILTER_DIRECTION_RX */
>> - if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
>> + if (!handle_secondary_tcp_pkt(s, conn, pkt, &key)) {
>> qemu_net_queue_send(s->incoming_queue, sender, 0,
>> (const uint8_t *)pkt->data, pkt->size, NULL);
>> packet_destroy(pkt, NULL);
>
>
> .
>
next prev parent reply other threads:[~2017-02-22 8:45 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-22 3:46 [Qemu-devel] [PATCH v2 0/3] filter-rewriter: fix two bugs and one optimization zhanghailiang
2017-02-22 3:46 ` [Qemu-devel] [PATCH v2 1/3] net/colo: fix memory double free error zhanghailiang
2017-02-22 8:39 ` Zhang Chen
2017-02-22 3:46 ` [Qemu-devel] [PATCH v2 2/3] filter-rewriter: fix memory leak for connection in connection_track_table zhanghailiang
2017-02-22 8:07 ` Jason Wang
2017-02-22 8:45 ` Hailiang Zhang [this message]
2017-02-22 8:51 ` Hailiang Zhang
2017-02-23 4:16 ` Jason Wang
2017-02-27 3:11 ` Hailiang Zhang
2017-02-27 3:40 ` Jason Wang
2017-02-27 4:09 ` Hailiang Zhang
2017-02-27 5:35 ` Jason Wang
2017-02-27 6:53 ` Hailiang Zhang
2017-02-27 9:05 ` Jason Wang
2017-02-27 10:29 ` Hailiang Zhang
2017-02-28 3:14 ` Jason Wang
2017-02-22 3:46 ` [Qemu-devel] [PATCH v2 3/3] filter-rewriter: skip net_checksum_calculate() while offset = 0 zhanghailiang
2017-02-24 8:08 ` Zhang Chen
2017-02-24 8:23 ` Zhang Chen
2017-02-27 1:36 ` Hailiang Zhang
2017-02-27 3:44 ` Zhang Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58AD4F8F.5030703@huawei.com \
--to=zhang.zhanghailiang@huawei.com \
--cc=jasowang@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=pss.wulizhen@huawei.com \
--cc=qemu-devel@nongnu.org \
--cc=xuquan8@huawei.com \
--cc=zhangchen.fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.