* [PATCH] improvement for TCP connection tracking @ 2011-02-26 3:32 Pablo Neira Ayuso 2011-02-26 3:33 ` [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK Pablo Neira Ayuso 0 siblings, 1 reply; 12+ messages in thread From: Pablo Neira Ayuso @ 2011-02-26 3:32 UTC (permalink / raw) To: netfilter-devel; +Cc: kaber, kadlec Hi Patrick, Jozsef, I've been testing the following patch in my HA firewall setup with success. This patch provides better recovery in stress scenarios. Please, if you're OK with it, pass to it 2.6.38. --- Pablo Neira Ayuso (1): netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK net/netfilter/nf_conntrack_proto_tcp.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 3:32 [PATCH] improvement for TCP connection tracking Pablo Neira Ayuso @ 2011-02-26 3:33 ` Pablo Neira Ayuso 2011-02-26 6:15 ` Changli Gao 2011-02-26 18:30 ` Jozsef Kadlecsik 0 siblings, 2 replies; 12+ messages in thread From: Pablo Neira Ayuso @ 2011-02-26 3:33 UTC (permalink / raw) To: netfilter-devel; +Cc: kaber, kadlec Consider the following scenario: client firewall server | | | | syn | syn | |------------->|------------>| | | | | syn+ack | syn+ack | | x<------|<------------| syn+ack got lost! | | | | syn | syn | |------------->|------------>| | | | | syn+ack | syn+ack | |<-------------|<------------| | | | Note that the syn+ack is lost after we have seen it. Without this patch, the TCP tracking ignores the retransmitted SYN without checking if the sequence number is in the window. This patch also helps a lot to conntrackd in stress scenarios (assumming a client that generates lots of small TCP connections). During the failover, consider that the new primary has injected one outdated flow in SYN_RECV state (this is likely to happen if the conntrack event rate is high because the backup will be a bit delayed from the primary). With the current code, if the client starts a new fresh connection that matches the tuple, the SYN packet will be ignored without updating the state tracking, and the SYN+ACK in reply will blocked as it will not pass checkings III or IV (since all state tracking in the original direction is not initialized because of the SYN packet was ignored). Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- net/netfilter/nf_conntrack_proto_tcp.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 3fb2b73..be0b84d 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -142,12 +142,12 @@ static const u8 tcp_conntracks[2][6][TCP_CONNTRACK_MAX] = { { /* ORIGINAL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ -/*syn*/ { sSS, sSS, sIG, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, +/*syn*/ { sSS, sSS, sSR, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, /* * sNO -> sSS Initialize a new connection * sSS -> sSS Retransmitted SYN * sS2 -> sS2 Late retransmitted SYN - * sSR -> sIG + * sSR -> sSR Retransmitted SYN, SYN/ACK got lost? * sES -> sIG Error: SYNs in window outside the SYN_SENT state * are errors. Receiver will reply with RST * and close the connection. ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 3:33 ` [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK Pablo Neira Ayuso @ 2011-02-26 6:15 ` Changli Gao 2011-02-26 18:30 ` Jozsef Kadlecsik 1 sibling, 0 replies; 12+ messages in thread From: Changli Gao @ 2011-02-26 6:15 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel, kaber, kadlec On Sat, Feb 26, 2011 at 11:33 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > net/netfilter/nf_conntrack_proto_tcp.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > index 3fb2b73..be0b84d 100644 > --- a/net/netfilter/nf_conntrack_proto_tcp.c > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > @@ -142,12 +142,12 @@ static const u8 tcp_conntracks[2][6][TCP_CONNTRACK_MAX] = { > { > /* ORIGINAL */ > /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ > -/*syn*/ { sSS, sSS, sIG, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, > +/*syn*/ { sSS, sSS, sSR, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, > /* > * sNO -> sSS Initialize a new connection > * sSS -> sSS Retransmitted SYN > * sS2 -> sS2 Late retransmitted SYN > - * sSR -> sIG > + * sSR -> sSR Retransmitted SYN, SYN/ACK got lost? > * sES -> sIG Error: SYNs in window outside the SYN_SENT state > * are errors. Receiver will reply with RST > * and close the connection. > I think you should change [reply][eSR][syn_ack] to ignore. Then the following code will be executed, and in fact, the following code is for you scenario according to the comments. 878 case TCP_CONNTRACK_IGNORE: 879 /* Ignored packets: 880 * 881 * Our connection entry may be out of sync, so ignore 882 * packets which may signal the real connection between 883 * the client and the server. 884 * 885 * a) SYN in ORIGINAL 886 * b) SYN/ACK in REPLY 887 * c) ACK in reply direction after initial SYN in original. 888 * 889 * If the ignored packet is invalid, the receiver will send 890 * a RST we'll catch below. 891 */ 892 if (index == TCP_SYNACK_SET 893 && ct->proto.tcp.last_index == TCP_SYN_SET 894 && ct->proto.tcp.last_dir != dir 895 && ntohl(th->ack_seq) == ct->proto.tcp.last_end) { 896 /* b) This SYN/ACK acknowledges a SYN that we earlier 897 * ignored as invalid. This means that the client and 898 * the server are both in sync, while the firewall is 899 * not. We get in sync from the previously annotated 900 * values. 901 */ 902 old_state = TCP_CONNTRACK_SYN_SENT; 903 new_state = TCP_CONNTRACK_SYN_RECV; 904 ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_end = 905 ct->proto.tcp.last_end; 906 ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_maxend = 907 ct->proto.tcp.last_end; 908 ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_maxwin = 909 ct->proto.tcp.last_win == 0 ? 910 1 : ct->proto.tcp.last_win; 911 ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_scale = 912 ct->proto.tcp.last_wscale; 913 ct->proto.tcp.seen[ct->proto.tcp.last_dir].flags = 914 ct->proto.tcp.last_flags; 915 memset(&ct->proto.tcp.seen[dir], 0, 916 sizeof(struct ip_ct_tcp_state)); 917 break; 918 } -- Regards, Changli Gao(xiaosuo@gmail.com) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 3:33 ` [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK Pablo Neira Ayuso 2011-02-26 6:15 ` Changli Gao @ 2011-02-26 18:30 ` Jozsef Kadlecsik 2011-02-26 20:11 ` Pablo Neira Ayuso 1 sibling, 1 reply; 12+ messages in thread From: Jozsef Kadlecsik @ 2011-02-26 18:30 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel, Patrick McHardy Hi Pablo, On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: > Consider the following scenario: > > client firewall server > | | | > | syn | syn | > |------------->|------------>| > | | | > | syn+ack | syn+ack | > | x<------|<------------| syn+ack got lost! > | | | > | syn | syn | > |------------->|------------>| > | | | > | syn+ack | syn+ack | > |<-------------|<------------| > | | | > > Note that the syn+ack is lost after we have seen it. Without this > patch, the TCP tracking ignores the retransmitted SYN without > checking if the sequence number is in the window. Maybe I'm just blind, but I don't see what you try to achieve. Actually, with your patch you switch off the path which would handle the case when conntrack holds an outdated entry from whatever reason. Present code, cases: a. conntrack in sync, SYN/ACK lost and client resends the SYN - we ignore it but it does not do any harm b. conntrack entry is outdated, new SYN received - we ignore it but save the initialization data from it - when the reply SYN/ACK receives and it matches the saved data, we pick up the new connection With your patch, cases: a. conntrack in sync, SYN/ACK lost and client resends the SYN - SYN is out of window and will be marked as invalid b. conntrack entry is outdated, new SYN received - SYN is out of window and will be marked as invalid What do I miss? > This patch also helps a lot to conntrackd in stress scenarios > (assumming a client that generates lots of small TCP connections). > During the failover, consider that the new primary has injected > one outdated flow in SYN_RECV state (this is likely to happen if > the conntrack event rate is high because the backup will be a bit > delayed from the primary). With the current code, if the client > starts a new fresh connection that matches the tuple, the SYN > packet will be ignored without updating the state tracking, and > the SYN+ACK in reply will blocked as it will not pass checkings > III or IV (since all state tracking in the original direction is > not initialized because of the SYN packet was ignored). The last_dir entries are updated from the ignored SYN packet and thus the SYN+ACK won't be blocked - if the packet is a true answer to the SYN. > Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > --- > net/netfilter/nf_conntrack_proto_tcp.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c > index 3fb2b73..be0b84d 100644 > --- a/net/netfilter/nf_conntrack_proto_tcp.c > +++ b/net/netfilter/nf_conntrack_proto_tcp.c > @@ -142,12 +142,12 @@ static const u8 tcp_conntracks[2][6][TCP_CONNTRACK_MAX] = { > { > /* ORIGINAL */ > /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ > -/*syn*/ { sSS, sSS, sIG, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, > +/*syn*/ { sSS, sSS, sSR, sIG, sIG, sIG, sIG, sSS, sSS, sS2 }, > /* > * sNO -> sSS Initialize a new connection > * sSS -> sSS Retransmitted SYN > * sS2 -> sS2 Late retransmitted SYN > - * sSR -> sIG > + * sSR -> sSR Retransmitted SYN, SYN/ACK got lost? > * sES -> sIG Error: SYNs in window outside the SYN_SENT state > * are errors. Receiver will reply with RST > * and close the connection. > Bes regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 18:30 ` Jozsef Kadlecsik @ 2011-02-26 20:11 ` Pablo Neira Ayuso 2011-02-26 21:45 ` Jozsef Kadlecsik 0 siblings, 1 reply; 12+ messages in thread From: Pablo Neira Ayuso @ 2011-02-26 20:11 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel, Patrick McHardy [-- Attachment #1: Type: text/plain, Size: 2503 bytes --] Hi Jozsef, On 26/02/11 19:30, Jozsef Kadlecsik wrote: > Maybe I'm just blind, but I don't see what you try to achieve. Actually, > with your patch you switch off the path which would handle the case when > conntrack holds an outdated entry from whatever reason. > > Present code, cases: > > a. conntrack in sync, SYN/ACK lost and client resends the SYN > - we ignore it but it does not do any harm > b. conntrack entry is outdated, new SYN received > - we ignore it but save the initialization data from it > - when the reply SYN/ACK receives and it matches the saved data, > we pick up the new connection This is what it seems at first sight, but this is not what it happens if the conntrack is in SYN_RECV state. Initially, the SYN packet hits first item of 'b', thus we save data from it. But the SYN/ACK packet is considered a retransmission given that we're in SYN_RECV state. Therefore, we never hit the second item of 'b'. > With your patch, cases: > > a. conntrack in sync, SYN/ACK lost and client resends the SYN > - SYN is out of window and will be marked as invalid > b. conntrack entry is outdated, new SYN received > - SYN is out of window and will be marked as invalid Damn, you're right, this patch is broken, that SYN packet won't pass boundary III, IIRC. > What do I miss? That we're in SYN_RECV state :-). >> This patch also helps a lot to conntrackd in stress scenarios >> (assumming a client that generates lots of small TCP connections). >> During the failover, consider that the new primary has injected >> one outdated flow in SYN_RECV state (this is likely to happen if >> the conntrack event rate is high because the backup will be a bit >> delayed from the primary). With the current code, if the client >> starts a new fresh connection that matches the tuple, the SYN >> packet will be ignored without updating the state tracking, and >> the SYN+ACK in reply will blocked as it will not pass checkings >> III or IV (since all state tracking in the original direction is >> not initialized because of the SYN packet was ignored). > > The last_dir entries are updated from the ignored SYN packet and thus the > SYN+ACK won't be blocked - if the packet is a true answer to the SYN. This is what it happens with other states, but not for SYN_RECV. I have reworked the patch and the description. With this, the SYN_RECV state if we're out of sync receives the same handling. I have test it here, it works fine. Let me know if you're OK with it. [-- Attachment #2: tcp-syn-ack-recovery.patch --] [-- Type: text/x-patch, Size: 3240 bytes --] netfilter: nf_ct_tcp: fix out of sync scenario while in SYN_RECV From: Pablo Neira Ayuso <pablo@netfilter.org> This patch fixes the out of sync scenarios while in SYN_RECV state. Quoting Jozsef, what it happens if we are out of sync if the following: > b. conntrack entry is outdated, new SYN received > - (b1) we ignore it but save the initialization data from it > - (b2) when the reply SYN/ACK receives and it matches the saved data, > we pick up the new connection This is what it should happen if we are in SYN_RECV state. Initially, the SYN packet hits b1, thus we save data from it. But the SYN/ACK packet is considered a retransmission given that we're in SYN_RECV state. Therefore, we never hit b2 and we don't get in sync. To fix this, we check if we were already in SYN_RECV and the previous packet was a SYN. In that case, we enter the ignore case that get us in sync. This patch helps a lot to conntrackd in stress scenarios (assumming a client that generates lots of small TCP connections). During the failover, consider that the new primary has injected one outdated flow in SYN_RECV state (this is likely to happen if the conntrack event rate is high because the backup will be a bit delayed from the primary). With the current code, if the client starts a new fresh connection that matches the tuple, the SYN packet will be ignored without updating the state tracking, and the SYN+ACK in reply will blocked as it will not pass checkings III or IV (since all state tracking in the original direction is not initialized because of the SYN packet was ignored and the ignore case that get us in sync is not applied). Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- net/netfilter/nf_conntrack_proto_tcp.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 3fb2b73..baa9450 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -842,6 +842,20 @@ static int tcp_packet(struct nf_conn *ct, tuple = &ct->tuplehash[dir].tuple; switch (new_state) { + case TCP_CONNTRACK_SYN_RECV: + /* Special case: we were in SYN_RECV state but we are out of + * sync. We have previously seen a SYN packet in the original + * direction which has been ignored, now we see the SYN/ACK + * packet but the TCP state transition considers that this is + * a retransmission. However, this is not. Jump to the code + * that get us in sync. */ + if (old_state == TCP_CONNTRACK_SYN_RECV && + index == TCP_SYNACK_SET && + ct->proto.tcp.last_index == TCP_SYN_SET && + ct->proto.tcp.last_dir != dir && + ntohl(th->ack_seq) == ct->proto.tcp.last_end) + goto syn_recv_out_of_sync; + break; case TCP_CONNTRACK_SYN_SENT: if (old_state < TCP_CONNTRACK_TIME_WAIT) break; @@ -899,6 +913,7 @@ static int tcp_packet(struct nf_conn *ct, * not. We get in sync from the previously annotated * values. */ +syn_recv_out_of_sync: old_state = TCP_CONNTRACK_SYN_SENT; new_state = TCP_CONNTRACK_SYN_RECV; ct->proto.tcp.seen[ct->proto.tcp.last_dir].td_end = ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 20:11 ` Pablo Neira Ayuso @ 2011-02-26 21:45 ` Jozsef Kadlecsik 2011-02-26 23:42 ` Pablo Neira Ayuso 0 siblings, 1 reply; 12+ messages in thread From: Jozsef Kadlecsik @ 2011-02-26 21:45 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel, Patrick McHardy, Changli Gao Hi Pablo, On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: > On 26/02/11 19:30, Jozsef Kadlecsik wrote: > > Maybe I'm just blind, but I don't see what you try to achieve. Actually, > > with your patch you switch off the path which would handle the case when > > conntrack holds an outdated entry from whatever reason. > > > > Present code, cases: > > > > a. conntrack in sync, SYN/ACK lost and client resends the SYN > > - we ignore it but it does not do any harm > > b. conntrack entry is outdated, new SYN received > > - we ignore it but save the initialization data from it > > - when the reply SYN/ACK receives and it matches the saved data, > > we pick up the new connection > > This is what it seems at first sight, but this is not what it happens if > the conntrack is in SYN_RECV state. > > Initially, the SYN packet hits first item of 'b', thus we save data from > it. But the SYN/ACK packet is considered a retransmission given that > we're in SYN_RECV state. Therefore, we never hit the second item of 'b'. Yes, you're right! We don't hit 'b' in the SYN_RECV state. > > With your patch, cases: > > > > a. conntrack in sync, SYN/ACK lost and client resends the SYN > > - SYN is out of window and will be marked as invalid > > b. conntrack entry is outdated, new SYN received > > - SYN is out of window and will be marked as invalid > > Damn, you're right, this patch is broken, that SYN packet won't pass > boundary III, IIRC. > > > What do I miss? > > That we're in SYN_RECV state :-). > > >> This patch also helps a lot to conntrackd in stress scenarios > >> (assumming a client that generates lots of small TCP connections). > >> During the failover, consider that the new primary has injected > >> one outdated flow in SYN_RECV state (this is likely to happen if > >> the conntrack event rate is high because the backup will be a bit > >> delayed from the primary). With the current code, if the client > >> starts a new fresh connection that matches the tuple, the SYN > >> packet will be ignored without updating the state tracking, and > >> the SYN+ACK in reply will blocked as it will not pass checkings > >> III or IV (since all state tracking in the original direction is > >> not initialized because of the SYN packet was ignored). > > > > The last_dir entries are updated from the ignored SYN packet and thus the > > SYN+ACK won't be blocked - if the packet is a true answer to the SYN. > > This is what it happens with other states, but not for SYN_RECV. > > I have reworked the patch and the description. With this, the SYN_RECV > state if we're out of sync receives the same handling. > > I have test it here, it works fine. Let me know if you're OK with it. The patch looks OK but I think Changli Gao is also right and it'd be simpler to set the [reply][synack][SR] state to sIG. What do you think? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 21:45 ` Jozsef Kadlecsik @ 2011-02-26 23:42 ` Pablo Neira Ayuso 2011-02-27 0:00 ` Pablo Neira Ayuso 0 siblings, 1 reply; 12+ messages in thread From: Pablo Neira Ayuso @ 2011-02-26 23:42 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel, Patrick McHardy, Changli Gao [-- Attachment #1: Type: text/plain, Size: 485 bytes --] On 26/02/11 22:45, Jozsef Kadlecsik wrote: > On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: >> I have test it here, it works fine. Let me know if you're OK with it. > > The patch looks OK but I think Changli Gao is also right and it'd be > simpler to set the [reply][synack][SR] state to sIG. What do you think? I read his email before leaving and after I made the new patch. Indeed, his idea is simpler, here's a new patch. I tested it here, it works fine. Patrick, please apply! [-- Attachment #2: tcp-changli.patch --] [-- Type: text/x-patch, Size: 818 bytes --] Index: linux-2.6.37/net/netfilter/nf_conntrack_proto_tcp.c =================================================================== --- linux-2.6.37.orig/net/netfilter/nf_conntrack_proto_tcp.c 2011-02-26 20:14:44.000000000 +0000 +++ linux-2.6.37/net/netfilter/nf_conntrack_proto_tcp.c 2011-02-26 20:15:03.000000000 +0000 @@ -227,11 +227,11 @@ * sCL -> sIV */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ -/*synack*/ { sIV, sSR, sSR, sIG, sIG, sIG, sIG, sIG, sIG, sSR }, +/*synack*/ { sIV, sSR, sIG, sIG, sIG, sIG, sIG, sIG, sIG, sSR }, /* * sSS -> sSR Standard open. * sS2 -> sSR Simultaneous open - * sSR -> sSR Retransmitted SYN/ACK. + * sSR -> sIG Retransmitted SYN/ACK, ignore it. * sES -> sIG Late retransmitted SYN/ACK? * sFW -> sIG Might be SYN/ACK answering ignored SYN * sCW -> sIG ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-26 23:42 ` Pablo Neira Ayuso @ 2011-02-27 0:00 ` Pablo Neira Ayuso 2011-02-27 2:08 ` Igor 'Lo' (И.L.) 2011-02-27 15:22 ` Patrick McHardy 0 siblings, 2 replies; 12+ messages in thread From: Pablo Neira Ayuso @ 2011-02-27 0:00 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: netfilter-devel, Patrick McHardy, Changli Gao [-- Attachment #1: Type: text/plain, Size: 611 bytes --] On 27/02/11 00:42, Pablo Neira Ayuso wrote: > On 26/02/11 22:45, Jozsef Kadlecsik wrote: >> On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: >>> I have test it here, it works fine. Let me know if you're OK with it. >> >> The patch looks OK but I think Changli Gao is also right and it'd be >> simpler to set the [reply][synack][SR] state to sIG. What do you think? > > I read his email before leaving and after I made the new patch. > > Indeed, his idea is simpler, here's a new patch. I tested it here, it > works fine. > > Patrick, please apply! Hm, I forgot to include the description. New patch attached. [-- Attachment #2: tcp-changli.patch --] [-- Type: text/x-patch, Size: 2794 bytes --] netfilter: nf_ct_tcp: fix out of sync scenario while in SYN_RECV From: Pablo Neira Ayuso <pablo@netfilter.org> This patch fixes the out of sync scenarios while in SYN_RECV state. Quoting Jozsef, what it happens if we are out of sync if the following: > b. conntrack entry is outdated, new SYN received > - (b1) we ignore it but save the initialization data from it > - (b2) when the reply SYN/ACK receives and it matches the saved data, > we pick up the new connection This is what it should happen if we are in SYN_RECV state. Initially, the SYN packet hits b1, thus we save data from it. But the SYN/ACK packet is considered a retransmission given that we're in SYN_RECV state. Therefore, we never hit b2 and we don't get in sync. To fix this, we ignore SYN/ACK if we are in SYN_RECV. If the previous packet was a SYN, then we enter the ignore case that get us in sync. This patch helps a lot to conntrackd in stress scenarios (assumming a client that generates lots of small TCP connections). During the failover, consider that the new primary has injected one outdated flow in SYN_RECV state (this is likely to happen if the conntrack event rate is high because the backup will be a bit delayed from the primary). With the current code, if the client starts a new fresh connection that matches the tuple, the SYN packet will be ignored without updating the state tracking, and the SYN+ACK in reply will blocked as it will not pass checkings III or IV (since all state tracking in the original direction is not initialized because of the SYN packet was ignored and the ignore case that get us in sync is not applied). I posted a couple of patches before this one. Changli Gao spotted a simpler way to fix this problem. This patch implements his idea. Cc: Changli Gao <xiaosuo@gmail.com> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- net/netfilter/nf_conntrack_proto_tcp.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 3fb2b73..19bbab7 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -227,11 +227,11 @@ static const u8 tcp_conntracks[2][6][TCP_CONNTRACK_MAX] = { * sCL -> sIV */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sS2 */ -/*synack*/ { sIV, sSR, sSR, sIG, sIG, sIG, sIG, sIG, sIG, sSR }, +/*synack*/ { sIV, sSR, sIG, sIG, sIG, sIG, sIG, sIG, sIG, sSR }, /* * sSS -> sSR Standard open. * sS2 -> sSR Simultaneous open - * sSR -> sSR Retransmitted SYN/ACK. + * sSR -> sIG Retransmitted SYN/ACK, ignore it. * sES -> sIG Late retransmitted SYN/ACK? * sFW -> sIG Might be SYN/ACK answering ignored SYN * sCW -> sIG ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-27 0:00 ` Pablo Neira Ayuso @ 2011-02-27 2:08 ` Igor 'Lo' (И.L.) 2011-02-27 15:22 ` Patrick McHardy 1 sibling, 0 replies; 12+ messages in thread From: Igor 'Lo' (И.L.) @ 2011-02-27 2:08 UTC (permalink / raw) To: netfilter-devel It's a bit off-topic, but can it be recommended to hack into nf_ct_tcp somehow changing it's structure to assign each connection unique id and pass it with each retransmission event detected to nfqueue - without setting too complicated skbuffs / iptables rules? Or this is considered as dirty hacking and should be implemented other way? Reasons to do this: currently, trying to set up a small NFQUEUE-based program that will modify data in TCP streams, that causes retransmissions due to changed size thus requires SEQ/ACK tuning (and as far as I suspect, something deeper than a filter table affects the idea, but it's the separate question) and the network-caused retransmissions are a bit of pain to track. On 27 February 2011 02:00, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > On 27/02/11 00:42, Pablo Neira Ayuso wrote: > > On 26/02/11 22:45, Jozsef Kadlecsik wrote: > >> On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: > >>> I have test it here, it works fine. Let me know if you're OK with it. > >> > >> The patch looks OK but I think Changli Gao is also right and it'd be > >> simpler to set the [reply][synack][SR] state to sIG. What do you think? > > > > I read his email before leaving and after I made the new patch. > > > > Indeed, his idea is simpler, here's a new patch. I tested it here, it > > works fine. > > > > Patrick, please apply! > > Hm, I forgot to include the description. New patch attached. -- С уважением, Игорь -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-27 0:00 ` Pablo Neira Ayuso 2011-02-27 2:08 ` Igor 'Lo' (И.L.) @ 2011-02-27 15:22 ` Patrick McHardy 2011-02-27 17:28 ` Jozsef Kadlecsik 1 sibling, 1 reply; 12+ messages in thread From: Patrick McHardy @ 2011-02-27 15:22 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: Jozsef Kadlecsik, netfilter-devel, Changli Gao On 27.02.2011 01:00, Pablo Neira Ayuso wrote: > On 27/02/11 00:42, Pablo Neira Ayuso wrote: >> On 26/02/11 22:45, Jozsef Kadlecsik wrote: >>> On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: >>>> I have test it here, it works fine. Let me know if you're OK with it. >>> >>> The patch looks OK but I think Changli Gao is also right and it'd be >>> simpler to set the [reply][synack][SR] state to sIG. What do you think? >> >> I read his email before leaving and after I made the new patch. >> >> Indeed, his idea is simpler, here's a new patch. I tested it here, it >> works fine. >> >> Patrick, please apply! > > Hm, I forgot to include the description. New patch attached. Thanks. I'll apply this tommorrow, assuming Jozsef is fine with this version. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-27 15:22 ` Patrick McHardy @ 2011-02-27 17:28 ` Jozsef Kadlecsik 2011-02-28 17:04 ` Patrick McHardy 0 siblings, 1 reply; 12+ messages in thread From: Jozsef Kadlecsik @ 2011-02-27 17:28 UTC (permalink / raw) To: Patrick McHardy; +Cc: Pablo Neira Ayuso, netfilter-devel, Changli Gao On Sun, 27 Feb 2011, Patrick McHardy wrote: > On 27.02.2011 01:00, Pablo Neira Ayuso wrote: > > On 27/02/11 00:42, Pablo Neira Ayuso wrote: > >> On 26/02/11 22:45, Jozsef Kadlecsik wrote: > >>> On Sat, 26 Feb 2011, Pablo Neira Ayuso wrote: > >>>> I have test it here, it works fine. Let me know if you're OK with it. > >>> > >>> The patch looks OK but I think Changli Gao is also right and it'd be > >>> simpler to set the [reply][synack][SR] state to sIG. What do you think? > >> > >> I read his email before leaving and after I made the new patch. > >> > >> Indeed, his idea is simpler, here's a new patch. I tested it here, it > >> works fine. > >> > >> Patrick, please apply! > > > > Hm, I forgot to include the description. New patch attached. > > Thanks. I'll apply this tommorrow, assuming Jozsef is fine with > this version. Yes, of course Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK 2011-02-27 17:28 ` Jozsef Kadlecsik @ 2011-02-28 17:04 ` Patrick McHardy 0 siblings, 0 replies; 12+ messages in thread From: Patrick McHardy @ 2011-02-28 17:04 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Pablo Neira Ayuso, netfilter-devel, Changli Gao Am 27.02.2011 18:28, schrieb Jozsef Kadlecsik: >>> > > Hm, I forgot to include the description. New patch attached. >> > >> > Thanks. I'll apply this tommorrow, assuming Jozsef is fine with >> > this version. > Yes, of course > > Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Applied, thanks everyone. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-02-28 17:04 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-26 3:32 [PATCH] improvement for TCP connection tracking Pablo Neira Ayuso 2011-02-26 3:33 ` [PATCH] netfilter: nf_ct_tcp: better handling for SYN retransmissions after SYN+ACK Pablo Neira Ayuso 2011-02-26 6:15 ` Changli Gao 2011-02-26 18:30 ` Jozsef Kadlecsik 2011-02-26 20:11 ` Pablo Neira Ayuso 2011-02-26 21:45 ` Jozsef Kadlecsik 2011-02-26 23:42 ` Pablo Neira Ayuso 2011-02-27 0:00 ` Pablo Neira Ayuso 2011-02-27 2:08 ` Igor 'Lo' (И.L.) 2011-02-27 15:22 ` Patrick McHardy 2011-02-27 17:28 ` Jozsef Kadlecsik 2011-02-28 17:04 ` Patrick McHardy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).