* [PATCH] Fix NAT TCP sequence adjustment
@ 2005-04-02 20:24 Phil Oester
2005-04-03 12:26 ` Milos Wimmer
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Phil Oester @ 2005-04-02 20:24 UTC (permalink / raw)
To: netfilter-devel
[-- Attachment #1: Type: text/plain, Size: 1631 bytes --]
In adjust_tcp_sequence, we track the sequence number of any adjustments
in the correction_pos variable. The seq stored is based upon the left
side of the window of the NAT box -- not of the original sender.
Later, in ip_nat_seq_adjust, we compare the correction_pos variable to
the seq of the original sender to determine whether this is a new packet
or a retransmission (i.e. should we apply offset_before or offset_after).
So we are comparing the post-adjustment seq to a pre-adjustment seq.
This works ok under normal circumstances, like a single FTP transfer, since
the before/after comparison luckily works out. But in situations like
an FTP 'mget', it can fail depending on how many files are transferred.
For example, assume the strlen of the PORT command from the client is 26, and
it gets adjusted to 27 by the NAT box (sizediff=1). Under this scenario,
the mget will fail to transfer file #26 since at that point the before/after
comparison in ip_nat_seq_adjust flips the other way, and the adjustment
becomes offset_before instead of the appropriate offset_after. Obviously
for larger sizediffs, the failure comes earlier.
The solution is to store the client's seq number in correction_pos instead
of the seq of the NAT box. The below makes this change, as well as cleans up a
number of broken DEBUGP statements and a couple of u32->u16 casts.
As noted, without this patch an mget fails at (sizediff * strlen(PORT...).
With this patch, I have successfully transferred 500 files under both
scenarios: sizediff > 0 and < 0.
This closes bugzilla #308.
Phil
Signed-off-by: Phil Oester <kernel@linuxace.com>
[-- Attachment #2: patch-ftp2 --]
[-- Type: text/plain, Size: 4201 bytes --]
diff -ru linux-orig/net/ipv4/netfilter/ip_conntrack_ftp.c linux-new/net/ipv4/netfilter/ip_conntrack_ftp.c
--- linux-orig/net/ipv4/netfilter/ip_conntrack_ftp.c 2005-03-02 02:38:38.000000000 -0500
+++ linux-new/net/ipv4/netfilter/ip_conntrack_ftp.c 2005-04-01 21:11:12.389774176 -0500
@@ -252,7 +252,7 @@
}
/* Look up to see if we're just after a \n. */
-static int find_nl_seq(u16 seq, const struct ip_ct_ftp_master *info, int dir)
+static int find_nl_seq(u32 seq, const struct ip_ct_ftp_master *info, int dir)
{
unsigned int i;
@@ -263,7 +263,7 @@
}
/* We don't update if it's older than what we have. */
-static void update_nl_seq(u16 nl_seq, struct ip_ct_ftp_master *info, int dir)
+static void update_nl_seq(u32 nl_seq, struct ip_ct_ftp_master *info, int dir)
{
unsigned int i, oldest = NUM_SEQ_TO_REMEMBER;
@@ -330,9 +330,10 @@
/* Look up to see if we're just after a \n. */
if (!find_nl_seq(ntohl(th->seq), ct_ftp_info, dir)) {
/* Now if this ends in \n, update ftp info. */
- DEBUGP("ip_conntrack_ftp_help: wrong seq pos %s(%u) or %s(%u)\n",
- ct_ftp_info->seq_aft_nl[0][dir]
- old_seq_aft_nl_set ? "":"(UNSET) ", old_seq_aft_nl);
+ DEBUGP("ip_conntrack_ftp_help: wrong seq pos %s%u vs %u\n",
+ ct_ftp_info->seq_aft_nl[0][dir] ? "":"(UNSET)",
+ ct_ftp_info->seq_aft_nl[0][dir],
+ ntohl(th->seq));
ret = NF_ACCEPT;
goto out_update_nl;
}
@@ -373,7 +374,7 @@
goto out_update_nl;
}
- DEBUGP("conntrack_ftp: match `%s' (%u bytes at %u)\n",
+ DEBUGP("conntrack_ftp: match `%.26s' (%u bytes at %u)\n",
fb_ptr + matchoff, matchlen, ntohl(th->seq) + matchoff);
/* Allocate expectation which will be inserted */
@@ -440,7 +441,7 @@
/* Now if this ends in \n, update ftp info. Seq may have been
* adjusted by NAT code. */
if (ends_in_nl)
- update_nl_seq(seq, ct_ftp_info,dir);
+ update_nl_seq(seq, ct_ftp_info, dir);
out:
UNLOCK_BH(&ip_ftp_lock);
return ret;
diff -ru linux-orig/net/ipv4/netfilter/ip_nat_helper.c linux-new/net/ipv4/netfilter/ip_nat_helper.c
--- linux-orig/net/ipv4/netfilter/ip_nat_helper.c 2005-03-02 02:37:49.000000000 -0500
+++ linux-new/net/ipv4/netfilter/ip_nat_helper.c 2005-04-02 12:45:42.787580776 -0500
@@ -57,17 +57,16 @@
enum ip_conntrack_info ctinfo)
{
int dir;
- struct ip_nat_seq *this_way, *other_way;
+ struct ip_nat_seq *this_way;
+ u32 adjustedseq;
- DEBUGP("ip_nat_resize_packet: old_size = %u, new_size = %u\n",
- (*skb)->len, new_size);
+ DEBUGP("adjust_tcp_sequence: sizediff = %i\n", sizediff);
dir = CTINFO2DIR(ctinfo);
this_way = &ct->nat.info.seq[dir];
- other_way = &ct->nat.info.seq[!dir];
- DEBUGP("ip_nat_resize_packet: Seq_offset before: ");
+ DEBUGP("adjust_tcp_sequence: Seq_offset before: ");
DUMP_OFFSET(this_way);
LOCK_BH(&ip_nat_seqofs_lock);
@@ -76,15 +75,16 @@
* correction, record it: we don't handle more than one
* adjustment in the window, but do deal with common case of a
* retransmit */
+ adjustedseq = seq - this_way->offset_after;
if (this_way->offset_before == this_way->offset_after
- || before(this_way->correction_pos, seq)) {
- this_way->correction_pos = seq;
+ || before(this_way->correction_pos, adjustedseq)) {
+ this_way->correction_pos = adjustedseq;
this_way->offset_before = this_way->offset_after;
this_way->offset_after += sizediff;
}
UNLOCK_BH(&ip_nat_seqofs_lock);
- DEBUGP("ip_nat_resize_packet: Seq_offset after: ");
+ DEBUGP("adjust_tcp_sequence: Seq_offset after: ");
DUMP_OFFSET(this_way);
}
@@ -116,7 +116,7 @@
skb->len);
skb_put(skb, rep_len - match_len);
} else {
- DEBUGP("ip_nat_mangle_packet: Shrinking packet from "
+ DEBUGP("ip_nat_mangle_packet: Shrinking packet by "
"%u from %u bytes\n", match_len - rep_len,
skb->len);
__skb_trim(skb, skb->len + rep_len - match_len);
@@ -291,7 +291,7 @@
- natseq->offset_before;
new_end_seq = htonl(new_end_seq);
- DEBUGP("sack_adjust: start_seq: %d->%d, end_seq: %d->%d\n",
+ DEBUGP("sack_adjust: start_seq: %u->%u, end_seq: %u->%u\n",
ntohl(sack->start_seq), new_start_seq,
ntohl(sack->end_seq), new_end_seq);
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-02 20:24 [PATCH] Fix NAT TCP sequence adjustment Phil Oester @ 2005-04-03 12:26 ` Milos Wimmer 2005-04-03 19:26 ` Patrick McHardy 2005-05-31 9:17 ` Rusty Russell 2 siblings, 0 replies; 32+ messages in thread From: Milos Wimmer @ 2005-04-03 12:26 UTC (permalink / raw) To: Phil Oester; +Cc: netfilter-devel On Sat, 2 Apr 2005, Phil Oester wrote: > As noted, without this patch an mget fails at (sizediff * strlen(PORT...). > With this patch, I have successfully transferred 500 files under both > scenarios: sizediff > 0 and < 0. Yes, I tried it just now and I can confirm it works great. Thank you for your work, Milos Wimmer ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-02 20:24 [PATCH] Fix NAT TCP sequence adjustment Phil Oester 2005-04-03 12:26 ` Milos Wimmer @ 2005-04-03 19:26 ` Patrick McHardy 2005-04-03 23:53 ` Phil Oester 2005-05-31 9:17 ` Rusty Russell 2 siblings, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-04-03 19:26 UTC (permalink / raw) To: Phil Oester; +Cc: netfilter-devel Phil Oester wrote: > In adjust_tcp_sequence, we track the sequence number of any adjustments > in the correction_pos variable. The seq stored is based upon the left > side of the window of the NAT box -- not of the original sender. > > Later, in ip_nat_seq_adjust, we compare the correction_pos variable to > the seq of the original sender to determine whether this is a new packet > or a retransmission (i.e. should we apply offset_before or offset_after). > So we are comparing the post-adjustment seq to a pre-adjustment seq. > > The solution is to store the client's seq number in correction_pos instead > of the seq of the NAT box. The below makes this change, as well as cleans up a > number of broken DEBUGP statements and a couple of u32->u16 casts. Great work Phil. One question though: You want to store the pre-adjusted sequence number. What if the packet is a retransmission and offset_before has been applied? If I understand correctly, depending on the delta between offset_after and offset_before, this might cause the before(...) test to give a false positive and screw up the state. In case I'm wrong, could you a patch containing only the necessary changes? I think the final fix for this problem should go in -stable, ideally it would only be a single line "seq -= this_way->offset_after". Regards Patrick > + adjustedseq = seq - this_way->offset_after; > if (this_way->offset_before == this_way->offset_after > - || before(this_way->correction_pos, seq)) { > - this_way->correction_pos = seq; > + || before(this_way->correction_pos, adjustedseq)) { > + this_way->correction_pos = adjustedseq; > this_way->offset_before = this_way->offset_after; > this_way->offset_after += sizediff; > } > UNLOCK_BH(&ip_nat_seqofs_lock); > > - DEBUGP("ip_nat_resize_packet: Seq_offset after: "); > + DEBUGP("adjust_tcp_sequence: Seq_offset after: "); > DUMP_OFFSET(this_way); > } ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-03 19:26 ` Patrick McHardy @ 2005-04-03 23:53 ` Phil Oester 2005-04-04 4:40 ` Phil Oester 0 siblings, 1 reply; 32+ messages in thread From: Phil Oester @ 2005-04-03 23:53 UTC (permalink / raw) To: Patrick McHardy; +Cc: netfilter-devel On Sun, Apr 03, 2005 at 09:26:22PM +0200, Patrick McHardy wrote: > Great work Phil. One question though: You want to store the pre-adjusted > sequence number. What if the packet is a retransmission and > offset_before has been applied? If I understand correctly, depending > on the delta between offset_after and offset_before, this might cause > the before(...) test to give a false positive and screw up the state. > In case I'm wrong, could you a patch containing only the necessary > changes? I think the final fix for this problem should go in -stable, > ideally it would only be a single line "seq -= this_way->offset_after". I think you are correct, retransmission would falsely trigger the before. Testing a new patch...will likely submit tomorrow. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-03 23:53 ` Phil Oester @ 2005-04-04 4:40 ` Phil Oester 2005-04-04 8:27 ` Patrick McHardy 0 siblings, 1 reply; 32+ messages in thread From: Phil Oester @ 2005-04-04 4:40 UTC (permalink / raw) To: Patrick McHardy; +Cc: netfilter-devel [-- Attachment #1: Type: text/plain, Size: 1739 bytes --] On Sun, Apr 03, 2005 at 04:53:20PM -0700, Phil Oester wrote: > On Sun, Apr 03, 2005 at 09:26:22PM +0200, Patrick McHardy wrote: > > Great work Phil. One question though: You want to store the pre-adjusted > > sequence number. What if the packet is a retransmission and > > offset_before has been applied? If I understand correctly, depending > > on the delta between offset_after and offset_before, this might cause > > the before(...) test to give a false positive and screw up the state. > > In case I'm wrong, could you a patch containing only the necessary > > changes? I think the final fix for this problem should go in -stable, > > ideally it would only be a single line "seq -= this_way->offset_after". > > I think you are correct, retransmission would falsely trigger the before. > Testing a new patch...will likely submit tomorrow. If the retransmitted packet were the same packet as the last one which was mangled, the seq will be adjusted by offset_before, which at this point is the same adjustment as the first packet received via offset_after. So, seq == correction_pos for the retransmitted packet, and this is fairly trivial to handle. This is the only case where I can see before() triggering a false positive. As per your request, below is only the bare minimum patch for -stable, but the other patch to ip_conntrack_ftp for u16->u32 is likely another good candidate for -stable. On another note, It would be helpful if you published your tree somewhere so patches could be based upon it...then I would not have wasted time sending the u16->u32 patch which someone else submitted but which is not yet in mainline or -bk. Time to consider a -nf snapshot? Phil Signed-off-by: Phil Oester <kernel@linuxace.com> [-- Attachment #2: patch-ftpstable --] [-- Type: text/plain, Size: 1042 bytes --] diff -ru linux-orig/net/ipv4/netfilter/ip_nat_helper.c linux-stable/net/ipv4/netfilter/ip_nat_helper.c --- linux-orig/net/ipv4/netfilter/ip_nat_helper.c 2005-03-02 02:37:49.000000000 -0500 +++ linux-stable/net/ipv4/netfilter/ip_nat_helper.c 2005-04-04 00:22:28.929366600 -0400 @@ -76,11 +76,14 @@ * correction, record it: we don't handle more than one * adjustment in the window, but do deal with common case of a * retransmit */ - if (this_way->offset_before == this_way->offset_after - || before(this_way->correction_pos, seq)) { - this_way->correction_pos = seq; - this_way->offset_before = this_way->offset_after; - this_way->offset_after += sizediff; + if (seq != this_way->correction_pos) { + seq -= this_way->offset_after; + if (this_way->offset_before == this_way->offset_after + || before(this_way->correction_pos, seq)) { + this_way->correction_pos = seq; + this_way->offset_before = this_way->offset_after; + this_way->offset_after += sizediff; + } } UNLOCK_BH(&ip_nat_seqofs_lock); ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-04 4:40 ` Phil Oester @ 2005-04-04 8:27 ` Patrick McHardy 2005-04-04 20:47 ` Phil Oester 2005-04-27 0:44 ` Rusty Russell 0 siblings, 2 replies; 32+ messages in thread From: Patrick McHardy @ 2005-04-04 8:27 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel Phil Oester wrote: > If the retransmitted packet were the same packet as the last one which was > mangled, the seq will be adjusted by offset_before, which at this point is > the same adjustment as the first packet received via offset_after. So, > seq == correction_pos for the retransmitted packet, and this is fairly trivial > to handle. This is the only case where I can see before() triggering a false > positive. What you are trying to do is to reconstruct the original sequence number, so lets consider the possible cases: 1. tcph->seq - offset_before <= correction_pos Since offset_before < offset_after, tcph->seq - offset_after is also before correction_pos and the original sequence number must have been tcph->seq - offset_before. In this case the calculated sequence number is wrong, but no false positive can trigger. 2. tcph->seq - offset_after > correction_pos Since offset_after > offset_before, tcph->seq - offset_before is also after correction_pos and the original sequence number must have been tcph->seq - offset_after. This case is also handled correctly, the position is updated. 3. tcph->seq - offset_before > this_way->correction_pos && tcph->seq - offset_after <= this_way->correction_pos In this case we can't determine whether this packet was a retransmission and what the original sequence number was. Assuming it was offset_after leaves the case where it really was offset_before handled incorrectly. In this case it was a retransmit and we don't want to update the state, using offset_after results in an incorrect sequence number that is before correction_pos, so this case is also handled correctly. > + if (seq != this_way->correction_pos) { > + seq -= this_way->offset_after; > + if (this_way->offset_before == this_way->offset_after > + || before(this_way->correction_pos, seq)) { > + this_way->correction_pos = seq; > + this_way->offset_before = this_way->offset_after; > + this_way->offset_after += sizediff; > + } > } So in conclusion it seems simply doing "seq -= this_way->offset_after" without the seq != this_way->correction_pos check handles all cases correctly. Please someone verify I didn't make a mistake. I think a comment explaining the cases would also be appropriate. > As per your request, below is only the bare minimum patch for -stable, > but the other patch to ip_conntrack_ftp for u16->u32 is likely another good > candidate for -stable. Please send an updated patch without the new check and with a comment. Regarding the u16 patch, I haven't heard of problems related to this bug, so its not a candidate yet. > On another note, It would be helpful if you published your tree somewhere so > patches could be based upon it...then I would not have wasted time sending > the u16->u32 patch which someone else submitted but which is not yet in > mainline or -bk. Time to consider a -nf snapshot? Exporting using bitkeeper is painless, for normal diffs I first need to figure out how to work with triggers. I'll try to do this some time soon. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-04 8:27 ` Patrick McHardy @ 2005-04-04 20:47 ` Phil Oester 2005-04-05 7:32 ` Patrick McHardy 2005-04-06 4:48 ` Phil Oester 2005-04-27 0:44 ` Rusty Russell 1 sibling, 2 replies; 32+ messages in thread From: Phil Oester @ 2005-04-04 20:47 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel On Mon, Apr 04, 2005 at 10:27:30AM +0200, Patrick McHardy wrote: > So in conclusion it seems simply doing "seq -= this_way->offset_after" > without the seq != this_way->correction_pos check handles all cases > correctly. Please someone verify I didn't make a mistake. I think a > comment explaining the cases would also be appropriate. I think you are only thinking in terms of positive sizediff, while this is not always the case. Perhaps easiest to explain by observed example. ip_nat_seq_adjust: seq 467631623 after co_pos 467631577 Adjusting sequence number from 467631623->467631622 ... ip_nat_mangle_packet: Shrinking packet by 1 from 66 bytes adjust_tcp_sequence: sizediff = -1 adjust_tcp_sequence: Seq_offset before: offset_before=0, offset_after=-1, correction_pos=467631577 adjust_tcp_sequence: Seq_offset after: offset_before=-1, offset_after=-2, correction_pos=467631623 Given the above, assume we now receive a retransmit of this packet. 467631623 will have offset_before added to it by ip_nat_seq_adjust, becoming 467631622. It will then have offset_after subtracted from it in adjust_tcp_sequence, making it 467631624, which will then cause a false positive on the before() test. If we change the example above to a positive sizediff: ip_nat_seq_adjust: seq 467631623 after co_pos 467631577 Adjusting sequence number from 467631623->467631624 ... ip_nat_mangle_packet: Extending packet by 1 from 66 bytes adjust_tcp_sequence: sizediff = 1 adjust_tcp_sequence: Seq_offset before: offset_before=0, offset_after=1, correction_pos=467631577 adjust_tcp_sequence: Seq_offset after: offset_before=1, offset_after=2, correction_pos=467631623 Assuming a retransmit in this example, 467631623 will have offset_before added to it by ip_nat_seq_adjust, becoming 467631624. It will then have offset_after subtracted from it in adjust_tcp_sequence, making it 467631622 which will correctly evaluate as being before correction_pos. So as it stands, only retransmits on positive sizediff are handled properly. But my original check to compensate for this was flawed, since seq != correction_pos when entering adjust_tcp_sequence - but it does seem clear we cannot blindly -= offset_after. I think this will handle all permutations, since it essentially reverses ip_nat_seq_adjust's adjustment before checking for retransmit. if (seq - this_way->offset_before != this_way->correction_pos) { seq -= this_way->offset_after; if (this_way->offset_before == this_way->offset_after || before(this_way->correction_pos, seq)) { this_way->correction_pos = seq; this_way->offset_before = this_way->offset_after; this_way->offset_after += sizediff; } } Ack? Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-04 20:47 ` Phil Oester @ 2005-04-05 7:32 ` Patrick McHardy 2005-04-05 13:33 ` Patch lifetime " Amin Azez 2005-04-06 4:48 ` Phil Oester 1 sibling, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-04-05 7:32 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel Phil Oester wrote: > On Mon, Apr 04, 2005 at 10:27:30AM +0200, Patrick McHardy wrote: > >>So in conclusion it seems simply doing "seq -= this_way->offset_after" >>without the seq != this_way->correction_pos check handles all cases >>correctly. Please someone verify I didn't make a mistake. I think a >>comment explaining the cases would also be appropriate. > > > I think you are only thinking in terms of positive sizediff, while this > is not always the case. Perhaps easiest to explain by observed example. Damn, you're right. > Ack? Let me think about it again, I'll get back to you. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Patch lifetime Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-05 7:32 ` Patrick McHardy @ 2005-04-05 13:33 ` Amin Azez 2005-04-10 20:49 ` Harald Welte 0 siblings, 1 reply; 32+ messages in thread From: Amin Azez @ 2005-04-05 13:33 UTC (permalink / raw) To: netfilter-devel I apologise for this dumb question, which really applies to most of the patches submitted here. I oberve that Phils' fine patch has undergone some peer review and is just about to be "accepted". What happens next? Does it hit pom-ng some short time later? When will it hit iptables cvs head and snapshots? When will it (will it ever) become part of the core kernel distribution? What I am really trying to find out is the best way to track such patches through their lifetime until I no longer need to. Amin ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Patch lifetime Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-05 13:33 ` Patch lifetime " Amin Azez @ 2005-04-10 20:49 ` Harald Welte 0 siblings, 0 replies; 32+ messages in thread From: Harald Welte @ 2005-04-10 20:49 UTC (permalink / raw) To: Amin Azez; +Cc: netfilter-devel [-- Attachment #1: Type: text/plain, Size: 1271 bytes --] On Tue, Apr 05, 2005 at 02:33:38PM +0100, Amin Azez wrote: > I apologise for this dumb question, which really applies to most of the > patches submitted here. > > I oberve that Phils' fine patch has undergone some peer review and is > just about to be "accepted". > > What happens next? Does it hit pom-ng some short time later? pom-ng is not for bugfixes, but for new matches/targets/helpers. > When will it hit iptables cvs head and snapshots? iptables (userspace) patches are usually applied immediately. > When will it (will it ever) become part of the core kernel distribution? after some more testing and synchronizin with the kernel release cycle > What I am really trying to find out is the best way to track such > patches through their lifetime until I no longer need to. http://patchwork.netfilter.org/ will help, once it's fully running (thanks jeremy) -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-04 20:47 ` Phil Oester 2005-04-05 7:32 ` Patrick McHardy @ 2005-04-06 4:48 ` Phil Oester 2005-04-18 1:42 ` Patrick McHardy 1 sibling, 1 reply; 32+ messages in thread From: Phil Oester @ 2005-04-06 4:48 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 527 bytes --] On Mon, Apr 04, 2005 at 01:47:16PM -0700, Phil Oester wrote: > So as it stands, only retransmits on positive sizediff are handled > properly. But my original check to compensate for this was flawed, > since seq != correction_pos when entering adjust_tcp_sequence - but > it does seem clear we cannot blindly -= offset_after. I've become more convinced that this is the correct approach. Anticipating that you will too, below is an updated patch with comments added. Phil Signed-off-by: Phil Oester <kernel@linuxace.com> [-- Attachment #2: patch-ftpstable --] [-- Type: text/plain, Size: 1599 bytes --] diff -ru linux-orig/net/ipv4/netfilter/ip_nat_helper.c linux-stable/net/ipv4/netfilter/ip_nat_helper.c --- linux-orig/net/ipv4/netfilter/ip_nat_helper.c 2005-03-02 02:37:49.000000000 -0500 +++ linux-stable/net/ipv4/netfilter/ip_nat_helper.c 2005-04-06 00:42:48.736295496 -0400 @@ -72,15 +72,23 @@ LOCK_BH(&ip_nat_seqofs_lock); - /* SYN adjust. If it's uninitialized, or this is after last - * correction, record it: we don't handle more than one - * adjustment in the window, but do deal with common case of a - * retransmit */ - if (this_way->offset_before == this_way->offset_after - || before(this_way->correction_pos, seq)) { - this_way->correction_pos = seq; - this_way->offset_before = this_way->offset_after; - this_way->offset_after += sizediff; + /* SYN adjust. In the case of a retransmit, ip_nat_seq_adjust has + * added offset_before to client's seq, so subtract it to test + * for retransmit here */ + if (seq - this_way->offset_before != this_way->correction_pos) { + /* If it's uninitialized, or this is after last correction, + * record it. We don't handle more than one adjustment in + * the window */ + seq -= this_way->offset_after; + if (this_way->offset_before == this_way->offset_after + || before(this_way->correction_pos, seq)) { + /* Since ip_nat_seq_adjust has added + * offset_after, subtract it to store + * the client's seq in correction_pos */ + this_way->correction_pos = seq; + this_way->offset_before = this_way->offset_after; + this_way->offset_after += sizediff; + } } UNLOCK_BH(&ip_nat_seqofs_lock); ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-06 4:48 ` Phil Oester @ 2005-04-18 1:42 ` Patrick McHardy 2005-04-19 0:58 ` Phil Oester 0 siblings, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-04-18 1:42 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel Phil Oester wrote: > On Mon, Apr 04, 2005 at 01:47:16PM -0700, Phil Oester wrote: > >>So as it stands, only retransmits on positive sizediff are handled >>properly. But my original check to compensate for this was flawed, >>since seq != correction_pos when entering adjust_tcp_sequence - but >>it does seem clear we cannot blindly -= offset_after. > > > I've become more convinced that this is the correct approach. Anticipating > that you will too, below is an updated patch with comments added. I have a strong feeling that it is wrong. As you correctly pointed out, my previous attempt to argue about its correctness was based on the wrong assumption that offset_before < offset_after, while this is only one possible case. So let's try again. The sequence number is adjusted like this: if (after(ntohl(tcph->seq), this_way->correction_pos)) newseq = ntohl(tcph->seq) + this_way->offset_after; else newseq = ntohl(tcph->seq) + this_way->offset_before; and we need the original sequence number to determine whether this is a retransmission and, if not, to store it as new correction_pos: /* SYN adjust. If it's uninitialized, or this is after last * correction, record it: we don't handle more than one * adjustment in the window, but do deal with common case of a * retransmit */ if (this_way->offset_before == this_way->offset_after || before(this_way->correction_pos, seq)) { this_way->correction_pos = seq; this_way->offset_before = this_way->offset_after; this_way->offset_after += sizediff; } We have six cases to consider (2.3. is the interesting one): 1. offset_before < offset_after 1.1. tcph->seq - offset_before <= correction_pos Since offset_before < offset_after, tcph->seq - offset_after is also before correction_pos and the original sequence number must have been tcph->seq - offset_before. Using offset_after results in an incorrect sequence number, but the packet is still correctly identified as a retransmission. 1.2. tcph->seq - offset_after > correction_pos Since offset_before < offset_after, tcph->seq - offset_before is also after correction_pos and the original sequence number must have been tcph->seq - offset_after. Using offset_after is obviously correct. 1.3. tcph->seq - offset_before > correction_pos && tcph->seq - offset_after <= correction_pos In this case we can't determine whether the packet is a retransmission and what the original sequence number was. Using offset_after results in an incorrect sequence number when really offset_before was used, but the packet is still correctly detected as a retransmission. 2. offset_after < offset_before 2.1. tcph->seq - offset_after <= correction_pos Since offset_after < offset_before, tcph->seq - offset_before is also before correction_pos and the original sequence number must have been tcph->seq - offset_before. Using offset_after results in an incorrect sequence number, but the packet is still correctly identified as a retransmission. 2.2. tcph->seq - offset_before > correction_pos Since offset_after < offset_before, tcph->seq - offset_after is also after correction_pos and the original sequence number must have been tcph->seq - offset_after. Using offset_after is obviously correct. 2.3. tcph->seq - offset_after > correction_pos && tcph->seq - offset_before <= correction_pos In this case we can't determine whether the packet is a retransmission and what the original sequence number was. Using offset_after results in an incorrect sequence number when really offset_before was used. Unlike in 1.3., in this case the packet is (by precondition) _not_ correctly identified as a retransmission and causes a false positive. It is impossible to handle the last case without more information, since it is correct in some cases, in some it isn't. So it looks like we need to store this information in the skb. We don't have much choice, nfcache and about 29 bits in nfctinfo are available. nfctinfo could be split in two u16 fields to avoid changes to lots of unrelated code, nfcache should die IMO. Any other ideas? Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-18 1:42 ` Patrick McHardy @ 2005-04-19 0:58 ` Phil Oester 2005-04-20 15:03 ` Patrick McHardy 0 siblings, 1 reply; 32+ messages in thread From: Phil Oester @ 2005-04-19 0:58 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel On Mon, Apr 18, 2005 at 03:42:38AM +0200, Patrick McHardy wrote: > We have six cases to consider No...let's simplify it a bit. At this point the only debate is over retransmit handling, since I think we are in agreement that storing the client's seq in correction_pos (not the adjusted seq) is appropriate. So that solves 1/2 of the puzzle. Since we now store the correct c_p as noted above, we are guaranteed that ip_nat_seq_adjust will make the correct determination of whether a packet is a retransmission. Now assume a retransmit arrives, has offset_before added to it, and reaches adjust_tcp_sequence. With my proposed test: if (seq - this_way->offset_before != this_way->correction_pos) which compares the pre-ip_nat_seq_adjust adjustment to the current c_p, I have difficulty finding any scenario which this would fail to correctly find a retransmit. > 2.3. tcph->seq - offset_after > correction_pos && > tcph->seq - offset_before <= correction_pos > > In this case we can't determine whether the packet is a retransmission > and what the original sequence number was. I think you are failing to recognize that in the first transmit of a given packet, offset_before and offset_after are adjusted, so that by the time the retransmit arrives, the tests are different than they were on the first pass. That possible? If not, can you provide some sample seq/o_a/o_b numbers which you believe would fail this test? Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-19 0:58 ` Phil Oester @ 2005-04-20 15:03 ` Patrick McHardy 2005-04-20 15:53 ` Phil Oester 0 siblings, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-04-20 15:03 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel Phil Oester wrote: > If not, can you provide some sample seq/o_a/o_b numbers which you believe > would fail this test? offset_before: 1000 offset_after: -1000 correction_pos: 1000000 Sequence number (pre-adjustment): 999500 (retransmit) Sequence number (post-adjustment): 1000500 if (seq - this_way->offset_before != this_way->correction_pos) adjusted seq - offset_before = 999500 => passes the test adjusted seq - offset_after = 1001500 => not detected as retransmit. You assume only identical retransmits of the packet that caused the last adjustment. It could also be an older packet or have different boundaries. This brings us to a different problem, the sequence number at which the correction occured should be stored, not the first sequence number contained in the packet. But this can be done in a seperate fix. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 15:03 ` Patrick McHardy @ 2005-04-20 15:53 ` Phil Oester 2005-04-20 16:07 ` Patrick McHardy 0 siblings, 1 reply; 32+ messages in thread From: Phil Oester @ 2005-04-20 15:53 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel On Wed, Apr 20, 2005 at 05:03:40PM +0200, Patrick McHardy wrote: > offset_before: 1000 > offset_after: -1000 > correction_pos: 1000000 > > Sequence number (pre-adjustment): 999500 (retransmit) > Sequence number (post-adjustment): 1000500 > > if (seq - this_way->offset_before != this_way->correction_pos) > > adjusted seq - offset_before = 999500 => passes the test > adjusted seq - offset_after = 1001500 => not detected as retransmit. > > You assume only identical retransmits of the packet that caused > the last adjustment. It could also be an older packet or have > different boundaries. Yes, and I believe this is a safe assumption. Take FTP for example. Packets containing the PORT command traverse this function, but not packets containing the file itself. If a PORT packet is lost, you will not receive a PORT packet for a different file before the first is retransmitted. So it is safe to assume that the adjustments are occurring sequentially -- not out of order. So with this assumption, the above example becomes impossible - the 999500 retransmit would have been dealt with prior to a new correction_pos being set. This is a very FTP-centric view, but can you think of other protocols helpers which would not follow this sequential pattern? > This brings us to a different problem, > the sequence number at which the correction occured should be > stored, not the first sequence number contained in the packet. > But this can be done in a seperate fix. Given the above assumption, I don't think this is a concern. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 15:53 ` Phil Oester @ 2005-04-20 16:07 ` Patrick McHardy 2005-04-20 17:24 ` Phil Oester 0 siblings, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-04-20 16:07 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel Phil Oester wrote: > On Wed, Apr 20, 2005 at 05:03:40PM +0200, Patrick McHardy wrote: > >>You assume only identical retransmits of the packet that caused >>the last adjustment. It could also be an older packet or have >>different boundaries. > > Yes, and I believe this is a safe assumption. Take FTP for example. Packets > containing the PORT command traverse this function, but not packets containing > the file itself. If a PORT packet is lost, you will not receive a PORT > packet for a different file before the first is retransmitted. So it is safe > to assume that the adjustments are occurring sequentially -- not out of order IIRC I've seen an FTP client that continued issueing commands during a transfer, but I don't remeber which one exactly. > So with this assumption, the above example becomes impossible - the 999500 > retransmit would have been dealt with prior to a new correction_pos being > set. You forget about the network, which can duplicate, delay and reorder packets. If we don't handle this is it will screw up the state. > This is a very FTP-centric view, but can you think of other protocols helpers > which would not follow this sequential pattern? No, but this function should do the right thing anyway. >>This brings us to a different problem, >>the sequence number at which the correction occured should be >>stored, not the first sequence number contained in the packet. >>But this can be done in a seperate fix. > > Given the above assumption, I don't think this is a concern. It is. Assume a FTP control packet that contains a PORT command at some offset. The MTU changes and the packet is retransmitted as three seperate packets, with the PORT command contained in the third packet. The second packet will be incorrectly adjusted. Probably not a common condition, but easy to handle correctly, so why not. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 16:07 ` Patrick McHardy @ 2005-04-20 17:24 ` Phil Oester 2005-04-20 17:50 ` Patrick McHardy 0 siblings, 1 reply; 32+ messages in thread From: Phil Oester @ 2005-04-20 17:24 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel On Wed, Apr 20, 2005 at 06:07:45PM +0200, Patrick McHardy wrote: > IIRC I've seen an FTP client that continued issueing commands during a > transfer, but I don't remeber which one exactly. At this moment, multi-file FTP is broken for all clients -- should we fix this first, and worry about potential statistical outliers later? > >So with this assumption, the above example becomes impossible - the 999500 > >retransmit would have been dealt with prior to a new correction_pos being > >set. > > You forget about the network, which can duplicate, delay and reorder > packets. If we don't handle this is it will screw up the state. I'm aware of that, but again I claim that it is irrelevant in this case because the PORT commands will not be processed out of order (even if the data packets are), and thus the adjustments will occur sequentially. > >This is a very FTP-centric view, but can you think of other protocols > >helpers > >which would not follow this sequential pattern? > > No, but this function should do the right thing anyway. Agreed, but IMHO my approach is currently the best available fix for a problem which is affecting many people. If further testing shows it can be improved, that is great -- but I certainly don't think it's any worse than the current broken behaviour. > >>This brings us to a different problem, > >>the sequence number at which the correction occured should be > >>stored, not the first sequence number contained in the packet. > >>But this can be done in a seperate fix. > > > >Given the above assumption, I don't think this is a concern. > > It is. Assume a FTP control packet that contains a PORT command > at some offset. The MTU changes and the packet is retransmitted > as three seperate packets, with the PORT command contained in the > third packet. The second packet will be incorrectly adjusted. > Probably not a common condition, but easy to handle correctly, > so why not. Ok, another statistical outlier which can be thought about later. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 17:24 ` Phil Oester @ 2005-04-20 17:50 ` Patrick McHardy 2005-04-20 18:25 ` Phil Oester 0 siblings, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-04-20 17:50 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel Phil Oester wrote: > On Wed, Apr 20, 2005 at 06:07:45PM +0200, Patrick McHardy wrote: > >>You forget about the network, which can duplicate, delay and reorder >>packets. If we don't handle this is it will screw up the state. > > I'm aware of that, but again I claim that it is irrelevant in this case > because the PORT commands will not be processed out of order (even if the > data packets are), and thus the adjustments will occur sequentially. Why wouldn't they be processed out of order by nat? >>No, but this function should do the right thing anyway. > > Agreed, but IMHO my approach is currently the best available fix for a problem > which is affecting many people. If further testing shows it can be improved, > that is great -- but I certainly don't think it's any worse than the current > broken behaviour. I agree that the problem should be fixed ASAP. I'll do a proper fix now - either by moving seq adjustment after NAT helpers or by adding a flag to the skb. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 17:50 ` Patrick McHardy @ 2005-04-20 18:25 ` Phil Oester 2005-04-20 21:39 ` Martijn Lievaart 2005-04-21 1:38 ` Patrick McHardy 0 siblings, 2 replies; 32+ messages in thread From: Phil Oester @ 2005-04-20 18:25 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel On Wed, Apr 20, 2005 at 07:50:44PM +0200, Patrick McHardy wrote: > >I'm aware of that, but again I claim that it is irrelevant in this case > >because the PORT commands will not be processed out of order (even if the > >data packets are), and thus the adjustments will occur sequentially. > > Why wouldn't they be processed out of order by nat? Because a second transfer will not be started until the first completes in my experience. Thus, there will not be two 'PORT' packets floating around to potentially arrive out of order > I agree that the problem should be fixed ASAP. I'll do a proper fix > now - either by moving seq adjustment after NAT helpers or by adding > a flag to the skb. Doesn't sound like a -stable fix, but whatever you think is best. Phil ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 18:25 ` Phil Oester @ 2005-04-20 21:39 ` Martijn Lievaart 2005-04-21 1:41 ` Patrick McHardy 2005-04-21 1:38 ` Patrick McHardy 1 sibling, 1 reply; 32+ messages in thread From: Martijn Lievaart @ 2005-04-20 21:39 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel, Patrick McHardy Phil Oester wrote: >On Wed, Apr 20, 2005 at 07:50:44PM +0200, Patrick McHardy wrote: > > >>>I'm aware of that, but again I claim that it is irrelevant in this case >>>because the PORT commands will not be processed out of order (even if the >>>data packets are), and thus the adjustments will occur sequentially. >>> >>> >>Why wouldn't they be processed out of order by nat? >> >> > >Because a second transfer will not be started until the first completes >in my experience. Thus, there will not be two 'PORT' packets floating >around to potentially arrive out of order > > > Some Windows graphical FTP frontends do multiple transfers. I think one should take these into account. OTOH, how often do PORT commands arrive out of sequence even given that it is possible? M4 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 21:39 ` Martijn Lievaart @ 2005-04-21 1:41 ` Patrick McHardy 0 siblings, 0 replies; 32+ messages in thread From: Patrick McHardy @ 2005-04-21 1:41 UTC (permalink / raw) To: Martijn Lievaart; +Cc: Harald Welte, Rusty Russell, netfilter-devel Martijn Lievaart wrote: > Some Windows graphical FTP frontends do multiple transfers. I think one > should take these into account. OTOH, how often do PORT commands arrive > out of sequence even given that it is possible? These clients most likely open up multiple control connections. But it doesn't really matter, this function is not directly related to FTP and it needs to work properly in all cases. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-20 18:25 ` Phil Oester 2005-04-20 21:39 ` Martijn Lievaart @ 2005-04-21 1:38 ` Patrick McHardy 2005-04-21 12:31 ` Milos Wimmer 2005-04-21 13:31 ` Jonas Berlin 1 sibling, 2 replies; 32+ messages in thread From: Patrick McHardy @ 2005-04-21 1:38 UTC (permalink / raw) To: Phil Oester; +Cc: Harald Welte, Rusty Russell, netfilter-devel [-- Attachment #1: Type: text/plain, Size: 685 bytes --] Phil Oester wrote: > On Wed, Apr 20, 2005 at 07:50:44PM +0200, Patrick McHardy wrote: > >>I agree that the problem should be fixed ASAP. I'll do a proper fix >>now - either by moving seq adjustment after NAT helpers or by adding >>a flag to the skb. > > Doesn't sound like a -stable fix, but whatever you think is best. Well, I also can't submit a patch that doesn't fix the problem properly to -stable. This patch moves helpers to a new hook two priorities below ip_conntrack_confirm() and sequence number adjustment to a new hook at the next priority, thereby restoreing the old order. Can someone affected by the problem please confirm that this patch fixes it? Thanks, Patrick [-- Attachment #2: x --] [-- Type: text/plain, Size: 7024 bytes --] ===== include/linux/netfilter_ipv4.h 1.7 vs edited ===== --- 1.7/include/linux/netfilter_ipv4.h 2004-03-30 06:24:38 +02:00 +++ edited/include/linux/netfilter_ipv4.h 2005-04-21 03:10:07 +02:00 @@ -62,6 +62,9 @@ NF_IP_PRI_FILTER = 0, NF_IP_PRI_NAT_SRC = 100, NF_IP_PRI_SELINUX_LAST = 225, + NF_IP_PRI_CONNTRACK_HELPER = INT_MAX - 2, + NF_IP_PRI_NAT_SEQ_ADJUST = INT_MAX - 1, + NF_IP_PRI_CONNTRACK_CONFIRM = INT_MAX, NF_IP_PRI_LAST = INT_MAX, }; ===== net/ipv4/netfilter/ip_conntrack_standalone.c 1.67 vs edited ===== --- 1.67/net/ipv4/netfilter/ip_conntrack_standalone.c 2005-03-18 19:38:59 +01:00 +++ edited/net/ipv4/netfilter/ip_conntrack_standalone.c 2005-04-21 03:10:44 +02:00 @@ -401,6 +401,16 @@ const struct net_device *out, int (*okfn)(struct sk_buff *)) { + /* We've seen it coming out the other side: confirm it */ + return ip_conntrack_confirm(pskb); +} + +static unsigned int ip_conntrack_help(unsigned int hooknum, + struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + int (*okfn)(struct sk_buff *)) +{ struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; @@ -412,9 +422,7 @@ if (ret != NF_ACCEPT) return ret; } - - /* We've seen it coming out the other side: confirm it */ - return ip_conntrack_confirm(pskb); + return NF_ACCEPT; } static unsigned int ip_conntrack_defrag(unsigned int hooknum, @@ -516,13 +524,30 @@ .priority = NF_IP_PRI_CONNTRACK, }; +/* helpers */ +static struct nf_hook_ops ip_conntrack_helper_out_ops = { + .hook = ip_conntrack_help, + .owner = THIS_MODULE, + .pf = PF_INET, + .hooknum = NF_IP_POST_ROUTING, + .priority = NF_IP_PRI_CONNTRACK_HELPER, +}; + +static struct nf_hook_ops ip_conntrack_helper_in_ops = { + .hook = ip_conntrack_help, + .owner = THIS_MODULE, + .pf = PF_INET, + .hooknum = NF_IP_LOCAL_IN, + .priority = NF_IP_PRI_CONNTRACK_HELPER, +}; + /* Refragmenter; last chance. */ static struct nf_hook_ops ip_conntrack_out_ops = { .hook = ip_refrag, .owner = THIS_MODULE, .pf = PF_INET, .hooknum = NF_IP_POST_ROUTING, - .priority = NF_IP_PRI_LAST, + .priority = NF_IP_PRI_CONNTRACK_CONFIRM, }; static struct nf_hook_ops ip_conntrack_local_in_ops = { @@ -530,7 +555,7 @@ .owner = THIS_MODULE, .pf = PF_INET, .hooknum = NF_IP_LOCAL_IN, - .priority = NF_IP_PRI_LAST-1, + .priority = NF_IP_PRI_CONNTRACK_CONFIRM, }; /* Sysctl support */ @@ -831,10 +856,20 @@ printk("ip_conntrack: can't register local out hook.\n"); goto cleanup_inops; } + ret = nf_register_hook(&ip_conntrack_helper_in_ops); + if (ret < 0) { + printk("ip_conntrack: can't register local in helper hook.\n"); + goto cleanup_inandlocalops; + } + ret = nf_register_hook(&ip_conntrack_helper_out_ops); + if (ret < 0) { + printk("ip_conntrack: can't register postrouting helper hook.\n"); + goto cleanup_helperinops; + } ret = nf_register_hook(&ip_conntrack_out_ops); if (ret < 0) { printk("ip_conntrack: can't register post-routing hook.\n"); - goto cleanup_inandlocalops; + goto cleanup_helperoutops; } ret = nf_register_hook(&ip_conntrack_local_in_ops); if (ret < 0) { @@ -860,6 +895,10 @@ nf_unregister_hook(&ip_conntrack_local_in_ops); cleanup_inoutandlocalops: nf_unregister_hook(&ip_conntrack_out_ops); + cleanup_helperoutops: + nf_unregister_hook(&ip_conntrack_helper_out_ops); + cleanup_helperinops: + nf_unregister_hook(&ip_conntrack_helper_in_ops); cleanup_inandlocalops: nf_unregister_hook(&ip_conntrack_local_out_ops); cleanup_inops: ===== net/ipv4/netfilter/ip_nat_core.c 1.69 vs edited ===== --- 1.69/net/ipv4/netfilter/ip_nat_core.c 2005-02-07 06:48:35 +01:00 +++ edited/net/ipv4/netfilter/ip_nat_core.c 2005-04-21 03:05:08 +02:00 @@ -356,15 +356,6 @@ unsigned long statusbit; enum ip_nat_manip_type mtype = HOOK2MANIP(hooknum); - if (test_bit(IPS_SEQ_ADJUST_BIT, &ct->status) - && (hooknum == NF_IP_POST_ROUTING || hooknum == NF_IP_LOCAL_IN)) { - DEBUGP("ip_nat_core: adjusting sequence number\n"); - /* future: put this in a l4-proto specific function, - * and call this function here. */ - if (!ip_nat_seq_adjust(pskb, ct, ctinfo)) - return NF_DROP; - } - if (mtype == IP_NAT_MANIP_SRC) statusbit = IPS_SRC_NAT; else ===== net/ipv4/netfilter/ip_nat_standalone.c 1.44 vs edited ===== --- 1.44/net/ipv4/netfilter/ip_nat_standalone.c 2005-01-27 07:03:17 +01:00 +++ edited/net/ipv4/netfilter/ip_nat_standalone.c 2005-04-21 03:15:54 +02:00 @@ -230,6 +230,25 @@ return ret; } +static unsigned int +ip_nat_adjust(unsigned int hooknum, + struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + int (*okfn)(struct sk_buff *)) +{ + struct ip_conntrack *ct; + enum ip_conntrack_info ctinfo; + + ct = ip_conntrack_get(*pskb, &ctinfo); + if (ct && test_bit(IPS_SEQ_ADJUST_BIT, &ct->status)) { + DEBUGP("ip_nat_standalone: adjusting sequence number\n"); + if (!ip_nat_seq_adjust(pskb, ct, ctinfo)) + return NF_DROP; + } + return NF_ACCEPT; +} + /* We must be after connection tracking and before packet filtering. */ /* Before packet filtering, change destination */ @@ -250,6 +269,15 @@ .priority = NF_IP_PRI_NAT_SRC, }; +/* After conntrack, adjust sequence number */ +static struct nf_hook_ops ip_nat_adjust_out_ops = { + .hook = ip_nat_adjust, + .owner = THIS_MODULE, + .pf = PF_INET, + .hooknum = NF_IP_POST_ROUTING, + .priority = NF_IP_PRI_NAT_SEQ_ADJUST, +}; + /* Before packet filtering, change destination */ static struct nf_hook_ops ip_nat_local_out_ops = { .hook = ip_nat_local_fn, @@ -268,6 +296,16 @@ .priority = NF_IP_PRI_NAT_SRC, }; +/* After conntrack, adjust sequence number */ +static struct nf_hook_ops ip_nat_adjust_in_ops = { + .hook = ip_nat_adjust, + .owner = THIS_MODULE, + .pf = PF_INET, + .hooknum = NF_IP_LOCAL_IN, + .priority = NF_IP_PRI_NAT_SEQ_ADJUST, +}; + + static int init_or_cleanup(int init) { int ret = 0; @@ -296,10 +334,20 @@ printk("ip_nat_init: can't register out hook.\n"); goto cleanup_inops; } + ret = nf_register_hook(&ip_nat_adjust_in_ops); + if (ret < 0) { + printk("ip_nat_init: can't register adjust in hook.\n"); + goto cleanup_outops; + } + ret = nf_register_hook(&ip_nat_adjust_out_ops); + if (ret < 0) { + printk("ip_nat_init: can't register adjust out hook.\n"); + goto cleanup_adjustin_ops; + } ret = nf_register_hook(&ip_nat_local_out_ops); if (ret < 0) { printk("ip_nat_init: can't register local out hook.\n"); - goto cleanup_outops; + goto cleanup_adjustout_ops;; } ret = nf_register_hook(&ip_nat_local_in_ops); if (ret < 0) { @@ -312,6 +360,10 @@ nf_unregister_hook(&ip_nat_local_in_ops); cleanup_localoutops: nf_unregister_hook(&ip_nat_local_out_ops); + cleanup_adjustout_ops: + nf_unregister_hook(&ip_nat_adjust_out_ops); + cleanup_adjustin_ops: + nf_unregister_hook(&ip_nat_adjust_in_ops); cleanup_outops: nf_unregister_hook(&ip_nat_out_ops); cleanup_inops: ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-21 1:38 ` Patrick McHardy @ 2005-04-21 12:31 ` Milos Wimmer 2005-04-21 12:32 ` Patrick McHardy 2005-04-21 13:31 ` Jonas Berlin 1 sibling, 1 reply; 32+ messages in thread From: Milos Wimmer @ 2005-04-21 12:31 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel On Thu, 21 Apr 2005, Patrick McHardy wrote: > Well, I also can't submit a patch that doesn't fix the problem properly > to -stable. This patch moves helpers to a new hook two priorities below > ip_conntrack_confirm() and sequence number adjustment to a new hook at > the next priority, thereby restoreing the old order. Can someone > affected by the problem please confirm that this patch fixes it? I applied your patch to vanilla kernel 2.6.11.7 and it works great. Regards, Milos ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-21 12:31 ` Milos Wimmer @ 2005-04-21 12:32 ` Patrick McHardy 0 siblings, 0 replies; 32+ messages in thread From: Patrick McHardy @ 2005-04-21 12:32 UTC (permalink / raw) To: Milos Wimmer; +Cc: Harald Welte, Rusty Russell, netfilter-devel Milos Wimmer wrote: > On Thu, 21 Apr 2005, Patrick McHardy wrote: > >> Can someone affected by the problem please confirm that this patch >> fixes it? > > I applied your patch to vanilla kernel 2.6.11.7 and it works great. Thanks Milos, I'll submit it tonight. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-21 1:38 ` Patrick McHardy 2005-04-21 12:31 ` Milos Wimmer @ 2005-04-21 13:31 ` Jonas Berlin 2005-04-21 23:01 ` Patrick McHardy 1 sibling, 1 reply; 32+ messages in thread From: Jonas Berlin @ 2005-04-21 13:31 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Rusty Russell, netfilter-devel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Quoting Patrick McHardy on 2005-04-21 01:38 UTC: > Well, I also can't submit a patch that doesn't fix the problem properly > to -stable. This patch moves helpers to a new hook two priorities below > ip_conntrack_confirm() and sequence number adjustment to a new hook at > the next priority, thereby restoreing the old order. Can someone > affected by the problem please confirm that this patch fixes it? > +++ edited/include/linux/netfilter_ipv4.h 2005-04-21 03:10:07 +02:00 > @@ -62,6 +62,9 @@ > NF_IP_PRI_FILTER = 0, > NF_IP_PRI_NAT_SRC = 100, > NF_IP_PRI_SELINUX_LAST = 225, > + NF_IP_PRI_CONNTRACK_HELPER = INT_MAX - 2, > + NF_IP_PRI_NAT_SEQ_ADJUST = INT_MAX - 1, > + NF_IP_PRI_CONNTRACK_CONFIRM = INT_MAX, > NF_IP_PRI_LAST = INT_MAX, > }; Not that I know much about netfilter hooks or the importance of the newly added hooks really being the last ones to be executed, but.. Would there be something to lose in using INT_MAX/2 (or something) instead of INT_MAX as a base for the additions? And/or maybe have a bit of space between them in case something new pops up later? Of course you _can_ change them later, but there's always the risk that some 3rd party software out there basing themselves on a standard kernel including these additions, so I think it would be nice not to have to change them later.. That said, I have to repeat that I don't know much about these things, and thus my suggestions might be too generic. - -- - - xkr47 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFCZ6snxyF48ZTvn+4RAotTAJ9GZ1G8IETHyC3P5rZ9Z8WWD6RaTQCffE7I 0lxX0NDf8GwBtYt+LE+fReU= =esWQ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-21 13:31 ` Jonas Berlin @ 2005-04-21 23:01 ` Patrick McHardy 0 siblings, 0 replies; 32+ messages in thread From: Patrick McHardy @ 2005-04-21 23:01 UTC (permalink / raw) To: Jonas Berlin; +Cc: Harald Welte, Rusty Russell, netfilter-devel Jonas Berlin wrote: > Would there be something to lose in using INT_MAX/2 (or something) > instead of INT_MAX as a base for the additions? And/or maybe have a bit > of space between them in case something new pops up later? Of course you > _can_ change them later, but there's always the risk that some 3rd party > software out there basing themselves on a standard kernel including > these additions, so I think it would be nice not to have to change them > later.. These hooks should be last. If anyone has a valid reason to call something after them they can post their code to the list and we can change it. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-04 8:27 ` Patrick McHardy 2005-04-04 20:47 ` Phil Oester @ 2005-04-27 0:44 ` Rusty Russell 2005-04-27 10:27 ` Patrick McHardy 1 sibling, 1 reply; 32+ messages in thread From: Rusty Russell @ 2005-04-27 0:44 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, netfilter-devel On Mon, 2005-04-04 at 10:27 +0200, Patrick McHardy wrote: > Phil Oester wrote: > > If the retransmitted packet were the same packet as the last one which was > > mangled, the seq will be adjusted by offset_before, which at this point is > > the same adjustment as the first packet received via offset_after. So, > > seq == correction_pos for the retransmitted packet, and this is fairly trivial > > to handle. This is the only case where I can see before() triggering a false > > positive. > > What you are trying to do is to reconstruct the original sequence > number, so lets consider the possible cases: Please enhance nfsim-testsuite with these. I dislike adding new hooks, but to be honest I haven't had time to follow this conversation. At least a testcase will allow us to vet other solutions, and prevent us from breaking this again. Rusty. -- A bad analogy is like a leaky screwdriver -- Richard Braakman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-27 0:44 ` Rusty Russell @ 2005-04-27 10:27 ` Patrick McHardy 0 siblings, 0 replies; 32+ messages in thread From: Patrick McHardy @ 2005-04-27 10:27 UTC (permalink / raw) To: Rusty Russell; +Cc: Harald Welte, netfilter-devel Rusty Russell wrote: > Please enhance nfsim-testsuite with these. I dislike adding new hooks, > but to be honest I haven't had time to follow this conversation. At > least a testcase will allow us to vet other solutions, and prevent us > from breaking this again. I don't like adding hooks either, but it was the safest fix for 2.6.12. I'll try to add the tests when I find some time, but first I need to get nfsim to run properly on my 64bit machine. Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-04-02 20:24 [PATCH] Fix NAT TCP sequence adjustment Phil Oester 2005-04-03 12:26 ` Milos Wimmer 2005-04-03 19:26 ` Patrick McHardy @ 2005-05-31 9:17 ` Rusty Russell 2005-05-31 13:02 ` Patrick McHardy 2 siblings, 1 reply; 32+ messages in thread From: Rusty Russell @ 2005-05-31 9:17 UTC (permalink / raw) To: Phil Oester; +Cc: netfilter-devel, Patrick McHardy On Sat, 2005-04-02 at 12:24 -0800, Phil Oester wrote: > In adjust_tcp_sequence, we track the sequence number of any adjustments > in the correction_pos variable. The seq stored is based upon the left > side of the window of the NAT box -- not of the original sender. > > Later, in ip_nat_seq_adjust, we compare the correction_pos variable to > the seq of the original sender to determine whether this is a new packet > or a retransmission (i.e. should we apply offset_before or offset_after). > So we are comparing the post-adjustment seq to a pre-adjustment seq. This is tested in the testsuite, and sure enough, now the patch resulting from this discussion has been applied, it fails. Well done: your fix broke FTP worse than the original problem, it seems. I *did* ask for a test here, and this is what the testsuite is *for*. And now it works on x86_64, there's no excuse. Grrr... Rusty. -- A bad analogy is like a leaky screwdriver -- Richard Braakman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-05-31 9:17 ` Rusty Russell @ 2005-05-31 13:02 ` Patrick McHardy 2005-05-31 13:48 ` Rusty Russell 0 siblings, 1 reply; 32+ messages in thread From: Patrick McHardy @ 2005-05-31 13:02 UTC (permalink / raw) To: Rusty Russell; +Cc: netfilter-devel Hi Rusty, using up your flame early this year? :) Rusty Russell wrote: > This is tested in the testsuite, and sure enough, now the patch > resulting from this discussion has been applied, it fails. Well done: > your fix broke FTP worse than the original problem, it seems. I don't think so. First of all, the original problem was data corruption, so if my patch didn't blew up your harddisc, it's hardly worse. Second of all, I spent like 10 hours to verify the proposed fixes, and I am still convinced that it is correct. You didn't mention which testsuite fails, so I assume you're talking about 03NAT/65a-ftp-ack-adjust2.sim. Looking at it, I'm pretty sure it will start working once you resend the "Hi Marc\r\n" with every new session, so a newline is present before the PORT command. > I *did* ask for a test here, and this is what the testsuite is *for*. > And now it works on x86_64, there's no excuse. Great, my excuse now will be that you apparently already wrote one :) Regards Patrick ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-05-31 13:02 ` Patrick McHardy @ 2005-05-31 13:48 ` Rusty Russell 2005-05-31 14:35 ` Patrick McHardy 0 siblings, 1 reply; 32+ messages in thread From: Rusty Russell @ 2005-05-31 13:48 UTC (permalink / raw) To: Patrick McHardy; +Cc: netfilter-devel On Tue, 2005-05-31 at 15:02 +0200, Patrick McHardy wrote: > Second of all, I spent like 10 hours to verify the > proposed fixes, and I am still convinced that it is correct. Which shows exactly *why* we have a testsuite. Dammit, I didn't spend all those hours on it for fun. You spent *10* hours, and the testsuite runs in 5 seconds (60 seconds counting build time the first time). <sigh> I fixed the testsuite, so now when I get bored I can come up with a patch which removes those horrible hooks you added (sure, I may have to rewrite everything to do it). Rusty. -- A bad analogy is like a leaky screwdriver -- Richard Braakman ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] Fix NAT TCP sequence adjustment 2005-05-31 13:48 ` Rusty Russell @ 2005-05-31 14:35 ` Patrick McHardy 0 siblings, 0 replies; 32+ messages in thread From: Patrick McHardy @ 2005-05-31 14:35 UTC (permalink / raw) To: Rusty Russell; +Cc: netfilter-devel Rusty Russell wrote: > On Tue, 2005-05-31 at 15:02 +0200, Patrick McHardy wrote: > >> Second of all, I spent like 10 hours to verify the >>proposed fixes, and I am still convinced that it is correct. > > Which shows exactly *why* we have a testsuite. Dammit, I didn't spend > all those hours on it for fun. > > You spent *10* hours, and the testsuite runs in 5 seconds (60 seconds > counting build time the first time). I don't get your point - I know why we have a testsuite. You're complaining that I had more important things to do than add tests to a testsuite that didn't even run on my system until perhaps 4 hours ago, after spending considerable time fixing the fallout from your patches? In any case, without identifying the problems, you can't test for them. It looks like your tests won't even trigger the original problem, let alone the problem case I identified in Phil's proposed patch. So if I had relied on it, we would now have an incorrect fix. And no, the 10 hours won't be wasted, as I already said before, I'll add proper tests when nfsim runs on my system and I have time. > <sigh> Sigh indeed. > I fixed the testsuite, so now when I get bored I can come up with a > patch which removes those horrible hooks you added (sure, I may have to > rewrite everything to do it). Feel free to do so. Its not hard to replace them, the alternative just felt a bit unsafe for -rc3. ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2005-05-31 14:35 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-02 20:24 [PATCH] Fix NAT TCP sequence adjustment Phil Oester 2005-04-03 12:26 ` Milos Wimmer 2005-04-03 19:26 ` Patrick McHardy 2005-04-03 23:53 ` Phil Oester 2005-04-04 4:40 ` Phil Oester 2005-04-04 8:27 ` Patrick McHardy 2005-04-04 20:47 ` Phil Oester 2005-04-05 7:32 ` Patrick McHardy 2005-04-05 13:33 ` Patch lifetime " Amin Azez 2005-04-10 20:49 ` Harald Welte 2005-04-06 4:48 ` Phil Oester 2005-04-18 1:42 ` Patrick McHardy 2005-04-19 0:58 ` Phil Oester 2005-04-20 15:03 ` Patrick McHardy 2005-04-20 15:53 ` Phil Oester 2005-04-20 16:07 ` Patrick McHardy 2005-04-20 17:24 ` Phil Oester 2005-04-20 17:50 ` Patrick McHardy 2005-04-20 18:25 ` Phil Oester 2005-04-20 21:39 ` Martijn Lievaart 2005-04-21 1:41 ` Patrick McHardy 2005-04-21 1:38 ` Patrick McHardy 2005-04-21 12:31 ` Milos Wimmer 2005-04-21 12:32 ` Patrick McHardy 2005-04-21 13:31 ` Jonas Berlin 2005-04-21 23:01 ` Patrick McHardy 2005-04-27 0:44 ` Rusty Russell 2005-04-27 10:27 ` Patrick McHardy 2005-05-31 9:17 ` Rusty Russell 2005-05-31 13:02 ` Patrick McHardy 2005-05-31 13:48 ` Rusty Russell 2005-05-31 14:35 ` Patrick McHardy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.