* Invalid SACK numbers in NAT'ed packets
@ 2008-04-23 15:22 Leonid Zeitlin
2008-04-23 19:31 ` Jozsef Kadlecsik
0 siblings, 1 reply; 9+ messages in thread
From: Leonid Zeitlin @ 2008-04-23 15:22 UTC (permalink / raw)
To: netfilter
Hi all,
I have the following issue that I need a comment on.
The setup. I have a Linux router with one NIC connected to a LAN and the
other connected directly to a Cisco 800 Series router. The Cisco box is
connected further to the ISP. The Cisco box creates an IPsec tunnel to a
remote site. It also does SNAT on packets coming through it from the LAN to
the remote site. It looks as follows:
LAN -> Linux router -> Cisco (NAT) -> IPsec tunnel -> remote site
The Linux router has the following IP Tables rules:
-A FORWARD -i eth0 -o eth3 -j ACCEPT
-A FORWARD -i eth3 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT
with the default policy for the FORWARD chain being DROP. Here eth0 is the
LAN interface and eth3 is the interface that goes to the Cisco router. Thus
packets coming from the LAN to the Cisco and further to remote site are
allowed, as well as replies to them, but the remote site can't intiate a
connection to the LAN. So far so good, we can connect from the LAN to the
remote site successfully.
The problem. When uploading large files by FTP from the LAN to the remote
site, the connection randomly stalls.
Analysis of tcp dumps shows that some ACK packets coming back from the
remote site are not forwarded by the Linux box from eth3 to eth0.
Apparently, Linux doesn't think they are part of the established connection.
Adding this rule that lets all ACK packets through fixes the problem:
-A FORWARD -i eth3 -o eth0 -p tcp --tcp-flags ACK ACK -j ACCEPT
I want to understand why these ACK packets are not allowed by the original
state match.
Further analysis of tcp dumps reveals the following. The Cisco router
changes TCP sequence numbers when it does NAT. So, the sequence numbers and
ACKs that go between the LAN and the Cisco box are not the same as the ones
on the other side, between the Cisco box and the remote site. The problem
is, the Cisco box does NOT change the sequence numbers within SACK TCP
option! So, whenever a packet is lost, and the remote site sends an ACK
packet that contains a SACK option, the Linux router sees a SACK option
referencing a packet it never saw before. My guess is that this is the
reason why the packet is not considered as belonging to an established
connection by netfilter.
The questions:
1. Does my explanation look plausible? Can an invalid sequence number in
SACK lead to the packet not being considered as belonging to an established
connection?
2. Has anyone come across such issue with a Cisco hardware before? Can it be
fixed by some configuration?
3. If not, can it be worked around at the Linux side; i.e. somehow ignore
invalid SACKs or prohibit SACKs on this particular connection (I am
reluctant to turn them off altogether with a sysctl).
Thanks a lot,
Leonid
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-23 15:22 Invalid SACK numbers in NAT'ed packets Leonid Zeitlin
@ 2008-04-23 19:31 ` Jozsef Kadlecsik
2008-04-24 9:09 ` Leonid Zeitlin
0 siblings, 1 reply; 9+ messages in thread
From: Jozsef Kadlecsik @ 2008-04-23 19:31 UTC (permalink / raw)
To: Leonid Zeitlin; +Cc: netfilter
On Wed, 23 Apr 2008, Leonid Zeitlin wrote:
> Further analysis of tcp dumps reveals the following. The Cisco router
> changes TCP sequence numbers when it does NAT. So, the sequence numbers and
> ACKs that go between the LAN and the Cisco box are not the same as the ones
> on the other side, between the Cisco box and the remote site. The problem
> is, the Cisco box does NOT change the sequence numbers within SACK TCP
> option! So, whenever a packet is lost, and the remote site sends an ACK
> packet that contains a SACK option, the Linux router sees a SACK option
> referencing a packet it never saw before. My guess is that this is the
> reason why the packet is not considered as belonging to an established
> connection by netfilter.
>
> The questions:
> 1. Does my explanation look plausible? Can an invalid sequence number in
> SACK lead to the packet not being considered as belonging to an established
> connection?
Yes, TCP connection tracking in netfilter checks the SACK option values as
well.
> 2. Has anyone come across such issue with a Cisco hardware before? Can it be
> fixed by some configuration?
It was reported a few times. Try to upgrade IOS image as it's a Cisco bug
> 3. If not, can it be worked around at the Linux side; i.e. somehow ignore
> invalid SACKs or prohibit SACKs on this particular connection (I am
> reluctant to turn them off altogether with a sysctl).
or use IPV4OPTSTRIP for the SYN packets sent/received in this direction as
a selective workaround for the problem.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-23 19:31 ` Jozsef Kadlecsik
@ 2008-04-24 9:09 ` Leonid Zeitlin
2008-04-24 9:33 ` Jozsef Kadlecsik
0 siblings, 1 reply; 9+ messages in thread
From: Leonid Zeitlin @ 2008-04-24 9:09 UTC (permalink / raw)
To: Jozsef Kadlecsik; +Cc: netfilter
Hi Jozsef,
Thanks for your answer.
> or use IPV4OPTSTRIP for the SYN packets sent/received in this direction as
> a selective workaround for the problem.
What is IPV4OPTSTRIP? How can I get it? It's not in standard iptables (not
the one that I have anyway), and I can't find it at the netfilter site
either.
Thanks,
Leonid
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-24 9:09 ` Leonid Zeitlin
@ 2008-04-24 9:33 ` Jozsef Kadlecsik
2008-04-25 8:50 ` Leonid Zeitlin
0 siblings, 1 reply; 9+ messages in thread
From: Jozsef Kadlecsik @ 2008-04-24 9:33 UTC (permalink / raw)
To: Leonid Zeitlin; +Cc: netfilter
On Thu, 24 Apr 2008, Leonid Zeitlin wrote:
> > or use IPV4OPTSTRIP for the SYN packets sent/received in this direction as
> > a selective workaround for the problem.
>
> What is IPV4OPTSTRIP? How can I get it? It's not in standard iptables (not the
> one that I have anyway), and I can't find it at the netfilter site either.
It's a target extension which can be found in patch-o-matic-ng. But sorry,
I mixed up: it strips off IPv4 options and not TCP options, so it'd not
help.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-24 9:33 ` Jozsef Kadlecsik
@ 2008-04-25 8:50 ` Leonid Zeitlin
2008-04-25 9:02 ` Jozsef Kadlecsik
0 siblings, 1 reply; 9+ messages in thread
From: Leonid Zeitlin @ 2008-04-25 8:50 UTC (permalink / raw)
To: Jozsef Kadlecsik; +Cc: netfilter
Thanks, Jozsef, I see.
It appears that short of writing a custom netfilter extension, there's no
way to turn off SACKs on a particular connection. Is this right?
Thanks,
Leonid
----- Original Message -----
From: "Jozsef Kadlecsik" <kadlec@blackhole.kfki.hu>
To: "Leonid Zeitlin" <lz@csltd.com.ua>
Cc: <netfilter@vger.kernel.org>
Sent: Thursday, April 24, 2008 12:33 PM
Subject: Re: Invalid SACK numbers in NAT'ed packets
> On Thu, 24 Apr 2008, Leonid Zeitlin wrote:
>
>> > or use IPV4OPTSTRIP for the SYN packets sent/received in this direction
>> > as
>> > a selective workaround for the problem.
>>
>> What is IPV4OPTSTRIP? How can I get it? It's not in standard iptables
>> (not the
>> one that I have anyway), and I can't find it at the netfilter site
>> either.
>
> It's a target extension which can be found in patch-o-matic-ng. But sorry,
> I mixed up: it strips off IPv4 options and not TCP options, so it'd not
> help.
>
> Best regards,
> Jozsef
> -
> E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
> PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
> Address : KFKI Research Institute for Particle and Nuclear Physics
> H-1525 Budapest 114, POB. 49, Hungary
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-25 8:50 ` Leonid Zeitlin
@ 2008-04-25 9:02 ` Jozsef Kadlecsik
2008-04-25 10:57 ` Jan Engelhardt
0 siblings, 1 reply; 9+ messages in thread
From: Jozsef Kadlecsik @ 2008-04-25 9:02 UTC (permalink / raw)
To: Leonid Zeitlin; +Cc: netfilter, netfilter-devel
On Fri, 25 Apr 2008, Leonid Zeitlin wrote:
> It appears that short of writing a custom netfilter extension, there's no way
> to turn off SACKs on a particular connection. Is this right?
Yes, exactly. Actually, writing a new extension to erase any TCP option
isn't that hard: just replace the option with noop and recalculate the
checksum.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-25 9:02 ` Jozsef Kadlecsik
@ 2008-04-25 10:57 ` Jan Engelhardt
2008-04-25 11:12 ` Jozsef Kadlecsik
2008-04-25 11:24 ` Leonid Zeitlin
0 siblings, 2 replies; 9+ messages in thread
From: Jan Engelhardt @ 2008-04-25 10:57 UTC (permalink / raw)
To: Jozsef Kadlecsik; +Cc: Leonid Zeitlin, netfilter, netfilter-devel
On Friday 2008-04-25 11:02, Jozsef Kadlecsik wrote:
>On Fri, 25 Apr 2008, Leonid Zeitlin wrote:
>
>> It appears that short of writing a custom netfilter extension, there's no way
>> to turn off SACKs on a particular connection. Is this right?
>
>Yes, exactly. Actually, writing a new extension to erase any TCP option
>isn't that hard: just replace the option with noop and recalculate the
>checksum.
There is already a TCPOPTSTRIP target.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-25 10:57 ` Jan Engelhardt
@ 2008-04-25 11:12 ` Jozsef Kadlecsik
2008-04-25 11:24 ` Leonid Zeitlin
1 sibling, 0 replies; 9+ messages in thread
From: Jozsef Kadlecsik @ 2008-04-25 11:12 UTC (permalink / raw)
To: Jan Engelhardt; +Cc: Leonid Zeitlin, netfilter, netfilter-devel
On Fri, 25 Apr 2008, Jan Engelhardt wrote:
> On Friday 2008-04-25 11:02, Jozsef Kadlecsik wrote:
>
> >On Fri, 25 Apr 2008, Leonid Zeitlin wrote:
> >
> >> It appears that short of writing a custom netfilter extension, there's no way
> >> to turn off SACKs on a particular connection. Is this right?
> >
> >Yes, exactly. Actually, writing a new extension to erase any TCP option
> >isn't that hard: just replace the option with noop and recalculate the
> >checksum.
>
> There is already a TCPOPTSTRIP target.
True, it was added to 2.6.25. I should have checked the facts before
posting.
Best regards,
Jozsef
-
E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Invalid SACK numbers in NAT'ed packets
2008-04-25 10:57 ` Jan Engelhardt
2008-04-25 11:12 ` Jozsef Kadlecsik
@ 2008-04-25 11:24 ` Leonid Zeitlin
1 sibling, 0 replies; 9+ messages in thread
From: Leonid Zeitlin @ 2008-04-25 11:24 UTC (permalink / raw)
To: Jan Engelhardt, Jozsef Kadlecsik; +Cc: netfilter, netfilter-devel
If I am reading the news correctly, it is available in the latest 2.6.25
kernel, right?
Thanks,
Leonid
----- Original Message -----
From: "Jan Engelhardt" <jengelh@computergmbh.de>
To: "Jozsef Kadlecsik" <kadlec@blackhole.kfki.hu>
Cc: "Leonid Zeitlin" <lz@csltd.com.ua>; <netfilter@vger.kernel.org>;
<netfilter-devel@vger.kernel.org>
Sent: Friday, April 25, 2008 1:57 PM
Subject: Re: Invalid SACK numbers in NAT'ed packets
>
> On Friday 2008-04-25 11:02, Jozsef Kadlecsik wrote:
>
>>On Fri, 25 Apr 2008, Leonid Zeitlin wrote:
>>
>>> It appears that short of writing a custom netfilter extension, there's
>>> no way
>>> to turn off SACKs on a particular connection. Is this right?
>>
>>Yes, exactly. Actually, writing a new extension to erase any TCP option
>>isn't that hard: just replace the option with noop and recalculate the
>>checksum.
>
> There is already a TCPOPTSTRIP target.
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-04-25 11:24 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-23 15:22 Invalid SACK numbers in NAT'ed packets Leonid Zeitlin
2008-04-23 19:31 ` Jozsef Kadlecsik
2008-04-24 9:09 ` Leonid Zeitlin
2008-04-24 9:33 ` Jozsef Kadlecsik
2008-04-25 8:50 ` Leonid Zeitlin
2008-04-25 9:02 ` Jozsef Kadlecsik
2008-04-25 10:57 ` Jan Engelhardt
2008-04-25 11:12 ` Jozsef Kadlecsik
2008-04-25 11:24 ` Leonid Zeitlin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox