* masquerading failure for at least icmp and tcp+sack on amd64
@ 2005-09-06 17:29 Marc Lehmann
2005-09-07 12:44 ` Marc Lehmann
0 siblings, 1 reply; 7+ messages in thread
From: Marc Lehmann @ 2005-09-06 17:29 UTC (permalink / raw)
To: linux-kernel
Hi!
I recently upgraded a 32 bit machine to a new amd64 board+cpu. I took the
same kernel (2.6.13-rc7) and just recompiled it for 64 bit, plus upgraded
userspace to 64 bit.
Firewall config stayed the same.
Problem: neither ping nor tcp was being masqueraded properly. I created
the following test-set-up:
iptables -t mangle -F
iptables -t filter -F
iptables -t nat -F
iptables -t nat -A POSTROUTING -p all -s 10.0.0.0/8 -d \! 10.0.0.0/8 -j MASQUERADE
i..e the above masquerade rule should be the only firewall rule, and all
fules shoul[d have policy ACCEPT.
The effect was that tcp packets and icmp packets coming from 10.0.0.1 on
interface eth0 were properly masqueraded on the outgoing "inet" interface
(ppp0 renamed):
eth0:
19:17:24.364351 IP 10.0.0.1.44320 > 129.13.162.95.80: S 3745828676:3745828676(0) win 5840 <mss 1460,nop,nop,sackOK>
inet:
19:17:24.364505 IP 84.56.237.68.44320 > 129.13.162.95.80: S 3745828676:3745828676(0) win 5840 <mss 1452,nop,nop,sackOK>
19:17:24.378029 IP 129.13.162.95.80 > 84.56.237.68.44320: S 3777391404:3777391404(0) ack 3745828677 win 5840 <mss 1460,nop,nop,sackOK>
19:17:24.378103 IP 84.56.237.68.44320 > 129.13.162.95.80: R 3745828677:3745828677(0) win 0
However, the reverse packets were rejected. ip_conntrack showed this:
tcp 6 52 SYN_SENT src=10.0.0.1 dst=129.13.162.95 sport=44320 dport=80 [UNREPLIED] src=129.13.162.95 dst=84.56.237.68 sport=80 dport=44320 mark=0 use=1
ICMP echo replies were also masqueraded, but the reply was ignored.
Weird observation 1:
ip route del default
ip add default via 10.0.0.17
Resulted in working masquerading, this time over device "vpn0", which is
a tuntap-interface. Working means that outgoing packets were correctly
re-written with source 10.0.0.5 (local address of vpn0) and replie were
correctly "un"-translated.
Weird obervation 2:
Some sites could be connected to with TCP. It turned out that those
sites did not support TCP SACK. Indeed, turning off SACK either on the
remote side of a connection or on the origonator side resulted in workign
masquerading:
eth0:
19:23:29.928470 IP 10.0.0.1.45611 > 129.13.162.95.80: S 4113365634:4113365634(0) win 5840 <mss 1460>
19:23:29.942246 IP 129.13.162.95.80 > 10.0.0.1.45611: S 4161877683:4161877683(0) ack 4113365635 win 5840 <mss 1460>
19:23:29.942313 IP 10.0.0.1.45611 > 129.13.162.95.80: . ack 1 win 5840
inet:
19:23:29.928249 IP 84.56.237.68.45611 > 129.13.162.95.80: S 4113365634:4113365634(0) win 5840 <mss 1452>
19:23:29.942199 IP 129.13.162.95.80 > 84.56.237.68.45611: S 4161877683:4161877683(0) ack 4113365635 win 5840 <mss 1460>
19:23:29.942332 IP 84.56.237.68.45611 > 129.13.162.95.80: . ack 1 win 5840
However, ICMP still is not masqueraded.
Kernels that worked:
2.6.13-rc7, 2.6.12.5, 2.6.11 and lower, compiled for x86 with gcc-3.4
Kernels that don't work:
2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)
Kernel configuration was exactly the same for the 2.6.13-rc7 kernels,
modulo the cpu and architectrue selections.
I have a somewhat nontrivial source routing set-up on that machine that I
could document more if that could be a possible reason for that problem. I
am confident that this is not a configuration error, as the configuraiton
worked basically unchanged since the 2.4 days, and I am confident it's not
a iptables setup problem either, as I can reproduce it with empty rules
except for the masquerading rule.
I did not mention UDP because I didn't test it, but it's likely that UDP
masquerading also fails.
Any idea at what I could look at or try out to find out more about this
problem?
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: masquerading failure for at least icmp and tcp+sack on amd64
2005-09-06 17:29 masquerading failure for at least icmp and tcp+sack on amd64 Marc Lehmann
@ 2005-09-07 12:44 ` Marc Lehmann
0 siblings, 0 replies; 7+ messages in thread
From: Marc Lehmann @ 2005-09-07 12:44 UTC (permalink / raw)
To: linux-kernel
On Tue, Sep 06, 2005 at 07:29:30PM +0200, Marc Lehmann <schmorp@schmorp.de> wrote:
> Weird obervation 2:
>
> Some sites could be connected to with TCP. It turned out that those
> sites did not support TCP SACK. Indeed, turning off SACK either on the
> remote side of a connection or on the origonator side resulted in workign
> masquerading:
Sorry for the F'up, but this turned to be slightly untrue: turning off SACK
makes the syn handshake happen, but some packets further down the stream
the masquerading router sends a RST again.
> Kernels that don't work:
>
> 2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)
>
I forgot to mention that the kernels that don't work are for amd64. In
the meantime, I also tried out 2.6.11 (as I had some troubles with
2.6.12..2.6.13-rc7 on other amd64 machines), with the same result (reply
packets are ignored/rejected).
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: masquerading failure for at least icmp and tcp+sack on amd64
2005-09-11 14:10 ` Patrick McHardy
@ 2005-09-13 18:09 ` Stephen Hemminger
2005-09-13 20:59 ` David S. Miller
2005-09-14 1:10 ` Patrick McHardy
0 siblings, 2 replies; 7+ messages in thread
From: Stephen Hemminger @ 2005-09-13 18:09 UTC (permalink / raw)
To: Patrick McHardy
Cc: Andrew Morton, netdev, Netfilter Development Mailinglist,
Marc Lehmann
On Sun, 11 Sep 2005 16:10:01 +0200
Patrick McHardy <kaber@trash.net> wrote:
> Marc Lehmann wrote:
> > On Fri, Sep 09, 2005 at 01:41:34PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> >
> >>What network driver are you using?
> >
> > Happens with both skge and sk98.
>
> Are you sure the same checksum error happens? sk98_lin never sets
> ip_summed to CHECKSUM_HW, it uses either CHECKSUM_UNNECESSARY for
> HW checksummed packets, in which case the check is skipped in
> ip_conntrack, or it uses CHECKSUM_NONE, in which case the checksum
> in the packet must be invalid if the check fails and the packet
> wouldn't be accepted by the final receipient anyway. skge uses
> CHECKSUM_HW for every packet, so they have nothing in common wrt.
> HW checksumming. This would normally mean we can rule out HW
> checksumming, but for some reason turning it off on skge seems
> to help in your case. Stephen, you're more familiar with the
> sk* drivers than me, anything I'm missing here?
>
Some background, the semantic of ip_summed is different on the
output than the input path. On input, it means a checksum is
available in skb->csum; and on output it means the packet is
destined for a device that can do hardware checksumming.
I have gotten reports of receive checksum errors on
some systems, it may be related to certain revisions of
hardware. It would be useful to see the message printed out
by the skge driver that shows chip and revision.
Also, on the input path for TCP and UDP, the code does not
depend on the hardware being correct, and if the checksum
is incorrect, it just prints a warning and does a software
checksum before deciding to drop.
Perhaps netfilter code needs to handle that case?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: masquerading failure for at least icmp and tcp+sack on amd64
2005-09-13 18:09 ` Stephen Hemminger
@ 2005-09-13 20:59 ` David S. Miller
2005-09-14 1:13 ` Patrick McHardy
2005-09-14 1:10 ` Patrick McHardy
1 sibling, 1 reply; 7+ messages in thread
From: David S. Miller @ 2005-09-13 20:59 UTC (permalink / raw)
To: shemminger; +Cc: akpm, netdev, netfilter-devel, kaber, schmorp
From: Stephen Hemminger <shemminger@osdl.org>
Date: Tue, 13 Sep 2005 11:09:02 -0700
> Also, on the input path for TCP and UDP, the code does not
> depend on the hardware being correct, and if the checksum
> is incorrect, it just prints a warning and does a software
> checksum before deciding to drop.
> Perhaps netfilter code needs to handle that case?
I personally think netfilter should do so.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: masquerading failure for at least icmp and tcp+sack on amd64
2005-09-13 18:09 ` Stephen Hemminger
2005-09-13 20:59 ` David S. Miller
@ 2005-09-14 1:10 ` Patrick McHardy
1 sibling, 0 replies; 7+ messages in thread
From: Patrick McHardy @ 2005-09-14 1:10 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Andrew Morton, netdev, Netfilter Development Mailinglist,
Marc Lehmann
Stephen Hemminger wrote:
> Some background, the semantic of ip_summed is different on the
> output than the input path. On input, it means a checksum is
> available in skb->csum; and on output it means the packet is
> destined for a device that can do hardware checksumming.
Yep, so far the netfilter code should be correct since my batch
of HW checksum fixes.
> I have gotten reports of receive checksum errors on
> some systems, it may be related to certain revisions of
> hardware. It would be useful to see the message printed out
> by the skge driver that shows chip and revision.
That may be the reason. I've audited most relevant code-paths
for this case and they seem to be mostly OK. There are a couple
of cases in the ppp code I'm not sure about yet, but I couldn't
trigger any errors in my test-setup. Anyway they don't seem to
be related to this problem since it also happens with sk98_lin
which doesn't set CHECKSUM_HW.
> Also, on the input path for TCP and UDP, the code does not
> depend on the hardware being correct, and if the checksum
> is incorrect, it just prints a warning and does a software
> checksum before deciding to drop.
> Perhaps netfilter code needs to handle that case?
Yes, this is not handled so far. But is looks like there is
some other problem because it also happens with sk98_lin.
Thanks for your help Stephen.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: masquerading failure for at least icmp and tcp+sack on amd64
2005-09-13 20:59 ` David S. Miller
@ 2005-09-14 1:13 ` Patrick McHardy
2005-09-14 3:41 ` David S. Miller
0 siblings, 1 reply; 7+ messages in thread
From: Patrick McHardy @ 2005-09-14 1:13 UTC (permalink / raw)
To: David S. Miller; +Cc: akpm, netdev, netfilter-devel, schmorp, shemminger
David S. Miller wrote:
> From: Stephen Hemminger <shemminger@osdl.org>
> Date: Tue, 13 Sep 2005 11:09:02 -0700
>
>
>>Also, on the input path for TCP and UDP, the code does not
>>depend on the hardware being correct, and if the checksum
>>is incorrect, it just prints a warning and does a software
>>checksum before deciding to drop.
>>Perhaps netfilter code needs to handle that case?
>
> I personally think netfilter should do so.
I agree. One thing I've planned for some silent moment is
to clean up the entire netfilter checksumming code (there's
lots of small duplicated chunks). Probably at least some
of it will also be applicable for the remaining stack.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: masquerading failure for at least icmp and tcp+sack on amd64
2005-09-14 1:13 ` Patrick McHardy
@ 2005-09-14 3:41 ` David S. Miller
0 siblings, 0 replies; 7+ messages in thread
From: David S. Miller @ 2005-09-14 3:41 UTC (permalink / raw)
To: kaber; +Cc: akpm, netdev, netfilter-devel, schmorp, shemminger
From: Patrick McHardy <kaber@trash.net>
Date: Wed, 14 Sep 2005 03:13:39 +0200
> David S. Miller wrote:
> > I personally think netfilter should do so.
>
> I agree. One thing I've planned for some silent moment is
> to clean up the entire netfilter checksumming code (there's
> lots of small duplicated chunks). Probably at least some
> of it will also be applicable for the remaining stack.
There is another thing I thought about today, and that is
to automatically handle this CHECKSUM_HW stuff when the
skb->data area is COW'd via pskb_expand_head() or similar.
I don't know how well that would work, but if it did then
we could consolidate all of this stuff into one spot which
is always nice.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-09-14 3:41 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-06 17:29 masquerading failure for at least icmp and tcp+sack on amd64 Marc Lehmann
2005-09-07 12:44 ` Marc Lehmann
[not found] <20050907052057.09714a4c.akpm@osdl.org>
2005-09-07 12:39 ` Fw: " Patrick McHardy
2005-09-07 20:59 ` Marc Lehmann
2005-09-07 21:34 ` Patrick McHardy
2005-09-07 21:52 ` Marc Lehmann
2005-09-09 11:41 ` Patrick McHardy
2005-09-11 13:19 ` Marc Lehmann
2005-09-11 14:10 ` Patrick McHardy
2005-09-13 18:09 ` Stephen Hemminger
2005-09-13 20:59 ` David S. Miller
2005-09-14 1:13 ` Patrick McHardy
2005-09-14 3:41 ` David S. Miller
2005-09-14 1:10 ` Patrick McHardy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.