netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
       [not found] <20050907052057.09714a4c.akpm@osdl.org>
@ 2005-09-07 12:39 ` Patrick McHardy
  2005-09-07 20:59   ` Marc Lehmann
  2005-09-07 21:34   ` Marc Lehmann
  0 siblings, 2 replies; 17+ messages in thread
From: Patrick McHardy @ 2005-09-07 12:39 UTC (permalink / raw)
  To: Marc Lehmann; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

Andrew Morton wrote:
> I recently upgraded a 32 bit machine to a new amd64 board+cpu. I took the
> same kernel (2.6.13-rc7) and just recompiled it for 64 bit, plus upgraded
> userspace to 64 bit.
> 
> Firewall config stayed the same.
> 
> Problem: neither ping nor tcp was being masqueraded properly. I created
> the following test-set-up:
> 
>    iptables -t mangle -F
>    iptables -t filter -F
>    iptables -t nat -F
>    iptables -t nat -A POSTROUTING -p all -s 10.0.0.0/8 -d \! 10.0.0.0/8 -j MASQUERADE
> 
> i..e the above masquerade rule should be the only firewall rule, and all
> fules shoul[d have policy ACCEPT.
> 
> The effect was that tcp packets and icmp packets coming from 10.0.0.1 on
> interface eth0 were properly masqueraded on the outgoing "inet" interface
> (ppp0 renamed):
> 
> eth0:
>    19:17:24.364351 IP 10.0.0.1.44320 > 129.13.162.95.80: S 3745828676:3745828676(0) win 5840 <mss 1460,nop,nop,sackOK>
> 
> inet:
>    19:17:24.364505 IP 84.56.237.68.44320 > 129.13.162.95.80: S 3745828676:3745828676(0) win 5840 <mss 1452,nop,nop,sackOK>
>    19:17:24.378029 IP 129.13.162.95.80 > 84.56.237.68.44320: S 3777391404:3777391404(0) ack 3745828677 win 5840 <mss 1460,nop,nop,sackOK>
>    19:17:24.378103 IP 84.56.237.68.44320 > 129.13.162.95.80: R 3745828677:3745828677(0) win 0
> 
> However, the reverse packets were rejected. ip_conntrack showed this:
> 
>    tcp      6 52 SYN_SENT src=10.0.0.1 dst=129.13.162.95 sport=44320 dport=80 [UNREPLIED] src=129.13.162.95 dst=84.56.237.68 sport=80 dport=44320 mark=0 use=1

It seems ip_conntrack did not like the SYN/ACK and marked it as invalid,
NAT leaves the packet alone and the firewall resets the connection.
Please try if loading the ipt_LOG module and executing
"echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid"
gives more information

> Weird obervation 2:
> 
> Some sites could be connected to with TCP. It turned out that those
> sites did not support TCP SACK. Indeed, turning off SACK either on the
> remote side of a connection or on the origonator side resulted in workign
> masquerading:
> 
> eth0:
>    19:23:29.928470 IP 10.0.0.1.45611 > 129.13.162.95.80: S 4113365634:4113365634(0) win 5840 <mss 1460>
>    19:23:29.942246 IP 129.13.162.95.80 > 10.0.0.1.45611: S 4161877683:4161877683(0) ack 4113365635 win 5840 <mss 1460>
>    19:23:29.942313 IP 10.0.0.1.45611 > 129.13.162.95.80: . ack 1 win 5840
> 
> inet:
>    19:23:29.928249 IP 84.56.237.68.45611 > 129.13.162.95.80: S 4113365634:4113365634(0) win 5840 <mss 1452>
>    19:23:29.942199 IP 129.13.162.95.80 > 84.56.237.68.45611: S 4161877683:4161877683(0) ack 4113365635 win 5840 <mss 1460>
>    19:23:29.942332 IP 84.56.237.68.45611 > 129.13.162.95.80: . ack 1 win 5840
> 
> However, ICMP still is not masqueraded.

Please also try this again with logging enabled.

> Kernels that worked:
> 
>    2.6.13-rc7, 2.6.12.5, 2.6.11 and lower, compiled for x86 with gcc-3.4
> 
> Kernels that don't work:
> 
>    2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)

Can you retest with 2.6.12.5 on 64bit so we can see if it is a new
problem?

> Kernel configuration was exactly the same for the 2.6.13-rc7 kernels,
> modulo the cpu and architectrue selections.
> 
> I have a somewhat nontrivial source routing set-up on that machine that I
> could document more if that could be a possible reason for that problem. I
> am confident that this is not a configuration error, as the configuraiton
> worked basically unchanged since the 2.4 days, and I am confident it's not
> a iptables setup problem either, as I can reproduce it with empty rules
> except for the masquerading rule.

So far I don't think its related to routed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 12:39 ` Fw: masquerading failure for at least icmp and tcp+sack on amd64 Patrick McHardy
@ 2005-09-07 20:59   ` Marc Lehmann
  2005-09-07 21:34     ` Patrick McHardy
  2005-09-07 21:34   ` Marc Lehmann
  1 sibling, 1 reply; 17+ messages in thread
From: Marc Lehmann @ 2005-09-07 20:59 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

On Wed, Sep 07, 2005 at 02:39:20PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> Andrew Morton wrote:

Thanks for your response!

> >   tcp      6 52 SYN_SENT src=10.0.0.1 dst=129.13.162.95 sport=44320 
> >   dport=80 [UNREPLIED] src=129.13.162.95 dst=84.56.237.68 sport=80 
> >   dport=44320 mark=0 use=1
> 
> It seems ip_conntrack did not like the SYN/ACK and marked it as invalid,
> NAT leaves the packet alone and the firewall resets the connection.
> Please try if loading the ipt_LOG module and executing
> "echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid"
> gives more information

I think I have the LOG target compiled into the kernel. After the echo, I got
this within a matter of seconds:

   printk: 614 messages suppressed.
   ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxxxxxxxxxxx DST=84.56.231.206 LEN=105 TOS=0x00 PREC=0x00 TTL=53 ID=33989 DF PROTO=TCP SPT=119 DPT=41349 SEQ=495763142 ACK=177548929 WINDOW=56677 RES=0x00 ACK PSH URGP=0 OPT (0101080A0986EF9D00E16123) 

This is interesting, as the connection in question seems to work fine (at
least I can download news at 32kb/s, which is the rate limit on the other
side without much more than 32kb/s on my ppp link, so it is weird that
this many packets should have invalid tcp checksum. Maybe this is somehow
related?)

I then tried to create a masqueraded connection and got the expected
symptoms: correctly re-written packet leaves interface, return packet gets
RST.

During that time, I got more of the above messages, but none related to the
test connection.

I then stopped all traffic-generating programs to get an idle link and
retried. Still no log messages from the test conenction.

> >eth0:
> >   19:23:29.928470 IP 10.0.0.1.45611 > 129.13.162.95.80: S 
> >   4113365634:4113365634(0) win 5840 <mss 1460>
> >   19:23:29.942246 IP 129.13.162.95.80 > 10.0.0.1.45611: S 
> >   4161877683:4161877683(0) ack 4113365635 win 5840 <mss 1460>
> >   19:23:29.942313 IP 10.0.0.1.45611 > 129.13.162.95.80: . ack 1 win 5840
> >
> >inet:
> >   19:23:29.928249 IP 84.56.237.68.45611 > 129.13.162.95.80: S 
> >   4113365634:4113365634(0) win 5840 <mss 1452>
> >   19:23:29.942199 IP 129.13.162.95.80 > 84.56.237.68.45611: S 
> >   4161877683:4161877683(0) ack 4113365635 win 5840 <mss 1460>
> >   19:23:29.942332 IP 84.56.237.68.45611 > 129.13.162.95.80: . ack 1 win 
> >   5840
> >
> >However, ICMP still is not masqueraded.
> 
> Please also try this again with logging enabled.

No messages, either.

(As I wrote in another mail), I also found in the meantime that switching
off SACK only results in a correct handshake, further packets might and
usually will cause a RST.

> >Kernels that don't work:
> >
> >   2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)
> 
> Can you retest with 2.6.12.5 on 64bit so we can see if it is a new
> problem?

I hope that trying with 2.6.11, and getting the same problem (as I did in
the meantime), is even better than testing 2.6.12.5.

> So far I don't think its related to routed.

The weird thing is that it works on tap, but not on ethernet/ppp. Maybe
the kernel code gets some offset wrong?

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcg@goof.com
      --==---/ / _ \/ // /\ \/ /      http://schmorp.de/
      -=====/_/_//_/\_,_/ /_/\_\      XX11-RIPE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 20:59   ` Marc Lehmann
@ 2005-09-07 21:34     ` Patrick McHardy
  2005-09-07 21:52       ` Marc Lehmann
  0 siblings, 1 reply; 17+ messages in thread
From: Patrick McHardy @ 2005-09-07 21:34 UTC (permalink / raw)
  To: Marc Lehmann; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

Marc Lehmann wrote:
> I think I have the LOG target compiled into the kernel. After the echo, I got
> this within a matter of seconds:
> 
>    printk: 614 messages suppressed.
>    ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxxxxxxxxxxx DST=84.56.231.206 LEN=105 TOS=0x00 PREC=0x00 TTL=53 ID=33989 DF PROTO=TCP SPT=119 DPT=41349 SEQ=495763142 ACK=177548929 WINDOW=56677 RES=0x00 ACK PSH URGP=0 OPT (0101080A0986EF9D00E16123) 

Interesting .. if this isn't real there is most likely some problem with
HW checksumming in netfilter. What does ethtool -k <dev> show?

> This is interesting, as the connection in question seems to work fine (at
> least I can download news at 32kb/s, which is the rate limit on the other
> side without much more than 32kb/s on my ppp link, so it is weird that
> this many packets should have invalid tcp checksum. Maybe this is somehow
> related?)
> 
> I then tried to create a masqueraded connection and got the expected
> symptoms: correctly re-written packet leaves interface, return packet gets
> RST.
> 
> During that time, I got more of the above messages, but none related to the
> test connection.

Could be because of net_ratelimit() message surpressing.

> (As I wrote in another mail), I also found in the meantime that switching
> off SACK only results in a correct handshake, further packets might and
> usually will cause a RST.

I'm not aware of any special handling for SACKs that would make it fail,
especially considering that ICMP also fails, but I'm going to look into
it.

>>>Kernels that don't work:
>>>
>>>  2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)
>>
>>Can you retest with 2.6.12.5 on 64bit so we can see if it is a new
>>problem?
> 
> 
> I hope that trying with 2.6.11, and getting the same problem (as I did in
> the meantime), is even better than testing 2.6.12.5.

Thanks, this is even more evidence for HW checksumming problems, these
existed for a long time.

>>So far I don't think its related to routed.
> 
> The weird thing is that it works on tap, but not on ethernet/ppp. Maybe
> the kernel code gets some offset wrong?

Another sign pointing to HW checksumming ..

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 12:39 ` Fw: masquerading failure for at least icmp and tcp+sack on amd64 Patrick McHardy
  2005-09-07 20:59   ` Marc Lehmann
@ 2005-09-07 21:34   ` Marc Lehmann
  2005-09-07 21:42     ` Patrick McHardy
  1 sibling, 1 reply; 17+ messages in thread
From: Marc Lehmann @ 2005-09-07 21:34 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

On Wed, Sep 07, 2005 at 02:39:20PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> Please try if loading the ipt_LOG module and executing
> "echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid"
> gives more information

Some more messages I get when logging is enabled:

printk: 1286 messages suppressed.
ip_ct_tcp: invalid state IN= OUT= SRC=84.56.231.206 DST=xxx.xxx.xxx.xxx LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=3260 DF PROTO=TCP SPT=41535 DPT=119 SEQ=3475818900 ACK=1819416201 WINDOW=12198 RES=0x00 ACK URGP=0 OPT (0101080A00F5DE260917B744) UID=0 
printk: 1166 messages suppressed.
ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxx.xxx.xxx.xxx DST=84.56.231.206 LEN=1492 TOS=0x00 PREC=0x00 TTL=53 ID=6652 DF PROTO=TCP SPT=119 DPT=41550 SEQ=686563106 ACK=3472571721 WINDOW=55741 RES=0x00 ACK URGP=0 OPT (0101080A091782AB00F5E2EC) 
printk: 1114 messages suppressed.
ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxx.xxx.xxx.xxx DST=84.56.231.206 LEN=52 TOS=0x00 PREC=0x00 TTL=53 ID=45484 DF PROTO=TCP SPT=119 DPT=41550 SEQ=686606959 ACK=3472571737 WINDOW=55725 RES=0x00 ACK URGP=0 OPT (0101080A0917849E00F5E7B4) 
printk: 1214 messages suppressed.
ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxx.xxx.xxx.xxx DST=84.56.231.206 LEN=1492 TOS=0x00 PREC=0x00 TTL=53 ID=39527 DF PROTO=TCP SPT=119 DPT=41552 SEQ=2432945453 ACK=3473246510 WINDOW=56283 RES=0x00 ACK URGP=0 OPT (0101080A09182B2000F5ECAC) 
printk: 1320 messages suppressed.
ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxx.xxx.xxx.xxx DST=84.56.231.206 LEN=1492 TOS=0x00 PREC=0x00 TTL=52 ID=4867 DF PROTO=TCP SPT=119 DPT=41561 SEQ=1077509261 ACK=3487524170 WINDOW=56319 RES=0x00 ACK URGP=0 OPT (0101080A0917ABCD00F5F18F) 
printk: 1190 messages suppressed.
ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxx.xxx.xxx.xxx DST=84.56.231.206 LEN=1492 TOS=0x00 PREC=0x00 TTL=53 ID=7628 DF PROTO=TCP SPT=119 DPT=41538 SEQ=163747835 ACK=3477529327 WINDOW=56170 RES=0x00 ACK URGP=0 OPT (0101080A098F2AB200F5F682) 
printk: 1172 messages suppressed.

The corresponding connections work just fine, though (and I think I get
more than a single message for every physical packet received).

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcg@goof.com
      --==---/ / _ \/ // /\ \/ /      http://schmorp.de/
      -=====/_/_//_/\_,_/ /_/\_\      XX11-RIPE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 21:34   ` Marc Lehmann
@ 2005-09-07 21:42     ` Patrick McHardy
  2005-09-07 21:54       ` Marc Lehmann
  0 siblings, 1 reply; 17+ messages in thread
From: Patrick McHardy @ 2005-09-07 21:42 UTC (permalink / raw)
  To: Marc Lehmann; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

Marc Lehmann wrote:
> On Wed, Sep 07, 2005 at 02:39:20PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> 
>>Please try if loading the ipt_LOG module and executing
>>"echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid"
>>gives more information
> 
> Some more messages I get when logging is enabled:
> 
> printk: 1286 messages suppressed.
> ip_ct_tcp: invalid state IN= OUT= SRC=84.56.231.206 DST=xxx.xxx.xxx.xxx LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=3260 DF PROTO=TCP SPT=41535 DPT=119 SEQ=3475818900 ACK=1819416201 WINDOW=12198 RES=0x00 ACK URGP=0 OPT (0101080A00F5DE260917B744) UID=0 
> printk: 1166 messages suppressed.

This doesn't tell much since we don't know what happend to the
connection before. You can disable message surpressing by echoing
0 to /proc/sys/net/core/message_cost. I had a beer too much for
serious debugging, just to give some hints, does this patch make
the problem go away?


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 443 bytes --]

diff --git a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c
--- a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c
+++ b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c
@@ -843,7 +843,6 @@ static int tcp_error(struct sk_buff *skb
 		if (LOG_INVALID(IPPROTO_TCP))
 			nf_log_packet(PF_INET, 0, skb, NULL, NULL, NULL,
 				  "ip_ct_tcp: bad TCP checksum ");
-		return -NF_ACCEPT;
 	}
 
 	/* Check TCP flags. */

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 21:34     ` Patrick McHardy
@ 2005-09-07 21:52       ` Marc Lehmann
  2005-09-09 11:41         ` Patrick McHardy
  0 siblings, 1 reply; 17+ messages in thread
From: Marc Lehmann @ 2005-09-07 21:52 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

On Wed, Sep 07, 2005 at 11:34:10PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> > I think I have the LOG target compiled into the kernel. After the echo, I got
> > this within a matter of seconds:
> > 
> >    printk: 614 messages suppressed.
> >    ip_ct_tcp: bad TCP checksum IN= OUT= SRC=xxxxxxxxxxxx DST=84.56.231.206 LEN=105 TOS=0x00 PREC=0x00 TTL=53 ID=33989 DF PROTO=TCP SPT=119 DPT=41349 SEQ=495763142 ACK=177548929 WINDOW=56677 RES=0x00 ACK PSH URGP=0 OPT (0101080A0986EF9D00E16123) 
> 
> Interesting .. if this isn't real there is most likely some problem with
> HW checksumming in netfilter. What does ethtool -k <dev> show?

It happens both with ppp(oe) as well as with ethernet, but the above message
originate son the ppp interface.

In any case, the ethtool output for eth1 (which is used by the pppoe
connection):

   Offload parameters for eth1:
   Cannot get device tcp segmentation offload settings: Operation not supported
   rx-checksumming: on
   tx-checksumming: on
   scatter-gather: on
   tcp segmentation offload: off

> > The weird thing is that it works on tap, but not on ethernet/ppp. Maybe
> > the kernel code gets some offset wrong?
> 
> Another sign pointing to HW checksumming ..

It's also a 64-bit-only problem. To verify, I tried this:

ethtool -K eth1 rx off tx off sg off

Where eth1 is the interface where pppoe runs over.

ethtool -k eth1 then displayed:

   rx-checksumming: off
   tx-checksumming: off
   scatter-gather: off
   tcp segmentation offload: off

And ICMP, TCP etc. starts working again.

Thanks for the analysis and the hint, I guess that verifies that its hw
checksumming. (Weird that hw checksumming on the underlying device somehow
changes the ppp packets, but nevertheless).

If I can be of further assistance in tracking down this problem, please
rely on me. I can easily live with hw checksumming switched off on that
if, though.

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcg@goof.com
      --==---/ / _ \/ // /\ \/ /      http://schmorp.de/
      -=====/_/_//_/\_,_/ /_/\_\      XX11-RIPE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 21:42     ` Patrick McHardy
@ 2005-09-07 21:54       ` Marc Lehmann
  2005-09-07 22:08         ` Patrick McHardy
  0 siblings, 1 reply; 17+ messages in thread
From: Marc Lehmann @ 2005-09-07 21:54 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

On Wed, Sep 07, 2005 at 11:42:01PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> > Some more messages I get when logging is enabled:
> > 
> > printk: 1286 messages suppressed.
> > ip_ct_tcp: invalid state IN= OUT= SRC=84.56.231.206 DST=xxx.xxx.xxx.xxx LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=3260 DF PROTO=TCP SPT=41535 DPT=119 SEQ=3475818900 ACK=1819416201 WINDOW=12198 RES=0x00 ACK URGP=0 OPT (0101080A00F5DE260917B744) UID=0 
> > printk: 1166 messages suppressed.
> 
> This doesn't tell much since we don't know what happend to the
> connection before. You can disable message surpressing by echoing
> 0 to /proc/sys/net/core/message_cost. I had a beer too much for
> serious debugging,
   
Well, as long as you can think of simple patches, that's just fine :)

> just to give some hints, does this patch make the problem go away?

I guess now that hw checksumming is identified as the direct cause, this is
no longer neecssary? If yes, just tell me and I'll try it out.

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcg@goof.com
      --==---/ / _ \/ // /\ \/ /      http://schmorp.de/
      -=====/_/_//_/\_,_/ /_/\_\      XX11-RIPE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 21:54       ` Marc Lehmann
@ 2005-09-07 22:08         ` Patrick McHardy
  0 siblings, 0 replies; 17+ messages in thread
From: Patrick McHardy @ 2005-09-07 22:08 UTC (permalink / raw)
  To: Marc Lehmann; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

Marc Lehmann wrote:
> On Wed, Sep 07, 2005 at 11:42:01PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> 
>>just to give some hints, does this patch make the problem go away?
> 
> I guess now that hw checksumming is identified as the direct cause, this is
> no longer neecssary? If yes, just tell me and I'll try it out.

No, don't bother. I'm goint to trying to figure out what's wrong, your
information so far should be enough, thanks. I'll let you know if I find
something worth trying.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-07 21:52       ` Marc Lehmann
@ 2005-09-09 11:41         ` Patrick McHardy
  2005-09-11 13:19           ` Marc Lehmann
  0 siblings, 1 reply; 17+ messages in thread
From: Patrick McHardy @ 2005-09-09 11:41 UTC (permalink / raw)
  To: Marc Lehmann; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

Marc Lehmann wrote:
> It's also a 64-bit-only problem. To verify, I tried this:
> 
> ethtool -K eth1 rx off tx off sg off
> 
> Where eth1 is the interface where pppoe runs over.
> 
> ethtool -k eth1 then displayed:
> 
>    rx-checksumming: off
>    tx-checksumming: off
>    scatter-gather: off
>    tcp segmentation offload: off
> 
> And ICMP, TCP etc. starts working again.
> 
> Thanks for the analysis and the hint, I guess that verifies that its hw
> checksumming. (Weird that hw checksumming on the underlying device somehow
> changes the ppp packets, but nevertheless).

I tried reproducing the problem without any luck. Its odd that its
happening on both eth and ppp devices, if it was just ppp I would
suspect some missing checksum update/invalidation in the ppp driver.
What network driver are you using? Please also send a list of loaded
modules and iptables rules. Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-09 11:41         ` Patrick McHardy
@ 2005-09-11 13:19           ` Marc Lehmann
  2005-09-11 14:10             ` Patrick McHardy
  0 siblings, 1 reply; 17+ messages in thread
From: Marc Lehmann @ 2005-09-11 13:19 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Andrew Morton, netdev, Netfilter Development Mailinglist

On Fri, Sep 09, 2005 at 01:41:34PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> > And ICMP, TCP etc. starts working again.
> > 
> > Thanks for the analysis and the hint, I guess that verifies that its hw
> > checksumming. (Weird that hw checksumming on the underlying device somehow
> > changes the ppp packets, but nevertheless).
> 
> I tried reproducing the problem without any luck. Its odd that its
> happening on both eth and ppp devices, if it was just ppp I would
> suspect some missing checksum update/invalidation in the ppp driver.

> What network driver are you using?

Happens with both skge and sk98.

> Please also send a list of loaded
> modules and iptables rules. Thanks.

   sch_htb                16448  2 
   sch_ingress             4420  2 
   nvidia               4378188  12 
   mga                    60096  0 
   drm                    76136  1 mga
   agpgart                29864  2 nvidia,drm
   sch_sfq                 5568  4 
   tun                     9664  1 
   powernow_k8             9616  0 
   processor              20432  1 powernow_k8
   sco                    12552  2 
   ztdummy                 3360  0 
   zaptel                197760  5 ztdummy
   crc_ccitt               2112  1 zaptel
   lirc_i2c               10180  1 
   lirc_dev               14336  1 lirc_i2c
   budget                 10240  0 
   s5h1420                 8900  1 budget
   l64781                  7236  1 budget
   ves1820                 5892  1 budget
   budget_core             8516  1 budget
   saa7146                15624  2 budget,budget_core
   ttpci_eeprom            2432  1 budget_core
   stv0299                11272  1 budget
   tda8083                 5764  1 budget
   ves1x93                 6404  1 budget
   dvb_core               82812  2 budget,budget_core
   parport_pc             39472  1 
   lp                     10880  0 
   parport                37964  2 parport_pc,lp
   tuner                  24096  0 
   ivtv                  202388  2 
   i2c_algo_bit            9032  1 ivtv
   videodev                9792  1 ivtv
   saa7115                13200  0 
   saa7127                11668  0 
   msp3400                27980  0 
   tveeprom               14560  0 
   v4l1_compat            12804  0 
   loop                   57688  4 
   w83627hf               32616  0 
   i2c_sensor              3136  1 w83627hf
   i2c_isa                 2624  0 
   i2c_core               19800  19 lirc_i2c,budget,s5h1420,l64781,ves1820,budget_core,ttpci_eeprom,stv0299,tda8083,ves1x93,tuner,i2c_algo_bit,saa7115,saa7127,msp3400,tveeprom,w83627hf,i2c_sensor,i2c_isa
   snd_seq_oss            33316  0 
   snd_seq_midi            7616  0 
   snd_seq_midi_event      7488  2 snd_seq_oss,snd_seq_midi
   snd_seq                55128  5 snd_seq_oss,snd_seq_midi,snd_seq_midi_event
   snd_via82xx            25280  2 
   snd_ac97_codec         88900  1 snd_via82xx
   snd_mpu401_uart         7040  1 snd_via82xx
   snd_rawmidi            24224  2 snd_seq_midi,snd_mpu401_uart
   snd_seq_device          7824  4 snd_seq_oss,snd_seq_midi,snd_seq,snd_rawmidi
   fcusb2                683148  3 
   capidrv                30168  1 
   isdn                  109160  2 capidrv
   capi                   16112  8 
   kernelcapi             48736  3 fcusb2,capidrv,capi
   rfcomm                 35952  8 
   l2cap                  24520  7 rfcomm
   hci_usb                14728  3 
   bluetooth              46404  8 sco,rfcomm,l2cap,hci_usb
   cls_u32                 7624  8 
   skge                   34256  0 
   ipt_hashlimit           8024  1 
   ipv6                  254848  22 
   ehci_hcd               31816  0 
   uhci_hcd               31968  0 
   capifs                  4880  2 capi
   nfsd                  107656  17 
   lockd                  64912  2 nfsd
   nfs_acl                 3136  1 nfsd
   sunrpc                142632  13 nfsd,lockd,nfs_acl

However, it happens with init=/bin/bash and loading the single iptables
rule I sent in the original report, for either eth0/1 or the ppp
interface. I can send you my 278 other rules if you want, but they have
had no effect on the problem here.

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcg@goof.com
      --==---/ / _ \/ // /\ \/ /      http://schmorp.de/
      -=====/_/_//_/\_,_/ /_/\_\      XX11-RIPE

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-11 13:19           ` Marc Lehmann
@ 2005-09-11 14:10             ` Patrick McHardy
  2005-09-13 18:09               ` Stephen Hemminger
  2005-09-14 19:09               ` Fw: " Marc Lehmann
  0 siblings, 2 replies; 17+ messages in thread
From: Patrick McHardy @ 2005-09-11 14:10 UTC (permalink / raw)
  To: Marc Lehmann
  Cc: Andrew Morton, netdev, Netfilter Development Mailinglist,
	Stephen Hemminger

Marc Lehmann wrote:
> On Fri, Sep 09, 2005 at 01:41:34PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> 
>>What network driver are you using?
> 
> Happens with both skge and sk98.

Are you sure the same checksum error happens? sk98_lin never sets
ip_summed to CHECKSUM_HW, it uses either CHECKSUM_UNNECESSARY for
HW checksummed packets, in which case the check is skipped in
ip_conntrack, or it uses CHECKSUM_NONE, in which case the checksum
in the packet must be invalid if the check fails and the packet
wouldn't be accepted by the final receipient anyway. skge uses 
CHECKSUM_HW for every packet, so they have nothing in common wrt.
HW checksumming. This would normally mean we can rule out HW
checksumming, but for some reason turning it off on skge seems
to help in your case. Stephen, you're more familiar with the
sk* drivers than me, anything I'm missing here?

>>Please also send a list of loaded
>>modules and iptables rules. Thanks.
> 
> 
>    sch_htb                16448  2 
>    sch_ingress             4420  2 
>    nvidia               4378188  12 
>    mga                    60096  0 
>    drm                    76136  1 mga
>    agpgart                29864  2 nvidia,drm
>    sch_sfq                 5568  4 
>    tun                     9664  1 
>    powernow_k8             9616  0 
>    processor              20432  1 powernow_k8
>    sco                    12552  2 
>    ztdummy                 3360  0 
>    zaptel                197760  5 ztdummy
>    crc_ccitt               2112  1 zaptel
>    lirc_i2c               10180  1 
>    lirc_dev               14336  1 lirc_i2c
>    budget                 10240  0 
>    s5h1420                 8900  1 budget
>    l64781                  7236  1 budget
>    ves1820                 5892  1 budget
>    budget_core             8516  1 budget
>    saa7146                15624  2 budget,budget_core
>    ttpci_eeprom            2432  1 budget_core
>    stv0299                11272  1 budget
>    tda8083                 5764  1 budget
>    ves1x93                 6404  1 budget
>    dvb_core               82812  2 budget,budget_core
>    parport_pc             39472  1 
>    lp                     10880  0 
>    parport                37964  2 parport_pc,lp
>    tuner                  24096  0 
>    ivtv                  202388  2 
>    i2c_algo_bit            9032  1 ivtv
>    videodev                9792  1 ivtv
>    saa7115                13200  0 
>    saa7127                11668  0 
>    msp3400                27980  0 
>    tveeprom               14560  0 
>    v4l1_compat            12804  0 
>    loop                   57688  4 
>    w83627hf               32616  0 
>    i2c_sensor              3136  1 w83627hf
>    i2c_isa                 2624  0 
>    i2c_core               19800  19 lirc_i2c,budget,s5h1420,l64781,ves1820,budget_core,ttpci_eeprom,stv0299,tda8083,ves1x93,tuner,i2c_algo_bit,saa7115,saa7127,msp3400,tveeprom,w83627hf,i2c_sensor,i2c_isa
>    snd_seq_oss            33316  0 
>    snd_seq_midi            7616  0 
>    snd_seq_midi_event      7488  2 snd_seq_oss,snd_seq_midi
>    snd_seq                55128  5 snd_seq_oss,snd_seq_midi,snd_seq_midi_event
>    snd_via82xx            25280  2 
>    snd_ac97_codec         88900  1 snd_via82xx
>    snd_mpu401_uart         7040  1 snd_via82xx
>    snd_rawmidi            24224  2 snd_seq_midi,snd_mpu401_uart
>    snd_seq_device          7824  4 snd_seq_oss,snd_seq_midi,snd_seq,snd_rawmidi
>    fcusb2                683148  3 
>    capidrv                30168  1 
>    isdn                  109160  2 capidrv
>    capi                   16112  8 
>    kernelcapi             48736  3 fcusb2,capidrv,capi
>    rfcomm                 35952  8 
>    l2cap                  24520  7 rfcomm
>    hci_usb                14728  3 
>    bluetooth              46404  8 sco,rfcomm,l2cap,hci_usb
>    cls_u32                 7624  8 
>    skge                   34256  0 
>    ipt_hashlimit           8024  1 
>    ipv6                  254848  22 
>    ehci_hcd               31816  0 
>    uhci_hcd               31968  0 
>    capifs                  4880  2 capi
>    nfsd                  107656  17 
>    lockd                  64912  2 nfsd
>    nfs_acl                 3136  1 nfsd
>    sunrpc                142632  13 nfsd,lockd,nfs_acl
> 
> However, it happens with init=/bin/bash and loading the single iptables
> rule I sent in the original report, for either eth0/1 or the ppp
> interface. I can send you my 278 other rules if you want, but they have
> had no effect on the problem here.

No, thanks. But your entire .config might help ..

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-11 14:10             ` Patrick McHardy
@ 2005-09-13 18:09               ` Stephen Hemminger
  2005-09-13 20:59                 ` David S. Miller
  2005-09-14  1:10                 ` Patrick McHardy
  2005-09-14 19:09               ` Fw: " Marc Lehmann
  1 sibling, 2 replies; 17+ messages in thread
From: Stephen Hemminger @ 2005-09-13 18:09 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Andrew Morton, netdev, Netfilter Development Mailinglist,
	Marc Lehmann

On Sun, 11 Sep 2005 16:10:01 +0200
Patrick McHardy <kaber@trash.net> wrote:

> Marc Lehmann wrote:
> > On Fri, Sep 09, 2005 at 01:41:34PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> > 
> >>What network driver are you using?
> > 
> > Happens with both skge and sk98.
> 
> Are you sure the same checksum error happens? sk98_lin never sets
> ip_summed to CHECKSUM_HW, it uses either CHECKSUM_UNNECESSARY for
> HW checksummed packets, in which case the check is skipped in
> ip_conntrack, or it uses CHECKSUM_NONE, in which case the checksum
> in the packet must be invalid if the check fails and the packet
> wouldn't be accepted by the final receipient anyway. skge uses 
> CHECKSUM_HW for every packet, so they have nothing in common wrt.
> HW checksumming. This would normally mean we can rule out HW
> checksumming, but for some reason turning it off on skge seems
> to help in your case. Stephen, you're more familiar with the
> sk* drivers than me, anything I'm missing here?
> 

Some background, the semantic of ip_summed is different on the
output than the input path. On input, it means a checksum is
available in skb->csum; and on output it means the packet is
destined for a device that can do hardware checksumming.

I have gotten reports of receive checksum errors on
some systems, it may be related to certain revisions of
hardware. It would be useful to see the message printed out
by the skge driver that shows chip and revision.

Also, on the input path for TCP and UDP, the code does not
depend on the hardware being correct, and if the checksum
is incorrect, it just prints a warning and does a software
checksum before deciding to drop.
Perhaps netfilter code needs to handle that case?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-13 18:09               ` Stephen Hemminger
@ 2005-09-13 20:59                 ` David S. Miller
  2005-09-14  1:13                   ` Patrick McHardy
  2005-09-14  1:10                 ` Patrick McHardy
  1 sibling, 1 reply; 17+ messages in thread
From: David S. Miller @ 2005-09-13 20:59 UTC (permalink / raw)
  To: shemminger; +Cc: akpm, netdev, netfilter-devel, kaber, schmorp

From: Stephen Hemminger <shemminger@osdl.org>
Date: Tue, 13 Sep 2005 11:09:02 -0700

> Also, on the input path for TCP and UDP, the code does not
> depend on the hardware being correct, and if the checksum
> is incorrect, it just prints a warning and does a software
> checksum before deciding to drop.
> Perhaps netfilter code needs to handle that case?

I personally think netfilter should do so.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-13 18:09               ` Stephen Hemminger
  2005-09-13 20:59                 ` David S. Miller
@ 2005-09-14  1:10                 ` Patrick McHardy
  1 sibling, 0 replies; 17+ messages in thread
From: Patrick McHardy @ 2005-09-14  1:10 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Andrew Morton, netdev, Netfilter Development Mailinglist,
	Marc Lehmann

Stephen Hemminger wrote:
> Some background, the semantic of ip_summed is different on the
> output than the input path. On input, it means a checksum is
> available in skb->csum; and on output it means the packet is
> destined for a device that can do hardware checksumming.

Yep, so far the netfilter code should be correct since my batch
of HW checksum fixes.

> I have gotten reports of receive checksum errors on
> some systems, it may be related to certain revisions of
> hardware. It would be useful to see the message printed out
> by the skge driver that shows chip and revision.

That may be the reason. I've audited most relevant code-paths
for this case and they seem to be mostly OK. There are a couple
of cases in the ppp code I'm not sure about yet, but I couldn't
trigger any errors in my test-setup. Anyway they don't seem to
be related to this problem since it also happens with sk98_lin
which doesn't set CHECKSUM_HW.

> Also, on the input path for TCP and UDP, the code does not
> depend on the hardware being correct, and if the checksum
> is incorrect, it just prints a warning and does a software
> checksum before deciding to drop.
> Perhaps netfilter code needs to handle that case?

Yes, this is not handled so far. But is looks like there is
some other problem because it also happens with sk98_lin.

Thanks for your help Stephen.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-13 20:59                 ` David S. Miller
@ 2005-09-14  1:13                   ` Patrick McHardy
  2005-09-14  3:41                     ` David S. Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Patrick McHardy @ 2005-09-14  1:13 UTC (permalink / raw)
  To: David S. Miller; +Cc: akpm, netdev, netfilter-devel, schmorp, shemminger

David S. Miller wrote:
> From: Stephen Hemminger <shemminger@osdl.org>
> Date: Tue, 13 Sep 2005 11:09:02 -0700
> 
> 
>>Also, on the input path for TCP and UDP, the code does not
>>depend on the hardware being correct, and if the checksum
>>is incorrect, it just prints a warning and does a software
>>checksum before deciding to drop.
>>Perhaps netfilter code needs to handle that case?
> 
> I personally think netfilter should do so.

I agree. One thing I've planned for some silent moment is
to clean up the entire netfilter checksumming code (there's
lots of small duplicated chunks). Probably at least some
of it will also be applicable for the remaining stack.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-14  1:13                   ` Patrick McHardy
@ 2005-09-14  3:41                     ` David S. Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David S. Miller @ 2005-09-14  3:41 UTC (permalink / raw)
  To: kaber; +Cc: akpm, netdev, netfilter-devel, schmorp, shemminger

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 14 Sep 2005 03:13:39 +0200

> David S. Miller wrote:
> > I personally think netfilter should do so.
> 
> I agree. One thing I've planned for some silent moment is
> to clean up the entire netfilter checksumming code (there's
> lots of small duplicated chunks). Probably at least some
> of it will also be applicable for the remaining stack.

There is another thing I thought about today, and that is
to automatically handle this CHECKSUM_HW stuff when the
skb->data area is COW'd via pskb_expand_head() or similar.

I don't know how well that would work, but if it did then
we could consolidate all of this stuff into one spot which
is always nice.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Fw: masquerading failure for at least icmp and tcp+sack on amd64
  2005-09-11 14:10             ` Patrick McHardy
  2005-09-13 18:09               ` Stephen Hemminger
@ 2005-09-14 19:09               ` Marc Lehmann
  1 sibling, 0 replies; 17+ messages in thread
From: Marc Lehmann @ 2005-09-14 19:09 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Andrew Morton, netdev, Netfilter Development Mailinglist,
	Stephen Hemminger

On Sun, Sep 11, 2005 at 04:10:01PM +0200, Patrick McHardy <kaber@trash.net> wrote:
> >>What network driver are you using?
> >
> >Happens with both skge and sk98.
> 
> Are you sure the same checksum error happens?

No, of course not.

I have two similar network cards in that machine (a normal sk98 pci card
and an onboard one that skge calls Yukon lite and the sk98lin driver calls
9521. The onboard one does not, though, work with sk98lin).

What I did test was with skge and the onboard card, and sk98_lin with
the pci card, and in both cases I couldn't get masquerading to work, but
everything else worked fine.

> ip_summed to CHECKSUM_HW, it uses either CHECKSUM_UNNECESSARY for
> HW checksummed packets, in which case the check is skipped in
> ip_conntrack, or it uses CHECKSUM_NONE, in which case the checksum
> in the packet must be invalid if the check fails and the packet
> wouldn't be accepted by the final receipient anyway. skge uses 
> CHECKSUM_HW for every packet, so they have nothing in common wrt.
> HW checksumming. This would normally mean we can rule out HW
> checksumming, but for some reason turning it off on skge seems
> to help in your case.

... and I have not tried turning it off with sk98lin (and would like to
avoid it, as it requires reconfiguration to test). If any additional tests
arre required, I will do them, though.

> >   nfs_acl                 3136  1 nfsd
> >   sunrpc                142632  13 nfsd,lockd,nfs_acl
> >
> >However, it happens with init=/bin/bash and loading the single iptables
> >rule I sent in the original report, for either eth0/1 or the ppp
> >interface. I can send you my 278 other rules if you want, but they have
> >had no effect on the problem here.
> 
> No, thanks. But your entire .config might help ..

Sorry for the delay :( It's here:

http://data.plan9.de/config.doom

This is the currently-running 2.6.13 config, when testing 2.6.11, i just
did "make oldconfig" on this one.

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcg@goof.com
      --==---/ / _ \/ // /\ \/ /      http://schmorp.de/
      -=====/_/_//_/\_,_/ /_/\_\      XX11-RIPE

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-09-14 19:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20050907052057.09714a4c.akpm@osdl.org>
2005-09-07 12:39 ` Fw: masquerading failure for at least icmp and tcp+sack on amd64 Patrick McHardy
2005-09-07 20:59   ` Marc Lehmann
2005-09-07 21:34     ` Patrick McHardy
2005-09-07 21:52       ` Marc Lehmann
2005-09-09 11:41         ` Patrick McHardy
2005-09-11 13:19           ` Marc Lehmann
2005-09-11 14:10             ` Patrick McHardy
2005-09-13 18:09               ` Stephen Hemminger
2005-09-13 20:59                 ` David S. Miller
2005-09-14  1:13                   ` Patrick McHardy
2005-09-14  3:41                     ` David S. Miller
2005-09-14  1:10                 ` Patrick McHardy
2005-09-14 19:09               ` Fw: " Marc Lehmann
2005-09-07 21:34   ` Marc Lehmann
2005-09-07 21:42     ` Patrick McHardy
2005-09-07 21:54       ` Marc Lehmann
2005-09-07 22:08         ` Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).