* Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" @ 2014-07-10 13:44 Stephen Hemminger 2014-07-10 17:11 ` Cong Wang 0 siblings, 1 reply; 7+ messages in thread From: Stephen Hemminger @ 2014-07-10 13:44 UTC (permalink / raw) To: netdev Begin forwarded message: Date: Thu, 10 Jul 2014 06:20:14 -0700 From: "bugzilla-daemon@bugzilla.kernel.org" <bugzilla-daemon@bugzilla.kernel.org> To: "stephen@networkplumber.org" <stephen@networkplumber.org> Subject: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" https://bugzilla.kernel.org/show_bug.cgi?id=79891 Bug ID: 79891 Summary: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" Product: Networking Version: 2.5 Kernel Version: 3.2.60 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: IPV4 Assignee: shemminger@linux-foundation.org Reporter: tm@del.bg Regression: No Created attachment 142651 --> https://bugzilla.kernel.org/attachment.cgi?id=142651&action=edit TCP retransmissions from a windows host After upgrading a router to Linux 3.2.60 most windows machines behind it started experiencing connection stalls. Downgrade to 3.2.59 resolved the problem. Using git bisect I pinpointed it to "59d9f389df3cdf72833d5ee17c3fe959b6bdc792 is the first bad commit", which entered the kernel from here http://marc.info/?l=linux-netdev&m=139949081418806&w=2 No connection problems for Linux hosts at all – only windows. Please, find attached tcpdump packet capture demonstrating the bad behaviour. -- You are receiving this mail because: You are the assignee for the bug. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" 2014-07-10 13:44 Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" Stephen Hemminger @ 2014-07-10 17:11 ` Cong Wang 2014-07-11 18:11 ` Ben Hutchings 0 siblings, 1 reply; 7+ messages in thread From: Cong Wang @ 2014-07-10 17:11 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev On Thu, Jul 10, 2014 at 6:44 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > > After upgrading a router to Linux 3.2.60 most windows machines behind it > started experiencing connection stalls. Downgrade to 3.2.59 resolved the > problem. Using git bisect I pinpointed it to > "59d9f389df3cdf72833d5ee17c3fe959b6bdc792 is the first bad commit", which > entered the kernel from here > http://marc.info/?l=linux-netdev&m=139949081418806&w=2 > This commit should have been reverted for older kernels like 3.2.y. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" 2014-07-10 17:11 ` Cong Wang @ 2014-07-11 18:11 ` Ben Hutchings 2014-07-11 18:38 ` Cong Wang 0 siblings, 1 reply; 7+ messages in thread From: Ben Hutchings @ 2014-07-11 18:11 UTC (permalink / raw) To: Cong Wang; +Cc: Stephen Hemminger, netdev, David Miller, stable [-- Attachment #1: Type: text/plain, Size: 820 bytes --] On Thu, 2014-07-10 at 10:11 -0700, Cong Wang wrote: > On Thu, Jul 10, 2014 at 6:44 AM, Stephen Hemminger > <stephen@networkplumber.org> wrote: > > > > After upgrading a router to Linux 3.2.60 most windows machines behind it > > started experiencing connection stalls. Downgrade to 3.2.59 resolved the > > problem. Using git bisect I pinpointed it to > > "59d9f389df3cdf72833d5ee17c3fe959b6bdc792 is the first bad commit", which > > entered the kernel from here > > http://marc.info/?l=linux-netdev&m=139949081418806&w=2 > > > > This commit should have been reverted for older kernels like 3.2.y. Really? We already had fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path") backported in 3.2.57. Ben. -- Ben Hutchings To err is human; to really foul things up requires a computer. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 811 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" 2014-07-11 18:11 ` Ben Hutchings @ 2014-07-11 18:38 ` Cong Wang 2014-07-11 19:01 ` Ben Hutchings 0 siblings, 1 reply; 7+ messages in thread From: Cong Wang @ 2014-07-11 18:38 UTC (permalink / raw) To: Ben Hutchings; +Cc: Stephen Hemminger, netdev, David Miller, stable On Fri, Jul 11, 2014 at 11:11 AM, Ben Hutchings <ben@decadent.org.uk> wrote: > On Thu, 2014-07-10 at 10:11 -0700, Cong Wang wrote: >> On Thu, Jul 10, 2014 at 6:44 AM, Stephen Hemminger >> <stephen@networkplumber.org> wrote: >> > >> > After upgrading a router to Linux 3.2.60 most windows machines behind it >> > started experiencing connection stalls. Downgrade to 3.2.59 resolved the >> > problem. Using git bisect I pinpointed it to >> > "59d9f389df3cdf72833d5ee17c3fe959b6bdc792 is the first bad commit", which >> > entered the kernel from here >> > http://marc.info/?l=linux-netdev&m=139949081418806&w=2 >> > >> >> This commit should have been reverted for older kernels like 3.2.y. > > Really? We already had fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in > forwarding path") backported in 3.2.57. I haven't read the code, but according to a previous discussion it sounds like that should be reverted: http://lists.openwall.net/netdev/2014/06/11/67 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" 2014-07-11 18:38 ` Cong Wang @ 2014-07-11 19:01 ` Ben Hutchings 2014-07-11 19:14 ` David Miller 0 siblings, 1 reply; 7+ messages in thread From: Ben Hutchings @ 2014-07-11 19:01 UTC (permalink / raw) To: Cong Wang; +Cc: Stephen Hemminger, netdev, David Miller, stable [-- Attachment #1: Type: text/plain, Size: 1453 bytes --] On Fri, 2014-07-11 at 11:38 -0700, Cong Wang wrote: > On Fri, Jul 11, 2014 at 11:11 AM, Ben Hutchings <ben@decadent.org.uk> wrote: > > On Thu, 2014-07-10 at 10:11 -0700, Cong Wang wrote: > >> On Thu, Jul 10, 2014 at 6:44 AM, Stephen Hemminger > >> <stephen@networkplumber.org> wrote: > >> > > >> > After upgrading a router to Linux 3.2.60 most windows machines behind it > >> > started experiencing connection stalls. Downgrade to 3.2.59 resolved the > >> > problem. Using git bisect I pinpointed it to > >> > "59d9f389df3cdf72833d5ee17c3fe959b6bdc792 is the first bad commit", which > >> > entered the kernel from here > >> > http://marc.info/?l=linux-netdev&m=139949081418806&w=2 > >> > > >> > >> This commit should have been reverted for older kernels like 3.2.y. > > > > Really? We already had fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in > > forwarding path") backported in 3.2.57. > > I haven't read the code, but according to a previous discussion it sounds > like that should be reverted: > > http://lists.openwall.net/netdev/2014/06/11/67 My reading of that is we need 895162b1101b ("netfilter: ipv4: defrag: set local_df flag on defragmented skb") in 3.2.y and 3.4.y. But there seem to be many other places that local_df should be set, that have only recently been fixed. So maybe reverting is the safer option. Ben. -- Ben Hutchings To err is human; to really foul things up requires a computer. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 811 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" 2014-07-11 19:01 ` Ben Hutchings @ 2014-07-11 19:14 ` David Miller 2014-07-11 22:07 ` Florian Westphal 0 siblings, 1 reply; 7+ messages in thread From: David Miller @ 2014-07-11 19:14 UTC (permalink / raw) To: ben; +Cc: cwang, stephen, netdev, stable From: Ben Hutchings <ben@decadent.org.uk> Date: Fri, 11 Jul 2014 20:01:12 +0100 > On Fri, 2014-07-11 at 11:38 -0700, Cong Wang wrote: >> On Fri, Jul 11, 2014 at 11:11 AM, Ben Hutchings <ben@decadent.org.uk> wrote: >> > On Thu, 2014-07-10 at 10:11 -0700, Cong Wang wrote: >> >> On Thu, Jul 10, 2014 at 6:44 AM, Stephen Hemminger >> >> <stephen@networkplumber.org> wrote: >> >> > >> >> > After upgrading a router to Linux 3.2.60 most windows machines behind it >> >> > started experiencing connection stalls. Downgrade to 3.2.59 resolved the >> >> > problem. Using git bisect I pinpointed it to >> >> > "59d9f389df3cdf72833d5ee17c3fe959b6bdc792 is the first bad commit", which >> >> > entered the kernel from here >> >> > http://marc.info/?l=linux-netdev&m=139949081418806&w=2 >> >> > >> >> >> >> This commit should have been reverted for older kernels like 3.2.y. >> > >> > Really? We already had fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in >> > forwarding path") backported in 3.2.57. >> >> I haven't read the code, but according to a previous discussion it sounds >> like that should be reverted: >> >> http://lists.openwall.net/netdev/2014/06/11/67 > > My reading of that is we need 895162b1101b ("netfilter: ipv4: defrag: > set local_df flag on defragmented skb") in 3.2.y and 3.4.y. But there > seem to be many other places that local_df should be set, that have only > recently been fixed. So maybe reverting is the safer option. Reverting is indeed probably safer. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" 2014-07-11 19:14 ` David Miller @ 2014-07-11 22:07 ` Florian Westphal 0 siblings, 0 replies; 7+ messages in thread From: Florian Westphal @ 2014-07-11 22:07 UTC (permalink / raw) To: David Miller; +Cc: ben, cwang, stephen, netdev, stable David Miller <davem@davemloft.net> wrote: > From: Ben Hutchings <ben@decadent.org.uk> > Date: Fri, 11 Jul 2014 20:01:12 +0100 > > > On Fri, 2014-07-11 at 11:38 -0700, Cong Wang wrote: [..] > >> >> > http://marc.info/?l=linux-netdev&m=139949081418806&w=2 > >> >> > > >> >> > >> >> This commit should have been reverted for older kernels like 3.2.y. > >> > > >> > Really? We already had fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in > >> > forwarding path") backported in 3.2.57. > >> > >> I haven't read the code, but according to a previous discussion it sounds > >> like that should be reverted: > >> > >> http://lists.openwall.net/netdev/2014/06/11/67 > > > > My reading of that is we need 895162b1101b ("netfilter: ipv4: defrag: > > set local_df flag on defragmented skb") in 3.2.y and 3.4.y. But there > > seem to be many other places that local_df should be set, that have only > > recently been fixed. So maybe reverting is the safer option. > > Reverting is indeed probably safer. Right, I agree. Reverting is safer. IMO there are two possible options for 3.2 / 3.4: 1. Revert fe6cc55f3a9 ("net: ip, ipv6: handle gso skbs in forwarding path") 2. Backport 21d1196a3 ("ipv4: set transport header earlier") to 3.2/3.4 -stable [ The problem is that transport header is not yet set in 3.2/3.4 in forward path so skb_gso_network_seglen() returns bogus length ] There is a 3rd alternative (i mention this for completeness only). You could sort-of 'soft-revert' to the old behaviour to not care about GRO packets in the forward path. The minium change is: diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c --- a/net/ipv4/ip_forward.c +++ b/net/ipv4/ip_forward.c @@ -50,7 +50,7 @@ static bool ip_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu) if (skb->len <= mtu) return false; - if (skb_is_gso(skb) && skb_gso_network_seglen(skb) <= mtu) + if (skb_is_gso(skb)) return false; return true; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index cb9df0e..f05d6ef 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -354,7 +354,7 @@ static bool ip6_pkt_too_big(const struct sk_buff *skb, unsigned int mtu) if (skb->ignore_df) return false; - if (skb_is_gso(skb) && skb_gso_network_seglen(skb) <= mtu) + if (skb_is_gso(skb)) return false; return true; Dave/Greg, if this is what you prefer just let me know and I can submit such patch for 3.2 and 3.4 stable series. ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-07-11 22:07 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-07-10 13:44 Fw: [Bug 79891] New: Router causes TCP retransmits for windows hosts after "ip_forward: fix inverted local_df test" Stephen Hemminger 2014-07-10 17:11 ` Cong Wang 2014-07-11 18:11 ` Ben Hutchings 2014-07-11 18:38 ` Cong Wang 2014-07-11 19:01 ` Ben Hutchings 2014-07-11 19:14 ` David Miller 2014-07-11 22:07 ` Florian Westphal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).