* BSD 4.2 style TCP keepalives @ 2010-01-06 0:39 David Miller 2010-01-06 2:07 ` Neil Horman ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: David Miller @ 2010-01-06 0:39 UTC (permalink / raw) To: netdev; +Cc: nhorman, ilpo.jarvinen To make a long story short, there are still some Windows 2000 machines out there emitting BSD 4.2 style keepalives (one garbage byte instead of an empty out-of-window probe frame). We don't ACK these because of how tcp_sequence() sees ->end_seq as being equal to ->rcv_wup But we can't change tcp_sequence() to reject these frames, because if we do then we end up mishandling connection attempts (SYN, SYN+ACK) and retransmits of such. Neil has shown me a patch that does a by-hand special case of this one-garbage-byte keepalive inside of tcp_rcv_established(). Anyone have suggestions for an alternative and perhaps cleaner implementation of a fix? Thanks! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 0:39 BSD 4.2 style TCP keepalives David Miller @ 2010-01-06 2:07 ` Neil Horman 2010-01-06 3:59 ` David Miller 2010-01-06 8:23 ` David Miller 2010-01-08 12:40 ` Neil Horman 2 siblings, 1 reply; 17+ messages in thread From: Neil Horman @ 2010-01-06 2:07 UTC (permalink / raw) To: David Miller; +Cc: netdev, ilpo.jarvinen On Tue, Jan 05, 2010 at 04:39:11PM -0800, David Miller wrote: > > To make a long story short, there are still some Windows 2000 > machines out there emitting BSD 4.2 style keepalives (one garbage > byte instead of an empty out-of-window probe frame). > > We don't ACK these because of how tcp_sequence() sees ->end_seq > as being equal to ->rcv_wup > > But we can't change tcp_sequence() to reject these frames, because if > we do then we end up mishandling connection attempts (SYN, SYN+ACK) > and retransmits of such. > > Neil has shown me a patch that does a by-hand special case of this > one-garbage-byte keepalive inside of tcp_rcv_established(). > > Anyone have suggestions for an alternative and perhaps cleaner > implementation of a fix? > Dave, If that patch fixes the problem (waiting on test results now, but I figure it will), what if we add a parameter to tcp_sequence (and tcp_validate_incomming), that represents an offset to trim from end_seq (so that we can effectively ignore the garbage byte)? Its not much cleaner, but it consolidates the code a bit, and is probably a bit quicker. Then we can just pass a 0 value to tcp_validate_incomming from tcp_rcv_state_process and 1 in tcp_rcv_established (or a boolean variable if we want to implement a sysctl to tune weather or not we want to ack these old frames, if such a knob is relevant). I'll happily implement this if theres consensus on it Thoughts? Thanks! Neil ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 2:07 ` Neil Horman @ 2010-01-06 3:59 ` David Miller 2010-01-06 17:21 ` Rick Jones 0 siblings, 1 reply; 17+ messages in thread From: David Miller @ 2010-01-06 3:59 UTC (permalink / raw) To: nhorman; +Cc: netdev, ilpo.jarvinen From: Neil Horman <nhorman@tuxdriver.com> Date: Tue, 5 Jan 2010 21:07:56 -0500 > Dave, If that patch fixes the problem (waiting on test results now, > but I figure it will), what if we add a parameter to tcp_sequence > (and tcp_validate_incomming), that represents an offset to trim from > end_seq (so that we can effectively ignore the garbage byte)? Sure, we could do that too, and it would be an improvement. Let's first wait for test results and also give a bit for others to potentially come up with implementation ideas. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 3:59 ` David Miller @ 2010-01-06 17:21 ` Rick Jones 2010-01-06 20:50 ` Neil Horman 0 siblings, 1 reply; 17+ messages in thread From: Rick Jones @ 2010-01-06 17:21 UTC (permalink / raw) To: David Miller; +Cc: nhorman, netdev, ilpo.jarvinen David Miller wrote: > From: Neil Horman <nhorman@tuxdriver.com> > Date: Tue, 5 Jan 2010 21:07:56 -0500 > > >>Dave, If that patch fixes the problem (waiting on test results now, >>but I figure it will), what if we add a parameter to tcp_sequence >>(and tcp_validate_incomming), that represents an offset to trim from >>end_seq (so that we can effectively ignore the garbage byte)? > > > Sure, we could do that too, and it would be an improvement. > > Let's first wait for test results and also give a bit for > others to potentially come up with implementation ideas. Might it suffice to simply enable TCP keepalives on the Linux end? Or is that too big a kludge? rick jones ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 17:21 ` Rick Jones @ 2010-01-06 20:50 ` Neil Horman 0 siblings, 0 replies; 17+ messages in thread From: Neil Horman @ 2010-01-06 20:50 UTC (permalink / raw) To: Rick Jones; +Cc: David Miller, netdev, ilpo.jarvinen On Wed, Jan 06, 2010 at 09:21:49AM -0800, Rick Jones wrote: > David Miller wrote: > >From: Neil Horman <nhorman@tuxdriver.com> > >Date: Tue, 5 Jan 2010 21:07:56 -0500 > > > > > >>Dave, If that patch fixes the problem (waiting on test results now, > >>but I figure it will), what if we add a parameter to tcp_sequence > >>(and tcp_validate_incomming), that represents an offset to trim from > >>end_seq (so that we can effectively ignore the garbage byte)? > > > > > >Sure, we could do that too, and it would be an improvement. > > > >Let's first wait for test results and also give a bit for > >others to potentially come up with implementation ideas. > > Might it suffice to simply enable TCP keepalives on the Linux end? > Or is that too big a kludge? > I imagine that would prevent the consequences of the problem (which is that, after not responding to several of these older keep-alives, the connection gets reset), but doing so requires that we know the keepalive interval configured on the peer, and we don't normally know that. What we need to do here is be able to respond to these old keep alives (arguably) Neil > rick jones > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 0:39 BSD 4.2 style TCP keepalives David Miller 2010-01-06 2:07 ` Neil Horman @ 2010-01-06 8:23 ` David Miller 2010-01-06 23:04 ` David Miller 2010-01-08 12:40 ` Neil Horman 2 siblings, 1 reply; 17+ messages in thread From: David Miller @ 2010-01-06 8:23 UTC (permalink / raw) To: netdev; +Cc: nhorman, ilpo.jarvinen From: David Miller <davem@davemloft.net> Date: Tue, 05 Jan 2010 16:39:11 -0800 (PST) > To make a long story short, there are still some Windows 2000 > machines out there emitting BSD 4.2 style keepalives (one garbage > byte instead of an empty out-of-window probe frame). > > We don't ACK these because of how tcp_sequence() sees ->end_seq > as being equal to ->rcv_wup > > But we can't change tcp_sequence() to reject these frames, because if > we do then we end up mishandling connection attempts (SYN, SYN+ACK) > and retransmits of such. After some digging I found commit: commit 585d51805443d6caf10d468366bc6567e12cf090 Author: davem <davem> Date: Tue Jan 30 01:38:22 2001 +0000 Make tcp_sequence check more loose, so that we respect control flags in packets which are not out-of-window after truncating to the current window. From Alexey. in the netdev-vger-cvs tree, which is how we arrived at the current implementation of tcp_sequence(). The old code looks like (ifdef'd sections removed and code reformatted for brevity): static int __tcp_sequence(struct tcp_opt *tp, u32 seq, u32 end_seq) { u32 end_window = tp->rcv_wup + tp->rcv_wnd; u32 rcv_wnd = tcp_receive_window(tp); if (rcv_wnd && after(end_seq, tp->rcv_nxt) && before(seq, end_window)) return 1; if (seq != end_window) return 0; return (seq == end_seq); } extern __inline__ int tcp_sequence(struct tcp_opt *tp, u32 seq, u32 end_seq, int rst) { u32 rcv_wnd = tcp_receive_window(tp); if (seq == tp->rcv_nxt) return (rcv_wnd || (end_seq == seq) || rst); return __tcp_sequence(tp, seq, end_seq); } Whereas tcp_sequence() is now: static inline int tcp_sequence(struct tcp_sock *tp, u32 seq, u32 end_seq) { return !before(end_seq, tp->rcv_wup) && !after(seq, tp->rcv_nxt + tcp_receive_window(tp)); } The key element (taken from FreeBSD, as per current comments) is to allow sequences using tp->rcv_wup as the window edge so that we can accept things like resets even when our delayed ACK has caused the sender to not advance his snd.una yet. That's fine. So everywhere you see tp->rcv_nxt in the old code, it should correspond to things using tp->rcv_wup in the new code. Would the old code cause us to properly ACK the zero window probes that have the garbage byte? It seems it would. Assuming we still have a zero window being advertised when we receive the probe: 1) seq is one byte behind edge of the window so the seq == tp->rcv_nxt check will not pass, therefore we get to __tcp_sequence() 2) rcv_wnd is zero, so first check there does not pass 3) seq != end_window (it's something like tp->rcv_wup - 1) so we return 0 and thus end up in the code (two functions up) which emits the ACK. However, we can't just change !before(end_seq, tp->rcv_wup) && to be: after(end_seq, tp->rcv_wup) && As that would even reject the first bare ACK we receive in the SYN, SYN+ACK, ACK sequence. (where seq == end_seq and end_seq == tp->rcv_wup, which would be rejected by after(end_seq, tp->rcv_wup)). Special casing the seq == end_seq == tp->rcv_wup case using something like: (after(end_seq, tp->rcv_wup) || (end_seq == tp->rcv_wup && seq == end_seq)) && might work, but I'm not confident that's exactly what we want at the moment, as it partially defeats what this code is trying to do (let us accept URG/FIN/RST after seq and end_seq are truncated to the window). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 8:23 ` David Miller @ 2010-01-06 23:04 ` David Miller 2010-01-07 0:14 ` David Miller 2010-01-07 0:34 ` Ilpo Järvinen 0 siblings, 2 replies; 17+ messages in thread From: David Miller @ 2010-01-06 23:04 UTC (permalink / raw) To: netdev; +Cc: nhorman, ilpo.jarvinen From: David Miller <davem@davemloft.net> Date: Wed, 06 Jan 2010 00:23:28 -0800 (PST) > Special casing the seq == end_seq == tp->rcv_wup case using > something like: > > (after(end_seq, tp->rcv_wup) || > (end_seq == tp->rcv_wup && seq == end_seq)) && > > might work, but I'm not confident that's exactly what we want at the > moment, as it partially defeats what this code is trying to do (let us > accept URG/FIN/RST after seq and end_seq are truncated to the window). I did some more research and everything I've said here turns out to be moot. We should be ACK'ing these things anyways. Here is why: 1) if tcp_sequence() accepts the sequence we continue on in tcp_established() 2) We make it to tcp_data_queue() unless tcp_ack() finds that the ACK sequence is invalid (it covers data we never sent). 3) tcp_data_queue() should make it to, and hit, this conditional: if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) { which will schedule an ACK the same exact way we would if tcp_sequence() rejected the sequence range. So it's a mystery why we aren't responding to Windows 2000's BSD 4.2 style zero window probes. Can someone please validate my analysis? Someone with access to a system exhibiting this will probably need to do some diagnostics to figure out what's going on. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 23:04 ` David Miller @ 2010-01-07 0:14 ` David Miller 2010-01-07 3:21 ` David Miller 2010-01-07 0:34 ` Ilpo Järvinen 1 sibling, 1 reply; 17+ messages in thread From: David Miller @ 2010-01-07 0:14 UTC (permalink / raw) To: netdev; +Cc: nhorman, ilpo.jarvinen From: David Miller <davem@davemloft.net> Date: Wed, 06 Jan 2010 15:04:53 -0800 (PST) > Someone with access to a system exhibiting this will probably need to > do some diagnostics to figure out what's going on. To make this easier to diagnose, I cooked up a hack patch that makes Linux emit BSD 4.2 style keepalives, and indeed a quick test shows that we do indeed not ACK these for some reason: diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 383ce23..e0db52e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2727,6 +2727,19 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent) * send it. */ tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPCB_FLAG_ACK); +#if 1 + /* Construct BSD 4.2 style zero-window probe with one + * out-of-window garbage data byte. + * + * XXX this does the wrong thing when 'urgent' is true + */ + { + unsigned char *garbage_byte = skb_put(skb, 1); + + *garbage_byte = 0xff; + TCP_SKB_CB(skb)->end_seq++; + } +#endif TCP_SKB_CB(skb)->when = tcp_time_stamp; return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC); } ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-07 0:14 ` David Miller @ 2010-01-07 3:21 ` David Miller 2010-01-07 3:36 ` David Miller 0 siblings, 1 reply; 17+ messages in thread From: David Miller @ 2010-01-07 3:21 UTC (permalink / raw) To: netdev; +Cc: nhorman, ilpo.jarvinen From: David Miller <davem@davemloft.net> Date: Wed, 06 Jan 2010 16:14:54 -0800 (PST) > +#if 1 > + /* Construct BSD 4.2 style zero-window probe with one > + * out-of-window garbage data byte. > + * > + * XXX this does the wrong thing when 'urgent' is true > + */ > + { > + unsigned char *garbage_byte = skb_put(skb, 1); > + > + *garbage_byte = 0xff; > + TCP_SKB_CB(skb)->end_seq++; > + } > +#endif This doesn't work so well, because the checksum will be wrong. The patch at the end of this email should work better. And maybe that's a clue, because with this upstream Linux does ACK the probe. So I wonder if the Windows 2000 systems don't calculate the checksum correctly? More mysterious by the minute :-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 383ce23..ce0ae32 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2727,6 +2727,20 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent) * send it. */ tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPCB_FLAG_ACK); +#if 1 + /* Construct BSD 4.2 style zero-window probe with one + * out-of-window garbage data byte. + * + * XXX this does the wrong thing when 'urgent' is true + */ + { + unsigned char *garbage_byte = skb_put(skb, 1); + + *garbage_byte = 0xff; + TCP_SKB_CB(skb)->end_seq++; + skb->ip_summed = CHECKSUM_PARTIAL; + } +#endif TCP_SKB_CB(skb)->when = tcp_time_stamp; return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC); } ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-07 3:21 ` David Miller @ 2010-01-07 3:36 ` David Miller 0 siblings, 0 replies; 17+ messages in thread From: David Miller @ 2010-01-07 3:36 UTC (permalink / raw) To: netdev; +Cc: nhorman, ilpo.jarvinen From: David Miller <davem@davemloft.net> Date: Wed, 06 Jan 2010 19:21:31 -0800 (PST) > And maybe that's a clue, because with this upstream Linux does ACK the > probe. Here's an example trace, note the DSACK blocks: 19:26:51.966289 IP A.34376 > B.330: . ack 22738433 win 0 <nop,nop,timestamp 4294964458 375470> 19:26:52.185946 IP B.330 > A.34376: . 22738432:22738433(1) ack 10753 win 127 <nop,nop,timestamp 375492 4294964458> 19:26:52.207848 IP A.34376 > B.330: . ack 22738433 win 0 <nop,nop,timestamp 4294964480 375492,nop,nop,sack 1 {22738432:22738433}> 19:26:52.645946 IP B.330 > A.34376: . 22738432:22738433(1) ack 10753 win 127 <nop,nop,timestamp 375538 4294964480> 19:26:52.668386 IP A.34376 > B.330: . ack 22738433 win 0 <nop,nop,timestamp 4294964526 375538,nop,nop,sack 1 {22738432:22738433}> 19:26:53.545949 IP B.330 > A.34376: . 22738432:22738433(1) ack 10753 win 127 <nop,nop,timestamp 375628 4294964526> 19:26:53.568955 IP A.34376 > B.330: . ack 22738433 win 0 <nop,nop,timestamp 4294964616 375628,nop,nop,sack 1 {22738432:22738433}> 19:26:55.328448 IP B.330 > A.34376: . 22738432:22738433(1) ack 10753 win 127 <nop,nop,timestamp 375806 4294964616> 19:26:55.351998 IP A.34376 > B.330: . ack 22738433 win 0 <nop,nop,timestamp 4294964794 375806,nop,nop,sack 1 {22738432:22738433}> I added diagnostics to tcp_data_queue() to validate which conditional triggers: [ 662.808884] TCP: HIT SYN[0] FIN[0] ACK[1] SEQ[a018c50f:a018c510] rcv_nxt[a018c510] rcv_wup[a018c510] window[00000000] [ 662.822532] TCP: tcp_data_queue() !after(end_seq, rcv_nxt) Which is what I said should trigger for these things. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 23:04 ` David Miller 2010-01-07 0:14 ` David Miller @ 2010-01-07 0:34 ` Ilpo Järvinen 2010-01-07 0:59 ` David Miller 1 sibling, 1 reply; 17+ messages in thread From: Ilpo Järvinen @ 2010-01-07 0:34 UTC (permalink / raw) To: David Miller; +Cc: Netdev, nhorman On Wed, 6 Jan 2010, David Miller wrote: > From: David Miller <davem@davemloft.net> > Date: Wed, 06 Jan 2010 00:23:28 -0800 (PST) > > > Special casing the seq == end_seq == tp->rcv_wup case using > > something like: > > > > (after(end_seq, tp->rcv_wup) || > > (end_seq == tp->rcv_wup && seq == end_seq)) && > > > > might work, but I'm not confident that's exactly what we want at the > > moment, as it partially defeats what this code is trying to do (let us > > accept URG/FIN/RST after seq and end_seq are truncated to the window). > > I did some more research and everything I've said here turns > out to be moot. > > We should be ACK'ing these things anyways. Here is why: > > 1) if tcp_sequence() accepts the sequence we continue on in > tcp_established() > > 2) We make it to tcp_data_queue() unless tcp_ack() finds that the > ACK sequence is invalid (it covers data we never sent). > > 3) tcp_data_queue() should make it to, and hit, this conditional: > > if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) { > > which will schedule an ACK the same exact way we would if > tcp_sequence() rejected the sequence range. > > So it's a mystery why we aren't responding to Windows 2000's > BSD 4.2 style zero window probes. > > Can someone please validate my analysis? In 3) I don't see why we'd hit that one as peer's snd_una+1 would be larger than rcv_nxt. -- i. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-07 0:34 ` Ilpo Järvinen @ 2010-01-07 0:59 ` David Miller 2010-01-07 7:55 ` Ilpo Järvinen 0 siblings, 1 reply; 17+ messages in thread From: David Miller @ 2010-01-07 0:59 UTC (permalink / raw) To: ilpo.jarvinen; +Cc: netdev, nhorman From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> Date: Thu, 7 Jan 2010 02:34:51 +0200 (EET) > On Wed, 6 Jan 2010, David Miller wrote: > >> 3) tcp_data_queue() should make it to, and hit, this conditional: >> >> if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) { >> >> which will schedule an ACK the same exact way we would if >> tcp_sequence() rejected the sequence range. >> >> So it's a mystery why we aren't responding to Windows 2000's >> BSD 4.2 style zero window probes. >> >> Can someone please validate my analysis? > > In 3) I don't see why we'd hit that one as peer's snd_una+1 would be > larger than rcv_nxt. Peer constructs keepalive packet using sequence [snd.una-1,snd.una], both of which are <= rcv_nxt ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-07 0:59 ` David Miller @ 2010-01-07 7:55 ` Ilpo Järvinen 0 siblings, 0 replies; 17+ messages in thread From: Ilpo Järvinen @ 2010-01-07 7:55 UTC (permalink / raw) To: David Miller; +Cc: Netdev, nhorman [-- Attachment #1: Type: TEXT/PLAIN, Size: 940 bytes --] On Wed, 6 Jan 2010, David Miller wrote: > From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> > Date: Thu, 7 Jan 2010 02:34:51 +0200 (EET) > > > On Wed, 6 Jan 2010, David Miller wrote: > > > >> 3) tcp_data_queue() should make it to, and hit, this conditional: > >> > >> if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) { > >> > >> which will schedule an ACK the same exact way we would if > >> tcp_sequence() rejected the sequence range. > >> > >> So it's a mystery why we aren't responding to Windows 2000's > >> BSD 4.2 style zero window probes. > >> > >> Can someone please validate my analysis? > > > > In 3) I don't see why we'd hit that one as peer's snd_una+1 would be > > larger than rcv_nxt. > > Peer constructs keepalive packet using sequence [snd.una-1,snd.una], > both of which are <= rcv_nxt Right, I later realized that there was this !urgent but was already too much heading to zzz to correct it. -- i. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-06 0:39 BSD 4.2 style TCP keepalives David Miller 2010-01-06 2:07 ` Neil Horman 2010-01-06 8:23 ` David Miller @ 2010-01-08 12:40 ` Neil Horman 2010-01-08 21:21 ` David Miller 2 siblings, 1 reply; 17+ messages in thread From: Neil Horman @ 2010-01-08 12:40 UTC (permalink / raw) To: David Miller; +Cc: netdev, ilpo.jarvinen On Tue, Jan 05, 2010 at 04:39:11PM -0800, David Miller wrote: > > To make a long story short, there are still some Windows 2000 > machines out there emitting BSD 4.2 style keepalives (one garbage > byte instead of an empty out-of-window probe frame). > > We don't ACK these because of how tcp_sequence() sees ->end_seq > as being equal to ->rcv_wup > > But we can't change tcp_sequence() to reject these frames, because if > we do then we end up mishandling connection attempts (SYN, SYN+ACK) > and retransmits of such. > > Neil has shown me a patch that does a by-hand special case of this > one-garbage-byte keepalive inside of tcp_rcv_established(). > > Anyone have suggestions for an alternative and perhaps cleaner > implementation of a fix? > > Thanks! > Dave, sorry about this, but it looks like we can scrap this, I just looked at the initial tcpdump this was reported in, and apparently w2k doesn't compute the checksum properly on these old style keepalives. Wireshark disables tcp keepalive validation by default, so it wasn't clear to see, but as soon as you enable it, the checksum is marked as bad in all of those frames. So we've got nothing to do here, except maybe make a note of this in case we hit it again in the future. Thanks! Neil ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-08 12:40 ` Neil Horman @ 2010-01-08 21:21 ` David Miller 2010-01-09 1:22 ` Neil Horman 0 siblings, 1 reply; 17+ messages in thread From: David Miller @ 2010-01-08 21:21 UTC (permalink / raw) To: nhorman; +Cc: netdev, ilpo.jarvinen From: Neil Horman <nhorman@tuxdriver.com> Date: Fri, 8 Jan 2010 07:40:33 -0500 > Dave, sorry about this, but it looks like we can scrap this, I just > looked at the initial tcpdump this was reported in, and apparently > w2k doesn't compute the checksum properly on these old style > keepalives. Thanks for the update. Nothing to be sorry about, this is actually a relief and we learned a lot about keepalives and zero-window probes in the process :-) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-08 21:21 ` David Miller @ 2010-01-09 1:22 ` Neil Horman 2010-01-09 1:41 ` David Miller 0 siblings, 1 reply; 17+ messages in thread From: Neil Horman @ 2010-01-09 1:22 UTC (permalink / raw) To: David Miller; +Cc: netdev, ilpo.jarvinen On Fri, Jan 08, 2010 at 01:21:07PM -0800, David Miller wrote: > From: Neil Horman <nhorman@tuxdriver.com> > Date: Fri, 8 Jan 2010 07:40:33 -0500 > > > Dave, sorry about this, but it looks like we can scrap this, I just > > looked at the initial tcpdump this was reported in, and apparently > > w2k doesn't compute the checksum properly on these old style > > keepalives. > > Thanks for the update. Nothing to be sorry about, this is actually a > relief and we learned a lot about keepalives and zero-window probes in > the process :-) > I'm trying to do some independent computation on it, but its looking like this may have been a combination of old software (win2k) and a bad corner case in hardware. I think the reporter has a NIC that doesn't do TCO properly on these old style keepalives, which would explain why it wasn't reproducable outside of the reporters environment. I'm trying to figure out which card/hw revision of NIC they were using. It might be worth coding an errata check into the appropriate driver if this all turns out to be accurate to disable TCO affectecd hw. Neil ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: BSD 4.2 style TCP keepalives 2010-01-09 1:22 ` Neil Horman @ 2010-01-09 1:41 ` David Miller 0 siblings, 0 replies; 17+ messages in thread From: David Miller @ 2010-01-09 1:41 UTC (permalink / raw) To: nhorman; +Cc: netdev, ilpo.jarvinen From: Neil Horman <nhorman@tuxdriver.com> Date: Fri, 8 Jan 2010 20:22:27 -0500 > I'm trying to do some independent computation on it, but its looking > like this may have been a combination of old software (win2k) and a > bad corner case in hardware. I think the reporter has a NIC that > doesn't do TCO properly on these old style keepalives, which would > explain why it wasn't reproducable outside of the reporters > environment. I'm trying to figure out which card/hw revision of NIC > they were using. It might be worth coding an errata check into the > appropriate driver if this all turns out to be accurate to disable > TCO affectecd hw. I think it's not a hardware bug, because there isn't anything interesting about these packets from a checksumming perspective. It's just a normal 1-byte TCP data frame as far as the card is concerned. Rather, I think when checksum offloading, win2k doesn't set the internal packet state correctly for such probe packets such that the card is told to checksum the frame. Ie. a win2k TCP bug and nothing to do with the card or it's driver. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-01-09 1:41 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-06 0:39 BSD 4.2 style TCP keepalives David Miller 2010-01-06 2:07 ` Neil Horman 2010-01-06 3:59 ` David Miller 2010-01-06 17:21 ` Rick Jones 2010-01-06 20:50 ` Neil Horman 2010-01-06 8:23 ` David Miller 2010-01-06 23:04 ` David Miller 2010-01-07 0:14 ` David Miller 2010-01-07 3:21 ` David Miller 2010-01-07 3:36 ` David Miller 2010-01-07 0:34 ` Ilpo Järvinen 2010-01-07 0:59 ` David Miller 2010-01-07 7:55 ` Ilpo Järvinen 2010-01-08 12:40 ` Neil Horman 2010-01-08 21:21 ` David Miller 2010-01-09 1:22 ` Neil Horman 2010-01-09 1:41 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).