* Debugging TCP: Treason Uncloaked
@ 2008-05-08 2:29 Chris Bredesen
2008-05-08 15:01 ` John Heffner
2008-05-13 11:21 ` Ilpo Järvinen
0 siblings, 2 replies; 5+ messages in thread
From: Chris Bredesen @ 2008-05-08 2:29 UTC (permalink / raw)
To: netdev
Hello,
As per below thread, I'm forwarding to this list/alias. tcpdump output
is attached to the forum post.
Thanks!
-------- Original Message --------
Subject: Re: F8 Treason Uncloaked
Date: Tue, 6 May 2008 20:00:51 +0100
From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Organization: Red Hat UK Cyf., Amberley Place, 107-111 Peascod Street,
Windsor, Berkshire, SL4 1TE, Y Deyrnas Gyfunol. Cofrestrwyd yng Nghymru
a Lloegr o'r rhif cofrestru 3798903
To: For users of Fedora <fedora-list@redhat.com>
CC: cbredesen@redhat.com
References: <4820A59F.7010005@redhat.com>
On Tue, 06 May 2008 14:38:23 -0400
Chris Bredesen <cbredesen@redhat.com> wrote:
> Hello list,
>
> I'm trying to get this issue out to a wider audience. I posted it on
> the forum but got no responses:
>
> http://www.fedoraforum.org/forum/showthread.php?t=186331
>
> Sorry to cross post, but I'm not sure where else to turn...
netdev@linux.kernel.org
but the TCP messages really indicate either a Linux bug or that the NAS
isn't talking proper TCP. Trying turning off window scaling was a good
test but beyond that tcpdump data will be needed.
Alan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Debugging TCP: Treason Uncloaked
2008-05-08 2:29 Debugging TCP: Treason Uncloaked Chris Bredesen
@ 2008-05-08 15:01 ` John Heffner
2008-05-13 11:21 ` Ilpo Järvinen
1 sibling, 0 replies; 5+ messages in thread
From: John Heffner @ 2008-05-08 15:01 UTC (permalink / raw)
To: Chris Bredesen; +Cc: netdev
On Wed, May 7, 2008 at 10:29 PM, Chris Bredesen <cbredesen@redhat.com> wrote:
> Hello,
>
> As per below thread, I'm forwarding to this list/alias. tcpdump output
> is attached to the forum post.
Full binary tcpdumps (including the SYN packets) are more useful.
>From the brief snippet you have, we can't see where the announced
window was reneged, but it's 192.168.1.130 doing it, not the nas box.
You're also getting DSACKs from the nas box, which is also a bit
strange.
Another thing to try is disabling TSO, though I can't think why that
might be causing a problem here.
-John
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Debugging TCP: Treason Uncloaked
2008-05-08 2:29 Debugging TCP: Treason Uncloaked Chris Bredesen
2008-05-08 15:01 ` John Heffner
@ 2008-05-13 11:21 ` Ilpo Järvinen
2008-05-19 16:40 ` Chris Bredesen
1 sibling, 1 reply; 5+ messages in thread
From: Ilpo Järvinen @ 2008-05-13 11:21 UTC (permalink / raw)
To: Chris Bredesen; +Cc: Netdev
On Wed, 7 May 2008, Chris Bredesen wrote:
> Hello,
>
> As per below thread, I'm forwarding to this list/alias. tcpdump output
> is attached to the forum post.
>
> Thanks!
>
> -------- Original Message --------
> Subject: Re: F8 Treason Uncloaked
> Date: Tue, 6 May 2008 20:00:51 +0100
> From: Alan Cox <alan@lxorguk.ukuu.org.uk>
> Organization: Red Hat UK Cyf., Amberley Place, 107-111 Peascod Street,
> Windsor, Berkshire, SL4 1TE, Y Deyrnas Gyfunol. Cofrestrwyd yng Nghymru
> a Lloegr o'r rhif cofrestru 3798903
> To: For users of Fedora <fedora-list@redhat.com>
> CC: cbredesen@redhat.com
> References: <4820A59F.7010005@redhat.com>
>
> On Tue, 06 May 2008 14:38:23 -0400
> Chris Bredesen <cbredesen@redhat.com> wrote:
>
> > Hello list,
> >
> > I'm trying to get this issue out to a wider audience. I posted it on the
> > forum but got no responses:
> >
> > http://www.fedoraforum.org/forum/showthread.php?t=186331
> >
> > Sorry to cross post, but I'm not sure where else to turn...
>
> netdev@linux.kernel.org
>
> but the TCP messages really indicate either a Linux bug or that the NAS
> isn't talking proper TCP. Trying turning off window scaling was a good
> test but beyond that tcpdump data will be needed.
...This report lacks kernel version (no I won't try to figure out what f8
or whatever is using on your box, just tell it :-)) (e.g., ...Some
2.6.25-rc had this problem).
Tcp_window_scaling sysctl has nothing to do with window resizing. ...It
just decides if scaling factor can be used or not. It won't guarantee you
a constant window!
What happened while the window was shrunk is hard to know because the log
snippet doesn't have the point where the window was reduced.
--
i.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Debugging TCP: Treason Uncloaked
2008-05-13 11:21 ` Ilpo Järvinen
@ 2008-05-19 16:40 ` Chris Bredesen
2008-05-20 10:17 ` Ilpo Järvinen
0 siblings, 1 reply; 5+ messages in thread
From: Chris Bredesen @ 2008-05-19 16:40 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Netdev, johnwheffner
[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]
Ilpo Järvinen wrote:
> ...This report lacks kernel version (no I won't try to figure out what f8
> or whatever is using on your box, just tell it :-)) (e.g., ...Some
> 2.6.25-rc had this problem).
>
> Tcp_window_scaling sysctl has nothing to do with window resizing. ...It
> just decides if scaling factor can be used or not. It won't guarantee you
> a constant window!
>
> What happened while the window was shrunk is hard to know because the log
> snippet doesn't have the point where the window was reduced.
John - thanks for the explanation - I understand the relationship
between scaling and resizing now. If my notes are correct, it happened
with both these kernels on the client:
2.6.24.3-12.fc8.i686
2.6.24.3-34.fc8.i686
RHEL and CentOS guys are reporting this issue as well so I wonder if
it's something specific to a RH kernel (not sure how close they are to
upstream but my understanding is that Fedora kernels are pretty close,
but this is *clearly* not my area of expertise).
Kernel on the NAS device is 2.6.9 AFAIK but the distro has proprietary
bits in it so I'm not sure what's been done there. It's a Netgear
ReadyNAS appliance.
In any case, I'm attaching an archive of the whole tcpdump session so
you can have a look. Please let me know if you need more info. I
really *really* appreciate your help on this -- I'm paying the results
of our findings forward so others won't get tripped up on this issue.
Best,
Chris
[-- Attachment #2: tcp.dump.filtered.tar.gz --]
[-- Type: application/x-gzip, Size: 106507 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Debugging TCP: Treason Uncloaked
2008-05-19 16:40 ` Chris Bredesen
@ 2008-05-20 10:17 ` Ilpo Järvinen
0 siblings, 0 replies; 5+ messages in thread
From: Ilpo Järvinen @ 2008-05-20 10:17 UTC (permalink / raw)
To: Chris Bredesen; +Cc: Netdev, johnwheffner
On Mon, 19 May 2008, Chris Bredesen wrote:
> Kernel on the NAS device is 2.6.9 AFAIK but the distro has proprietary bits in
> it so I'm not sure what's been done there. It's a Netgear ReadyNAS appliance.
It well could be NAS' fault as well.... The recent case with 25-rcs had
TCP to transmit _past_ snd_nxt (ie., too far, which of course is not right
either), not that the window was actually shrunk as the message suggests.
> In any case, I'm attaching an archive of the whole tcpdump session so you can
> have a look. Please let me know if you need more info.
Hmm, this actually seems to be fault of that type in NAS' TCP:
20:06:57.976848 ... > nas.rsync: . 23744483:23745931(1448) ack 130667 win 1448
20:06:57.977241 nas.rsync > ...: . 130667:132115(1448)
20:06:57.977294 nas.rsync > ...: . 132115:133563(1448)
20:06:57.977308 nas.rsync > ...: P 133563:134259(696)
How come could it send 134259 when advertized window is just 130667+1448 =
132115 and assume that to work? Then TCP at NAS' end finally gives up
later because it does get cumulative ACK as response to a number of RTOs
as window remains zero at 133563. Would the window open from zero, the
situation would resolve when RTO is received. But it doesn't which
may be client side user-space application's "fault" as it seems to not be
too eager to read from TCP(?) :-/, nevertheless, NAS violated spec and
cannot cope the results. And yes, the client didn't shrink the window
anywhere (I checked that too), so those transmission are obviously out of
window by spec.
If some other client works, it may be just due to luck, eg., user-space
works differently or a subtle difference in TCP implementation behavior.
As a workaround, one could try larger receiver buffer at the client.
I don't think window scaling contributes to this problem as you suggested
earlier, except that there are some bugs related to it in 2.6.9 that are
fixed now (and even 2.6.24 might have the rounding bug unless somebody
sent that to stable, I don't remember if that happened, that is, commit
607bfbf2d55dd1cfe5368b41c2a81a8c9ccf4723).
--
i.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-05-20 10:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-08 2:29 Debugging TCP: Treason Uncloaked Chris Bredesen
2008-05-08 15:01 ` John Heffner
2008-05-13 11:21 ` Ilpo Järvinen
2008-05-19 16:40 ` Chris Bredesen
2008-05-20 10:17 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).