From: "Darryl L. Miles" <darryl-mailinglists@netbauds.net>
To: linux-kernel@vger.kernel.org
Cc: Netdev <netdev@vger.kernel.org>
Subject: Re: TCP SACK issue, hung connection, tcpdump included
Date: Tue, 31 Jul 2007 06:03:02 +0100 [thread overview]
Message-ID: <46AEC286.2030302@netbauds.net> (raw)
In-Reply-To: <20070729160721.GA31276@1wt.eu>
I've been able to capture a tcpdump from both ends during the problem
and its my belief there is a bug in 2.6.20.1 (at the client side) in
that it issues a SACK option for an old sequence which the current
window being advertised is beyond it. This is the most concerning issue
as the integrity of the sequence numbers doesn't seem right (to my
limited understanding anyhow).
There is another concern of why the SERVER performed a retransmission in
the first place, when the tcpdump shows the ack covering it has been seen.
I have made available the full dumps at:
http://darrylmiles.org/snippets/lkml/20070731/
There are some changes in 2.6.22 that appear to affect TCP SACK handling
does this fix a known issue ?
This sequence is interesting from the client side:
03:58:56.419034 IP SERVER.ssh > CLIENT.43726: . 26016:27464(1448) ack
4239 win 2728 <nop,nop,timestamp 16345815 819458859> # S1
03:58:56.419100 IP CLIENT.43726 > SERVER.ssh: . ack 27464 win 501
<nop,nop,timestamp 819458884 16345815> # C1
03:58:56.422019 IP SERVER.ssh > CLIENT.43726: P 27464:28176(712) ack
4239 win 2728 <nop,nop,timestamp 16345815 819458859> # S2
03:58:56.422078 IP CLIENT.43726 > SERVER.ssh: . ack 28176 win 501
<nop,nop,timestamp 819458884 16345815> # C2
The above 4 packets look as expect to me. Then we suddenly see a
retransmission of 26016:27464.
03:58:56.731597 IP SERVER.ssh > CLIENT.43726: . 26016:27464(1448) ack
4239 win 2728 <nop,nop,timestamp 16346128 819458864> # S3
So the client instead of discarding the retransmission of duplicate
segment, issues a SACK.
03:58:56.731637 IP CLIENT.43726 > SERVER.ssh: . ack 28176 win 501
<nop,nop,timestamp 819458962 16345815,nop,nop,sack sack 1 {26016:27464}
> # C3
In response to this the server is confused ??? It responds to
sack{26016:27464} but the client is also saying "wnd 28176". Wouldn't
the server expect "wnd < 26016" to there is a segment to retransmit ?
03:58:57.322800 IP SERVER.ssh > CLIENT.43726: . 26016:27464(1448) ack
4239 win 2728 <nop,nop,timestamp 16346718 819458864> # S4
Now viewed from the server side:
03:58:56.365655 IP SERVER.ssh > CLIENT.43726: . 26016:27464(1448) ack
4239 win 2728 <nop,nop,timestamp 16345815 819458859> # S1
03:58:56.365662 IP SERVER.ssh > CLIENT.43726: P 27464:28176(712) ack
4239 win 2728 <nop,nop,timestamp 16345815 819458859> # S2
03:58:56.374633 IP CLIENT.43726 > SERVER.ssh: . ack 24144 win 488
<nop,nop,timestamp 819458861 16345731> # propagation delay
03:58:56.381630 IP CLIENT.43726 > SERVER.ssh: . ack 25592 win 501
<nop,nop,timestamp 819458863 16345734> # propagation delay
03:58:56.384503 IP CLIENT.43726 > SERVER.ssh: . ack 26016 win 501
<nop,nop,timestamp 819458864 16345734> # propagation delay
03:58:56.462583 IP CLIENT.43726 > SERVER.ssh: . ack 27464 win 501
<nop,nop,timestamp 819458884 16345815> # C1
03:58:56.465707 IP CLIENT.43726 > SERVER.ssh: . ack 28176 win 501
<nop,nop,timestamp 819458884 16345815> # C2
The above packets just as expected.
03:58:56.678546 IP SERVER.ssh > CLIENT.43726: . 26016:27464(1448) ack
4239 win 2728 <nop,nop,timestamp 16346128 819458864> # S3
I guess the above packet is indeed a retransmission of "# S1" but why
was it retransmitted, when we can clearly see "# C1" above acks this
segment ? It is not even as if the retransmission escaped before the
kernel had time to process the ack, as 200ms elapsed. CONCERN NUMBER TWO
03:58:56.774778 IP CLIENT.43726 > SERVER.ssh: . ack 28176 win 501
<nop,nop,timestamp 819458962 16345815,nop,nop,sack sack 1 {26016:27464}
> # C3
CONCERN NUMBER ONE, why in response to that escaped retransmission was a
SACK the appropriate response ? When at the time the client sent the
SACK it had received all data upto 28176, a fact it continues to
advertise in the "# C3" packet above.
There is nothing wrong is the CLIENT expecting to see a retransmission
of that segment at this point in time that is an expected circumstance.
03:58:57.269529 IP SERVER.ssh > CLIENT.43726: . 26016:27464(1448) ack
4239 win 2728 <nop,nop,timestamp 16346718 819458864> # S4
Darryl
next prev parent reply other threads:[~2007-07-31 5:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <46AC2CBE.5010500@netbauds.net>
2007-07-29 6:45 ` TCP SACK issue, hung connection, tcpdump included Willy Tarreau
2007-07-29 8:26 ` Ilpo Järvinen
2007-07-29 8:54 ` Willy Tarreau
2007-07-29 9:28 ` Ilpo Järvinen
2007-07-29 16:07 ` Willy Tarreau
2007-07-29 16:28 ` Ilpo Järvinen
2007-07-31 5:03 ` Darryl L. Miles [this message]
2007-08-02 9:23 ` Ilpo Järvinen
2007-08-02 9:26 ` David Miller
2007-08-02 16:58 ` Darryl Miles
2007-08-02 23:51 ` Ilpo Järvinen
2007-07-29 8:56 ` David Miller
2007-07-29 8:39 ` Jan Engelhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46AEC286.2030302@netbauds.net \
--to=darryl-mailinglists@netbauds.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).