From: Eric Dumazet <eric.dumazet@gmail.com>
To: Carsten Wolff <carsten@wolffcarsten.de>
Cc: Yuchung Cheng <ycheng@google.com>,
"Esztermann, Ansgar" <Ansgar.Esztermann@mpi-bpc.mpg.de>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: TCP fast retransmit
Date: Thu, 15 Dec 2011 09:24:29 +0100 [thread overview]
Message-ID: <1323937469.2631.31.camel@edumazet-laptop> (raw)
In-Reply-To: <201112150841.08087.carsten@wolffcarsten.de>
Le jeudi 15 décembre 2011 à 08:41 +0100, Carsten Wolff a écrit :
> On Wednesday 14 December 2011, Eric Dumazet wrote:
> > Le mercredi 14 décembre 2011 à 11:00 -0800, Yuchung Cheng a écrit :
> > > I use tcptrace to check the time sequence and I am puzzled:
> > > I see a lot of OOO packets too but how can this happen at a sender-side
> > > trace? unless the trace is taken close to but not exactly at the sender.
> > > I expect on seeing in-sequence packets but a lots of SACKs plus some
> > > spurious retransmists.
> >
> > I understood the trace was a receiver-side one (a linux machine if I am
> > not mistaken, while the sender is AIX powered)
> >
> > (Looking at timings of ACKS, coming a few us after corresponding data
> > packet arrival)
>
> Oh. Right. This also means, that net.ipv4.tcp_reordering is only available at
> the receiver (Linux), which doesn't help, because the reordering robustness
> stuff happens on sender-side. So don't even bother changing that sysctl.
>
Oh well, reading Ansgar mail, it seems this is the other way :
quote :
2.6.37.6 with openSUSE patches in the sender, some version of AIX in the
receiver. The latter seems to be critical: we've never encountered this
problem with any other combination of OSs but AIX & Linux.
I only dont understand how we can receive an ACK so fast (6 us after the
data packet ACKed, even 3us a bit later). This seems not possible, even
with 10Gb infra. (A CISCO firewall was mentioned)
12:18:20.732998 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 284400:287136, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 2736
12:18:20.733004 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 287136, win 591, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
12:18:20.733048 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 287136:293976, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 6840
12:18:20.733073 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 293976, win 549, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
12:18:20.733104 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 293976:298080, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 4104
12:18:20.733120 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 298080, win 522, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
Here next two packets we send are out of order.
12:18:20.733161 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 299448:300816, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733164 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 298080, win 522, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {299448:300816}], length 0
12:18:20.733166 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 298080:299448, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733169 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 300816:302184, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733171 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 303552:304920, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733173 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 302184, win 490, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {303552:304920}], length 0
12:18:20.733174 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 302184:303552, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733177 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 304920, win 469, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
12:18:20.733224 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 304920:310392, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 5472
12:18:20.733228 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 311760:313128, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733230 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 310392, win 427, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {311760:313128}], length 0
12:18:20.733272 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 313128:315864, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 2736
12:18:20.733276 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 310392, win 427, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {311760:315864}], length 0
12:18:20.733326 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 315864:319968, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 4104
12:18:20.733330 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 310392, win 427, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {311760:319968}], length 0
12:18:20.733332 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 310392:311760, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733333 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 321336:322704, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733335 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 319968, win 353, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {321336:322704}], length 0
12:18:20.733372 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 322704:324072, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733375 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 319968, win 353, options [nop,nop,TS val 627192022 ecr 1327509818,nop,nop,sack 1 {321336:324072}], length 0
12:18:20.733377 IP 134.76.98.13.1500 > 10.208.9.87.35337: Flags [.], seq 319968:321336, ack 555, win 65280, options [nop,nop,TS val 1327509818 ecr 627192022], length 1368
12:18:20.733381 IP 10.208.9.87.35337 > 134.76.98.13.1500: Flags [.], ack 324072, win 327, options [nop,nop,TS val 627192022 ecr 1327509818], length 0
Really, my feeling is this trace is taken on receiver, and maybe LRO/GRO
is buggy ?
Ansgar, please provide more details, like the NIC you use (hardware,
driver versions...)
next prev parent reply other threads:[~2011-12-15 8:24 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-25 13:33 TCP fast retransmit Esztermann, Ansgar
2011-11-25 16:36 ` Eric Dumazet
2011-11-25 16:39 ` Eric Dumazet
2011-11-29 9:00 ` Esztermann, Ansgar
2011-12-09 13:34 ` Esztermann, Ansgar
2011-12-09 14:43 ` Eric Dumazet
2011-12-09 16:17 ` Esztermann, Ansgar
2011-12-09 16:31 ` Eric Dumazet
2011-12-13 14:05 ` Esztermann, Ansgar
2011-12-13 14:31 ` Eric Dumazet
2011-12-13 14:59 ` Carsten Wolff
2011-12-14 19:00 ` Yuchung Cheng
2011-12-14 22:31 ` Eric Dumazet
2011-12-15 7:41 ` Carsten Wolff
2011-12-15 8:24 ` Eric Dumazet [this message]
2011-12-16 15:53 ` Esztermann, Ansgar
2011-11-25 16:57 ` Ilpo Järvinen
2011-11-28 21:17 ` Yuchung Cheng
2011-11-29 9:00 ` Esztermann, Ansgar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1323937469.2631.31.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=Ansgar.Esztermann@mpi-bpc.mpg.de \
--cc=carsten@wolffcarsten.de \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox