From: Josef Bacik <jbacik@fb.com>
To: Andrei Borzenkov <arvidjaar@gmail.com>
Cc: The development of GNU GRUB <grub-devel@gnu.org>, kernel-team@fb.com
Subject: Re: [PATCH] tcp: ack when we get an OOO/lost packet
Date: Tue, 18 Aug 2015 10:58:57 -0700 [thread overview]
Message-ID: <55D37261.9090907@fb.com> (raw)
In-Reply-To: <CAA91j0Vy+rJc2rOTaPG9Z6vui=+4JdBtS726kENxD8T3ErV8+A@mail.gmail.com>
On 08/17/2015 05:38 AM, Andrei Borzenkov wrote:
> On Thu, Aug 13, 2015 at 4:59 PM, Josef Bacik <jbacik@fb.com> wrote:
>> On 08/13/2015 04:19 AM, Andrei Borzenkov wrote:
>>>
>>> On Wed, Aug 12, 2015 at 6:16 PM, Josef Bacik <jbacik@fb.com> wrote:
>>>>
>>>> While adding tcp window scaling support I was finding that I'd get some
>>>> packet
>>>> loss or reordering when transferring from large distances and grub would
>>>> just
>>>> timeout. This is because we weren't ack'ing when we got our OOO packet,
>>>> so the
>>>> sender didn't know it needed to retransmit anything, so eventually it
>>>> would fill
>>>> the window and stop transmitting, and we'd time out. Fix this by ACK'ing
>>>> when
>>>> we don't find our next sequence numbered packet. With this fix I no
>>>> longer time
>>>> out. Thanks,
>>>
>>>
>>> I have a feeling that your description is misleading. Patch simply
>>> sends duplicated ACK, but partner does not know what has been received
>>> and what has not, so it must wait for ACK timeout anyway before
>>> retransmitting. What this patch may fix would be lost ACK packet
>>> *from* GRUB, by increasing rate of ACK packets it sends. Do you have
>>> packet trace for timeout case, ideally from both sides simultaneously?
>>>
>>
>> The way linux works is that if you get <configurable amount> of DUP ack's it
>> triggers a retransmit. I only have traces from the server since tcpdump
>> doesn't work in grub (or if it does I don't know how to do it). The server
>> is definitely getting all of the ACK's,
>
(Sorry was traveling for Linux Plumbers.)
> In packet trace you sent me there was almost certain ACK loss for the
> segment 20801001- 20805881 (frame 19244). Note that here recovery was
> rather fast - server started retransmission after ~0.5sec. It is
> unlikely lost packet from server - next ACK from GRUB received by
> server was for 20803441, which means it actually got at least initial
> half of this segment. Unfortunately some packets are missing in
> capture (even packets *from* server), which makes it harder to
> interpret. After this server went down to 512 segment size and
> everything went more or less well, until frame 19949. Here the server
> behavior is rather interesting. It starts retransmission with initial
> timeout ~6sec, even though it received quite a lot of DUP ACKs; and
> doubling it every time until it hits GRUB timeout (~34 seconds).
>
Yeah that's the normal re-transmission timeout. This tcpdump was on a
non-patched grub. We only sent 3 dup acks, we have the dup ack counter
stuff set to like 13 or something like that so we have to get a lot
before it triggers the dup ack retransmit logic.
> Note the difference in behavior between the former and the latter. Did
> you try to ask on Linux networking list why they are so different?
>
I'll run it by our networking guys when they show up.
> OTOH GRUB probably times out too early. Initial TCP RFC suggests 5
> minutes general timeout and RFC1122 - at least 100 seconds. It would
> be interesting to increase connection timeout to see if it recovers.
> You could try bumping GRUB_NET_TRIES to 82 which result in timeout
> slightly over 101 sec.
>
> Also it seems that huge window may aggravate the issue. According to
> trace, 10K is enough to fill pipe and you set it to 1M. It would be
> interesting to see the same with default windows size.
>
Oh yeah the problem doesn't happen with a normal window size, it's only
with the giant window size. I'm not sure where you are getting the 10k
number, believe me if I could have gotten around this by just jacking up
the normal window size I would have done it. When I set it to the max
(64k I think?) I get a transfer rate of around 200 kb/s, which is not
fast enough to pull down our 250mb image. With the 1mb window I get 5.5
mb/s, so there is a real benefit to the giant window. Thanks,
Josef
next prev parent reply other threads:[~2015-08-18 17:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-12 15:16 [PATCH] tcp: ack when we get an OOO/lost packet Josef Bacik
2015-08-13 8:19 ` Andrei Borzenkov
2015-08-13 13:59 ` Josef Bacik
2015-08-13 17:13 ` Andrei Borzenkov
2015-08-13 17:40 ` Josef Bacik
2015-08-17 12:38 ` Andrei Borzenkov
2015-08-18 17:58 ` Josef Bacik [this message]
2015-12-07 17:59 ` Andrei Borzenkov
2015-12-07 18:28 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55D37261.9090907@fb.com \
--to=jbacik@fb.com \
--cc=arvidjaar@gmail.com \
--cc=grub-devel@gnu.org \
--cc=kernel-team@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.