From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1ZPt36-0007H2-6I for mharc-grub-devel@gnu.org; Thu, 13 Aug 2015 10:00:04 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34473) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZPt33-0007ER-72 for grub-devel@gnu.org; Thu, 13 Aug 2015 10:00:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZPt2y-0005sa-72 for grub-devel@gnu.org; Thu, 13 Aug 2015 10:00:01 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:40647) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZPt2y-0005sA-1E for grub-devel@gnu.org; Thu, 13 Aug 2015 09:59:56 -0400 Received: from pps.filterd (m0004077 [127.0.0.1]) by mx0b-00082601.pphosted.com (8.14.5/8.14.5) with SMTP id t7DDxbqx004701; Thu, 13 Aug 2015 06:59:52 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=message-id : date : from : mime-version : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=facebook; bh=EIRNhSOefJyWbYufehq2eN6lLIlfj/H3ni10J21F/Bg=; b=b1BB11Z2I3uKeXfby4/1s42MZNnflszCui86wei4zP1JAd+IsQvMEzKHmUqLZ+Qb6CJ5 DCuSvqgFgtANrnfR5xRJ5r43tMQTBzJ2IdBCMwgm7ZwddizRAH4LLB7CdtaKMMoHPNfe G/Qh/94OwgTKsh0kxMuQ+Qbr8FV/nlgX+lU= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0b-00082601.pphosted.com with ESMTP id 1w8vebr1kx-1 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Thu, 13 Aug 2015 06:59:52 -0700 Received: from localhost.localdomain (192.168.52.123) by mail.TheFacebook.com (192.168.16.18) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 13 Aug 2015 06:59:49 -0700 Message-ID: <55CCA2D3.1050306@fb.com> Date: Thu, 13 Aug 2015 09:59:47 -0400 From: Josef Bacik User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Andrei Borzenkov , The development of GNU GRUB Subject: Re: [PATCH] tcp: ack when we get an OOO/lost packet References: <1439392582-3172342-1-git-send-email-jbacik@fb.com> In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151, 1.0.33, 0.0.0000 definitions=2015-08-13_06:2015-08-13, 2015-08-13, 1970-01-01 signatures=0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x X-Received-From: 67.231.153.30 Cc: kernel-team@fb.com X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: The development of GNU GRUB List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Aug 2015 14:00:02 -0000 On 08/13/2015 04:19 AM, Andrei Borzenkov wrote: > On Wed, Aug 12, 2015 at 6:16 PM, Josef Bacik wrote: >> While adding tcp window scaling support I was finding that I'd get some packet >> loss or reordering when transferring from large distances and grub would just >> timeout. This is because we weren't ack'ing when we got our OOO packet, so the >> sender didn't know it needed to retransmit anything, so eventually it would fill >> the window and stop transmitting, and we'd time out. Fix this by ACK'ing when >> we don't find our next sequence numbered packet. With this fix I no longer time >> out. Thanks, > > I have a feeling that your description is misleading. Patch simply > sends duplicated ACK, but partner does not know what has been received > and what has not, so it must wait for ACK timeout anyway before > retransmitting. What this patch may fix would be lost ACK packet > *from* GRUB, by increasing rate of ACK packets it sends. Do you have > packet trace for timeout case, ideally from both sides simultaneously? > The way linux works is that if you get of DUP ack's it triggers a retransmit. I only have traces from the server since tcpdump doesn't work in grub (or if it does I don't know how to do it). The server is definitely getting all of the ACK's, and from my printf()'ing in grub we are either getting re-ordered packets (the most likely) or we are simply losing a packet here or there. This is a pretty long distance and we have a lot of networking between Sweden and California so reordering or packet loss isn't out of the question. Regardless we definitely need to be ACK'ing packets that come in with the last seq we had as the spec says so the sender knows the last bit we had, otherwise we see timeouts once the window is full. > Did you consider implementing receive side SACK BTW? You have the > right environment to test it :) > So I found this bug while implementing SACK, and decided it was faster to just do this rather than add SACK. This method is still exceedingly slow, I only get around 800kb/s over the entire transfer whereas I can sustain around 5.5 mb/s before we start losing stuff, so I'm definitely going to go back and try the timestamp echo stuff since the timeout stuff takes like 6 seconds, and then if that doesn't work bite the bullet and add SACK. But first I want to get my ipv6 patches right ;). Thanks, Josef