From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.71) id 1aQGKd-0003Wu-7O for mharc-grub-devel@gnu.org; Mon, 01 Feb 2016 10:23:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48531) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQGKa-0003Wm-Ix for grub-devel@gnu.org; Mon, 01 Feb 2016 10:23:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aQGKV-0000S5-Ft for grub-devel@gnu.org; Mon, 01 Feb 2016 10:23:56 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:32176) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aQGKV-0000Rj-6K for grub-devel@gnu.org; Mon, 01 Feb 2016 10:23:51 -0500 Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id u11FJgTl032630; Mon, 1 Feb 2016 07:23:48 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=subject : to : references : cc : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=facebook; bh=Pc6uPH4zpCwkWwd40A5WbIoYg5YusH53RLU/kN4/G04=; b=iqN4YToekmrxiyZABrabacFurjU/XmLCgroMU06sAD0vFZyHr0qfrmtHml2jLVXnTym9 UsBb5uX5jfseqvwfioyA1BUd0A1fpX0/wTQCyjX+bc3xjbB7fDNETd8yvfN37xfvJ1mZ NQMXfyN+KkF6WdMNZkS1RZmxm31Z/C8Lh6k= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 20t28r1was-8 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 01 Feb 2016 07:23:48 -0800 Received: from localhost.localdomain (192.168.52.123) by mail.thefacebook.com (192.168.16.12) with Microsoft SMTP Server (TLS) id 14.3.248.2; Mon, 1 Feb 2016 07:23:42 -0800 Subject: Re: [PATCH V2] tcp: add window scaling and RTTM support To: Andrei Borzenkov , The development of GNU GRUB References: <1454104105-1741742-1-git-send-email-jbacik@fb.com> From: Josef Bacik Message-ID: <56AF787D.9030200@fb.com> Date: Mon, 1 Feb 2016 10:23:41 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-02-01_06:, , signatures=0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 67.231.145.42 Cc: Kernel Team X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: The development of GNU GRUB List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 15:23:57 -0000 On 02/01/2016 03:43 AM, Andrei Borzenkov wrote: > On Sat, Jan 30, 2016 at 12:48 AM, Josef Bacik wrote: >> Sometimes we have to provision boxes across regions, such as California to >> Sweden. The http server has a 10 minute timeout, so if we can't get our 250mb >> image transferred fast enough our provisioning fails, which is not ideal. So >> add tcp window scaling on open connections and set the window size to 1mb. With >> this change we're able to get higher sustained transfers between regions and can >> transfer our image in well below 10 minutes. Without this patch we'd time out >> every time halfway through the transfer. >> >> RTTM is needed in order to make window scaling work well under heavy congestion >> or packet loss. In most cases grub could recover with just window scaling >> enabled, but on some machines the congestion would be so high that it would >> never recover and would timeout. >> >> I've made the window size configureable with the grub env variable >> "tcp_window_size". By default this is set to 1mb but can be configured to >> whatever a user wants, and we will calculate the appropriate window size and >> scale settings. Thanks, >> > > Please make it net_tcp_window_size to match other net_* variables. > > I'm still unsure about increasing it by default. GRUB network > processing is not lightning fast and it looks like it enables partner > on local network to send too much at once which can overflow hardware > receiver buffers. Not sure if this is actually possible. > > I hoped DHCP defines standard option for window size but apparently not. > >> Signed-off-by: Josef Bacik >> --- >> V1->V2: >> -Address Andrei's concerns about making the window size configurable. >> -Also make the tcp option stuff more dynamic. >> -Add RTTM support to make higher window sizes more stable. >> >> grub-core/net/tcp.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 132 insertions(+), 9 deletions(-) >> >> diff --git a/grub-core/net/tcp.c b/grub-core/net/tcp.c >> index 5da8b11..be5ef30 100644 >> --- a/grub-core/net/tcp.c >> +++ b/grub-core/net/tcp.c >> @@ -22,6 +22,7 @@ >> #include >> #include >> #include >> +#include >> >> #define TCP_SYN_RETRANSMISSION_TIMEOUT GRUB_NET_INTERVAL >> #define TCP_SYN_RETRANSMISSION_COUNT GRUB_NET_TRIES >> @@ -65,6 +66,7 @@ struct grub_net_tcp_socket >> grub_uint32_t my_cur_seq; >> grub_uint32_t their_start_seq; >> grub_uint32_t their_cur_seq; >> + grub_uint32_t cur_tsecr; >> grub_uint16_t my_window; >> struct unacked *unack_first; >> struct unacked *unack_last; >> @@ -94,6 +96,8 @@ struct grub_net_tcp_listen >> void *hook_data; >> }; >> >> +#define ALIGNWORD(var) ((var) + 3) & (~3) >> + > > Can we use ALIGN_UP here? > >> struct tcphdr >> { >> grub_uint16_t src; >> @@ -106,6 +110,25 @@ struct tcphdr >> grub_uint16_t urgent; >> } GRUB_PACKED; >> >> +struct tcp_opt > > This probably should be tcp_opt_hdr, it is not really complete option. > >> +{ >> + grub_uint8_t kind; >> + grub_uint8_t length; >> +} GRUB_PACKED; >> + >> +struct tcp_scale_opt >> +{ >> + struct tcp_opt opt; >> + grub_uint8_t scale; >> +} GRUB_PACKED; >> + >> +struct tcp_timestamp_opt >> +{ >> + struct tcp_opt opt; >> + grub_uint32_t tsval; >> + grub_uint32_t tsecr; >> +} GRUB_PACKED; >> + >> struct tcp_pseudohdr >> { >> grub_uint32_t src; >> @@ -299,9 +322,12 @@ ack_real (grub_net_tcp_socket_t sock, int res) >> { >> struct grub_net_buff *nb_ack; >> struct tcphdr *tcph_ack; >> + struct tcp_timestamp_opt *timestamp; >> + grub_size_t headersize; >> grub_err_t err; >> >> - nb_ack = grub_netbuff_alloc (sizeof (*tcph_ack) + 128); >> + headersize = ALIGNWORD(sizeof (*tcph_ack) + sizeof (*timestamp)); >> + nb_ack = grub_netbuff_alloc (headersize + 128); > > RFC 7323 recommends to use timestamp option only if mutually > negotiated(i.e. both initial SYN and SYN+ACK contain timestamp > option). So it probably should not be done unconditionally. > >> if (!nb_ack) >> return; >> err = grub_netbuff_reserve (nb_ack, 128); >> @@ -313,7 +339,7 @@ ack_real (grub_net_tcp_socket_t sock, int res) >> return; >> } >> >> - err = grub_netbuff_put (nb_ack, sizeof (*tcph_ack)); >> + err = grub_netbuff_put (nb_ack, headersize); >> if (err) >> { >> grub_netbuff_free (nb_ack); >> @@ -322,22 +348,28 @@ ack_real (grub_net_tcp_socket_t sock, int res) >> return; >> } >> tcph_ack = (void *) nb_ack->data; >> + grub_memset (tcph_ack, 0, headersize); > > We did not do grub_memset before, why is it necessary now? The timestamp opt isn't 32bit aligned, so we have to make sure the tail end of the header is zero'ed. I'll fix up the rest of these comments. Thanks for the review, Josef