From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f50.google.com ([74.125.83.50]:36854 "EHLO mail-pg0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751756AbeBRVEc (ORCPT ); Sun, 18 Feb 2018 16:04:32 -0500 Received: by mail-pg0-f50.google.com with SMTP id j9so5139712pgv.3 for ; Sun, 18 Feb 2018 13:04:31 -0800 (PST) Message-ID: <1518987867.55655.15.camel@gmail.com> Subject: Re: TCP and BBR: reproducibly low cwnd and bandwidth From: Eric Dumazet To: Oleksandr Natalenko , Eric Dumazet Cc: Neal Cardwell , "David S. Miller" , Netdev , Yuchung Cheng , Soheil Hassas Yeganeh , Jerry Chu , Dave Taht Date: Sun, 18 Feb 2018 13:04:27 -0800 In-Reply-To: <1518893571.55655.12.camel@gmail.com> References: <1697118.nv5eASg0nx@natalenko.name> <7409814.oObJlsYiIU@natalenko.name> <5668348.WVIY7FqTii@natalenko.name> <1518893571.55655.12.camel@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org List-ID: On Sat, 2018-02-17 at 10:52 -0800, Eric Dumazet wrote: > > This must be some race condition in the code I added in TCP for self- > pacing, when a sort timeout is programmed. > > Disabling SG means TCP cooks 1-MSS packets. > > I will take a look, probably after the (long) week-end : Tuesday. I was able to take a look today, and I believe this is the time to switch TCP to GSO being always on. As a bonus, we get speed boost for cubic as well. Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sack coalescing, TCP pacing...) all were developed/tested/maintained with GSO/TSO being the norm. Can you please test the following patch ? Note that some cleanups can be done later in TCP stack, removing lots of legacy stuff. Also TCP internal-pacing could benefit from something similar to this fq patch eventually, although there is no hurry. https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77   Of course, you have to consider why SG was disabled on your device, this looks very pessimistic. Thanks !  include/net/sock.h  |    1 +  net/core/sock.c     |    2 +-  net/ipv4/tcp_ipv4.c |    1 +  net/ipv6/tcp_ipv6.c |    1 +  4 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/net/sock.h b/include/net/sock.h index 169c92afcafa3d548f8238e91606b87c187559f4..df4ac691870ff9f779f1782ded58140eb4d3a961 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -417,6 +417,7 @@ struct sock { struct page_frag sk_frag; netdev_features_t sk_route_caps; netdev_features_t sk_route_nocaps; + netdev_features_t sk_route_forced_caps; int sk_gso_type; unsigned int sk_gso_max_size; gfp_t sk_allocation; diff --git a/net/core/sock.c b/net/core/sock.c index c501499a04fe973e80e18655b306d762d348ff44..b084acb3b3b96791663b731788a392041148416c 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1773,7 +1773,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst) u32 max_segs = 1; sk_dst_set(sk, dst); - sk->sk_route_caps = dst->dev->features; + sk->sk_route_caps = dst->dev->features | sk->sk_route_forced_caps; if (sk->sk_route_caps & NETIF_F_GSO) sk->sk_route_caps |= NETIF_F_GSO_SOFTWARE; sk->sk_route_caps &= ~sk->sk_route_nocaps; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index f8ad397e285e9b8db0b04f8abc30a42f22294ef9..eaf1e30fc5af879442f5f33ed4bd69f89dff8cfb 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -233,6 +233,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) } /* OK, now commit destination to socket. */ sk->sk_gso_type = SKB_GSO_TCPV4; + sk->sk_route_forced_caps = NETIF_F_GSO; sk_setup_caps(sk, &rt->dst); rt = NULL; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 412139f4eccd96923daaea064cd9fb8be13f5916..4a461e8e2d654aa341d525a0df609a294c2040df 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -269,6 +269,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, inet->inet_rcv_saddr = LOOPBACK4_IPV6; sk->sk_gso_type = SKB_GSO_TCPV6; + sk->sk_route_forced_caps = NETIF_F_GSO; ip6_dst_store(sk, dst, NULL, NULL); icsk->icsk_ext_hdr_len = 0;