Re: TCP and BBR: reproducibly low cwnd and bandwidth

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eric Dumazet <eric.dumazet@gmail.com>
To: Oleksandr Natalenko <oleksandr@natalenko.name>,
	Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	Yuchung Cheng <ycheng@google.com>,
	Soheil Hassas Yeganeh <soheil@google.com>,
	Jerry Chu <hkchu@google.com>, Dave Taht <dave.taht@gmail.com>
Subject: Re: TCP and BBR: reproducibly low cwnd and bandwidth
Date: Sun, 18 Feb 2018 13:04:27 -0800	[thread overview]
Message-ID: <1518987867.55655.15.camel@gmail.com> (raw)
In-Reply-To: <1518893571.55655.12.camel@gmail.com>

On Sat, 2018-02-17 at 10:52 -0800, Eric Dumazet wrote:
> 
> This must be some race condition in the code I added in TCP for self-
> pacing, when a sort timeout is programmed.
> 
> Disabling SG means TCP cooks 1-MSS packets.
> 
> I will take a look, probably after the (long) week-end : Tuesday.

I was able to take a look today, and I believe this is the time to
switch TCP to GSO being always on.

As a bonus, we get speed boost for cubic as well.

Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sack
coalescing, TCP pacing...) all were developed/tested/maintained with
GSO/TSO being the norm.

Can you please test the following patch ?

Note that some cleanups can be done later in TCP stack, removing lots
of legacy stuff.

Also TCP internal-pacing could benefit from something similar to this
fq patch eventually, although there is no hurry.
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77  

Of course, you have to consider why SG was disabled on your device,
this looks very pessimistic.

Thanks !

 include/net/sock.h  |    1 +
 net/core/sock.c     |    2 +-
 net/ipv4/tcp_ipv4.c |    1 +
 net/ipv6/tcp_ipv6.c |    1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 169c92afcafa3d548f8238e91606b87c187559f4..df4ac691870ff9f779f1782ded58140eb4d3a961 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -417,6 +417,7 @@ struct sock {
 	struct page_frag	sk_frag;
 	netdev_features_t	sk_route_caps;
 	netdev_features_t	sk_route_nocaps;
+	netdev_features_t	sk_route_forced_caps;
 	int			sk_gso_type;
 	unsigned int		sk_gso_max_size;
 	gfp_t			sk_allocation;
diff --git a/net/core/sock.c b/net/core/sock.c
index c501499a04fe973e80e18655b306d762d348ff44..b084acb3b3b96791663b731788a392041148416c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1773,7 +1773,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
 	u32 max_segs = 1;
 
 	sk_dst_set(sk, dst);
-	sk->sk_route_caps = dst->dev->features;
+	sk->sk_route_caps = dst->dev->features | sk->sk_route_forced_caps;
 	if (sk->sk_route_caps & NETIF_F_GSO)
 		sk->sk_route_caps |= NETIF_F_GSO_SOFTWARE;
 	sk->sk_route_caps &= ~sk->sk_route_nocaps;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f8ad397e285e9b8db0b04f8abc30a42f22294ef9..eaf1e30fc5af879442f5f33ed4bd69f89dff8cfb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -233,6 +233,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	}
 	/* OK, now commit destination to socket.  */
 	sk->sk_gso_type = SKB_GSO_TCPV4;
+	sk->sk_route_forced_caps = NETIF_F_GSO;
 	sk_setup_caps(sk, &rt->dst);
 	rt = NULL;
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 412139f4eccd96923daaea064cd9fb8be13f5916..4a461e8e2d654aa341d525a0df609a294c2040df 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -269,6 +269,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
 	inet->inet_rcv_saddr = LOOPBACK4_IPV6;
 
 	sk->sk_gso_type = SKB_GSO_TCPV6;
+	sk->sk_route_forced_caps = NETIF_F_GSO;
 	ip6_dst_store(sk, dst, NULL, NULL);
 
 	icsk->icsk_ext_hdr_len = 0;

next prev parent reply	other threads:[~2018-02-18 21:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-15 20:42 TCP and BBR: reproducibly low cwnd and bandwidth Oleksandr Natalenko
2018-02-16 15:15 ` Oleksandr Natalenko
2018-02-16 16:25   ` Eric Dumazet
2018-02-16 17:37     ` Oleksandr Natalenko
2018-02-16 16:26   ` Holger Hoffstätte
2018-02-16 16:56     ` Neal Cardwell
2018-02-16 17:13       ` Holger Hoffstätte
2018-02-16 17:35     ` Oleksandr Natalenko
2018-02-16 16:21 ` Eric Dumazet
     [not found]   ` <CADVnQymiswHBp32dcMvWd1WfYLpFqY4QTas8yABFQE7KKKc5ag@mail.gmail.com>
2018-02-16 16:43     ` Eric Dumazet
2018-02-16 16:45       ` Neal Cardwell
2018-02-16 17:00         ` Oleksandr Natalenko
2018-02-16 17:25     ` Oleksandr Natalenko
2018-02-16 17:56       ` Holger Hoffstätte
2018-02-16 19:54         ` Oleksandr Natalenko
2018-02-16 20:54       ` Eric Dumazet
2018-02-16 22:50         ` Eric Dumazet
2018-02-16 23:06           ` Oleksandr Natalenko
2018-02-16 22:50         ` Oleksandr Natalenko
2018-02-16 22:59           ` Eric Dumazet
2018-02-17 10:01             ` Oleksandr Natalenko
2018-02-17 18:52               ` Eric Dumazet
2018-02-18 21:04                 ` Eric Dumazet [this message]
2018-02-18 21:06                   ` Eric Dumazet
2018-02-18 21:49                   ` Oleksandr Natalenko
2018-02-18 22:24                     ` Eric Dumazet

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:169c92afcafa3d548f8238e91606b87c187559f
dfblob:df4ac691870ff9f779f1782ded58140eb4d3a96
dfblob:c501499a04fe973e80e18655b306d762d348ff4
dfblob:b084acb3b3b96791663b731788a392041148416
dfblob:f8ad397e285e9b8db0b04f8abc30a42f22294ef
dfblob:eaf1e30fc5af879442f5f33ed4bd69f89dff8cf
dfblob:412139f4eccd96923daaea064cd9fb8be13f591
dfblob:4a461e8e2d654aa341d525a0df609a294c2040d )
 OR (
bs:"Re: TCP and BBR: reproducibly low cwnd and bandwidth" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1518987867.55655.15.camel@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=dave.taht@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hkchu@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=soheil@google.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.