From: Eric Dumazet <eric.dumazet@gmail.com>
To: Oleksandr Natalenko <oleksandr@natalenko.name>,
Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>,
"David S. Miller" <davem@davemloft.net>,
Netdev <netdev@vger.kernel.org>,
Yuchung Cheng <ycheng@google.com>,
Soheil Hassas Yeganeh <soheil@google.com>,
Jerry Chu <hkchu@google.com>, Dave Taht <dave.taht@gmail.com>
Subject: Re: TCP and BBR: reproducibly low cwnd and bandwidth
Date: Sun, 18 Feb 2018 13:04:27 -0800 [thread overview]
Message-ID: <1518987867.55655.15.camel@gmail.com> (raw)
In-Reply-To: <1518893571.55655.12.camel@gmail.com>
On Sat, 2018-02-17 at 10:52 -0800, Eric Dumazet wrote:
>
> This must be some race condition in the code I added in TCP for self-
> pacing, when a sort timeout is programmed.
>
> Disabling SG means TCP cooks 1-MSS packets.
>
> I will take a look, probably after the (long) week-end : Tuesday.
I was able to take a look today, and I believe this is the time to
switch TCP to GSO being always on.
As a bonus, we get speed boost for cubic as well.
Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sack
coalescing, TCP pacing...) all were developed/tested/maintained with
GSO/TSO being the norm.
Can you please test the following patch ?
Note that some cleanups can be done later in TCP stack, removing lots
of legacy stuff.
Also TCP internal-pacing could benefit from something similar to this
fq patch eventually, although there is no hurry.
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77
Of course, you have to consider why SG was disabled on your device,
this looks very pessimistic.
Thanks !
include/net/sock.h | 1 +
net/core/sock.c | 2 +-
net/ipv4/tcp_ipv4.c | 1 +
net/ipv6/tcp_ipv6.c | 1 +
4 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 169c92afcafa3d548f8238e91606b87c187559f4..df4ac691870ff9f779f1782ded58140eb4d3a961 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -417,6 +417,7 @@ struct sock {
struct page_frag sk_frag;
netdev_features_t sk_route_caps;
netdev_features_t sk_route_nocaps;
+ netdev_features_t sk_route_forced_caps;
int sk_gso_type;
unsigned int sk_gso_max_size;
gfp_t sk_allocation;
diff --git a/net/core/sock.c b/net/core/sock.c
index c501499a04fe973e80e18655b306d762d348ff44..b084acb3b3b96791663b731788a392041148416c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1773,7 +1773,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
u32 max_segs = 1;
sk_dst_set(sk, dst);
- sk->sk_route_caps = dst->dev->features;
+ sk->sk_route_caps = dst->dev->features | sk->sk_route_forced_caps;
if (sk->sk_route_caps & NETIF_F_GSO)
sk->sk_route_caps |= NETIF_F_GSO_SOFTWARE;
sk->sk_route_caps &= ~sk->sk_route_nocaps;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f8ad397e285e9b8db0b04f8abc30a42f22294ef9..eaf1e30fc5af879442f5f33ed4bd69f89dff8cfb 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -233,6 +233,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
}
/* OK, now commit destination to socket. */
sk->sk_gso_type = SKB_GSO_TCPV4;
+ sk->sk_route_forced_caps = NETIF_F_GSO;
sk_setup_caps(sk, &rt->dst);
rt = NULL;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 412139f4eccd96923daaea064cd9fb8be13f5916..4a461e8e2d654aa341d525a0df609a294c2040df 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -269,6 +269,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
inet->inet_rcv_saddr = LOOPBACK4_IPV6;
sk->sk_gso_type = SKB_GSO_TCPV6;
+ sk->sk_route_forced_caps = NETIF_F_GSO;
ip6_dst_store(sk, dst, NULL, NULL);
icsk->icsk_ext_hdr_len = 0;
next prev parent reply other threads:[~2018-02-18 21:04 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-15 20:42 TCP and BBR: reproducibly low cwnd and bandwidth Oleksandr Natalenko
2018-02-16 15:15 ` Oleksandr Natalenko
2018-02-16 16:25 ` Eric Dumazet
2018-02-16 17:37 ` Oleksandr Natalenko
2018-02-16 16:26 ` Holger Hoffstätte
2018-02-16 16:56 ` Neal Cardwell
2018-02-16 17:13 ` Holger Hoffstätte
2018-02-16 17:35 ` Oleksandr Natalenko
2018-02-16 16:21 ` Eric Dumazet
[not found] ` <CADVnQymiswHBp32dcMvWd1WfYLpFqY4QTas8yABFQE7KKKc5ag@mail.gmail.com>
2018-02-16 16:43 ` Eric Dumazet
2018-02-16 16:45 ` Neal Cardwell
2018-02-16 17:00 ` Oleksandr Natalenko
2018-02-16 17:25 ` Oleksandr Natalenko
2018-02-16 17:56 ` Holger Hoffstätte
2018-02-16 19:54 ` Oleksandr Natalenko
2018-02-16 20:54 ` Eric Dumazet
2018-02-16 22:50 ` Eric Dumazet
2018-02-16 23:06 ` Oleksandr Natalenko
2018-02-16 22:50 ` Oleksandr Natalenko
2018-02-16 22:59 ` Eric Dumazet
2018-02-17 10:01 ` Oleksandr Natalenko
2018-02-17 18:52 ` Eric Dumazet
2018-02-18 21:04 ` Eric Dumazet [this message]
2018-02-18 21:06 ` Eric Dumazet
2018-02-18 21:49 ` Oleksandr Natalenko
2018-02-18 22:24 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1518987867.55655.15.camel@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=dave.taht@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hkchu@google.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=oleksandr@natalenko.name \
--cc=soheil@google.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).