From: Eric Dumazet <edumazet@google.com>
To: "David S . Miller" <davem@davemloft.net>
Cc: netdev <netdev@vger.kernel.org>,
Eric Dumazet <edumazet@google.com>,
Eric Dumazet <eric.dumazet@gmail.com>,
Soheil Hassas Yeganeh <soheil@google.com>,
Neal Cardwell <ncardwell@google.com>,
Yuchung Cheng <ycheng@google.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Tariq Toukan <tariqt@mellanox.com>
Subject: [PATCH net-next] tcp: force a PSH flag on TSO packets
Date: Tue, 10 Sep 2019 14:49:28 -0700 [thread overview]
Message-ID: <20190910214928.220727-1-edumazet@google.com> (raw)
When tcp sends a TSO packet, adding a PSH flag on it
reduces the sojourn time of GRO packet in GRO receivers.
This is particularly the case under pressure, since RX queues
receive packets for many concurrent flows.
A sender can give a hint to GRO engines when it is
appropriate to flush a super-packet, especially when pacing
is in the picture, since next packet is probably delayed by
one ms.
Having less packets in GRO engine reduces chance
of LRU eviction or inflated RTT, and reduces GRO cost.
We found recently that we must not set the PSH flag on
individual full-size MSS segments [1] :
Under pressure (CWR state), we better let the packet sit
for a small delay (depending on NAPI logic) so that the
ACK packet is delayed, and thus next packet we send is
also delayed a bit. Eventually the bottleneck queue can
be drained. DCTCP flows with CWND=1 have demonstrated
the issue.
This patch allows to slowdown the aggregate traffic without
involving high resolution timers on senders and/or
receivers.
It has been used at Google for about four years,
and has been discussed at various networking conferences.
[1] segments smaller than MSS already have PSH flag set
by tcp_sendmsg() / tcp_mark_push(), unless MSG_MORE
has been requested by the user.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tariq Toukan <tariqt@mellanox.com>
---
net/ipv4/tcp_output.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 42abc9bd687a5fea627cbc7cfa750d022f393d84..fec6d67bfd146dc78f0f25173fd71b8b8cc752fe 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1050,11 +1050,22 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
tcb = TCP_SKB_CB(skb);
memset(&opts, 0, sizeof(opts));
- if (unlikely(tcb->tcp_flags & TCPHDR_SYN))
+ if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) {
tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
- else
+ } else {
tcp_options_size = tcp_established_options(sk, skb, &opts,
&md5);
+ /* Force a PSH flag on all (GSO) packets to expedite GRO flush
+ * at receiver : This slightly improve GRO performance.
+ * Note that we do not force the PSH flag for non GSO packets,
+ * because they might be sent under high congestion events,
+ * and in this case it is better to delay the delivery of 1-MSS
+ * packets and thus the corresponding ACK packet that would
+ * release the following packet.
+ */
+ if (tcp_skb_pcount(skb) > 1)
+ tcb->tcp_flags |= TCPHDR_PSH;
+ }
tcp_header_size = tcp_options_size + sizeof(struct tcphdr);
/* if no packet is in qdisc/device queue, then allow XPS to select
--
2.23.0.162.g0b9fbb3734-goog
next reply other threads:[~2019-09-10 21:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-10 21:49 Eric Dumazet [this message]
2019-09-10 21:52 ` [PATCH net-next] tcp: force a PSH flag on TSO packets Soheil Hassas Yeganeh
2019-09-11 0:55 ` Neal Cardwell
2019-09-16 1:32 ` [tcp] 51ce142de7: packetdrill.packetdrill/tests/linux/undo_undo-fr-ack-then-dsack-on-ack-below-snd_una.fail kernel test robot
2019-09-19 12:17 ` [PATCH net-next] tcp: force a PSH flag on TSO packets Or Gerlitz
2019-09-19 12:32 ` David Miller
2019-09-19 13:46 ` Eric Dumazet
2019-09-19 14:00 ` Or Gerlitz
2019-09-19 14:18 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190910214928.220727-1-edumazet@google.com \
--to=edumazet@google.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=soheil@google.com \
--cc=tariqt@mellanox.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).