netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] tcp: tso defer improvements
@ 2018-11-11 14:41 Eric Dumazet
  2018-11-11 14:41 ` [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR) Eric Dumazet
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Eric Dumazet @ 2018-11-11 14:41 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Soheil Hassas Yeganeh, Eric Dumazet

This series makes tcp_tso_should_defer() a bit smarter :

1) MSG_EOR gives a hint to TCP to not defer some skbs

2) Second patch takes into account that head tstamp
   can be in the future.

3) Third patch uses existing high resolution state variables
   to have a more precise heuristic.

Eric Dumazet (3):
  tcp: do not try to defer skbs with eor mark (MSG_EOR)
  tcp: refine tcp_tso_should_defer() after EDT adoption
  tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies

 net/ipv4/tcp_output.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR)
  2018-11-11 14:41 [PATCH net-next 0/3] tcp: tso defer improvements Eric Dumazet
@ 2018-11-11 14:41 ` Eric Dumazet
  2018-11-11 18:58   ` Neal Cardwell
  2018-11-11 14:41 ` [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption Eric Dumazet
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2018-11-11 14:41 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Soheil Hassas Yeganeh, Eric Dumazet

Applications using MSG_EOR are giving a strong hint to TCP stack :

Subsequent sendmsg() can not append more bytes to skbs having
the EOR mark.

Do not try to TSO defer suchs skbs, there is really no hope.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_output.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 9c34b97d365d719ff76250bc9fe7fa20495a3ed2..35feadf480300cd061411d65257f06ee658daa3c 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1944,6 +1944,10 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 	if ((skb != tcp_write_queue_tail(sk)) && (limit >= skb->len))
 		goto send_now;
 
+	/* If this packet won't get more data, do not wait. */
+	if (TCP_SKB_CB(skb)->eor)
+		goto send_now;
+
 	win_divisor = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_tso_win_divisor);
 	if (win_divisor) {
 		u32 chunk = min(tp->snd_wnd, tp->snd_cwnd * tp->mss_cache);
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption
  2018-11-11 14:41 [PATCH net-next 0/3] tcp: tso defer improvements Eric Dumazet
  2018-11-11 14:41 ` [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR) Eric Dumazet
@ 2018-11-11 14:41 ` Eric Dumazet
  2018-11-11 19:02   ` Neal Cardwell
  2018-11-11 14:41 ` [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies Eric Dumazet
  2018-11-11 21:55 ` [PATCH net-next 0/3] tcp: tso defer improvements David Miller
  3 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2018-11-11 14:41 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Soheil Hassas Yeganeh, Eric Dumazet

tcp_tso_should_defer() last step tries to check if the probable
next ACK packet is coming in less than half rtt.

Problem is that the head->tstamp might be in the future,
so we need to use signed arithmetics to avoid overflows.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_output.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 35feadf480300cd061411d65257f06ee658daa3c..78a56cef7e397c65ad18897d550f3800e8fe8f41 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1907,10 +1907,11 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 				 bool *is_cwnd_limited, u32 max_segs)
 {
 	const struct inet_connection_sock *icsk = inet_csk(sk);
-	u32 age, send_win, cong_win, limit, in_flight;
+	u32 send_win, cong_win, limit, in_flight;
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *head;
 	int win_divisor;
+	s64 delta;
 
 	if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
 		goto send_now;
@@ -1972,9 +1973,9 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 	head = tcp_rtx_queue_head(sk);
 	if (!head)
 		goto send_now;
-	age = tcp_stamp_us_delta(tp->tcp_mstamp, tcp_skb_timestamp_us(head));
+	delta = tp->tcp_clock_cache - head->tstamp;
 	/* If next ACK is likely to come too late (half srtt), do not defer */
-	if (age < (tp->srtt_us >> 4))
+	if ((s64)(delta - (u64)NSEC_PER_USEC * (tp->srtt_us >> 4)) < 0)
 		goto send_now;
 
 	/* Ok, it looks like it is advisable to defer. */
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies
  2018-11-11 14:41 [PATCH net-next 0/3] tcp: tso defer improvements Eric Dumazet
  2018-11-11 14:41 ` [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR) Eric Dumazet
  2018-11-11 14:41 ` [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption Eric Dumazet
@ 2018-11-11 14:41 ` Eric Dumazet
  2018-11-11 19:06   ` Neal Cardwell
  2018-11-11 21:55 ` [PATCH net-next 0/3] tcp: tso defer improvements David Miller
  3 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2018-11-11 14:41 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Eric Dumazet, Soheil Hassas Yeganeh, Eric Dumazet

tcp_tso_should_defer() first heuristic is to not defer
if last send is "old enough".

Its current implementation uses jiffies and its low granularity.

TSO autodefer performance should not rely on kernel HZ :/

After EDT conversion, we have state variables in nanoseconds that
can allow us to properly implement the heuristic.

This patch increases TSO chunk sizes on medium rate flows,
especially when receivers do not use GRO or similar aggregation.

It also reduces bursts for HZ=100 or HZ=250 kernels, making TCP
behavior more uniform.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
---
 net/ipv4/tcp_output.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 78a56cef7e397c65ad18897d550f3800e8fe8f41..75dcf4daca724a6819e9ecc9d0f3e6dc6df72e9b 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1920,9 +1920,12 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb,
 		goto send_now;
 
 	/* Avoid bursty behavior by allowing defer
-	 * only if the last write was recent.
+	 * only if the last write was recent (1 ms).
+	 * Note that tp->tcp_wstamp_ns can be in the future if we have
+	 * packets waiting in a qdisc or device for EDT delivery.
 	 */
-	if ((s32)(tcp_jiffies32 - tp->lsndtime) > 0)
+	delta = tp->tcp_clock_cache - tp->tcp_wstamp_ns - NSEC_PER_MSEC;
+	if (delta > 0)
 		goto send_now;
 
 	in_flight = tcp_packets_in_flight(tp);
-- 
2.19.1.930.g4563a0d9d0-goog

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR)
  2018-11-11 14:41 ` [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR) Eric Dumazet
@ 2018-11-11 18:58   ` Neal Cardwell
  0 siblings, 0 replies; 9+ messages in thread
From: Neal Cardwell @ 2018-11-11 18:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Netdev, Soheil Hassas Yeganeh, Eric Dumazet

On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Applications using MSG_EOR are giving a strong hint to TCP stack :
>
> Subsequent sendmsg() can not append more bytes to skbs having
> the EOR mark.
>
> Do not try to TSO defer suchs skbs, there is really no hope.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---
>  net/ipv4/tcp_output.c | 4 ++++
>  1 file changed, 4 insertions(+)

Thanks!

Acked-by: Neal Cardwell <ncardwell@google.com>

neal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption
  2018-11-11 14:41 ` [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption Eric Dumazet
@ 2018-11-11 19:02   ` Neal Cardwell
  0 siblings, 0 replies; 9+ messages in thread
From: Neal Cardwell @ 2018-11-11 19:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Netdev, Soheil Hassas Yeganeh, Eric Dumazet

On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet <edumazet@google.com> wrote:
>
> tcp_tso_should_defer() last step tries to check if the probable
> next ACK packet is coming in less than half rtt.
>
> Problem is that the head->tstamp might be in the future,
> so we need to use signed arithmetics to avoid overflows.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---
>  net/ipv4/tcp_output.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Thanks!

Acked-by: Neal Cardwell <ncardwell@google.com>

neal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies
  2018-11-11 14:41 ` [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies Eric Dumazet
@ 2018-11-11 19:06   ` Neal Cardwell
  2018-11-12 16:52     ` Yuchung Cheng
  0 siblings, 1 reply; 9+ messages in thread
From: Neal Cardwell @ 2018-11-11 19:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Netdev, Soheil Hassas Yeganeh, Eric Dumazet

On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet <edumazet@google.com> wrote:
>
> tcp_tso_should_defer() first heuristic is to not defer
> if last send is "old enough".
>
> Its current implementation uses jiffies and its low granularity.
>
> TSO autodefer performance should not rely on kernel HZ :/
>
> After EDT conversion, we have state variables in nanoseconds that
> can allow us to properly implement the heuristic.
>
> This patch increases TSO chunk sizes on medium rate flows,
> especially when receivers do not use GRO or similar aggregation.
>
> It also reduces bursts for HZ=100 or HZ=250 kernels, making TCP
> behavior more uniform.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---

Nice. Thanks!

Acked-by: Neal Cardwell <ncardwell@google.com>

neal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 0/3] tcp: tso defer improvements
  2018-11-11 14:41 [PATCH net-next 0/3] tcp: tso defer improvements Eric Dumazet
                   ` (2 preceding siblings ...)
  2018-11-11 14:41 ` [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies Eric Dumazet
@ 2018-11-11 21:55 ` David Miller
  3 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2018-11-11 21:55 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, soheil, eric.dumazet

From: Eric Dumazet <edumazet@google.com>
Date: Sun, 11 Nov 2018 06:41:28 -0800

> This series makes tcp_tso_should_defer() a bit smarter :
> 
> 1) MSG_EOR gives a hint to TCP to not defer some skbs
> 
> 2) Second patch takes into account that head tstamp
>    can be in the future.
> 
> 3) Third patch uses existing high resolution state variables
>    to have a more precise heuristic.

This stuff is fantastic, series applied.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies
  2018-11-11 19:06   ` Neal Cardwell
@ 2018-11-12 16:52     ` Yuchung Cheng
  0 siblings, 0 replies; 9+ messages in thread
From: Yuchung Cheng @ 2018-11-12 16:52 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: Eric Dumazet, David Miller, Netdev, Soheil Hassas Yeganeh,
	Eric Dumazet

On Sun, Nov 11, 2018 at 11:06 AM, Neal Cardwell <ncardwell@google.com> wrote:
> On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet <edumazet@google.com> wrote:
>>
>> tcp_tso_should_defer() first heuristic is to not defer
>> if last send is "old enough".
>>
>> Its current implementation uses jiffies and its low granularity.
>>
>> TSO autodefer performance should not rely on kernel HZ :/
>>
>> After EDT conversion, we have state variables in nanoseconds that
>> can allow us to properly implement the heuristic.
>>
>> This patch increases TSO chunk sizes on medium rate flows,
>> especially when receivers do not use GRO or similar aggregation.
>>
>> It also reduces bursts for HZ=100 or HZ=250 kernels, making TCP
>> behavior more uniform.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>> Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
>> ---
>
> Nice. Thanks!
>
> Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>

Love it
>
> neal

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-11-13  2:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-11 14:41 [PATCH net-next 0/3] tcp: tso defer improvements Eric Dumazet
2018-11-11 14:41 ` [PATCH net-next 1/3] tcp: do not try to defer skbs with eor mark (MSG_EOR) Eric Dumazet
2018-11-11 18:58   ` Neal Cardwell
2018-11-11 14:41 ` [PATCH net-next 2/3] tcp: refine tcp_tso_should_defer() after EDT adoption Eric Dumazet
2018-11-11 19:02   ` Neal Cardwell
2018-11-11 14:41 ` [PATCH net-next 3/3] tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies Eric Dumazet
2018-11-11 19:06   ` Neal Cardwell
2018-11-12 16:52     ` Yuchung Cheng
2018-11-11 21:55 ` [PATCH net-next 0/3] tcp: tso defer improvements David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).