netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] tcp: add tcp_delack_max()
@ 2023-09-20 17:29 Eric Dumazet
  2023-09-20 17:29 ` [PATCH net-next 1/3] net: constify sk_dst_get() and __sk_dst_get() argument Eric Dumazet
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Eric Dumazet @ 2023-09-20 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet, Eric Dumazet

First patches are adding const qualifiers to four existing helpers.

Third patch adds a much needed companion feature to RTAX_RTO_MIN.

Eric Dumazet (3):
  net: constify sk_dst_get() and __sk_dst_get() argument
  tcp: constify tcp_rto_min() and tcp_rto_min_us() argument
  tcp: derive delack_max from rto_min

 include/net/sock.h    |  4 ++--
 include/net/tcp.h     |  6 ++++--
 net/ipv4/tcp.c        |  3 ++-
 net/ipv4/tcp_output.c | 16 +++++++++++++++-
 4 files changed, 23 insertions(+), 6 deletions(-)

-- 
2.42.0.459.ge4e396fd5e-goog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next 1/3] net: constify sk_dst_get() and __sk_dst_get() argument
  2023-09-20 17:29 [PATCH net-next 0/3] tcp: add tcp_delack_max() Eric Dumazet
@ 2023-09-20 17:29 ` Eric Dumazet
  2023-09-20 17:29 ` [PATCH net-next 2/3] tcp: constify tcp_rto_min() and tcp_rto_min_us() argument Eric Dumazet
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2023-09-20 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet, Eric Dumazet

Both helpers only read fields from their socket argument.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 56ac1abadea59e6734396a7ef2e22518a0ba80a1..2800d587b08ddc93d7d190ea71dd96a1a3591c1a 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2141,14 +2141,14 @@ static inline bool sk_rethink_txhash(struct sock *sk)
 }
 
 static inline struct dst_entry *
-__sk_dst_get(struct sock *sk)
+__sk_dst_get(const struct sock *sk)
 {
 	return rcu_dereference_check(sk->sk_dst_cache,
 				     lockdep_sock_is_held(sk));
 }
 
 static inline struct dst_entry *
-sk_dst_get(struct sock *sk)
+sk_dst_get(const struct sock *sk)
 {
 	struct dst_entry *dst;
 
-- 
2.42.0.459.ge4e396fd5e-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 2/3] tcp: constify tcp_rto_min() and tcp_rto_min_us() argument
  2023-09-20 17:29 [PATCH net-next 0/3] tcp: add tcp_delack_max() Eric Dumazet
  2023-09-20 17:29 ` [PATCH net-next 1/3] net: constify sk_dst_get() and __sk_dst_get() argument Eric Dumazet
@ 2023-09-20 17:29 ` Eric Dumazet
  2023-09-20 17:29 ` [PATCH net-next 3/3] tcp: derive delack_max from rto_min Eric Dumazet
  2023-10-01 12:20 ` [PATCH net-next 0/3] tcp: add tcp_delack_max() patchwork-bot+netdevbpf
  3 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2023-09-20 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet, Eric Dumazet

Make clear these functions do not change any field from TCP socket.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 91688d0dadcd6f72144aac747178de8d85f15bf7..a8db7d43fb6215197af4a80e270b8c82070d55cb 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -719,7 +719,7 @@ static inline void tcp_fast_path_check(struct sock *sk)
 }
 
 /* Compute the actual rto_min value */
-static inline u32 tcp_rto_min(struct sock *sk)
+static inline u32 tcp_rto_min(const struct sock *sk)
 {
 	const struct dst_entry *dst = __sk_dst_get(sk);
 	u32 rto_min = inet_csk(sk)->icsk_rto_min;
@@ -729,7 +729,7 @@ static inline u32 tcp_rto_min(struct sock *sk)
 	return rto_min;
 }
 
-static inline u32 tcp_rto_min_us(struct sock *sk)
+static inline u32 tcp_rto_min_us(const struct sock *sk)
 {
 	return jiffies_to_usecs(tcp_rto_min(sk));
 }
-- 
2.42.0.459.ge4e396fd5e-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-20 17:29 [PATCH net-next 0/3] tcp: add tcp_delack_max() Eric Dumazet
  2023-09-20 17:29 ` [PATCH net-next 1/3] net: constify sk_dst_get() and __sk_dst_get() argument Eric Dumazet
  2023-09-20 17:29 ` [PATCH net-next 2/3] tcp: constify tcp_rto_min() and tcp_rto_min_us() argument Eric Dumazet
@ 2023-09-20 17:29 ` Eric Dumazet
  2023-09-20 17:34   ` Soheil Hassas Yeganeh
                     ` (2 more replies)
  2023-10-01 12:20 ` [PATCH net-next 0/3] tcp: add tcp_delack_max() patchwork-bot+netdevbpf
  3 siblings, 3 replies; 14+ messages in thread
From: Eric Dumazet @ 2023-09-20 17:29 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet, Eric Dumazet

While BPF allows to set icsk->->icsk_delack_max
and/or icsk->icsk_rto_min, we have an ip route
attribute (RTAX_RTO_MIN) to be able to tune rto_min,
but nothing to consequently adjust max delayed ack,
which vary from 40ms to 200 ms (TCP_DELACK_{MIN|MAX}).

This makes RTAX_RTO_MIN of almost no practical use,
unless customers are in big trouble.

Modern days datacenter communications want to set
rto_min to ~5 ms, and the max delayed ack one jiffie
smaller to avoid spurious retransmits.

After this patch, an "rto_min 5" route attribute will
effectively lower max delayed ack timers to 4 ms.

Note in the following ss output, "rto:6 ... ato:4"

$ ss -temoi dst XXXXXX
State Recv-Q Send-Q           Local Address:Port       Peer Address:Port  Process
ESTAB 0      0        [2002:a05:6608:295::]:52950   [2002:a05:6608:297::]:41597
     ino:255134 sk:1001 <->
         skmem:(r0,rb1707063,t872,tb262144,f0,w0,o0,bl0,d0) ts sack
 cubic wscale:8,8 rto:6 rtt:0.02/0.002 ato:4 mss:4096 pmtu:4500
 rcvmss:536 advmss:4096 cwnd:10 bytes_sent:54823160 bytes_acked:54823121
 bytes_received:54823120 segs_out:1370582 segs_in:1370580
 data_segs_out:1370579 data_segs_in:1370578 send 16.4Gbps
 pacing_rate 32.6Gbps delivery_rate 1.72Gbps delivered:1370579
 busy:26920ms unacked:1 rcv_rtt:34.615 rcv_space:65920
 rcv_ssthresh:65535 minrtt:0.015 snd_wnd:65536

While we could argue this patch fixes a bug with RTAX_RTO_MIN,
I do not add a Fixes: tag, so that we can soak it a bit before
asking backports to stable branches.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/tcp.h     |  2 ++
 net/ipv4/tcp.c        |  3 ++-
 net/ipv4/tcp_output.c | 16 +++++++++++++++-
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index a8db7d43fb6215197af4a80e270b8c82070d55cb..af9cb37fbe53ec60b4e545e8aa0740bbf30da7b6 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -718,6 +718,8 @@ static inline void tcp_fast_path_check(struct sock *sk)
 		tcp_fast_path_on(tp);
 }
 
+u32 tcp_delack_max(const struct sock *sk);
+
 /* Compute the actual rto_min value */
 static inline u32 tcp_rto_min(const struct sock *sk)
 {
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 69b8d707370844020770438cc4f31aeda4830b3d..e54f91eb943b2f09f303951cc72cbea61ada519d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3762,7 +3762,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
 		info->tcpi_options |= TCPI_OPT_SYN_DATA;
 
 	info->tcpi_rto = jiffies_to_usecs(icsk->icsk_rto);
-	info->tcpi_ato = jiffies_to_usecs(icsk->icsk_ack.ato);
+	info->tcpi_ato = jiffies_to_usecs(min(icsk->icsk_ack.ato,
+					      tcp_delack_max(sk)));
 	info->tcpi_snd_mss = tp->mss_cache;
 	info->tcpi_rcv_mss = icsk->icsk_ack.rcv_mss;
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 1fc1f879cfd6c28cd655bb8f02eff6624eec2ffc..2d1e4b5ac1ca41ff3db8dc58458d4e922a2c4999 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3977,6 +3977,20 @@ int tcp_connect(struct sock *sk)
 }
 EXPORT_SYMBOL(tcp_connect);
 
+u32 tcp_delack_max(const struct sock *sk)
+{
+	const struct dst_entry *dst = __sk_dst_get(sk);
+	u32 delack_max = inet_csk(sk)->icsk_delack_max;
+
+	if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
+		u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
+		u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);
+
+		delack_max = min_t(u32, delack_max, delack_from_rto_min);
+	}
+	return delack_max;
+}
+
 /* Send out a delayed ack, the caller does the policy checking
  * to see if we should even be here.  See tcp_input.c:tcp_ack_snd_check()
  * for details.
@@ -4012,7 +4026,7 @@ void tcp_send_delayed_ack(struct sock *sk)
 		ato = min(ato, max_ato);
 	}
 
-	ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max);
+	ato = min_t(u32, ato, tcp_delack_max(sk));
 
 	/* Stay within the limit we were given */
 	timeout = jiffies + ato;
-- 
2.42.0.459.ge4e396fd5e-goog


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-20 17:29 ` [PATCH net-next 3/3] tcp: derive delack_max from rto_min Eric Dumazet
@ 2023-09-20 17:34   ` Soheil Hassas Yeganeh
  2023-09-20 19:06   ` Neal Cardwell
  2023-09-20 21:57   ` David Ahern
  2 siblings, 0 replies; 14+ messages in thread
From: Soheil Hassas Yeganeh @ 2023-09-20 17:34 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Neal Cardwell,
	Yuchung Cheng, netdev, eric.dumazet

On Wed, Sep 20, 2023 at 1:29 PM Eric Dumazet <edumazet@google.com> wrote:
>
> While BPF allows to set icsk->->icsk_delack_max

nit pick in the commit message:  redundant ->->.

> and/or icsk->icsk_rto_min, we have an ip route
> attribute (RTAX_RTO_MIN) to be able to tune rto_min,
> but nothing to consequently adjust max delayed ack,
> which vary from 40ms to 200 ms (TCP_DELACK_{MIN|MAX}).
>
> This makes RTAX_RTO_MIN of almost no practical use,
> unless customers are in big trouble.
>
> Modern days datacenter communications want to set
> rto_min to ~5 ms, and the max delayed ack one jiffie
> smaller to avoid spurious retransmits.
>
> After this patch, an "rto_min 5" route attribute will
> effectively lower max delayed ack timers to 4 ms.
>
> Note in the following ss output, "rto:6 ... ato:4"
>
> $ ss -temoi dst XXXXXX
> State Recv-Q Send-Q           Local Address:Port       Peer Address:Port  Process
> ESTAB 0      0        [2002:a05:6608:295::]:52950   [2002:a05:6608:297::]:41597
>      ino:255134 sk:1001 <->
>          skmem:(r0,rb1707063,t872,tb262144,f0,w0,o0,bl0,d0) ts sack
>  cubic wscale:8,8 rto:6 rtt:0.02/0.002 ato:4 mss:4096 pmtu:4500
>  rcvmss:536 advmss:4096 cwnd:10 bytes_sent:54823160 bytes_acked:54823121
>  bytes_received:54823120 segs_out:1370582 segs_in:1370580
>  data_segs_out:1370579 data_segs_in:1370578 send 16.4Gbps
>  pacing_rate 32.6Gbps delivery_rate 1.72Gbps delivered:1370579
>  busy:26920ms unacked:1 rcv_rtt:34.615 rcv_space:65920
>  rcv_ssthresh:65535 minrtt:0.015 snd_wnd:65536
>
> While we could argue this patch fixes a bug with RTAX_RTO_MIN,
> I do not add a Fixes: tag, so that we can soak it a bit before
> asking backports to stable branches.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Acked-by: Soheil Hassas Yeganeh <soheil@google.com>

So great that this is now upstream :-) Thank you!

> ---
>  include/net/tcp.h     |  2 ++
>  net/ipv4/tcp.c        |  3 ++-
>  net/ipv4/tcp_output.c | 16 +++++++++++++++-
>  3 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index a8db7d43fb6215197af4a80e270b8c82070d55cb..af9cb37fbe53ec60b4e545e8aa0740bbf30da7b6 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -718,6 +718,8 @@ static inline void tcp_fast_path_check(struct sock *sk)
>                 tcp_fast_path_on(tp);
>  }
>
> +u32 tcp_delack_max(const struct sock *sk);
> +
>  /* Compute the actual rto_min value */
>  static inline u32 tcp_rto_min(const struct sock *sk)
>  {
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 69b8d707370844020770438cc4f31aeda4830b3d..e54f91eb943b2f09f303951cc72cbea61ada519d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3762,7 +3762,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info)
>                 info->tcpi_options |= TCPI_OPT_SYN_DATA;
>
>         info->tcpi_rto = jiffies_to_usecs(icsk->icsk_rto);
> -       info->tcpi_ato = jiffies_to_usecs(icsk->icsk_ack.ato);
> +       info->tcpi_ato = jiffies_to_usecs(min(icsk->icsk_ack.ato,
> +                                             tcp_delack_max(sk)));
>         info->tcpi_snd_mss = tp->mss_cache;
>         info->tcpi_rcv_mss = icsk->icsk_ack.rcv_mss;
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 1fc1f879cfd6c28cd655bb8f02eff6624eec2ffc..2d1e4b5ac1ca41ff3db8dc58458d4e922a2c4999 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -3977,6 +3977,20 @@ int tcp_connect(struct sock *sk)
>  }
>  EXPORT_SYMBOL(tcp_connect);
>
> +u32 tcp_delack_max(const struct sock *sk)
> +{
> +       const struct dst_entry *dst = __sk_dst_get(sk);
> +       u32 delack_max = inet_csk(sk)->icsk_delack_max;
> +
> +       if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
> +               u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
> +               u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);
> +
> +               delack_max = min_t(u32, delack_max, delack_from_rto_min);
> +       }
> +       return delack_max;
> +}
> +
>  /* Send out a delayed ack, the caller does the policy checking
>   * to see if we should even be here.  See tcp_input.c:tcp_ack_snd_check()
>   * for details.
> @@ -4012,7 +4026,7 @@ void tcp_send_delayed_ack(struct sock *sk)
>                 ato = min(ato, max_ato);
>         }
>
> -       ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max);
> +       ato = min_t(u32, ato, tcp_delack_max(sk));
>
>         /* Stay within the limit we were given */
>         timeout = jiffies + ato;
> --
> 2.42.0.459.ge4e396fd5e-goog
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-20 17:29 ` [PATCH net-next 3/3] tcp: derive delack_max from rto_min Eric Dumazet
  2023-09-20 17:34   ` Soheil Hassas Yeganeh
@ 2023-09-20 19:06   ` Neal Cardwell
  2023-09-20 21:57   ` David Ahern
  2 siblings, 0 replies; 14+ messages in thread
From: Neal Cardwell @ 2023-09-20 19:06 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Yuchung Cheng, netdev, eric.dumazet

On Wed, Sep 20, 2023 at 1:29 PM Eric Dumazet <edumazet@google.com> wrote:
>
> While BPF allows to set icsk->->icsk_delack_max
> and/or icsk->icsk_rto_min, we have an ip route
> attribute (RTAX_RTO_MIN) to be able to tune rto_min,
> but nothing to consequently adjust max delayed ack,
> which vary from 40ms to 200 ms (TCP_DELACK_{MIN|MAX}).
>
> This makes RTAX_RTO_MIN of almost no practical use,
> unless customers are in big trouble.
>
> Modern days datacenter communications want to set
> rto_min to ~5 ms, and the max delayed ack one jiffie
> smaller to avoid spurious retransmits.
>
> After this patch, an "rto_min 5" route attribute will
> effectively lower max delayed ack timers to 4 ms.
>
> Note in the following ss output, "rto:6 ... ato:4"
>
> $ ss -temoi dst XXXXXX
> State Recv-Q Send-Q           Local Address:Port       Peer Address:Port  Process
> ESTAB 0      0        [2002:a05:6608:295::]:52950   [2002:a05:6608:297::]:41597
>      ino:255134 sk:1001 <->
>          skmem:(r0,rb1707063,t872,tb262144,f0,w0,o0,bl0,d0) ts sack
>  cubic wscale:8,8 rto:6 rtt:0.02/0.002 ato:4 mss:4096 pmtu:4500
>  rcvmss:536 advmss:4096 cwnd:10 bytes_sent:54823160 bytes_acked:54823121
>  bytes_received:54823120 segs_out:1370582 segs_in:1370580
>  data_segs_out:1370579 data_segs_in:1370578 send 16.4Gbps
>  pacing_rate 32.6Gbps delivery_rate 1.72Gbps delivered:1370579
>  busy:26920ms unacked:1 rcv_rtt:34.615 rcv_space:65920
>  rcv_ssthresh:65535 minrtt:0.015 snd_wnd:65536
>
> While we could argue this patch fixes a bug with RTAX_RTO_MIN,
> I do not add a Fixes: tag, so that we can soak it a bit before
> asking backports to stable branches.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---

Acked-by: Neal Cardwell <ncardwell@google.com>

Thanks, Eric!

neal

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-20 17:29 ` [PATCH net-next 3/3] tcp: derive delack_max from rto_min Eric Dumazet
  2023-09-20 17:34   ` Soheil Hassas Yeganeh
  2023-09-20 19:06   ` Neal Cardwell
@ 2023-09-20 21:57   ` David Ahern
  2023-09-21  2:16     ` Eric Dumazet
  2 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2023-09-20 21:57 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet

On 9/20/23 11:29 AM, Eric Dumazet wrote:
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 1fc1f879cfd6c28cd655bb8f02eff6624eec2ffc..2d1e4b5ac1ca41ff3db8dc58458d4e922a2c4999 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -3977,6 +3977,20 @@ int tcp_connect(struct sock *sk)
>  }
>  EXPORT_SYMBOL(tcp_connect);
>  
> +u32 tcp_delack_max(const struct sock *sk)
> +{
> +	const struct dst_entry *dst = __sk_dst_get(sk);
> +	u32 delack_max = inet_csk(sk)->icsk_delack_max;
> +
> +	if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
> +		u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
> +		u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);

`u32` type with max_t type set as `int`

> +
> +		delack_max = min_t(u32, delack_max, delack_from_rto_min);
> +	}
> +	return delack_max;
> +}
> +
>  /* Send out a delayed ack, the caller does the policy checking
>   * to see if we should even be here.  See tcp_input.c:tcp_ack_snd_check()
>   * for details.
> @@ -4012,7 +4026,7 @@ void tcp_send_delayed_ack(struct sock *sk)
>  		ato = min(ato, max_ato);
>  	}
>  
> -	ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max);
> +	ato = min_t(u32, ato, tcp_delack_max(sk));

and then here ato is an `int`.
>  
>  	/* Stay within the limit we were given */
>  	timeout = jiffies + ato;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-20 21:57   ` David Ahern
@ 2023-09-21  2:16     ` Eric Dumazet
  2023-09-21 12:37       ` David Ahern
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2023-09-21  2:16 UTC (permalink / raw)
  To: David Ahern
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet

On Wed, Sep 20, 2023 at 11:57 PM David Ahern <dsahern@kernel.org> wrote:
>
> On 9/20/23 11:29 AM, Eric Dumazet wrote:
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index 1fc1f879cfd6c28cd655bb8f02eff6624eec2ffc..2d1e4b5ac1ca41ff3db8dc58458d4e922a2c4999 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -3977,6 +3977,20 @@ int tcp_connect(struct sock *sk)
> >  }
> >  EXPORT_SYMBOL(tcp_connect);
> >
> > +u32 tcp_delack_max(const struct sock *sk)
> > +{
> > +     const struct dst_entry *dst = __sk_dst_get(sk);
> > +     u32 delack_max = inet_csk(sk)->icsk_delack_max;
> > +
> > +     if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
> > +             u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
> > +             u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);
>
> `u32` type with max_t type set as `int`

That is because we allow "rto_min 0" in ip route ...

rto_min - 1 is then 0xFFFFFFFF

We could argue that "rto_min 0" would be illegal, but this is orthogonal.


>
> > +
> > +             delack_max = min_t(u32, delack_max, delack_from_rto_min);
> > +     }
> > +     return delack_max;
> > +}
> > +
> >  /* Send out a delayed ack, the caller does the policy checking
> >   * to see if we should even be here.  See tcp_input.c:tcp_ack_snd_check()
> >   * for details.
> > @@ -4012,7 +4026,7 @@ void tcp_send_delayed_ack(struct sock *sk)
> >               ato = min(ato, max_ato);
> >       }
> >
> > -     ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max);
> > +     ato = min_t(u32, ato, tcp_delack_max(sk));
>
> and then here ato is an `int`.


> >
> >       /* Stay within the limit we were given */
> >       timeout = jiffies + ato;
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-21  2:16     ` Eric Dumazet
@ 2023-09-21 12:37       ` David Ahern
  2023-09-21 12:58         ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: David Ahern @ 2023-09-21 12:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet

On 9/20/23 8:16 PM, Eric Dumazet wrote:
> On Wed, Sep 20, 2023 at 11:57 PM David Ahern <dsahern@kernel.org> wrote:
>>
>> On 9/20/23 11:29 AM, Eric Dumazet wrote:
>>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>>> index 1fc1f879cfd6c28cd655bb8f02eff6624eec2ffc..2d1e4b5ac1ca41ff3db8dc58458d4e922a2c4999 100644
>>> --- a/net/ipv4/tcp_output.c
>>> +++ b/net/ipv4/tcp_output.c
>>> @@ -3977,6 +3977,20 @@ int tcp_connect(struct sock *sk)
>>>  }
>>>  EXPORT_SYMBOL(tcp_connect);
>>>
>>> +u32 tcp_delack_max(const struct sock *sk)
>>> +{
>>> +     const struct dst_entry *dst = __sk_dst_get(sk);
>>> +     u32 delack_max = inet_csk(sk)->icsk_delack_max;
>>> +
>>> +     if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
>>> +             u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
>>> +             u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);
>>
>> `u32` type with max_t type set as `int`
> 
> That is because we allow "rto_min 0" in ip route ...
> 
> rto_min - 1 is then 0xFFFFFFFF
> 
> We could argue that "rto_min 0" would be illegal, but this is orthogonal.

My comment is solely about mismatch on data types. I am surprised use of
max_t with mixed data types does not throw a compiler warning.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-21 12:37       ` David Ahern
@ 2023-09-21 12:58         ` Eric Dumazet
  2023-09-22  9:59           ` David Laight
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2023-09-21 12:58 UTC (permalink / raw)
  To: David Ahern
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng, netdev,
	eric.dumazet

On Thu, Sep 21, 2023 at 2:37 PM David Ahern <dsahern@kernel.org> wrote:
>

> My comment is solely about mismatch on data types. I am surprised use of
> max_t with mixed data types does not throw a compiler warning.

This was intentional.

This is max_t() purpose really.

Perhaps you are thinking about max() ?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-21 12:58         ` Eric Dumazet
@ 2023-09-22  9:59           ` David Laight
  2023-09-22 10:53             ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: David Laight @ 2023-09-22  9:59 UTC (permalink / raw)
  To: 'Eric Dumazet', David Ahern
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng,
	netdev@vger.kernel.org, eric.dumazet@gmail.com

From: Eric Dumazet
> Sent: 21 September 2023 13:58
> 
> On Thu, Sep 21, 2023 at 2:37 PM David Ahern <dsahern@kernel.org> wrote:
> >
> 
> > My comment is solely about mismatch on data types. I am surprised use of
> > max_t with mixed data types does not throw a compiler warning.
> 
> This was intentional.
> 
> This is max_t() purpose really.

Apart from when it gets used to accidentally mask high bits :-)
(Although hat is usually consigned to min_t()).

Here
	u32 delack_from_rto_min = max(rto_min, 2u) - 1;
would probably be safer (as in have no casts that might have
unwanted side effects).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-22  9:59           ` David Laight
@ 2023-09-22 10:53             ` Eric Dumazet
  2023-09-22 16:51               ` David Laight
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2023-09-22 10:53 UTC (permalink / raw)
  To: David Laight
  Cc: David Ahern, David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng,
	netdev@vger.kernel.org, eric.dumazet@gmail.com

On Fri, Sep 22, 2023 at 11:59 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Eric Dumazet
> > Sent: 21 September 2023 13:58
> >
> > On Thu, Sep 21, 2023 at 2:37 PM David Ahern <dsahern@kernel.org> wrote:
> > >
> >
> > > My comment is solely about mismatch on data types. I am surprised use of
> > > max_t with mixed data types does not throw a compiler warning.
> >
> > This was intentional.
> >
> > This is max_t() purpose really.
>
> Apart from when it gets used to accidentally mask high bits :-)
> (Although hat is usually consigned to min_t()).

As explained, this is not an accident, but a conscious decision I made.

>
> Here
>         u32 delack_from_rto_min = max(rto_min, 2u) - 1;
> would probably be safer (as in have no casts that might have
> unwanted side effects).
>

I find my solution more readable.

max(-1, 1) is 1.

If this was not the case, many things would be broken in the kernel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH net-next 3/3] tcp: derive delack_max from rto_min
  2023-09-22 10:53             ` Eric Dumazet
@ 2023-09-22 16:51               ` David Laight
  0 siblings, 0 replies; 14+ messages in thread
From: David Laight @ 2023-09-22 16:51 UTC (permalink / raw)
  To: 'Eric Dumazet'
  Cc: David Ahern, David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, Yuchung Cheng,
	netdev@vger.kernel.org, eric.dumazet@gmail.com

From: Eric Dumazet
> Sent: 22 September 2023 11:53
> 
> On Fri, Sep 22, 2023 at 11:59 AM David Laight <David.Laight@aculab.com> wrote:
> >
> > From: Eric Dumazet
> > > Sent: 21 September 2023 13:58
> > >
> > > On Thu, Sep 21, 2023 at 2:37 PM David Ahern <dsahern@kernel.org> wrote:
> > > >
> > >
> > > > My comment is solely about mismatch on data types. I am surprised use of
> > > > max_t with mixed data types does not throw a compiler warning.
> > >
> > > This was intentional.
> > >
> > > This is max_t() purpose really.
> >
> > Apart from when it gets used to accidentally mask high bits :-)
> > (Although hat is usually consigned to min_t()).
> 
> As explained, this is not an accident, but a conscious decision I made.
> 
> >
> > Here
> >         u32 delack_from_rto_min = max(rto_min, 2u) - 1;
> > would probably be safer (as in have no casts that might have
> > unwanted side effects).
> >
> 
> I find my solution more readable.

It has to be said I didn't really like mine either :-)
A better alternative would be:
	max((int)rto_min - 1, 1)
to make it absolutely clear what is going on.

Far too many of the min_t() and max_t() are just used to silence
the over-enthusiastic type checking of min() and max().
So code doing something different might be best making it more obvious.

Does 'ip route' stop very large values for rto_min?
Unsigned values with the top bit set might cause 'interesting' behaviour!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/3] tcp: add tcp_delack_max()
  2023-09-20 17:29 [PATCH net-next 0/3] tcp: add tcp_delack_max() Eric Dumazet
                   ` (2 preceding siblings ...)
  2023-09-20 17:29 ` [PATCH net-next 3/3] tcp: derive delack_max from rto_min Eric Dumazet
@ 2023-10-01 12:20 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-10-01 12:20 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, soheil, ncardwell, ycheng, netdev,
	eric.dumazet

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Wed, 20 Sep 2023 17:29:40 +0000 you wrote:
> First patches are adding const qualifiers to four existing helpers.
> 
> Third patch adds a much needed companion feature to RTAX_RTO_MIN.
> 
> Eric Dumazet (3):
>   net: constify sk_dst_get() and __sk_dst_get() argument
>   tcp: constify tcp_rto_min() and tcp_rto_min_us() argument
>   tcp: derive delack_max from rto_min
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] net: constify sk_dst_get() and __sk_dst_get() argument
    https://git.kernel.org/netdev/net-next/c/5033f58d5fee
  - [net-next,2/3] tcp: constify tcp_rto_min() and tcp_rto_min_us() argument
    https://git.kernel.org/netdev/net-next/c/f68a181fcd3b
  - [net-next,3/3] tcp: derive delack_max from rto_min
    https://git.kernel.org/netdev/net-next/c/bbf80d713fe7

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-10-01 12:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-20 17:29 [PATCH net-next 0/3] tcp: add tcp_delack_max() Eric Dumazet
2023-09-20 17:29 ` [PATCH net-next 1/3] net: constify sk_dst_get() and __sk_dst_get() argument Eric Dumazet
2023-09-20 17:29 ` [PATCH net-next 2/3] tcp: constify tcp_rto_min() and tcp_rto_min_us() argument Eric Dumazet
2023-09-20 17:29 ` [PATCH net-next 3/3] tcp: derive delack_max from rto_min Eric Dumazet
2023-09-20 17:34   ` Soheil Hassas Yeganeh
2023-09-20 19:06   ` Neal Cardwell
2023-09-20 21:57   ` David Ahern
2023-09-21  2:16     ` Eric Dumazet
2023-09-21 12:37       ` David Ahern
2023-09-21 12:58         ` Eric Dumazet
2023-09-22  9:59           ` David Laight
2023-09-22 10:53             ` Eric Dumazet
2023-09-22 16:51               ` David Laight
2023-10-01 12:20 ` [PATCH net-next 0/3] tcp: add tcp_delack_max() patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).