* [PATCH net-next 1/2] tcp: use tp->total_rto to track number of linear timeouts in SYN_SENT state
2023-11-14 17:23 [PATCH net-next 0/2] tcp: change reaction to ICMP messages Eric Dumazet
@ 2023-11-14 17:23 ` Eric Dumazet
2023-11-14 17:23 ` [PATCH net-next 2/2] tcp: no longer abort SYN_SENT when receiving some ICMP Eric Dumazet
2023-11-16 23:40 ` [PATCH net-next 0/2] tcp: change reaction to ICMP messages patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2023-11-14 17:23 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Neal Cardwell, Yuchung Cheng, David Morley, netdev, eric.dumazet,
Eric Dumazet
In commit ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear")
David used icsk->icsk_backoff field to track the number of linear timeouts.
Since then, tp->total_rto has been added.
This commit uses tp->total_rto instead of icsk->icsk_backoff
so that tcp_ld_RTO_revert() no longer can trigger an overflow
in inet_csk_rto_backoff(). Other than the potential UBSAN
report, there was no issue because receiving an ICMP message
currently aborts the connect().
In the following patch, we want to adhere to RFC 6069
and RFC 1122 4.2.3.9.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Morley <morleyd@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
net/ipv4/tcp_timer.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 1f9f6c1c196b2de35b0bc2f734484f09ba90541a..d1ad20ce1c8c7c013b8b0f26d71c0b0bc4800354 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -626,7 +626,6 @@ void tcp_retransmit_timer(struct sock *sk)
* implemented ftp to mars will work nicely. We will have to fix
* the 120 second clamps though!
*/
- icsk->icsk_backoff++;
out_reset_timer:
/* If stream is thin, use linear timeouts. Since 'icsk_backoff' is
@@ -647,11 +646,12 @@ void tcp_retransmit_timer(struct sock *sk)
tcp_rto_min(sk),
TCP_RTO_MAX);
} else if (sk->sk_state != TCP_SYN_SENT ||
- icsk->icsk_backoff >
+ tp->total_rto >
READ_ONCE(net->ipv4.sysctl_tcp_syn_linear_timeouts)) {
/* Use normal (exponential) backoff unless linear timeouts are
* activated.
*/
+ icsk->icsk_backoff++;
icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX);
}
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS,
--
2.42.0.869.gea05f2083d-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH net-next 2/2] tcp: no longer abort SYN_SENT when receiving some ICMP
2023-11-14 17:23 [PATCH net-next 0/2] tcp: change reaction to ICMP messages Eric Dumazet
2023-11-14 17:23 ` [PATCH net-next 1/2] tcp: use tp->total_rto to track number of linear timeouts in SYN_SENT state Eric Dumazet
@ 2023-11-14 17:23 ` Eric Dumazet
2023-11-16 23:40 ` [PATCH net-next 0/2] tcp: change reaction to ICMP messages patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2023-11-14 17:23 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Neal Cardwell, Yuchung Cheng, David Morley, netdev, eric.dumazet,
Eric Dumazet
Currently, non fatal ICMP messages received on behalf
of SYN_SENT sockets do call tcp_ld_RTO_revert()
to implement RFC 6069, but immediately call tcp_done(),
thus aborting the connect() attempt.
This violates RFC 1122 following requirement:
4.2.3.9 ICMP Messages
...
o Destination Unreachable -- codes 0, 1, 5
Since these Unreachable messages indicate soft error
conditions, TCP MUST NOT abort the connection, and it
SHOULD make the information available to the
application.
This patch makes sure non 'fatal' ICMP[v6] messages do not
abort the connection attempt.
It enables RFC 6069 for SYN_SENT sockets as a result.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Morley <morleyd@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
---
net/ipv4/tcp_ipv4.c | 6 ++++++
net/ipv6/tcp_ipv6.c | 9 ++++++---
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5f693bbd578d2261b78aa0be6bf69499bbd5117e..86cc6d36f8188ec6a761c3a949382c03e74c91f8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -482,6 +482,7 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
const int code = icmp_hdr(skb)->code;
struct sock *sk;
struct request_sock *fastopen;
+ bool harderr = false;
u32 seq, snd_una;
int err;
struct net *net = dev_net(skb->dev);
@@ -555,6 +556,7 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
goto out;
case ICMP_PARAMETERPROB:
err = EPROTO;
+ harderr = true;
break;
case ICMP_DEST_UNREACH:
if (code > NR_ICMP_UNREACH)
@@ -579,6 +581,7 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
}
err = icmp_err_convert[code].errno;
+ harderr = icmp_err_convert[code].fatal;
/* check if this ICMP message allows revert of backoff.
* (see RFC 6069)
*/
@@ -604,6 +607,9 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
ip_icmp_error(sk, skb, err, th->dest, info, (u8 *)th);
+ if (!harderr)
+ break;
+
if (!sock_owned_by_user(sk)) {
WRITE_ONCE(sk->sk_err, err);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 937a02c2e5345390ed592b19faa661cd703a23f0..43deda49cc521278f0bd869135e70bbe7aed6d35 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -381,7 +381,7 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
struct tcp_sock *tp;
__u32 seq, snd_una;
struct sock *sk;
- bool fatal;
+ bool harderr;
int err;
sk = __inet6_lookup_established(net, net->ipv4.tcp_death_row.hashinfo,
@@ -402,9 +402,9 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
return 0;
}
seq = ntohl(th->seq);
- fatal = icmpv6_err_convert(type, code, &err);
+ harderr = icmpv6_err_convert(type, code, &err);
if (sk->sk_state == TCP_NEW_SYN_RECV) {
- tcp_req_err(sk, seq, fatal);
+ tcp_req_err(sk, seq, harderr);
return 0;
}
@@ -489,6 +489,9 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
ipv6_icmp_error(sk, skb, err, th->dest, ntohl(info), (u8 *)th);
+ if (!harderr)
+ break;
+
if (!sock_owned_by_user(sk)) {
WRITE_ONCE(sk->sk_err, err);
sk_error_report(sk); /* Wake people up to see the error (see connect in sock.c) */
--
2.42.0.869.gea05f2083d-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread