linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
@ 2025-08-30  4:31 Dmitry Safonov via B4 Relay
  2025-08-30  4:31 ` [PATCH net-next v3 1/2] " Dmitry Safonov via B4 Relay
  2025-08-30  4:31 ` [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
  0 siblings, 2 replies; 8+ messages in thread
From: Dmitry Safonov via B4 Relay @ 2025-08-30  4:31 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
	linux-kernel, Dmitry Safonov

On one side a minor/cosmetic issue, especially nowadays when
TCP-AO/TCP-MD5 signature verification failures aren't logged to dmesg.

Yet, I think worth addressing for two reasons:
- unsigned RST gets ignored by the peer and the connection is alive for
  longer (keep-alive interval)
- netstat counters increase and trace events report that trusted BGP peer
  is sending unsigned/incorrectly signed segments, which can ring alarm
  on monitoring.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
Changes in v3:
- Assign NULL to md5sig_info before call_rcu() (catched by Eric Dumazet)
- Link to v2: https://lore.kernel.org/r/20250828-b4-tcp-ao-md5-rst-finwait2-v2-0-653099bea5c1@arista.com

Changes in v2:
- Fixed TCP-MD5 ifdeffery (Reported-by: Victor Nogueira)
- Call proper destructor for inet_ipv6 (Reported-by: syzbot@syzkaller.appspotmail.com)
- Link to v1: https://lore.kernel.org/r/20250822-b4-tcp-ao-md5-rst-finwait2-v1-0-25825d085dcb@arista.com

---
Dmitry Safonov (2):
      tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
      tcp: Free TCP-AO/TCP-MD5 info/keys without RCU

 include/net/tcp.h        |  4 ++++
 net/ipv4/tcp.c           | 16 ++++++++++++++++
 net/ipv4/tcp_ao.c        |  5 ++---
 net/ipv4/tcp_ipv4.c      | 37 ++++++++++---------------------------
 net/ipv4/tcp_minisocks.c | 19 +++++--------------
 net/ipv6/tcp_ipv6.c      |  8 ++++++++
 6 files changed, 45 insertions(+), 44 deletions(-)
---
base-commit: 864ecc4a6dade82d3f70eab43dad0e277aa6fc78
change-id: 20250822-b4-tcp-ao-md5-rst-finwait2-e632b4d8f58d

Best regards,
-- 
Dmitry Safonov <dima@arista.com>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next v3 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
  2025-08-30  4:31 [PATCH net-next v3 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
@ 2025-08-30  4:31 ` Dmitry Safonov via B4 Relay
  2025-08-30  4:31 ` [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
  1 sibling, 0 replies; 8+ messages in thread
From: Dmitry Safonov via B4 Relay @ 2025-08-30  4:31 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
	linux-kernel, Dmitry Safonov

From: Dmitry Safonov <dima@arista.com>

Currently there are a couple of minor issues with destroying the keys
tcp_v4_destroy_sock():

1. The socket is yet in TCP bind buckets, making it reachable for
   incoming segments [on another CPU core], potentially available to send
   late FIN/ACK/RST replies.

2. There is at least one code path, where tcp_done() is called before
   sending RST [kudos to Bob for investigation]. This is a case of
   a server, that finished sending its data and just called close().

   The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by
   __tcp_close())

   tcp_v4_do_rcv()/tcp_v6_do_rcv()
     tcp_rcv_state_process()            /* LINUX_MIB_TCPABORTONDATA */
       tcp_reset()
         tcp_done_with_error()
           tcp_done()
             inet_csk_destroy_sock()    /* Destroys AO/MD5 keys */
     /* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */
   tcp_v4_send_reset()                  /* Sends an unsigned RST segment */

   tcpdump:
> 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60)
>     1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0
> 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72)
>     1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0
> 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
>     1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0
> 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
>     1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
> 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
>     1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0

  Note the signed RSTs later in the dump - those are sent by the server
  when the fin-wait socket gets removed from hash buckets, by
  the listener socket.

Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(),
slightly delay it until the actual socket .sk_destruct(). As shutdown'ed
socket can yet send non-data replies, they should be signed in order for
the peer to process them. Now it also matches how AO/MD5 gets destructed
for TIME-WAIT sockets (in tcp_twsk_destructor()).

This seems optimal for TCP-MD5, while for TCP-AO it seems to have an
open problem: once RST get sent and socket gets actually destructed,
there is no information on the initial sequence numbers. So, in case
this last RST gets lost in the network, the server's listener socket
won't be able to properly sign another RST. Nothing in RFC 1122
prescribes keeping any local state after non-graceful reset.
Luckily, BGP are known to use keep alive(s).

While the issue is quite minor/cosmetic, these days monitoring network
counters is a common practice and getting invalid signed segments from
a trusted BGP peer can get customers worried.

Investigated-by: Bob Gilligan <gilligan@arista.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/net/tcp.h   |  4 ++++
 net/ipv4/tcp.c      | 27 +++++++++++++++++++++++++++
 net/ipv4/tcp_ipv4.c | 33 ++++++++-------------------------
 net/ipv6/tcp_ipv6.c |  8 ++++++++
 4 files changed, 47 insertions(+), 25 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 16dc9cebb9d25832eac7a6ad590a9e9e47e85142..2141f0eab53452764c6eb04f924f9cc61aa8a30e 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1931,6 +1931,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk,
 }
 
 #define tcp_twsk_md5_key(twsk)	((twsk)->tw_md5_key)
+void tcp_md5_destruct_sock(struct sock *sk);
 #else
 static inline struct tcp_md5sig_key *
 tcp_md5_do_lookup(const struct sock *sk, int l3index,
@@ -1947,6 +1948,9 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk,
 }
 
 #define tcp_twsk_md5_key(twsk)	NULL
+static inline void tcp_md5_destruct_sock(struct sock *sk)
+{
+}
 #endif
 
 int tcp_md5_alloc_sigpool(void);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9bc8317e92b7952871f07ae11a9c2eaa7d3a9e65..e2ec4ee0ff4a640e9e5501a0d93fc0ed312d488d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -412,6 +412,33 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp)
 	return rate64;
 }
 
+#ifdef CONFIG_TCP_MD5SIG
+static void tcp_md5sig_info_free_rcu(struct rcu_head *head)
+{
+	struct tcp_md5sig_info *md5sig;
+
+	md5sig = container_of(head, struct tcp_md5sig_info, rcu);
+	kfree(md5sig);
+	static_branch_slow_dec_deferred(&tcp_md5_needed);
+	tcp_md5_release_sigpool();
+}
+
+void tcp_md5_destruct_sock(struct sock *sk)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+
+	if (tp->md5sig_info) {
+		struct tcp_md5sig_info *md5sig;
+
+		md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
+		tcp_clear_md5_list(sk);
+		rcu_assign_pointer(tp->md5sig_info, NULL);
+		call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu);
+	}
+}
+EXPORT_SYMBOL_GPL(tcp_md5_destruct_sock);
+#endif
+
 /* Address-family independent initialization for a tcp_sock.
  *
  * NOTE: A lot of things set to zero explicitly by call to
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7c1d612afca18b424b32ee5e97b99a68062d8436..68bb75bd419cdbfce17048252919996d764ddc1a 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2494,6 +2494,13 @@ static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = {
 	.ao_calc_key_sk		= tcp_v4_ao_calc_key_sk,
 #endif
 };
+
+static void tcp4_destruct_sock(struct sock *sk)
+{
+	tcp_md5_destruct_sock(sk);
+	tcp_ao_destroy_sock(sk, false);
+	inet_sock_destruct(sk);
+}
 #endif
 
 /* NOTE: A lot of things set to zero explicitly by call to
@@ -2509,23 +2516,12 @@ static int tcp_v4_init_sock(struct sock *sk)
 
 #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
 	tcp_sk(sk)->af_specific = &tcp_sock_ipv4_specific;
+	sk->sk_destruct = tcp4_destruct_sock;
 #endif
 
 	return 0;
 }
 
-#ifdef CONFIG_TCP_MD5SIG
-static void tcp_md5sig_info_free_rcu(struct rcu_head *head)
-{
-	struct tcp_md5sig_info *md5sig;
-
-	md5sig = container_of(head, struct tcp_md5sig_info, rcu);
-	kfree(md5sig);
-	static_branch_slow_dec_deferred(&tcp_md5_needed);
-	tcp_md5_release_sigpool();
-}
-#endif
-
 static void tcp_release_user_frags(struct sock *sk)
 {
 #ifdef CONFIG_PAGE_POOL
@@ -2562,19 +2558,6 @@ void tcp_v4_destroy_sock(struct sock *sk)
 	/* Cleans up our, hopefully empty, out_of_order_queue. */
 	skb_rbtree_purge(&tp->out_of_order_queue);
 
-#ifdef CONFIG_TCP_MD5SIG
-	/* Clean up the MD5 key list, if any */
-	if (tp->md5sig_info) {
-		struct tcp_md5sig_info *md5sig;
-
-		md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
-		tcp_clear_md5_list(sk);
-		call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu);
-		rcu_assign_pointer(tp->md5sig_info, NULL);
-	}
-#endif
-	tcp_ao_destroy_sock(sk, false);
-
 	/* Clean up a referenced TCP bind bucket. */
 	if (inet_csk(sk)->icsk_bind_hash)
 		inet_put_port(sk);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index b4e56b8772730579cb85f10b147a15acce03f8e4..838171bb3212c1e8bbf809a2e10ea25a275518ff 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2112,6 +2112,13 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = {
 	.ao_calc_key_sk	=	tcp_v4_ao_calc_key_sk,
 #endif
 };
+
+static void tcp6_destruct_sock(struct sock *sk)
+{
+	tcp_md5_destruct_sock(sk);
+	tcp_ao_destroy_sock(sk, false);
+	inet6_sock_destruct(sk);
+}
 #endif
 
 /* NOTE: A lot of things set to zero explicitly by call to
@@ -2127,6 +2134,7 @@ static int tcp_v6_init_sock(struct sock *sk)
 
 #if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
 	tcp_sk(sk)->af_specific = &tcp_sock_ipv6_specific;
+	sk->sk_destruct = tcp6_destruct_sock;
 #endif
 
 	return 0;

-- 
2.42.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
  2025-08-30  4:31 [PATCH net-next v3 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
  2025-08-30  4:31 ` [PATCH net-next v3 1/2] " Dmitry Safonov via B4 Relay
@ 2025-08-30  4:31 ` Dmitry Safonov via B4 Relay
  2025-09-02 23:08   ` Jakub Kicinski
  1 sibling, 1 reply; 8+ messages in thread
From: Dmitry Safonov via B4 Relay @ 2025-08-30  4:31 UTC (permalink / raw)
  To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
	linux-kernel, Dmitry Safonov

From: Dmitry Safonov <dima@arista.com>

Now that the destruction of info/keys is delayed until the socket
destructor, it's safe to use kfree() without an RCU callback.
As either socket was yet in TCP_CLOSE state or the socket refcounter is
zero and no one can discover it anymore, it's safe to release memory
straight away.
Similar thing was possible for twsk already.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 net/ipv4/tcp.c           | 17 +++--------------
 net/ipv4/tcp_ao.c        |  5 ++---
 net/ipv4/tcp_ipv4.c      |  4 ++--
 net/ipv4/tcp_minisocks.c | 19 +++++--------------
 4 files changed, 12 insertions(+), 33 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e2ec4ee0ff4a640e9e5501a0d93fc0ed312d488d..254ca95d0c3c5c44029be0e84120c5e9fb9d4514 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -413,27 +413,16 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp)
 }
 
 #ifdef CONFIG_TCP_MD5SIG
-static void tcp_md5sig_info_free_rcu(struct rcu_head *head)
-{
-	struct tcp_md5sig_info *md5sig;
-
-	md5sig = container_of(head, struct tcp_md5sig_info, rcu);
-	kfree(md5sig);
-	static_branch_slow_dec_deferred(&tcp_md5_needed);
-	tcp_md5_release_sigpool();
-}
-
 void tcp_md5_destruct_sock(struct sock *sk)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 
 	if (tp->md5sig_info) {
-		struct tcp_md5sig_info *md5sig;
 
-		md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
 		tcp_clear_md5_list(sk);
-		rcu_assign_pointer(tp->md5sig_info, NULL);
-		call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu);
+		kfree(rcu_replace_pointer(tp->md5sig_info, NULL, 1));
+		static_branch_slow_dec_deferred(&tcp_md5_needed);
+		tcp_md5_release_sigpool();
 	}
 }
 EXPORT_SYMBOL_GPL(tcp_md5_destruct_sock);
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index bbb8d5f0eae7d3d8887da3fa4d68e248af9060ad..31302be78bc4450b56fa23a390b6d03b2262741d 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -268,9 +268,8 @@ static void tcp_ao_key_free_rcu(struct rcu_head *head)
 	kfree_sensitive(key);
 }
 
-static void tcp_ao_info_free_rcu(struct rcu_head *head)
+static void tcp_ao_info_free(struct tcp_ao_info *ao)
 {
-	struct tcp_ao_info *ao = container_of(head, struct tcp_ao_info, rcu);
 	struct tcp_ao_key *key;
 	struct hlist_node *n;
 
@@ -310,7 +309,7 @@ void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
 
 	if (!twsk)
 		tcp_ao_sk_omem_free(sk, ao);
-	call_rcu(&ao->rcu, tcp_ao_info_free_rcu);
+	tcp_ao_info_free(ao);
 }
 
 void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 68bb75bd419cdbfce17048252919996d764ddc1a..f914bda25d8f5170395157b707d3bd2ef04267a1 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1503,9 +1503,9 @@ void tcp_clear_md5_list(struct sock *sk)
 	md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
 
 	hlist_for_each_entry_safe(key, n, &md5sig->head, node) {
-		hlist_del_rcu(&key->node);
+		hlist_del(&key->node);
 		atomic_sub(sizeof(*key), &sk->sk_omem_alloc);
-		kfree_rcu(key, rcu);
+		kfree(key);
 	}
 }
 
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index d1c9e40886463ca308f9f3682c4039f491e7555f..7c2ae07d8d5d2a18d6ce3210cc09ee5d9850ea29 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -377,26 +377,17 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
 }
 EXPORT_SYMBOL(tcp_time_wait);
 
-#ifdef CONFIG_TCP_MD5SIG
-static void tcp_md5_twsk_free_rcu(struct rcu_head *head)
-{
-	struct tcp_md5sig_key *key;
-
-	key = container_of(head, struct tcp_md5sig_key, rcu);
-	kfree(key);
-	static_branch_slow_dec_deferred(&tcp_md5_needed);
-	tcp_md5_release_sigpool();
-}
-#endif
-
 void tcp_twsk_destructor(struct sock *sk)
 {
 #ifdef CONFIG_TCP_MD5SIG
 	if (static_branch_unlikely(&tcp_md5_needed.key)) {
 		struct tcp_timewait_sock *twsk = tcp_twsk(sk);
 
-		if (twsk->tw_md5_key)
-			call_rcu(&twsk->tw_md5_key->rcu, tcp_md5_twsk_free_rcu);
+		if (twsk->tw_md5_key) {
+			kfree(twsk->tw_md5_key);
+			static_branch_slow_dec_deferred(&tcp_md5_needed);
+			tcp_md5_release_sigpool();
+		}
 	}
 #endif
 	tcp_ao_destroy_sock(sk, true);

-- 
2.42.2



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
  2025-08-30  4:31 ` [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
@ 2025-09-02 23:08   ` Jakub Kicinski
  2025-09-03 17:41     ` Dmitry Safonov
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2025-09-02 23:08 UTC (permalink / raw)
  To: Dmitry Safonov via B4 Relay
  Cc: dima, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
	David S. Miller, David Ahern, Paolo Abeni, Simon Horman,
	Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
	linux-kernel

On Sat, 30 Aug 2025 05:31:47 +0100 Dmitry Safonov via B4 Relay wrote:
> Now that the destruction of info/keys is delayed until the socket
> destructor, it's safe to use kfree() without an RCU callback.
> As either socket was yet in TCP_CLOSE state or the socket refcounter is
> zero and no one can discover it anymore, it's safe to release memory
> straight away.
> Similar thing was possible for twsk already.

After this patch the rcu members of struct tcp_ao* seem to no longer 
be used?
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
  2025-09-02 23:08   ` Jakub Kicinski
@ 2025-09-03 17:41     ` Dmitry Safonov
  2025-09-03 22:23       ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Safonov @ 2025-09-03 17:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dmitry Safonov via B4 Relay, Eric Dumazet, Neal Cardwell,
	Kuniyuki Iwashima, David S. Miller, David Ahern, Paolo Abeni,
	Simon Horman, Bob Gilligan, Salam Noureddine, Dmitry Safonov,
	netdev, linux-kernel

On Wed, Sep 3, 2025 at 12:09 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 30 Aug 2025 05:31:47 +0100 Dmitry Safonov via B4 Relay wrote:
> > Now that the destruction of info/keys is delayed until the socket
> > destructor, it's safe to use kfree() without an RCU callback.
> > As either socket was yet in TCP_CLOSE state or the socket refcounter is
> > zero and no one can discover it anymore, it's safe to release memory
> > straight away.
> > Similar thing was possible for twsk already.
>
> After this patch the rcu members of struct tcp_ao* seem to no longer
> be used?

Right. I'll remove tcp_ao_info::rcu in v4.
For tcp_ao_key it's needed for the regular key rotation, as well as
for tcp_md5sig_key.

Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
  2025-09-03 17:41     ` Dmitry Safonov
@ 2025-09-03 22:23       ` Jakub Kicinski
  2025-09-03 23:17         ` Dmitry Safonov
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2025-09-03 22:23 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov via B4 Relay, Eric Dumazet, Neal Cardwell,
	Kuniyuki Iwashima, David S. Miller, David Ahern, Paolo Abeni,
	Simon Horman, Bob Gilligan, Salam Noureddine, Dmitry Safonov,
	netdev, linux-kernel

On Wed, 3 Sep 2025 18:41:39 +0100 Dmitry Safonov wrote:
> > On Sat, 30 Aug 2025 05:31:47 +0100 Dmitry Safonov via B4 Relay wrote:  
> > > Now that the destruction of info/keys is delayed until the socket
> > > destructor, it's safe to use kfree() without an RCU callback.
> > > As either socket was yet in TCP_CLOSE state or the socket refcounter is
> > > zero and no one can discover it anymore, it's safe to release memory
> > > straight away.
> > > Similar thing was possible for twsk already.  
> >
> > After this patch the rcu members of struct tcp_ao* seem to no longer
> > be used?  
> 
> Right. I'll remove tcp_ao_info::rcu in v4.
> For tcp_ao_key it's needed for the regular key rotation, as well as
> for tcp_md5sig_key.

Hm, maybe I missed something. I did a test allmodconfig build yesterday
and while the md5sig_key rcu was still needed, removing the ao_key
didn't cause issues. But it was just a quick test I didn't even config
kconfig is sane.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
  2025-09-03 22:23       ` Jakub Kicinski
@ 2025-09-03 23:17         ` Dmitry Safonov
  2025-09-03 23:27           ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Safonov @ 2025-09-03 23:17 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dmitry Safonov via B4 Relay, Eric Dumazet, Neal Cardwell,
	Kuniyuki Iwashima, David S. Miller, David Ahern, Paolo Abeni,
	Simon Horman, Bob Gilligan, Salam Noureddine, Dmitry Safonov,
	netdev, linux-kernel

On Wed, Sep 3, 2025 at 11:23 PM Jakub Kicinski <kuba@kernel.org> wrote:
> On Wed, 3 Sep 2025 18:41:39 +0100 Dmitry Safonov wrote:
> > > On Sat, 30 Aug 2025 05:31:47 +0100 Dmitry Safonov via B4 Relay wrote:
> > > > Now that the destruction of info/keys is delayed until the socket
> > > > destructor, it's safe to use kfree() without an RCU callback.
> > > > As either socket was yet in TCP_CLOSE state or the socket refcounter is
> > > > zero and no one can discover it anymore, it's safe to release memory
> > > > straight away.
> > > > Similar thing was possible for twsk already.
> > >
> > > After this patch the rcu members of struct tcp_ao* seem to no longer
> > > be used?
> >
> > Right. I'll remove tcp_ao_info::rcu in v4.
> > For tcp_ao_key it's needed for the regular key rotation, as well as
> > for tcp_md5sig_key.
>
> Hm, maybe I missed something. I did a test allmodconfig build yesterday
> and while the md5sig_key rcu was still needed, removing the ao_key
> didn't cause issues. But it was just a quick test I didn't even config
> kconfig is sane.

Hmm, probably CONFIG_TCP_AO was off?
tcp_ao_delete_key() does call_rcu(&key->rcu, tcp_ao_key_free_rcu).

Looking at the code now, I guess what I could have done even more is
migrating tcp_sock::ao_info (and tcp_timewait_sock::ao_info) from
rcu_*() helpers to acquire/release ones. Somewhat feeling uneasy about
going that far just yet. Should I do it with another cleanup on the
top, what do you think?

Thanks,
           Dmitry

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
  2025-09-03 23:17         ` Dmitry Safonov
@ 2025-09-03 23:27           ` Jakub Kicinski
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2025-09-03 23:27 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov via B4 Relay, Eric Dumazet, Neal Cardwell,
	Kuniyuki Iwashima, David S. Miller, David Ahern, Paolo Abeni,
	Simon Horman, Bob Gilligan, Salam Noureddine, Dmitry Safonov,
	netdev, linux-kernel

On Thu, 4 Sep 2025 00:17:34 +0100 Dmitry Safonov wrote:
> > > Right. I'll remove tcp_ao_info::rcu in v4.
> > > For tcp_ao_key it's needed for the regular key rotation, as well as
> > > for tcp_md5sig_key.  
> >
> > Hm, maybe I missed something. I did a test allmodconfig build yesterday
> > and while the md5sig_key rcu was still needed, removing the ao_key
> > didn't cause issues. But it was just a quick test I didn't even config
> > kconfig is sane.  
> 
> Hmm, probably CONFIG_TCP_AO was off?
> tcp_ao_delete_key() does call_rcu(&key->rcu, tcp_ao_key_free_rcu).
> 
> Looking at the code now, I guess what I could have done even more is
> migrating tcp_sock::ao_info (and tcp_timewait_sock::ao_info) from
> rcu_*() helpers to acquire/release ones. Somewhat feeling uneasy about
> going that far just yet. Should I do it with another cleanup on the
> top, what do you think?

No preference :)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-09-03 23:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-30  4:31 [PATCH net-next v3 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
2025-08-30  4:31 ` [PATCH net-next v3 1/2] " Dmitry Safonov via B4 Relay
2025-08-30  4:31 ` [PATCH net-next v3 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
2025-09-02 23:08   ` Jakub Kicinski
2025-09-03 17:41     ` Dmitry Safonov
2025-09-03 22:23       ` Jakub Kicinski
2025-09-03 23:17         ` Dmitry Safonov
2025-09-03 23:27           ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).