* [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
@ 2025-08-22 4:55 Dmitry Safonov via B4 Relay
2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman
Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
linux-kernel, Dmitry Safonov
On one side a minor/cosmetic issue, especially nowadays when
TCP-AO/TCP-MD5 signature verification failures aren't logged to dmesg.
Yet, I think worth addressing for two reasons:
- unsigned RST gets ignored by the peer and the connection is alive for
longer (keep-alive interval)
- netstat counters increase and trace events report that trusted BGP peer
is sending unsigned/incorrectly signed segments, which can ring alarm
on monitoring.
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
Dmitry Safonov (2):
tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
net/ipv4/tcp.c | 18 ++++++++++++++++++
net/ipv4/tcp_ao.c | 5 ++---
net/ipv4/tcp_ipv4.c | 29 ++---------------------------
net/ipv4/tcp_minisocks.c | 19 +++++--------------
4 files changed, 27 insertions(+), 44 deletions(-)
---
base-commit: a7bd72158063740212344fad5d99dcef45bc70d6
change-id: 20250822-b4-tcp-ao-md5-rst-finwait2-e632b4d8f58d
Best regards,
--
Dmitry Safonov <dima@arista.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
@ 2025-08-22 4:55 ` Dmitry Safonov via B4 Relay
2025-08-22 12:40 ` Victor Nogueira
2025-08-22 19:24 ` kernel test robot
2025-08-22 4:55 ` [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
2025-08-22 9:08 ` [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() syzbot ci
2 siblings, 2 replies; 7+ messages in thread
From: Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman
Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
linux-kernel, Dmitry Safonov
From: Dmitry Safonov <dima@arista.com>
Currently there are a couple of minor issues with destroying the keys
tcp_v4_destroy_sock():
1. The socket is yet in TCP bind buckets, making it reachable for
incoming segments [on another CPU core], potentially available to send
late FIN/ACK/RST replies.
2. There is at least one code path, where tcp_done() is called before
sending RST [kudos to Bob for investigation]. This is a case of
a server, that finished sending its data and just called close().
The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by
__tcp_close())
tcp_v4_do_rcv()/tcp_v6_do_rcv()
tcp_rcv_state_process() /* LINUX_MIB_TCPABORTONDATA */
tcp_reset()
tcp_done_with_error()
tcp_done()
inet_csk_destroy_sock() /* Destroys AO/MD5 keys */
/* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */
tcp_v4_send_reset() /* Sends an unsigned RST segment */
tcpdump:
> 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60)
> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0
> 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72)
> 1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0
> 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0
> 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
> 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
Note the signed RSTs later in the dump - those are sent by the server
when the fin-wait socket gets removed from hash buckets, by
the listener socket.
Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(),
slightly delay it until the actual socket .sk_destruct(). As shutdown'ed
socket can yet send non-data replies, they should be signed in order for
the peer to process them. Now it also matches how AO/MD5 gets destructed
for TIME-WAIT sockets (in tcp_twsk_destructor()).
This seems optimal for TCP-MD5, while for TCP-AO it seems to have an
open problem: once RST get sent and socket gets actually destructed,
there is no information on the initial sequence numbers. So, in case
this last RST gets lost in the network, the server's listener socket
won't be able to properly sign another RST. Nothing in RFC 1122
prescribes keeping any local state after non-graceful reset.
Luckily, BGP are known to use keep alive(s).
While the issue is quite minor/cosmetic, these days monitoring network
counters is a common practice and getting invalid signed segments from
a trusted BGP peer can get customers worried.
Investigated-by: Bob Gilligan <gilligan@arista.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
net/ipv4/tcp.c | 31 +++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 25 -------------------------
2 files changed, 31 insertions(+), 25 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 71a956fbfc5533224ee00e792de2cfdccd4d40aa..4e996e937e8e5f0e75764caa24240e25006deece 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -412,6 +412,36 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp)
return rate64;
}
+#ifdef CONFIG_TCP_MD5SIG
+static void tcp_md5sig_info_free_rcu(struct rcu_head *head)
+{
+ struct tcp_md5sig_info *md5sig;
+
+ md5sig = container_of(head, struct tcp_md5sig_info, rcu);
+ kfree(md5sig);
+ static_branch_slow_dec_deferred(&tcp_md5_needed);
+ tcp_md5_release_sigpool();
+}
+#endif
+
+static void tcp_destruct_sock(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+#ifdef CONFIG_TCP_MD5SIG
+ if (tp->md5sig_info) {
+ struct tcp_md5sig_info *md5sig;
+
+ md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
+ tcp_clear_md5_list(sk);
+ call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu);
+ rcu_assign_pointer(tp->md5sig_info, NULL);
+ }
+#endif
+ tcp_ao_destroy_sock(sk, false);
+ inet_sock_destruct(sk);
+}
+
/* Address-family independent initialization for a tcp_sock.
*
* NOTE: A lot of things set to zero explicitly by call to
@@ -464,6 +494,7 @@ void tcp_init_sock(struct sock *sk)
tp->tsoffset = 0;
tp->rack.reo_wnd_steps = 1;
+ sk->sk_destruct = tcp_destruct_sock;
sk->sk_write_space = sk_stream_write_space;
sock_set_flag(sk, SOCK_USE_WRITE_QUEUE);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 84d3d556ed8062d07fe7019bc0dadd90d3b80d96..32814f205fdfdcbd4be4765a4e127c8f175d3b14 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2521,18 +2521,6 @@ static int tcp_v4_init_sock(struct sock *sk)
return 0;
}
-#ifdef CONFIG_TCP_MD5SIG
-static void tcp_md5sig_info_free_rcu(struct rcu_head *head)
-{
- struct tcp_md5sig_info *md5sig;
-
- md5sig = container_of(head, struct tcp_md5sig_info, rcu);
- kfree(md5sig);
- static_branch_slow_dec_deferred(&tcp_md5_needed);
- tcp_md5_release_sigpool();
-}
-#endif
-
static void tcp_release_user_frags(struct sock *sk)
{
#ifdef CONFIG_PAGE_POOL
@@ -2569,19 +2557,6 @@ void tcp_v4_destroy_sock(struct sock *sk)
/* Cleans up our, hopefully empty, out_of_order_queue. */
skb_rbtree_purge(&tp->out_of_order_queue);
-#ifdef CONFIG_TCP_MD5SIG
- /* Clean up the MD5 key list, if any */
- if (tp->md5sig_info) {
- struct tcp_md5sig_info *md5sig;
-
- md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
- tcp_clear_md5_list(sk);
- call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu);
- rcu_assign_pointer(tp->md5sig_info, NULL);
- }
-#endif
- tcp_ao_destroy_sock(sk, false);
-
/* Clean up a referenced TCP bind bucket. */
if (inet_csk(sk)->icsk_bind_hash)
inet_put_port(sk);
--
2.42.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay
@ 2025-08-22 4:55 ` Dmitry Safonov via B4 Relay
2025-08-22 9:08 ` [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() syzbot ci
2 siblings, 0 replies; 7+ messages in thread
From: Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman
Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
linux-kernel, Dmitry Safonov
From: Dmitry Safonov <dima@arista.com>
Now that the destruction of info/keys is delayed until the socket
destructor, it's safe to use kfree() without an RCU callback.
As either socket was yet in TCP_CLOSE state or the socket refcounter is
zero and no one can discover it anymore, it's safe to release memory
straight away.
Similar thing was possible for twsk already.
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
net/ipv4/tcp.c | 19 +++----------------
net/ipv4/tcp_ao.c | 5 ++---
net/ipv4/tcp_ipv4.c | 4 ++--
net/ipv4/tcp_minisocks.c | 19 +++++--------------
4 files changed, 12 insertions(+), 35 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4e996e937e8e5f0e75764caa24240e25006deece..de10d38116a205863c290470fa0cbbddeb8d709e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -412,30 +412,17 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp)
return rate64;
}
-#ifdef CONFIG_TCP_MD5SIG
-static void tcp_md5sig_info_free_rcu(struct rcu_head *head)
-{
- struct tcp_md5sig_info *md5sig;
-
- md5sig = container_of(head, struct tcp_md5sig_info, rcu);
- kfree(md5sig);
- static_branch_slow_dec_deferred(&tcp_md5_needed);
- tcp_md5_release_sigpool();
-}
-#endif
-
static void tcp_destruct_sock(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
#ifdef CONFIG_TCP_MD5SIG
if (tp->md5sig_info) {
- struct tcp_md5sig_info *md5sig;
- md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
tcp_clear_md5_list(sk);
- call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu);
- rcu_assign_pointer(tp->md5sig_info, NULL);
+ kfree(rcu_replace_pointer(tp->md5sig_info, NULL, 1));
+ static_branch_slow_dec_deferred(&tcp_md5_needed);
+ tcp_md5_release_sigpool();
}
#endif
tcp_ao_destroy_sock(sk, false);
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index bbb8d5f0eae7d3d8887da3fa4d68e248af9060ad..31302be78bc4450b56fa23a390b6d03b2262741d 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -268,9 +268,8 @@ static void tcp_ao_key_free_rcu(struct rcu_head *head)
kfree_sensitive(key);
}
-static void tcp_ao_info_free_rcu(struct rcu_head *head)
+static void tcp_ao_info_free(struct tcp_ao_info *ao)
{
- struct tcp_ao_info *ao = container_of(head, struct tcp_ao_info, rcu);
struct tcp_ao_key *key;
struct hlist_node *n;
@@ -310,7 +309,7 @@ void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
if (!twsk)
tcp_ao_sk_omem_free(sk, ao);
- call_rcu(&ao->rcu, tcp_ao_info_free_rcu);
+ tcp_ao_info_free(ao);
}
void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 32814f205fdfdcbd4be4765a4e127c8f175d3b14..185d16ff2d0ad7b20e2404ceff71cccaff4ed4bc 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1505,9 +1505,9 @@ void tcp_clear_md5_list(struct sock *sk)
md5sig = rcu_dereference_protected(tp->md5sig_info, 1);
hlist_for_each_entry_safe(key, n, &md5sig->head, node) {
- hlist_del_rcu(&key->node);
+ hlist_del(&key->node);
atomic_sub(sizeof(*key), &sk->sk_omem_alloc);
- kfree_rcu(key, rcu);
+ kfree(key);
}
}
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 2994c9222c9cb5ee86b60bdb553f92130e52c70e..c93812b19893742b071b0f7a49c4293a781e8de4 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -377,26 +377,17 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
}
EXPORT_SYMBOL(tcp_time_wait);
-#ifdef CONFIG_TCP_MD5SIG
-static void tcp_md5_twsk_free_rcu(struct rcu_head *head)
-{
- struct tcp_md5sig_key *key;
-
- key = container_of(head, struct tcp_md5sig_key, rcu);
- kfree(key);
- static_branch_slow_dec_deferred(&tcp_md5_needed);
- tcp_md5_release_sigpool();
-}
-#endif
-
void tcp_twsk_destructor(struct sock *sk)
{
#ifdef CONFIG_TCP_MD5SIG
if (static_branch_unlikely(&tcp_md5_needed.key)) {
struct tcp_timewait_sock *twsk = tcp_twsk(sk);
- if (twsk->tw_md5_key)
- call_rcu(&twsk->tw_md5_key->rcu, tcp_md5_twsk_free_rcu);
+ if (twsk->tw_md5_key) {
+ kfree(twsk->tw_md5_key);
+ static_branch_slow_dec_deferred(&tcp_md5_needed);
+ tcp_md5_release_sigpool();
+ }
}
#endif
tcp_ao_destroy_sock(sk, true);
--
2.42.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay
2025-08-22 4:55 ` [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
@ 2025-08-22 9:08 ` syzbot ci
2 siblings, 0 replies; 7+ messages in thread
From: syzbot ci @ 2025-08-22 9:08 UTC (permalink / raw)
To: 0x7f454c46, davem, devnull, dima, dsahern, edumazet, gilligan,
horms, kuba, kuniyu, linux-kernel, ncardwell, netdev, noureddine,
pabeni
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v1] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
https://lore.kernel.org/all/20250822-b4-tcp-ao-md5-rst-finwait2-v1-0-25825d085dcb@arista.com
* [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
* [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU
and found the following issue:
WARNING in inet_sock_destruct
Full report is available here:
https://ci.syzbot.org/series/4e53fc18-79b4-47d6-87c4-89e499e12879
***
WARNING in inet_sock_destruct
tree: net-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base: a9af709fda7edafa17e072bffe610d9e7ed7a5df
arch: amd64
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
config: https://ci.syzbot.org/builds/d6e1c7cd-df1b-4192-bc2d-a2db69987793/config
C repro: https://ci.syzbot.org/findings/62fd81c8-8c8c-49fe-aa87-8e1418bcc053/c_repro
syz repro: https://ci.syzbot.org/findings/62fd81c8-8c8c-49fe-aa87-8e1418bcc053/syz_repro
------------[ cut here ]------------
WARNING: CPU: 0 PID: 6012 at net/ipv4/af_inet.c:153 inet_sock_destruct+0x5f9/0x730 net/ipv4/af_inet.c:153
Modules linked in:
CPU: 0 UID: 0 PID: 6012 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:inet_sock_destruct+0x5f9/0x730 net/ipv4/af_inet.c:153
Code: 00 41 0f b6 74 24 12 48 c7 c7 20 91 9e 8c 4c 89 e2 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d e9 dd 38 24 f7 e8 68 44 bc f7 90 <0f> 0b 90 e9 62 fe ff ff e8 5a 44 bc f7 90 0f 0b 90 e9 95 fe ff ff
RSP: 0018:ffffc90003a0fc58 EFLAGS: 00010293
RAX: ffffffff8a0366c8 RBX: dffffc0000000000 RCX: ffff888106891cc0
RDX: 0000000000000000 RSI: 00000000000003c0 RDI: 0000000000000000
RBP: 00000000000003c0 R08: ffff88803bb429c3 R09: 1ffff11007768538
R10: dffffc0000000000 R11: ffffed1007768539 R12: ffff88803bb42880
R13: dffffc0000000000 R14: ffff88803bb429c0 R15: 1ffff11007768512
FS: 00005555690ff500(0000) GS:ffff8880b861a000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000f65000 CR3: 0000000106d3a000 CR4: 00000000000006f0
Call Trace:
<TASK>
__sk_destruct+0x89/0x660 net/core/sock.c:2339
inet_release+0x144/0x190 net/ipv4/af_inet.c:435
__sock_release net/socket.c:649 [inline]
sock_close+0xc3/0x240 net/socket.c:1439
__fput+0x44c/0xa70 fs/file_table.c:468
task_work_run+0x1d4/0x260 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa54ab8ebe9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc0978c268 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 000000000000ce2e RCX: 00007fa54ab8ebe9
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000001 R09: 000000040978c55f
R10: 0000001b30220000 R11: 0000000000000246 R12: 00007fa54adb5fac
R13: 00007fa54adb5fa0 R14: ffffffffffffffff R15: 0000000000000006
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay
@ 2025-08-22 12:40 ` Victor Nogueira
2025-08-22 18:05 ` Dmitry Safonov
2025-08-22 19:24 ` kernel test robot
1 sibling, 1 reply; 7+ messages in thread
From: Victor Nogueira @ 2025-08-22 12:40 UTC (permalink / raw)
To: dima, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima,
David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
linux-kernel
On 8/22/25 01:55, Dmitry Safonov via B4 Relay wrote:
> From: Dmitry Safonov <dima@arista.com>
>
> Currently there are a couple of minor issues with destroying the keys
> tcp_v4_destroy_sock():
>
> 1. The socket is yet in TCP bind buckets, making it reachable for
> incoming segments [on another CPU core], potentially available to send
> late FIN/ACK/RST replies.
>
> 2. There is at least one code path, where tcp_done() is called before
> sending RST [kudos to Bob for investigation]. This is a case of
> a server, that finished sending its data and just called close().
>
> The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by
> __tcp_close())
>
> tcp_v4_do_rcv()/tcp_v6_do_rcv()
> tcp_rcv_state_process() /* LINUX_MIB_TCPABORTONDATA */
> tcp_reset()
> tcp_done_with_error()
> tcp_done()
> inet_csk_destroy_sock() /* Destroys AO/MD5 keys */
> /* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */
> tcp_v4_send_reset() /* Sends an unsigned RST segment */
>
> tcpdump:
>> 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60)
>> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0
>> 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72)
>> 1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0
>> 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
>> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0
>> 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
>> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
>> 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
>> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
>
> Note the signed RSTs later in the dump - those are sent by the server
> when the fin-wait socket gets removed from hash buckets, by
> the listener socket.
>
> Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(),
> slightly delay it until the actual socket .sk_destruct(). As shutdown'ed
> socket can yet send non-data replies, they should be signed in order for
> the peer to process them. Now it also matches how AO/MD5 gets destructed
> for TIME-WAIT sockets (in tcp_twsk_destructor()).
>
> This seems optimal for TCP-MD5, while for TCP-AO it seems to have an
> open problem: once RST get sent and socket gets actually destructed,
> there is no information on the initial sequence numbers. So, in case
> this last RST gets lost in the network, the server's listener socket
> won't be able to properly sign another RST. Nothing in RFC 1122
> prescribes keeping any local state after non-graceful reset.
> Luckily, BGP are known to use keep alive(s).
>
> While the issue is quite minor/cosmetic, these days monitoring network
> counters is a common practice and getting invalid signed segments from
> a trusted BGP peer can get customers worried.
>
> Investigated-by: Bob Gilligan <gilligan@arista.com>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
> net/ipv4/tcp.c | 31 +++++++++++++++++++++++++++++++
> net/ipv4/tcp_ipv4.c | 25 -------------------------
> 2 files changed, 31 insertions(+), 25 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 71a956fbfc5533224ee00e792de2cfdccd4d40aa..4e996e937e8e5f0e75764caa24240e25006deece 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -412,6 +412,36 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp)
> return rate64;
> }
> [...]
> +
> +static void tcp_destruct_sock(struct sock *sk)
> +{
> + struct tcp_sock *tp = tcp_sk(sk);
It looks like this variable is unused when CONFIG_TCP_MD5SIG is not set
and this is causing the test CI build to fail.
net/ipv4/tcp.c: In function ‘tcp_destruct_sock’:
net/ipv4/tcp.c:417:26: error: unused variable ‘tp’ [-Werror=unused-variable]
417 | struct tcp_sock *tp = tcp_sk(sk);
| ^~
cc1: all warnings being treated as errors
make[4]: *** [scripts/Makefile.build:287: net/ipv4/tcp.
cheers,
Victor
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
2025-08-22 12:40 ` Victor Nogueira
@ 2025-08-22 18:05 ` Dmitry Safonov
0 siblings, 0 replies; 7+ messages in thread
From: Dmitry Safonov @ 2025-08-22 18:05 UTC (permalink / raw)
To: Victor Nogueira
Cc: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman,
Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev,
linux-kernel
Hi Victor,
Thanks, going to correct that in v2.
On Fri, Aug 22, 2025 at 1:40 PM Victor Nogueira <victor@mojatatu.com> wrote:
>
> On 8/22/25 01:55, Dmitry Safonov via B4 Relay wrote:
> > From: Dmitry Safonov <dima@arista.com>
> >
> > Currently there are a couple of minor issues with destroying the keys
> > tcp_v4_destroy_sock():
> >
> > 1. The socket is yet in TCP bind buckets, making it reachable for
> > incoming segments [on another CPU core], potentially available to send
> > late FIN/ACK/RST replies.
> >
> > 2. There is at least one code path, where tcp_done() is called before
> > sending RST [kudos to Bob for investigation]. This is a case of
> > a server, that finished sending its data and just called close().
> >
> > The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by
> > __tcp_close())
> >
> > tcp_v4_do_rcv()/tcp_v6_do_rcv()
> > tcp_rcv_state_process() /* LINUX_MIB_TCPABORTONDATA */
> > tcp_reset()
> > tcp_done_with_error()
> > tcp_done()
> > inet_csk_destroy_sock() /* Destroys AO/MD5 keys */
> > /* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */
> > tcp_v4_send_reset() /* Sends an unsigned RST segment */
> >
> > tcpdump:
> >> 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60)
> >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0
> >> 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72)
> >> 1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0
> >> 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
> >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0
> >> 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
> >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
> >> 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
> >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0
> >
> > Note the signed RSTs later in the dump - those are sent by the server
> > when the fin-wait socket gets removed from hash buckets, by
> > the listener socket.
> >
> > Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(),
> > slightly delay it until the actual socket .sk_destruct(). As shutdown'ed
> > socket can yet send non-data replies, they should be signed in order for
> > the peer to process them. Now it also matches how AO/MD5 gets destructed
> > for TIME-WAIT sockets (in tcp_twsk_destructor()).
> >
> > This seems optimal for TCP-MD5, while for TCP-AO it seems to have an
> > open problem: once RST get sent and socket gets actually destructed,
> > there is no information on the initial sequence numbers. So, in case
> > this last RST gets lost in the network, the server's listener socket
> > won't be able to properly sign another RST. Nothing in RFC 1122
> > prescribes keeping any local state after non-graceful reset.
> > Luckily, BGP are known to use keep alive(s).
> >
> > While the issue is quite minor/cosmetic, these days monitoring network
> > counters is a common practice and getting invalid signed segments from
> > a trusted BGP peer can get customers worried.
> >
> > Investigated-by: Bob Gilligan <gilligan@arista.com>
> > Signed-off-by: Dmitry Safonov <dima@arista.com>
> > ---
> > net/ipv4/tcp.c | 31 +++++++++++++++++++++++++++++++
> > net/ipv4/tcp_ipv4.c | 25 -------------------------
> > 2 files changed, 31 insertions(+), 25 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 71a956fbfc5533224ee00e792de2cfdccd4d40aa..4e996e937e8e5f0e75764caa24240e25006deece 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -412,6 +412,36 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp)
> > return rate64;
> > }
> > [...]
> > +
> > +static void tcp_destruct_sock(struct sock *sk)
> > +{
> > + struct tcp_sock *tp = tcp_sk(sk);
>
> It looks like this variable is unused when CONFIG_TCP_MD5SIG is not set
>
> and this is causing the test CI build to fail.
>
> net/ipv4/tcp.c: In function ‘tcp_destruct_sock’:
> net/ipv4/tcp.c:417:26: error: unused variable ‘tp’ [-Werror=unused-variable]
> 417 | struct tcp_sock *tp = tcp_sk(sk);
> | ^~
> cc1: all warnings being treated as errors
> make[4]: *** [scripts/Makefile.build:287: net/ipv4/tcp.
>
> cheers,
> Victor
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay
2025-08-22 12:40 ` Victor Nogueira
@ 2025-08-22 19:24 ` kernel test robot
1 sibling, 0 replies; 7+ messages in thread
From: kernel test robot @ 2025-08-22 19:24 UTC (permalink / raw)
To: Dmitry Safonov via B4 Relay, Eric Dumazet, Neal Cardwell,
Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: oe-kbuild-all, netdev, Bob Gilligan, Salam Noureddine,
Dmitry Safonov, linux-kernel
Hi Dmitry,
kernel test robot noticed the following build warnings:
[auto build test WARNING on a7bd72158063740212344fad5d99dcef45bc70d6]
url: https://github.com/intel-lab-lkp/linux/commits/Dmitry-Safonov-via-B4-Relay/tcp-Destroy-TCP-AO-TCP-MD5-keys-in-sk_destruct/20250822-125815
base: a7bd72158063740212344fad5d99dcef45bc70d6
patch link: https://lore.kernel.org/r/20250822-b4-tcp-ao-md5-rst-finwait2-v1-1-25825d085dcb%40arista.com
patch subject: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct()
config: x86_64-buildonly-randconfig-001-20250823 (https://download.01.org/0day-ci/archive/20250823/202508230331.PEHBlmSk-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14+deb12u1) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250823/202508230331.PEHBlmSk-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508230331.PEHBlmSk-lkp@intel.com/
All warnings (new ones prefixed by >>):
net/ipv4/tcp.c: In function 'tcp_destruct_sock':
>> net/ipv4/tcp.c:429:26: warning: unused variable 'tp' [-Wunused-variable]
429 | struct tcp_sock *tp = tcp_sk(sk);
| ^~
vim +/tp +429 net/ipv4/tcp.c
426
427 static void tcp_destruct_sock(struct sock *sk)
428 {
> 429 struct tcp_sock *tp = tcp_sk(sk);
430
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-22 19:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay
2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay
2025-08-22 12:40 ` Victor Nogueira
2025-08-22 18:05 ` Dmitry Safonov
2025-08-22 19:24 ` kernel test robot
2025-08-22 4:55 ` [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay
2025-08-22 9:08 ` [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() syzbot ci
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).