* [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() @ 2025-08-22 4:55 Dmitry Safonov via B4 Relay 2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 UTC (permalink / raw) To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev, linux-kernel, Dmitry Safonov On one side a minor/cosmetic issue, especially nowadays when TCP-AO/TCP-MD5 signature verification failures aren't logged to dmesg. Yet, I think worth addressing for two reasons: - unsigned RST gets ignored by the peer and the connection is alive for longer (keep-alive interval) - netstat counters increase and trace events report that trusted BGP peer is sending unsigned/incorrectly signed segments, which can ring alarm on monitoring. Signed-off-by: Dmitry Safonov <dima@arista.com> --- Dmitry Safonov (2): tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() tcp: Free TCP-AO/TCP-MD5 info/keys without RCU net/ipv4/tcp.c | 18 ++++++++++++++++++ net/ipv4/tcp_ao.c | 5 ++--- net/ipv4/tcp_ipv4.c | 29 ++--------------------------- net/ipv4/tcp_minisocks.c | 19 +++++-------------- 4 files changed, 27 insertions(+), 44 deletions(-) --- base-commit: a7bd72158063740212344fad5d99dcef45bc70d6 change-id: 20250822-b4-tcp-ao-md5-rst-finwait2-e632b4d8f58d Best regards, -- Dmitry Safonov <dima@arista.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() 2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 ` Dmitry Safonov via B4 Relay 2025-08-22 12:40 ` Victor Nogueira 2025-08-22 19:24 ` kernel test robot 2025-08-22 4:55 ` [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay 2025-08-22 9:08 ` [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() syzbot ci 2 siblings, 2 replies; 7+ messages in thread From: Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 UTC (permalink / raw) To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev, linux-kernel, Dmitry Safonov From: Dmitry Safonov <dima@arista.com> Currently there are a couple of minor issues with destroying the keys tcp_v4_destroy_sock(): 1. The socket is yet in TCP bind buckets, making it reachable for incoming segments [on another CPU core], potentially available to send late FIN/ACK/RST replies. 2. There is at least one code path, where tcp_done() is called before sending RST [kudos to Bob for investigation]. This is a case of a server, that finished sending its data and just called close(). The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by __tcp_close()) tcp_v4_do_rcv()/tcp_v6_do_rcv() tcp_rcv_state_process() /* LINUX_MIB_TCPABORTONDATA */ tcp_reset() tcp_done_with_error() tcp_done() inet_csk_destroy_sock() /* Destroys AO/MD5 keys */ /* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */ tcp_v4_send_reset() /* Sends an unsigned RST segment */ tcpdump: > 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60) > 1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0 > 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72) > 1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0 > 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) > 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0 > 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) > 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0 > 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) > 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0 Note the signed RSTs later in the dump - those are sent by the server when the fin-wait socket gets removed from hash buckets, by the listener socket. Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(), slightly delay it until the actual socket .sk_destruct(). As shutdown'ed socket can yet send non-data replies, they should be signed in order for the peer to process them. Now it also matches how AO/MD5 gets destructed for TIME-WAIT sockets (in tcp_twsk_destructor()). This seems optimal for TCP-MD5, while for TCP-AO it seems to have an open problem: once RST get sent and socket gets actually destructed, there is no information on the initial sequence numbers. So, in case this last RST gets lost in the network, the server's listener socket won't be able to properly sign another RST. Nothing in RFC 1122 prescribes keeping any local state after non-graceful reset. Luckily, BGP are known to use keep alive(s). While the issue is quite minor/cosmetic, these days monitoring network counters is a common practice and getting invalid signed segments from a trusted BGP peer can get customers worried. Investigated-by: Bob Gilligan <gilligan@arista.com> Signed-off-by: Dmitry Safonov <dima@arista.com> --- net/ipv4/tcp.c | 31 +++++++++++++++++++++++++++++++ net/ipv4/tcp_ipv4.c | 25 ------------------------- 2 files changed, 31 insertions(+), 25 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 71a956fbfc5533224ee00e792de2cfdccd4d40aa..4e996e937e8e5f0e75764caa24240e25006deece 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -412,6 +412,36 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp) return rate64; } +#ifdef CONFIG_TCP_MD5SIG +static void tcp_md5sig_info_free_rcu(struct rcu_head *head) +{ + struct tcp_md5sig_info *md5sig; + + md5sig = container_of(head, struct tcp_md5sig_info, rcu); + kfree(md5sig); + static_branch_slow_dec_deferred(&tcp_md5_needed); + tcp_md5_release_sigpool(); +} +#endif + +static void tcp_destruct_sock(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + +#ifdef CONFIG_TCP_MD5SIG + if (tp->md5sig_info) { + struct tcp_md5sig_info *md5sig; + + md5sig = rcu_dereference_protected(tp->md5sig_info, 1); + tcp_clear_md5_list(sk); + call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu); + rcu_assign_pointer(tp->md5sig_info, NULL); + } +#endif + tcp_ao_destroy_sock(sk, false); + inet_sock_destruct(sk); +} + /* Address-family independent initialization for a tcp_sock. * * NOTE: A lot of things set to zero explicitly by call to @@ -464,6 +494,7 @@ void tcp_init_sock(struct sock *sk) tp->tsoffset = 0; tp->rack.reo_wnd_steps = 1; + sk->sk_destruct = tcp_destruct_sock; sk->sk_write_space = sk_stream_write_space; sock_set_flag(sk, SOCK_USE_WRITE_QUEUE); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 84d3d556ed8062d07fe7019bc0dadd90d3b80d96..32814f205fdfdcbd4be4765a4e127c8f175d3b14 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2521,18 +2521,6 @@ static int tcp_v4_init_sock(struct sock *sk) return 0; } -#ifdef CONFIG_TCP_MD5SIG -static void tcp_md5sig_info_free_rcu(struct rcu_head *head) -{ - struct tcp_md5sig_info *md5sig; - - md5sig = container_of(head, struct tcp_md5sig_info, rcu); - kfree(md5sig); - static_branch_slow_dec_deferred(&tcp_md5_needed); - tcp_md5_release_sigpool(); -} -#endif - static void tcp_release_user_frags(struct sock *sk) { #ifdef CONFIG_PAGE_POOL @@ -2569,19 +2557,6 @@ void tcp_v4_destroy_sock(struct sock *sk) /* Cleans up our, hopefully empty, out_of_order_queue. */ skb_rbtree_purge(&tp->out_of_order_queue); -#ifdef CONFIG_TCP_MD5SIG - /* Clean up the MD5 key list, if any */ - if (tp->md5sig_info) { - struct tcp_md5sig_info *md5sig; - - md5sig = rcu_dereference_protected(tp->md5sig_info, 1); - tcp_clear_md5_list(sk); - call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu); - rcu_assign_pointer(tp->md5sig_info, NULL); - } -#endif - tcp_ao_destroy_sock(sk, false); - /* Clean up a referenced TCP bind bucket. */ if (inet_csk(sk)->icsk_bind_hash) inet_put_port(sk); -- 2.42.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() 2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay @ 2025-08-22 12:40 ` Victor Nogueira 2025-08-22 18:05 ` Dmitry Safonov 2025-08-22 19:24 ` kernel test robot 1 sibling, 1 reply; 7+ messages in thread From: Victor Nogueira @ 2025-08-22 12:40 UTC (permalink / raw) To: dima, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev, linux-kernel On 8/22/25 01:55, Dmitry Safonov via B4 Relay wrote: > From: Dmitry Safonov <dima@arista.com> > > Currently there are a couple of minor issues with destroying the keys > tcp_v4_destroy_sock(): > > 1. The socket is yet in TCP bind buckets, making it reachable for > incoming segments [on another CPU core], potentially available to send > late FIN/ACK/RST replies. > > 2. There is at least one code path, where tcp_done() is called before > sending RST [kudos to Bob for investigation]. This is a case of > a server, that finished sending its data and just called close(). > > The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by > __tcp_close()) > > tcp_v4_do_rcv()/tcp_v6_do_rcv() > tcp_rcv_state_process() /* LINUX_MIB_TCPABORTONDATA */ > tcp_reset() > tcp_done_with_error() > tcp_done() > inet_csk_destroy_sock() /* Destroys AO/MD5 keys */ > /* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */ > tcp_v4_send_reset() /* Sends an unsigned RST segment */ > > tcpdump: >> 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60) >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0 >> 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72) >> 1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0 >> 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0 >> 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0 >> 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0 > > Note the signed RSTs later in the dump - those are sent by the server > when the fin-wait socket gets removed from hash buckets, by > the listener socket. > > Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(), > slightly delay it until the actual socket .sk_destruct(). As shutdown'ed > socket can yet send non-data replies, they should be signed in order for > the peer to process them. Now it also matches how AO/MD5 gets destructed > for TIME-WAIT sockets (in tcp_twsk_destructor()). > > This seems optimal for TCP-MD5, while for TCP-AO it seems to have an > open problem: once RST get sent and socket gets actually destructed, > there is no information on the initial sequence numbers. So, in case > this last RST gets lost in the network, the server's listener socket > won't be able to properly sign another RST. Nothing in RFC 1122 > prescribes keeping any local state after non-graceful reset. > Luckily, BGP are known to use keep alive(s). > > While the issue is quite minor/cosmetic, these days monitoring network > counters is a common practice and getting invalid signed segments from > a trusted BGP peer can get customers worried. > > Investigated-by: Bob Gilligan <gilligan@arista.com> > Signed-off-by: Dmitry Safonov <dima@arista.com> > --- > net/ipv4/tcp.c | 31 +++++++++++++++++++++++++++++++ > net/ipv4/tcp_ipv4.c | 25 ------------------------- > 2 files changed, 31 insertions(+), 25 deletions(-) > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index 71a956fbfc5533224ee00e792de2cfdccd4d40aa..4e996e937e8e5f0e75764caa24240e25006deece 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -412,6 +412,36 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp) > return rate64; > } > [...] > + > +static void tcp_destruct_sock(struct sock *sk) > +{ > + struct tcp_sock *tp = tcp_sk(sk); It looks like this variable is unused when CONFIG_TCP_MD5SIG is not set and this is causing the test CI build to fail. net/ipv4/tcp.c: In function ‘tcp_destruct_sock’: net/ipv4/tcp.c:417:26: error: unused variable ‘tp’ [-Werror=unused-variable] 417 | struct tcp_sock *tp = tcp_sk(sk); | ^~ cc1: all warnings being treated as errors make[4]: *** [scripts/Makefile.build:287: net/ipv4/tcp. cheers, Victor ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() 2025-08-22 12:40 ` Victor Nogueira @ 2025-08-22 18:05 ` Dmitry Safonov 0 siblings, 0 replies; 7+ messages in thread From: Dmitry Safonov @ 2025-08-22 18:05 UTC (permalink / raw) To: Victor Nogueira Cc: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman, Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev, linux-kernel Hi Victor, Thanks, going to correct that in v2. On Fri, Aug 22, 2025 at 1:40 PM Victor Nogueira <victor@mojatatu.com> wrote: > > On 8/22/25 01:55, Dmitry Safonov via B4 Relay wrote: > > From: Dmitry Safonov <dima@arista.com> > > > > Currently there are a couple of minor issues with destroying the keys > > tcp_v4_destroy_sock(): > > > > 1. The socket is yet in TCP bind buckets, making it reachable for > > incoming segments [on another CPU core], potentially available to send > > late FIN/ACK/RST replies. > > > > 2. There is at least one code path, where tcp_done() is called before > > sending RST [kudos to Bob for investigation]. This is a case of > > a server, that finished sending its data and just called close(). > > > > The socket is in TCP_FIN_WAIT2 and has RCV_SHUTDOWN (set by > > __tcp_close()) > > > > tcp_v4_do_rcv()/tcp_v6_do_rcv() > > tcp_rcv_state_process() /* LINUX_MIB_TCPABORTONDATA */ > > tcp_reset() > > tcp_done_with_error() > > tcp_done() > > inet_csk_destroy_sock() /* Destroys AO/MD5 keys */ > > /* tcp_rcv_state_process() returns SKB_DROP_REASON_TCP_ABORT_ON_DATA */ > > tcp_v4_send_reset() /* Sends an unsigned RST segment */ > > > > tcpdump: > >> 22:53:15.399377 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 33929, offset 0, flags [DF], proto TCP (6), length 60) > >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [F.], seq 2185658590, ack 3969644355, win 502, options [nop,nop,md5 valid], length 0 > >> 22:53:15.399396 00:00:01:01:00:00 > 00:00:b2:1f:00:00, ethertype IPv4 (0x0800), length 86: (tos 0x0, ttl 64, id 51951, offset 0, flags [DF], proto TCP (6), length 72) > >> 1.0.0.2.49848 > 1.0.0.1.34567: Flags [.], seq 3969644375, ack 2185658591, win 128, options [nop,nop,md5 valid,nop,nop,sack 1 {2185658590:2185658591}], length 0 > >> 22:53:16.429588 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) > >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658590, win 0, length 0 > >> 22:53:16.664725 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) > >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0 > >> 22:53:17.289832 00:00:b2:1f:00:00 > 00:00:01:01:00:00, ethertype IPv4 (0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) > >> 1.0.0.1.34567 > 1.0.0.2.49848: Flags [R], seq 2185658591, win 0, options [nop,nop,md5 valid], length 0 > > > > Note the signed RSTs later in the dump - those are sent by the server > > when the fin-wait socket gets removed from hash buckets, by > > the listener socket. > > > > Instead of destroying AO/MD5 info and their keys in inet_csk_destroy_sock(), > > slightly delay it until the actual socket .sk_destruct(). As shutdown'ed > > socket can yet send non-data replies, they should be signed in order for > > the peer to process them. Now it also matches how AO/MD5 gets destructed > > for TIME-WAIT sockets (in tcp_twsk_destructor()). > > > > This seems optimal for TCP-MD5, while for TCP-AO it seems to have an > > open problem: once RST get sent and socket gets actually destructed, > > there is no information on the initial sequence numbers. So, in case > > this last RST gets lost in the network, the server's listener socket > > won't be able to properly sign another RST. Nothing in RFC 1122 > > prescribes keeping any local state after non-graceful reset. > > Luckily, BGP are known to use keep alive(s). > > > > While the issue is quite minor/cosmetic, these days monitoring network > > counters is a common practice and getting invalid signed segments from > > a trusted BGP peer can get customers worried. > > > > Investigated-by: Bob Gilligan <gilligan@arista.com> > > Signed-off-by: Dmitry Safonov <dima@arista.com> > > --- > > net/ipv4/tcp.c | 31 +++++++++++++++++++++++++++++++ > > net/ipv4/tcp_ipv4.c | 25 ------------------------- > > 2 files changed, 31 insertions(+), 25 deletions(-) > > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > > index 71a956fbfc5533224ee00e792de2cfdccd4d40aa..4e996e937e8e5f0e75764caa24240e25006deece 100644 > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -412,6 +412,36 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp) > > return rate64; > > } > > [...] > > + > > +static void tcp_destruct_sock(struct sock *sk) > > +{ > > + struct tcp_sock *tp = tcp_sk(sk); > > It looks like this variable is unused when CONFIG_TCP_MD5SIG is not set > > and this is causing the test CI build to fail. > > net/ipv4/tcp.c: In function ‘tcp_destruct_sock’: > net/ipv4/tcp.c:417:26: error: unused variable ‘tp’ [-Werror=unused-variable] > 417 | struct tcp_sock *tp = tcp_sk(sk); > | ^~ > cc1: all warnings being treated as errors > make[4]: *** [scripts/Makefile.build:287: net/ipv4/tcp. > > cheers, > Victor ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() 2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay 2025-08-22 12:40 ` Victor Nogueira @ 2025-08-22 19:24 ` kernel test robot 1 sibling, 0 replies; 7+ messages in thread From: kernel test robot @ 2025-08-22 19:24 UTC (permalink / raw) To: Dmitry Safonov via B4 Relay, Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: oe-kbuild-all, netdev, Bob Gilligan, Salam Noureddine, Dmitry Safonov, linux-kernel Hi Dmitry, kernel test robot noticed the following build warnings: [auto build test WARNING on a7bd72158063740212344fad5d99dcef45bc70d6] url: https://github.com/intel-lab-lkp/linux/commits/Dmitry-Safonov-via-B4-Relay/tcp-Destroy-TCP-AO-TCP-MD5-keys-in-sk_destruct/20250822-125815 base: a7bd72158063740212344fad5d99dcef45bc70d6 patch link: https://lore.kernel.org/r/20250822-b4-tcp-ao-md5-rst-finwait2-v1-1-25825d085dcb%40arista.com patch subject: [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() config: x86_64-buildonly-randconfig-001-20250823 (https://download.01.org/0day-ci/archive/20250823/202508230331.PEHBlmSk-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14+deb12u1) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250823/202508230331.PEHBlmSk-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202508230331.PEHBlmSk-lkp@intel.com/ All warnings (new ones prefixed by >>): net/ipv4/tcp.c: In function 'tcp_destruct_sock': >> net/ipv4/tcp.c:429:26: warning: unused variable 'tp' [-Wunused-variable] 429 | struct tcp_sock *tp = tcp_sk(sk); | ^~ vim +/tp +429 net/ipv4/tcp.c 426 427 static void tcp_destruct_sock(struct sock *sk) 428 { > 429 struct tcp_sock *tp = tcp_sk(sk); 430 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU 2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay 2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 ` Dmitry Safonov via B4 Relay 2025-08-22 9:08 ` [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() syzbot ci 2 siblings, 0 replies; 7+ messages in thread From: Dmitry Safonov via B4 Relay @ 2025-08-22 4:55 UTC (permalink / raw) To: Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Bob Gilligan, Salam Noureddine, Dmitry Safonov, netdev, linux-kernel, Dmitry Safonov From: Dmitry Safonov <dima@arista.com> Now that the destruction of info/keys is delayed until the socket destructor, it's safe to use kfree() without an RCU callback. As either socket was yet in TCP_CLOSE state or the socket refcounter is zero and no one can discover it anymore, it's safe to release memory straight away. Similar thing was possible for twsk already. Signed-off-by: Dmitry Safonov <dima@arista.com> --- net/ipv4/tcp.c | 19 +++---------------- net/ipv4/tcp_ao.c | 5 ++--- net/ipv4/tcp_ipv4.c | 4 ++-- net/ipv4/tcp_minisocks.c | 19 +++++-------------- 4 files changed, 12 insertions(+), 35 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 4e996e937e8e5f0e75764caa24240e25006deece..de10d38116a205863c290470fa0cbbddeb8d709e 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -412,30 +412,17 @@ static u64 tcp_compute_delivery_rate(const struct tcp_sock *tp) return rate64; } -#ifdef CONFIG_TCP_MD5SIG -static void tcp_md5sig_info_free_rcu(struct rcu_head *head) -{ - struct tcp_md5sig_info *md5sig; - - md5sig = container_of(head, struct tcp_md5sig_info, rcu); - kfree(md5sig); - static_branch_slow_dec_deferred(&tcp_md5_needed); - tcp_md5_release_sigpool(); -} -#endif - static void tcp_destruct_sock(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); #ifdef CONFIG_TCP_MD5SIG if (tp->md5sig_info) { - struct tcp_md5sig_info *md5sig; - md5sig = rcu_dereference_protected(tp->md5sig_info, 1); tcp_clear_md5_list(sk); - call_rcu(&md5sig->rcu, tcp_md5sig_info_free_rcu); - rcu_assign_pointer(tp->md5sig_info, NULL); + kfree(rcu_replace_pointer(tp->md5sig_info, NULL, 1)); + static_branch_slow_dec_deferred(&tcp_md5_needed); + tcp_md5_release_sigpool(); } #endif tcp_ao_destroy_sock(sk, false); diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index bbb8d5f0eae7d3d8887da3fa4d68e248af9060ad..31302be78bc4450b56fa23a390b6d03b2262741d 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -268,9 +268,8 @@ static void tcp_ao_key_free_rcu(struct rcu_head *head) kfree_sensitive(key); } -static void tcp_ao_info_free_rcu(struct rcu_head *head) +static void tcp_ao_info_free(struct tcp_ao_info *ao) { - struct tcp_ao_info *ao = container_of(head, struct tcp_ao_info, rcu); struct tcp_ao_key *key; struct hlist_node *n; @@ -310,7 +309,7 @@ void tcp_ao_destroy_sock(struct sock *sk, bool twsk) if (!twsk) tcp_ao_sk_omem_free(sk, ao); - call_rcu(&ao->rcu, tcp_ao_info_free_rcu); + tcp_ao_info_free(ao); } void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 32814f205fdfdcbd4be4765a4e127c8f175d3b14..185d16ff2d0ad7b20e2404ceff71cccaff4ed4bc 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1505,9 +1505,9 @@ void tcp_clear_md5_list(struct sock *sk) md5sig = rcu_dereference_protected(tp->md5sig_info, 1); hlist_for_each_entry_safe(key, n, &md5sig->head, node) { - hlist_del_rcu(&key->node); + hlist_del(&key->node); atomic_sub(sizeof(*key), &sk->sk_omem_alloc); - kfree_rcu(key, rcu); + kfree(key); } } diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 2994c9222c9cb5ee86b60bdb553f92130e52c70e..c93812b19893742b071b0f7a49c4293a781e8de4 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -377,26 +377,17 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) } EXPORT_SYMBOL(tcp_time_wait); -#ifdef CONFIG_TCP_MD5SIG -static void tcp_md5_twsk_free_rcu(struct rcu_head *head) -{ - struct tcp_md5sig_key *key; - - key = container_of(head, struct tcp_md5sig_key, rcu); - kfree(key); - static_branch_slow_dec_deferred(&tcp_md5_needed); - tcp_md5_release_sigpool(); -} -#endif - void tcp_twsk_destructor(struct sock *sk) { #ifdef CONFIG_TCP_MD5SIG if (static_branch_unlikely(&tcp_md5_needed.key)) { struct tcp_timewait_sock *twsk = tcp_twsk(sk); - if (twsk->tw_md5_key) - call_rcu(&twsk->tw_md5_key->rcu, tcp_md5_twsk_free_rcu); + if (twsk->tw_md5_key) { + kfree(twsk->tw_md5_key); + static_branch_slow_dec_deferred(&tcp_md5_needed); + tcp_md5_release_sigpool(); + } } #endif tcp_ao_destroy_sock(sk, true); -- 2.42.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() 2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay 2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay 2025-08-22 4:55 ` [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay @ 2025-08-22 9:08 ` syzbot ci 2 siblings, 0 replies; 7+ messages in thread From: syzbot ci @ 2025-08-22 9:08 UTC (permalink / raw) To: 0x7f454c46, davem, devnull, dima, dsahern, edumazet, gilligan, horms, kuba, kuniyu, linux-kernel, ncardwell, netdev, noureddine, pabeni Cc: syzbot, syzkaller-bugs syzbot ci has tested the following series [v1] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() https://lore.kernel.org/all/20250822-b4-tcp-ao-md5-rst-finwait2-v1-0-25825d085dcb@arista.com * [PATCH net-next 1/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() * [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU and found the following issue: WARNING in inet_sock_destruct Full report is available here: https://ci.syzbot.org/series/4e53fc18-79b4-47d6-87c4-89e499e12879 *** WARNING in inet_sock_destruct tree: net-next URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git base: a9af709fda7edafa17e072bffe610d9e7ed7a5df arch: amd64 compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7 config: https://ci.syzbot.org/builds/d6e1c7cd-df1b-4192-bc2d-a2db69987793/config C repro: https://ci.syzbot.org/findings/62fd81c8-8c8c-49fe-aa87-8e1418bcc053/c_repro syz repro: https://ci.syzbot.org/findings/62fd81c8-8c8c-49fe-aa87-8e1418bcc053/syz_repro ------------[ cut here ]------------ WARNING: CPU: 0 PID: 6012 at net/ipv4/af_inet.c:153 inet_sock_destruct+0x5f9/0x730 net/ipv4/af_inet.c:153 Modules linked in: CPU: 0 UID: 0 PID: 6012 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:inet_sock_destruct+0x5f9/0x730 net/ipv4/af_inet.c:153 Code: 00 41 0f b6 74 24 12 48 c7 c7 20 91 9e 8c 4c 89 e2 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d e9 dd 38 24 f7 e8 68 44 bc f7 90 <0f> 0b 90 e9 62 fe ff ff e8 5a 44 bc f7 90 0f 0b 90 e9 95 fe ff ff RSP: 0018:ffffc90003a0fc58 EFLAGS: 00010293 RAX: ffffffff8a0366c8 RBX: dffffc0000000000 RCX: ffff888106891cc0 RDX: 0000000000000000 RSI: 00000000000003c0 RDI: 0000000000000000 RBP: 00000000000003c0 R08: ffff88803bb429c3 R09: 1ffff11007768538 R10: dffffc0000000000 R11: ffffed1007768539 R12: ffff88803bb42880 R13: dffffc0000000000 R14: ffff88803bb429c0 R15: 1ffff11007768512 FS: 00005555690ff500(0000) GS:ffff8880b861a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000200000f65000 CR3: 0000000106d3a000 CR4: 00000000000006f0 Call Trace: <TASK> __sk_destruct+0x89/0x660 net/core/sock.c:2339 inet_release+0x144/0x190 net/ipv4/af_inet.c:435 __sock_release net/socket.c:649 [inline] sock_close+0xc3/0x240 net/socket.c:1439 __fput+0x44c/0xa70 fs/file_table.c:468 task_work_run+0x1d4/0x260 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:43 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fa54ab8ebe9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffc0978c268 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4 RAX: 0000000000000000 RBX: 000000000000ce2e RCX: 00007fa54ab8ebe9 RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000001 R09: 000000040978c55f R10: 0000001b30220000 R11: 0000000000000246 R12: 00007fa54adb5fac R13: 00007fa54adb5fa0 R14: ffffffffffffffff R15: 0000000000000006 </TASK> *** If these findings have caused you to resend the series or submit a separate fix, please add the following tag to your commit message: Tested-by: syzbot@syzkaller.appspotmail.com --- This report is generated by a bot. It may contain errors. syzbot ci engineers can be reached at syzkaller@googlegroups.com. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-22 19:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-22 4:55 [PATCH net-next 0/2] tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() Dmitry Safonov via B4 Relay 2025-08-22 4:55 ` [PATCH net-next 1/2] " Dmitry Safonov via B4 Relay 2025-08-22 12:40 ` Victor Nogueira 2025-08-22 18:05 ` Dmitry Safonov 2025-08-22 19:24 ` kernel test robot 2025-08-22 4:55 ` [PATCH net-next 2/2] tcp: Free TCP-AO/TCP-MD5 info/keys without RCU Dmitry Safonov via B4 Relay 2025-08-22 9:08 ` [syzbot ci] Re: tcp: Destroy TCP-AO, TCP-MD5 keys in .sk_destruct() syzbot ci
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).