From: Stanislav Fomichev <sdf@google.com>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, Stanislav Fomichev <sdf@google.com>
Subject: [PATCH net] net: set SOCK_RCU_FREE before inserting socket into hashtable
Date: Wed, 8 Nov 2023 12:28:19 -0800 [thread overview]
Message-ID: <20231108202819.1932920-1-sdf@google.com> (raw)
We've started to see the following kernel traces:
WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0
Call Trace:
<IRQ>
__bpf_skc_lookup+0x10d/0x120
bpf_sk_lookup+0x48/0xd0
bpf_sk_lookup_tcp+0x19/0x20
bpf_prog_<redacted>+0x37c/0x16a3
cls_bpf_classify+0x205/0x2e0
tcf_classify+0x92/0x160
__netif_receive_skb_core+0xe52/0xf10
__netif_receive_skb_list_core+0x96/0x2b0
napi_complete_done+0x7b5/0xb70
<redacted>_poll+0x94/0xb0
net_rx_action+0x163/0x1d70
__do_softirq+0xdc/0x32e
asm_call_irq_on_stack+0x12/0x20
</IRQ>
do_softirq_own_stack+0x36/0x50
do_softirq+0x44/0x70
I'm not 100% what is causing them. It might be some kernel change or
new code path in the bpf program. But looking at the code,
I'm assuming the issue has been there for a while.
__inet_hash can race with lockless (rcu) readers on the other cpus:
__inet_hash
__sk_nulls_add_node_rcu
<- (bpf triggers here)
sock_set_flag(SOCK_RCU_FREE)
Let's move the SOCK_RCU_FREE part up a bit, before we are inserting
the socket into hashtables. Note, that the race is really harmless;
the bpf callers are handling this situation (where listener socket
doesn't have SOCK_RCU_FREE set) correctly, so the only
annoyance is a WARN_ONCE (so not 100% sure whether it should
wait until net-next instead).
For the fixes tag, I'm using the original commit which added the flag.
Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
net/ipv4/inet_hashtables.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 598c1b114d2c..a532f749e477 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -751,12 +751,12 @@ int __inet_hash(struct sock *sk, struct sock *osk)
if (err)
goto unlock;
}
+ sock_set_flag(sk, SOCK_RCU_FREE);
if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport &&
sk->sk_family == AF_INET6)
__sk_nulls_add_node_tail_rcu(sk, &ilb2->nulls_head);
else
__sk_nulls_add_node_rcu(sk, &ilb2->nulls_head);
- sock_set_flag(sk, SOCK_RCU_FREE);
sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
unlock:
spin_unlock(&ilb2->lock);
--
2.42.0.869.gea05f2083d-goog
next reply other threads:[~2023-11-08 20:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-08 20:28 Stanislav Fomichev [this message]
2023-11-08 20:58 ` [PATCH net] net: set SOCK_RCU_FREE before inserting socket into hashtable Eric Dumazet
2023-11-08 21:01 ` Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231108202819.1932920-1-sdf@google.com \
--to=sdf@google.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox