From: Eric Dumazet <eric.dumazet@gmail.com>
To: Martin Lau <kafai@fb.com>, Eric Dumazet <eric.dumazet@gmail.com>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Alexei Starovoitov <ast@fb.com>,
Daniel Borkmann <daniel@iogearbox.net>,
David Miller <davem@davemloft.net>,
Kernel Team <Kernel-team@fb.com>
Subject: Re: [PATCH bpf] bpf: Fix a race in reuseport_array_free()
Date: Fri, 27 Sep 2019 13:47:32 -0700 [thread overview]
Message-ID: <fc762c01-94da-7f72-4fc0-9b76d6bbe3dd@gmail.com> (raw)
In-Reply-To: <20190927181729.7ep3pp2hiy6l5ixk@kafai-mbp.dhcp.thefacebook.com>
On 9/27/19 11:17 AM, Martin Lau wrote:
> On Fri, Sep 27, 2019 at 10:24:49AM -0700, Eric Dumazet wrote:
>>
>>
>> On 9/27/19 9:52 AM, Martin KaFai Lau wrote:
>>> In reuseport_array_free(), the rcu_read_lock() cannot ensure sk is still
>>> valid. It is because bpf_sk_reuseport_detach() can be called from
>>> __sk_destruct() which is invoked through call_rcu(..., __sk_destruct).
>>
>> We could question why reuseport_detach_sock(sk) is called from __sk_destruct()
>> (after the rcu grace period) instead of sk_destruct() ?
> Agree. It is another way to fix it.
>
> In this patch, I chose to avoid the need to single out a special treatment for
> reuseport_detach_sock() in sk_destruct().
>
> I am happy either way. What do you think?
It seems that since we call reuseport_detach_sock() after the rcu grace period,
another cpu could catch the sk pointer in reuse->socks[] array and use
it right before our cpu frees the socket.
RCU rules are not properly applied here I think.
The rules for deletion are :
1) unpublish object from various lists/arrays/hashes.
2) rcu_grace_period
3) free the object.
If we fix the unpublish (we need to anyway to make the data path safe),
then your patch is not needed ?
What about (totally untested, might be horribly wrong)
diff --git a/net/core/sock.c b/net/core/sock.c
index 07863edbe6fc4842e47ebebf00bc21bc406d9264..d31a4b094797f73ef89110c954aa0a164879362d 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1700,8 +1700,6 @@ static void __sk_destruct(struct rcu_head *head)
sk_filter_uncharge(sk, filter);
RCU_INIT_POINTER(sk->sk_filter, NULL);
}
- if (rcu_access_pointer(sk->sk_reuseport_cb))
- reuseport_detach_sock(sk);
sock_disable_timestamp(sk, SK_FLAGS_TIMESTAMP);
@@ -1728,7 +1726,13 @@ static void __sk_destruct(struct rcu_head *head)
void sk_destruct(struct sock *sk)
{
- if (sock_flag(sk, SOCK_RCU_FREE))
+ bool use_call_rcu = sock_flag(sk, SOCK_RCU_FREE);
+
+ if (rcu_access_pointer(sk->sk_reuseport_cb)) {
+ reuseport_detach_sock(sk);
+ use_call_rcu = true;
+ }
+ if (use_call_rcu)
call_rcu(&sk->sk_rcu, __sk_destruct);
else
__sk_destruct(&sk->sk_rcu);
next prev parent reply other threads:[~2019-09-27 20:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-27 16:52 [PATCH bpf] bpf: Fix a race in reuseport_array_free() Martin KaFai Lau
2019-09-27 17:24 ` Eric Dumazet
2019-09-27 18:17 ` Martin Lau
2019-09-27 20:47 ` Eric Dumazet [this message]
2019-09-27 21:22 ` Martin Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fc762c01-94da-7f72-4fc0-9b76d6bbe3dd@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=Kernel-team@fb.com \
--cc=ast@fb.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox