From: Jakub Sitnicki <jakub@cloudflare.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: bpf@vger.kernel.org, vincent.whitchurch@datadoghq.com,
daniel@iogearbox.net
Subject: Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock
Date: Sat, 29 Jun 2024 17:34:37 +0200 [thread overview]
Message-ID: <874j9bg3ua.fsf@cloudflare.com> (raw)
In-Reply-To: <20240625201632.49024-2-john.fastabend@gmail.com> (John Fastabend's message of "Tue, 25 Jun 2024 13:16:31 -0700")
On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote:
> Originally there was a race where removing a psock from the sock map while
> it was also receiving an skb and calling sk_psock_data_ready(). It was
> possible the removal code would NULL/set the data_ready callback while
> concurrently calling the hook from receive path. The fix was to wrap the
> access in sk_callback_lock to ensure the saved_data_ready pointer didn't
> change under us. There was some discussion around doing a larger change
> to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that
> was for *next kernels not stable fixes.
>
> But, we unfortunately introduced a regression with the fix because there
> is another path into this code (that didn't have a test case) through
> the stream parser. The stream parser runs with the lower lock which means
> we get the following splat and lock up.
>
>
> ============================================
> WARNING: possible recursive locking detected
> 6.10.0-rc2 #59 Not tainted
> --------------------------------------------
> test_sockmap/342 is trying to acquire lock:
> ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467
> net/core/skmsg.c:555)
>
> but task is already holding lock:
> ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> sk_psock_strp_data_ready (net/core/skmsg.c:1120)
>
> To fix ensure we do not grap lock when we reach this code through the
> strparser.
>
> Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
> Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
> include/linux/skmsg.h | 9 +++++++--
> net/core/skmsg.c | 5 ++++-
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index c9efda9df285..3659e9b514d0 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
> sk_psock_drop(sk, psock);
> }
>
> -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> {
> - read_lock_bh(&sk->sk_callback_lock);
> if (psock->saved_data_ready)
> psock->saved_data_ready(sk);
> else
> sk->sk_data_ready(sk);
> +}
> +
> +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +{
> + read_lock_bh(&sk->sk_callback_lock);
> + __sk_psock_data_ready(sk, psock);
> read_unlock_bh(&sk->sk_callback_lock);
> }
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index fd20aae30be2..8429daecbbb6 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
> msg->skb = skb;
>
> sk_psock_queue_msg(psock, msg);
> - sk_psock_data_ready(sk, psock);
> + if (skb_bpf_strparser(skb))
> + __sk_psock_data_ready(sk, psock);
> + else
> + sk_psock_data_ready(sk, psock);
> return copied;
> }
If I follow, this is the call chain that leads to the recursive lock:
sock::sk_data_ready → sk_psock_strp_data_ready
write_lock_bh(&sk->sk_callback_lock)
strp_data_ready
strp_read_sock
proto_ops::read_sock → tcp_read_sock
strp_recv
__strp_recv
strp_callbacks::rcv_msg → sk_psock_strp_read
sk_psock_verdict_apply(verdict=__SK_PASS)
sk_psock_skb_ingress_self
sk_psock_skb_ingress_enqueue
sk_psock_data_ready
read_lock_bh(&sk->sk_callback_lock) !!!
What I don't get, though, is why strp_data_ready has to be called with a
_writer_ lock? Maybe that should just be a reader lock, and then it can
be recursive.
next prev parent reply other threads:[~2024-06-29 15:34 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend
2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend
2024-06-29 15:34 ` Jakub Sitnicki [this message]
2024-07-03 1:12 ` John Fastabend
2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874j9bg3ua.fsf@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=vincent.whitchurch@datadoghq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox