From: Jakub Sitnicki <jakub@cloudflare.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: bpf@vger.kernel.org, vincent.whitchurch@datadoghq.com,
daniel@iogearbox.net
Subject: Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock
Date: Sat, 29 Jun 2024 17:34:37 +0200 [thread overview]
Message-ID: <874j9bg3ua.fsf@cloudflare.com> (raw)
In-Reply-To: <20240625201632.49024-2-john.fastabend@gmail.com> (John Fastabend's message of "Tue, 25 Jun 2024 13:16:31 -0700")
On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote:
> Originally there was a race where removing a psock from the sock map while
> it was also receiving an skb and calling sk_psock_data_ready(). It was
> possible the removal code would NULL/set the data_ready callback while
> concurrently calling the hook from receive path. The fix was to wrap the
> access in sk_callback_lock to ensure the saved_data_ready pointer didn't
> change under us. There was some discussion around doing a larger change
> to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that
> was for *next kernels not stable fixes.
>
> But, we unfortunately introduced a regression with the fix because there
> is another path into this code (that didn't have a test case) through
> the stream parser. The stream parser runs with the lower lock which means
> we get the following splat and lock up.
>
>
> ============================================
> WARNING: possible recursive locking detected
> 6.10.0-rc2 #59 Not tainted
> --------------------------------------------
> test_sockmap/342 is trying to acquire lock:
> ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467
> net/core/skmsg.c:555)
>
> but task is already holding lock:
> ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> sk_psock_strp_data_ready (net/core/skmsg.c:1120)
>
> To fix ensure we do not grap lock when we reach this code through the
> strparser.
>
> Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
> Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
> include/linux/skmsg.h | 9 +++++++--
> net/core/skmsg.c | 5 ++++-
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index c9efda9df285..3659e9b514d0 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
> sk_psock_drop(sk, psock);
> }
>
> -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> {
> - read_lock_bh(&sk->sk_callback_lock);
> if (psock->saved_data_ready)
> psock->saved_data_ready(sk);
> else
> sk->sk_data_ready(sk);
> +}
> +
> +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +{
> + read_lock_bh(&sk->sk_callback_lock);
> + __sk_psock_data_ready(sk, psock);
> read_unlock_bh(&sk->sk_callback_lock);
> }
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index fd20aae30be2..8429daecbbb6 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
> msg->skb = skb;
>
> sk_psock_queue_msg(psock, msg);
> - sk_psock_data_ready(sk, psock);
> + if (skb_bpf_strparser(skb))
> + __sk_psock_data_ready(sk, psock);
> + else
> + sk_psock_data_ready(sk, psock);
> return copied;
> }
If I follow, this is the call chain that leads to the recursive lock:
sock::sk_data_ready → sk_psock_strp_data_ready
write_lock_bh(&sk->sk_callback_lock)
strp_data_ready
strp_read_sock
proto_ops::read_sock → tcp_read_sock
strp_recv
__strp_recv
strp_callbacks::rcv_msg → sk_psock_strp_read
sk_psock_verdict_apply(verdict=__SK_PASS)
sk_psock_skb_ingress_self
sk_psock_skb_ingress_enqueue
sk_psock_data_ready
read_lock_bh(&sk->sk_callback_lock) !!!
What I don't get, though, is why strp_data_ready has to be called with a
_writer_ lock? Maybe that should just be a reader lock, and then it can
be recursive.
next prev parent reply other threads:[~2024-06-29 15:34 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend
2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend
2024-06-29 15:34 ` Jakub Sitnicki [this message]
2024-07-03 1:12 ` John Fastabend
2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874j9bg3ua.fsf@cloudflare.com \
--to=jakub@cloudflare.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=vincent.whitchurch@datadoghq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.