* [PATCH bpf 0/2] Fix reported sockmap splat
@ 2024-06-25 20:16 John Fastabend
2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend
2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend
0 siblings, 2 replies; 5+ messages in thread
From: John Fastabend @ 2024-06-25 20:16 UTC (permalink / raw)
To: bpf, vincent.whitchurch; +Cc: daniel, john.fastabend
Fix regression introduced in sockmap for the case with strparser
and verdict and returning SK_PASS verdict on a skb. We did not
have a test for this case so we add it here as well.
Todo is to finalize porting all tests into normal CI coverage
to make it easier to audit and run tests. That will go through
bpf-next though.
John Fastabend (2):
bpf: sockmap, fix introduced strparser recursive lock
bpf: sockmap, add test for ingress through strparser
include/linux/skmsg.h | 9 +++++++--
net/core/skmsg.c | 5 ++++-
tools/testing/selftests/bpf/test_sockmap.c | 13 ++++++++++++-
3 files changed, 23 insertions(+), 4 deletions(-)
--
2.33.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock
2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend
@ 2024-06-25 20:16 ` John Fastabend
2024-06-29 15:34 ` Jakub Sitnicki
2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend
1 sibling, 1 reply; 5+ messages in thread
From: John Fastabend @ 2024-06-25 20:16 UTC (permalink / raw)
To: bpf, vincent.whitchurch; +Cc: daniel, john.fastabend
Originally there was a race where removing a psock from the sock map while
it was also receiving an skb and calling sk_psock_data_ready(). It was
possible the removal code would NULL/set the data_ready callback while
concurrently calling the hook from receive path. The fix was to wrap the
access in sk_callback_lock to ensure the saved_data_ready pointer didn't
change under us. There was some discussion around doing a larger change
to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that
was for *next kernels not stable fixes.
But, we unfortunately introduced a regression with the fix because there
is another path into this code (that didn't have a test case) through
the stream parser. The stream parser runs with the lower lock which means
we get the following splat and lock up.
============================================
WARNING: possible recursive locking detected
6.10.0-rc2 #59 Not tainted
--------------------------------------------
test_sockmap/342 is trying to acquire lock:
ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467
net/core/skmsg.c:555)
but task is already holding lock:
ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
sk_psock_strp_data_ready (net/core/skmsg.c:1120)
To fix ensure we do not grap lock when we reach this code through the
strparser.
Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
include/linux/skmsg.h | 9 +++++++--
net/core/skmsg.c | 5 ++++-
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index c9efda9df285..3659e9b514d0 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
sk_psock_drop(sk, psock);
}
-static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
+static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
{
- read_lock_bh(&sk->sk_callback_lock);
if (psock->saved_data_ready)
psock->saved_data_ready(sk);
else
sk->sk_data_ready(sk);
+}
+
+static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
+{
+ read_lock_bh(&sk->sk_callback_lock);
+ __sk_psock_data_ready(sk, psock);
read_unlock_bh(&sk->sk_callback_lock);
}
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index fd20aae30be2..8429daecbbb6 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
msg->skb = skb;
sk_psock_queue_msg(psock, msg);
- sk_psock_data_ready(sk, psock);
+ if (skb_bpf_strparser(skb))
+ __sk_psock_data_ready(sk, psock);
+ else
+ sk_psock_data_ready(sk, psock);
return copied;
}
--
2.33.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser
2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend
2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend
@ 2024-06-25 20:16 ` John Fastabend
1 sibling, 0 replies; 5+ messages in thread
From: John Fastabend @ 2024-06-25 20:16 UTC (permalink / raw)
To: bpf, vincent.whitchurch; +Cc: daniel, john.fastabend
Add an option to always return SK_PASS in the verdict callback
instead of redirecting the skb. This allows testing cases
which are not covered by the test program as of now.
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
tools/testing/selftests/bpf/test_sockmap.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c
index 9cba4ec844a5..cc3e4d96c8ac 100644
--- a/tools/testing/selftests/bpf/test_sockmap.c
+++ b/tools/testing/selftests/bpf/test_sockmap.c
@@ -78,6 +78,7 @@ int txmsg_end_push;
int txmsg_start_pop;
int txmsg_pop;
int txmsg_ingress;
+int txmsg_pass_skb;
int txmsg_redir_skb;
int txmsg_ktls_skb;
int txmsg_ktls_skb_drop;
@@ -108,6 +109,7 @@ static const struct option long_options[] = {
{"txmsg_start_pop", required_argument, NULL, 'w'},
{"txmsg_pop", required_argument, NULL, 'x'},
{"txmsg_ingress", no_argument, &txmsg_ingress, 1 },
+ {"txmsg_pass_skb", no_argument, &txmsg_pass_skb, 1 },
{"txmsg_redir_skb", no_argument, &txmsg_redir_skb, 1 },
{"ktls", no_argument, &ktls, 1 },
{"peek", no_argument, &peek_flag, 1 },
@@ -177,6 +179,7 @@ static void test_reset(void)
txmsg_pass = txmsg_drop = txmsg_redir = 0;
txmsg_apply = txmsg_cork = 0;
txmsg_ingress = txmsg_redir_skb = 0;
+ txmsg_pass_skb = 0;
txmsg_ktls_skb = txmsg_ktls_skb_drop = txmsg_ktls_skb_redir = 0;
txmsg_omit_skb_parser = 0;
skb_use_parser = 0;
@@ -956,6 +959,7 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test)
{
int i, key, next_key, err, zero = 0;
struct bpf_program *tx_prog;
+ struct bpf_program *skb_verdict_prog;
/* If base test skip BPF setup */
if (test == BASE || test == BASE_SENDPAGE)
@@ -972,7 +976,12 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test)
}
}
- links[1] = bpf_program__attach_sockmap(progs[1], map_fd[0]);
+ if (txmsg_pass_skb)
+ skb_verdict_prog = progs[2];
+ else
+ skb_verdict_prog = progs[1];
+
+ links[1] = bpf_program__attach_sockmap(skb_verdict_prog, map_fd[0]);
if (!links[1]) {
fprintf(stderr, "ERROR: bpf_program__attach_sockmap (sockmap): (%s)\n",
strerror(errno));
@@ -1361,6 +1370,8 @@ static void test_options(char *options)
}
if (txmsg_ingress)
append_str(options, "ingress,", OPTSTRING);
+ if (txmsg_pass_skb)
+ append_str(options, "pass_skb,", OPTSTRING);
if (txmsg_redir_skb)
append_str(options, "redir_skb,", OPTSTRING);
if (txmsg_ktls_skb)
--
2.33.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock
2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend
@ 2024-06-29 15:34 ` Jakub Sitnicki
2024-07-03 1:12 ` John Fastabend
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Sitnicki @ 2024-06-29 15:34 UTC (permalink / raw)
To: John Fastabend; +Cc: bpf, vincent.whitchurch, daniel
On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote:
> Originally there was a race where removing a psock from the sock map while
> it was also receiving an skb and calling sk_psock_data_ready(). It was
> possible the removal code would NULL/set the data_ready callback while
> concurrently calling the hook from receive path. The fix was to wrap the
> access in sk_callback_lock to ensure the saved_data_ready pointer didn't
> change under us. There was some discussion around doing a larger change
> to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that
> was for *next kernels not stable fixes.
>
> But, we unfortunately introduced a regression with the fix because there
> is another path into this code (that didn't have a test case) through
> the stream parser. The stream parser runs with the lower lock which means
> we get the following splat and lock up.
>
>
> ============================================
> WARNING: possible recursive locking detected
> 6.10.0-rc2 #59 Not tainted
> --------------------------------------------
> test_sockmap/342 is trying to acquire lock:
> ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467
> net/core/skmsg.c:555)
>
> but task is already holding lock:
> ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> sk_psock_strp_data_ready (net/core/skmsg.c:1120)
>
> To fix ensure we do not grap lock when we reach this code through the
> strparser.
>
> Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
> Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
> include/linux/skmsg.h | 9 +++++++--
> net/core/skmsg.c | 5 ++++-
> 2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index c9efda9df285..3659e9b514d0 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
> sk_psock_drop(sk, psock);
> }
>
> -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> {
> - read_lock_bh(&sk->sk_callback_lock);
> if (psock->saved_data_ready)
> psock->saved_data_ready(sk);
> else
> sk->sk_data_ready(sk);
> +}
> +
> +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> +{
> + read_lock_bh(&sk->sk_callback_lock);
> + __sk_psock_data_ready(sk, psock);
> read_unlock_bh(&sk->sk_callback_lock);
> }
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index fd20aae30be2..8429daecbbb6 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
> msg->skb = skb;
>
> sk_psock_queue_msg(psock, msg);
> - sk_psock_data_ready(sk, psock);
> + if (skb_bpf_strparser(skb))
> + __sk_psock_data_ready(sk, psock);
> + else
> + sk_psock_data_ready(sk, psock);
> return copied;
> }
If I follow, this is the call chain that leads to the recursive lock:
sock::sk_data_ready → sk_psock_strp_data_ready
write_lock_bh(&sk->sk_callback_lock)
strp_data_ready
strp_read_sock
proto_ops::read_sock → tcp_read_sock
strp_recv
__strp_recv
strp_callbacks::rcv_msg → sk_psock_strp_read
sk_psock_verdict_apply(verdict=__SK_PASS)
sk_psock_skb_ingress_self
sk_psock_skb_ingress_enqueue
sk_psock_data_ready
read_lock_bh(&sk->sk_callback_lock) !!!
What I don't get, though, is why strp_data_ready has to be called with a
_writer_ lock? Maybe that should just be a reader lock, and then it can
be recursive.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock
2024-06-29 15:34 ` Jakub Sitnicki
@ 2024-07-03 1:12 ` John Fastabend
0 siblings, 0 replies; 5+ messages in thread
From: John Fastabend @ 2024-07-03 1:12 UTC (permalink / raw)
To: Jakub Sitnicki, John Fastabend; +Cc: bpf, vincent.whitchurch, daniel
Jakub Sitnicki wrote:
> On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote:
> > Originally there was a race where removing a psock from the sock map while
> > it was also receiving an skb and calling sk_psock_data_ready(). It was
> > possible the removal code would NULL/set the data_ready callback while
> > concurrently calling the hook from receive path. The fix was to wrap the
> > access in sk_callback_lock to ensure the saved_data_ready pointer didn't
> > change under us. There was some discussion around doing a larger change
> > to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that
> > was for *next kernels not stable fixes.
> >
> > But, we unfortunately introduced a regression with the fix because there
> > is another path into this code (that didn't have a test case) through
> > the stream parser. The stream parser runs with the lower lock which means
> > we get the following splat and lock up.
> >
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.10.0-rc2 #59 Not tainted
> > --------------------------------------------
> > test_sockmap/342 is trying to acquire lock:
> > ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> > sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467
> > net/core/skmsg.c:555)
> >
> > but task is already holding lock:
> > ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at:
> > sk_psock_strp_data_ready (net/core/skmsg.c:1120)
> >
> > To fix ensure we do not grap lock when we reach this code through the
> > strparser.
> >
> > Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
> > Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > ---
> > include/linux/skmsg.h | 9 +++++++--
> > net/core/skmsg.c | 5 ++++-
> > 2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> > index c9efda9df285..3659e9b514d0 100644
> > --- a/include/linux/skmsg.h
> > +++ b/include/linux/skmsg.h
> > @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock)
> > sk_psock_drop(sk, psock);
> > }
> >
> > -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> > +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> > {
> > - read_lock_bh(&sk->sk_callback_lock);
> > if (psock->saved_data_ready)
> > psock->saved_data_ready(sk);
> > else
> > sk->sk_data_ready(sk);
> > +}
> > +
> > +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock)
> > +{
> > + read_lock_bh(&sk->sk_callback_lock);
> > + __sk_psock_data_ready(sk, psock);
> > read_unlock_bh(&sk->sk_callback_lock);
> > }
> >
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index fd20aae30be2..8429daecbbb6 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
> > msg->skb = skb;
> >
> > sk_psock_queue_msg(psock, msg);
> > - sk_psock_data_ready(sk, psock);
> > + if (skb_bpf_strparser(skb))
> > + __sk_psock_data_ready(sk, psock);
> > + else
> > + sk_psock_data_ready(sk, psock);
> > return copied;
> > }
>
> If I follow, this is the call chain that leads to the recursive lock:
>
> sock::sk_data_ready → sk_psock_strp_data_ready
> write_lock_bh(&sk->sk_callback_lock)
> strp_data_ready
> strp_read_sock
> proto_ops::read_sock → tcp_read_sock
> strp_recv
> __strp_recv
> strp_callbacks::rcv_msg → sk_psock_strp_read
> sk_psock_verdict_apply(verdict=__SK_PASS)
> sk_psock_skb_ingress_self
> sk_psock_skb_ingress_enqueue
> sk_psock_data_ready
> read_lock_bh(&sk->sk_callback_lock) !!!
>
> What I don't get, though, is why strp_data_ready has to be called with a
> _writer_ lock? Maybe that should just be a reader lock, and then it can
> be recursive.
Agree read lock should be fine we just want to ensure the strp
is not changing during the callchain there. Let me do that
fix instead.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-07-03 1:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend
2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend
2024-06-29 15:34 ` Jakub Sitnicki
2024-07-03 1:12 ` John Fastabend
2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox