* [PATCH bpf 0/2] Fix reported sockmap splat @ 2024-06-25 20:16 John Fastabend 2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend 2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend 0 siblings, 2 replies; 5+ messages in thread From: John Fastabend @ 2024-06-25 20:16 UTC (permalink / raw) To: bpf, vincent.whitchurch; +Cc: daniel, john.fastabend Fix regression introduced in sockmap for the case with strparser and verdict and returning SK_PASS verdict on a skb. We did not have a test for this case so we add it here as well. Todo is to finalize porting all tests into normal CI coverage to make it easier to audit and run tests. That will go through bpf-next though. John Fastabend (2): bpf: sockmap, fix introduced strparser recursive lock bpf: sockmap, add test for ingress through strparser include/linux/skmsg.h | 9 +++++++-- net/core/skmsg.c | 5 ++++- tools/testing/selftests/bpf/test_sockmap.c | 13 ++++++++++++- 3 files changed, 23 insertions(+), 4 deletions(-) -- 2.33.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock 2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend @ 2024-06-25 20:16 ` John Fastabend 2024-06-29 15:34 ` Jakub Sitnicki 2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend 1 sibling, 1 reply; 5+ messages in thread From: John Fastabend @ 2024-06-25 20:16 UTC (permalink / raw) To: bpf, vincent.whitchurch; +Cc: daniel, john.fastabend Originally there was a race where removing a psock from the sock map while it was also receiving an skb and calling sk_psock_data_ready(). It was possible the removal code would NULL/set the data_ready callback while concurrently calling the hook from receive path. The fix was to wrap the access in sk_callback_lock to ensure the saved_data_ready pointer didn't change under us. There was some discussion around doing a larger change to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that was for *next kernels not stable fixes. But, we unfortunately introduced a regression with the fix because there is another path into this code (that didn't have a test case) through the stream parser. The stream parser runs with the lower lock which means we get the following splat and lock up. ============================================ WARNING: possible recursive locking detected 6.10.0-rc2 #59 Not tainted -------------------------------------------- test_sockmap/342 is trying to acquire lock: ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at: sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467 net/core/skmsg.c:555) but task is already holding lock: ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at: sk_psock_strp_data_ready (net/core/skmsg.c:1120) To fix ensure we do not grap lock when we reach this code through the strparser. Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue") Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> --- include/linux/skmsg.h | 9 +++++++-- net/core/skmsg.c | 5 ++++- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index c9efda9df285..3659e9b514d0 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock) sk_psock_drop(sk, psock); } -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) { - read_lock_bh(&sk->sk_callback_lock); if (psock->saved_data_ready) psock->saved_data_ready(sk); else sk->sk_data_ready(sk); +} + +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) +{ + read_lock_bh(&sk->sk_callback_lock); + __sk_psock_data_ready(sk, psock); read_unlock_bh(&sk->sk_callback_lock); } diff --git a/net/core/skmsg.c b/net/core/skmsg.c index fd20aae30be2..8429daecbbb6 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, msg->skb = skb; sk_psock_queue_msg(psock, msg); - sk_psock_data_ready(sk, psock); + if (skb_bpf_strparser(skb)) + __sk_psock_data_ready(sk, psock); + else + sk_psock_data_ready(sk, psock); return copied; } -- 2.33.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock 2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend @ 2024-06-29 15:34 ` Jakub Sitnicki 2024-07-03 1:12 ` John Fastabend 0 siblings, 1 reply; 5+ messages in thread From: Jakub Sitnicki @ 2024-06-29 15:34 UTC (permalink / raw) To: John Fastabend; +Cc: bpf, vincent.whitchurch, daniel On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote: > Originally there was a race where removing a psock from the sock map while > it was also receiving an skb and calling sk_psock_data_ready(). It was > possible the removal code would NULL/set the data_ready callback while > concurrently calling the hook from receive path. The fix was to wrap the > access in sk_callback_lock to ensure the saved_data_ready pointer didn't > change under us. There was some discussion around doing a larger change > to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that > was for *next kernels not stable fixes. > > But, we unfortunately introduced a regression with the fix because there > is another path into this code (that didn't have a test case) through > the stream parser. The stream parser runs with the lower lock which means > we get the following splat and lock up. > > > ============================================ > WARNING: possible recursive locking detected > 6.10.0-rc2 #59 Not tainted > -------------------------------------------- > test_sockmap/342 is trying to acquire lock: > ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at: > sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467 > net/core/skmsg.c:555) > > but task is already holding lock: > ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at: > sk_psock_strp_data_ready (net/core/skmsg.c:1120) > > To fix ensure we do not grap lock when we reach this code through the > strparser. > > Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue") > Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com> > Signed-off-by: John Fastabend <john.fastabend@gmail.com> > --- > include/linux/skmsg.h | 9 +++++++-- > net/core/skmsg.c | 5 ++++- > 2 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h > index c9efda9df285..3659e9b514d0 100644 > --- a/include/linux/skmsg.h > +++ b/include/linux/skmsg.h > @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock) > sk_psock_drop(sk, psock); > } > > -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > { > - read_lock_bh(&sk->sk_callback_lock); > if (psock->saved_data_ready) > psock->saved_data_ready(sk); > else > sk->sk_data_ready(sk); > +} > + > +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > +{ > + read_lock_bh(&sk->sk_callback_lock); > + __sk_psock_data_ready(sk, psock); > read_unlock_bh(&sk->sk_callback_lock); > } > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > index fd20aae30be2..8429daecbbb6 100644 > --- a/net/core/skmsg.c > +++ b/net/core/skmsg.c > @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > msg->skb = skb; > > sk_psock_queue_msg(psock, msg); > - sk_psock_data_ready(sk, psock); > + if (skb_bpf_strparser(skb)) > + __sk_psock_data_ready(sk, psock); > + else > + sk_psock_data_ready(sk, psock); > return copied; > } If I follow, this is the call chain that leads to the recursive lock: sock::sk_data_ready → sk_psock_strp_data_ready write_lock_bh(&sk->sk_callback_lock) strp_data_ready strp_read_sock proto_ops::read_sock → tcp_read_sock strp_recv __strp_recv strp_callbacks::rcv_msg → sk_psock_strp_read sk_psock_verdict_apply(verdict=__SK_PASS) sk_psock_skb_ingress_self sk_psock_skb_ingress_enqueue sk_psock_data_ready read_lock_bh(&sk->sk_callback_lock) !!! What I don't get, though, is why strp_data_ready has to be called with a _writer_ lock? Maybe that should just be a reader lock, and then it can be recursive. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock 2024-06-29 15:34 ` Jakub Sitnicki @ 2024-07-03 1:12 ` John Fastabend 0 siblings, 0 replies; 5+ messages in thread From: John Fastabend @ 2024-07-03 1:12 UTC (permalink / raw) To: Jakub Sitnicki, John Fastabend; +Cc: bpf, vincent.whitchurch, daniel Jakub Sitnicki wrote: > On Tue, Jun 25, 2024 at 01:16 PM -07, John Fastabend wrote: > > Originally there was a race where removing a psock from the sock map while > > it was also receiving an skb and calling sk_psock_data_ready(). It was > > possible the removal code would NULL/set the data_ready callback while > > concurrently calling the hook from receive path. The fix was to wrap the > > access in sk_callback_lock to ensure the saved_data_ready pointer didn't > > change under us. There was some discussion around doing a larger change > > to ensure we could use READ_ONCE/WRITE_ONCE over the callback, but that > > was for *next kernels not stable fixes. > > > > But, we unfortunately introduced a regression with the fix because there > > is another path into this code (that didn't have a test case) through > > the stream parser. The stream parser runs with the lower lock which means > > we get the following splat and lock up. > > > > > > ============================================ > > WARNING: possible recursive locking detected > > 6.10.0-rc2 #59 Not tainted > > -------------------------------------------- > > test_sockmap/342 is trying to acquire lock: > > ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at: > > sk_psock_skb_ingress_enqueue (./include/linux/skmsg.h:467 > > net/core/skmsg.c:555) > > > > but task is already holding lock: > > ffff888007a87228 (clock-AF_INET){++--}-{2:2}, at: > > sk_psock_strp_data_ready (net/core/skmsg.c:1120) > > > > To fix ensure we do not grap lock when we reach this code through the > > strparser. > > > > Fixes: 6648e613226e1 ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue") > > Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com> > > Signed-off-by: John Fastabend <john.fastabend@gmail.com> > > --- > > include/linux/skmsg.h | 9 +++++++-- > > net/core/skmsg.c | 5 ++++- > > 2 files changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h > > index c9efda9df285..3659e9b514d0 100644 > > --- a/include/linux/skmsg.h > > +++ b/include/linux/skmsg.h > > @@ -461,13 +461,18 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock) > > sk_psock_drop(sk, psock); > > } > > > > -static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > +static inline void __sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > { > > - read_lock_bh(&sk->sk_callback_lock); > > if (psock->saved_data_ready) > > psock->saved_data_ready(sk); > > else > > sk->sk_data_ready(sk); > > +} > > + > > +static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) > > +{ > > + read_lock_bh(&sk->sk_callback_lock); > > + __sk_psock_data_ready(sk, psock); > > read_unlock_bh(&sk->sk_callback_lock); > > } > > > > diff --git a/net/core/skmsg.c b/net/core/skmsg.c > > index fd20aae30be2..8429daecbbb6 100644 > > --- a/net/core/skmsg.c > > +++ b/net/core/skmsg.c > > @@ -552,7 +552,10 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, > > msg->skb = skb; > > > > sk_psock_queue_msg(psock, msg); > > - sk_psock_data_ready(sk, psock); > > + if (skb_bpf_strparser(skb)) > > + __sk_psock_data_ready(sk, psock); > > + else > > + sk_psock_data_ready(sk, psock); > > return copied; > > } > > If I follow, this is the call chain that leads to the recursive lock: > > sock::sk_data_ready → sk_psock_strp_data_ready > write_lock_bh(&sk->sk_callback_lock) > strp_data_ready > strp_read_sock > proto_ops::read_sock → tcp_read_sock > strp_recv > __strp_recv > strp_callbacks::rcv_msg → sk_psock_strp_read > sk_psock_verdict_apply(verdict=__SK_PASS) > sk_psock_skb_ingress_self > sk_psock_skb_ingress_enqueue > sk_psock_data_ready > read_lock_bh(&sk->sk_callback_lock) !!! > > What I don't get, though, is why strp_data_ready has to be called with a > _writer_ lock? Maybe that should just be a reader lock, and then it can > be recursive. Agree read lock should be fine we just want to ensure the strp is not changing during the callchain there. Let me do that fix instead. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser 2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend 2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend @ 2024-06-25 20:16 ` John Fastabend 1 sibling, 0 replies; 5+ messages in thread From: John Fastabend @ 2024-06-25 20:16 UTC (permalink / raw) To: bpf, vincent.whitchurch; +Cc: daniel, john.fastabend Add an option to always return SK_PASS in the verdict callback instead of redirecting the skb. This allows testing cases which are not covered by the test program as of now. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> --- tools/testing/selftests/bpf/test_sockmap.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/test_sockmap.c b/tools/testing/selftests/bpf/test_sockmap.c index 9cba4ec844a5..cc3e4d96c8ac 100644 --- a/tools/testing/selftests/bpf/test_sockmap.c +++ b/tools/testing/selftests/bpf/test_sockmap.c @@ -78,6 +78,7 @@ int txmsg_end_push; int txmsg_start_pop; int txmsg_pop; int txmsg_ingress; +int txmsg_pass_skb; int txmsg_redir_skb; int txmsg_ktls_skb; int txmsg_ktls_skb_drop; @@ -108,6 +109,7 @@ static const struct option long_options[] = { {"txmsg_start_pop", required_argument, NULL, 'w'}, {"txmsg_pop", required_argument, NULL, 'x'}, {"txmsg_ingress", no_argument, &txmsg_ingress, 1 }, + {"txmsg_pass_skb", no_argument, &txmsg_pass_skb, 1 }, {"txmsg_redir_skb", no_argument, &txmsg_redir_skb, 1 }, {"ktls", no_argument, &ktls, 1 }, {"peek", no_argument, &peek_flag, 1 }, @@ -177,6 +179,7 @@ static void test_reset(void) txmsg_pass = txmsg_drop = txmsg_redir = 0; txmsg_apply = txmsg_cork = 0; txmsg_ingress = txmsg_redir_skb = 0; + txmsg_pass_skb = 0; txmsg_ktls_skb = txmsg_ktls_skb_drop = txmsg_ktls_skb_redir = 0; txmsg_omit_skb_parser = 0; skb_use_parser = 0; @@ -956,6 +959,7 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test) { int i, key, next_key, err, zero = 0; struct bpf_program *tx_prog; + struct bpf_program *skb_verdict_prog; /* If base test skip BPF setup */ if (test == BASE || test == BASE_SENDPAGE) @@ -972,7 +976,12 @@ static int run_options(struct sockmap_options *options, int cg_fd, int test) } } - links[1] = bpf_program__attach_sockmap(progs[1], map_fd[0]); + if (txmsg_pass_skb) + skb_verdict_prog = progs[2]; + else + skb_verdict_prog = progs[1]; + + links[1] = bpf_program__attach_sockmap(skb_verdict_prog, map_fd[0]); if (!links[1]) { fprintf(stderr, "ERROR: bpf_program__attach_sockmap (sockmap): (%s)\n", strerror(errno)); @@ -1361,6 +1370,8 @@ static void test_options(char *options) } if (txmsg_ingress) append_str(options, "ingress,", OPTSTRING); + if (txmsg_pass_skb) + append_str(options, "pass_skb,", OPTSTRING); if (txmsg_redir_skb) append_str(options, "redir_skb,", OPTSTRING); if (txmsg_ktls_skb) -- 2.33.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-07-03 1:12 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-25 20:16 [PATCH bpf 0/2] Fix reported sockmap splat John Fastabend 2024-06-25 20:16 ` [PATCH bpf 1/2] bpf: sockmap, fix introduced strparser recursive lock John Fastabend 2024-06-29 15:34 ` Jakub Sitnicki 2024-07-03 1:12 ` John Fastabend 2024-06-25 20:16 ` [PATCH bpf 2/2] bpf: sockmap, add test for ingress through strparser John Fastabend
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox