* [PATCH bpf v3 0/2] bpf: fix recursive lock and add test
@ 2024-11-10 8:24 Jiayuan Chen
2024-11-10 8:24 ` [PATCH bpf v3 1/2] bpf: fix recursive lock when verdict program return SK_PASS Jiayuan Chen
2024-11-10 8:24 ` [PATCH bpf v3 2/2] selftests/bpf: Add some tests with sockmap SK_PASS Jiayuan Chen
0 siblings, 2 replies; 3+ messages in thread
From: Jiayuan Chen @ 2024-11-10 8:24 UTC (permalink / raw)
To: martin.lau, edumazet, jakub, davem, dsahern, kuba, pabeni, netdev,
bpf, linux-kernel, horms, daniel
Cc: mykolal, ast, kpsingh, jolsa, eddyz87, shuah, sdf,
linux-kselftest, haoluo, song, john.fastabend, andrii, mhal,
yonghong.song, Jiayuan Chen
1. fix recursive lock when ebpf prog return SK_PASS.
2. add selftest to reproduce recursive lock.
Note that if just the selftest merged without first
patch, the test case will definitely fail, because the
issue of deadlock is inevitable.
---
v2->v3: fix line length reported by patchwork.
(max_line_length is set to 80 in patchwork but default is 100 in kernel tree)
v1->v2: 1.inspired by martin.lau to add selftest to reproduce the issue.
2. follow the community rules for patch.
v1: https://lore.kernel.org/bpf/55fc6114-7e64-4b65-86d2-92cfd1e9e92f@linux.dev/T/#u
---
Jiayuan Chen (2):
bpf: fix recursive lock when verdict program return SK_PASS
selftests/bpf: Add some tests with sockmap SK_PASS
net/core/skmsg.c | 4 +-
.../selftests/bpf/prog_tests/sockmap_basic.c | 54 +++++++++++++++++++
.../bpf/progs/test_sockmap_pass_prog.c | 2 +-
3 files changed, 57 insertions(+), 3 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH bpf v3 1/2] bpf: fix recursive lock when verdict program return SK_PASS
2024-11-10 8:24 [PATCH bpf v3 0/2] bpf: fix recursive lock and add test Jiayuan Chen
@ 2024-11-10 8:24 ` Jiayuan Chen
2024-11-10 8:24 ` [PATCH bpf v3 2/2] selftests/bpf: Add some tests with sockmap SK_PASS Jiayuan Chen
1 sibling, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2024-11-10 8:24 UTC (permalink / raw)
To: martin.lau, edumazet, jakub, davem, dsahern, kuba, pabeni, netdev,
bpf, linux-kernel, horms, daniel
Cc: mykolal, ast, kpsingh, jolsa, eddyz87, shuah, sdf,
linux-kselftest, haoluo, song, john.fastabend, andrii, mhal,
yonghong.song, Jiayuan Chen, Vincent Whitchurch
When the stream_verdict program returns SK_PASS, it places the received skb
into its own receive queue, but a recursive lock eventually occurs, leading
to an operating system deadlock. This issue has been present since v6.9.
'''
sk_psock_strp_data_ready
write_lock_bh(&sk->sk_callback_lock)
strp_data_ready
strp_read_sock
read_sock -> tcp_read_sock
strp_recv
cb.rcv_msg -> sk_psock_strp_read
# now stream_verdict return SK_PASS without peer sock assign
__SK_PASS = sk_psock_map_verd(SK_PASS, NULL)
sk_psock_verdict_apply
sk_psock_skb_ingress_self
sk_psock_skb_ingress_enqueue
sk_psock_data_ready
read_lock_bh(&sk->sk_callback_lock) <= dead lock
'''
This topic has been discussed before, but it has not been fixed.
Previous discussion:
https://lore.kernel.org/all/6684a5864ec86_403d20898@john.notmuch
Fixes: 6648e613226e ("bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue")
Reported-by: Vincent Whitchurch <vincent.whitchurch@datadoghq.com>
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
net/core/skmsg.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index b1dcbd3be89e..e90fbab703b2 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1117,9 +1117,9 @@ static void sk_psock_strp_data_ready(struct sock *sk)
if (tls_sw_has_ctx_rx(sk)) {
psock->saved_data_ready(sk);
} else {
- write_lock_bh(&sk->sk_callback_lock);
+ read_lock_bh(&sk->sk_callback_lock);
strp_data_ready(&psock->strp);
- write_unlock_bh(&sk->sk_callback_lock);
+ read_unlock_bh(&sk->sk_callback_lock);
}
}
rcu_read_unlock();
--
2.43.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH bpf v3 2/2] selftests/bpf: Add some tests with sockmap SK_PASS
2024-11-10 8:24 [PATCH bpf v3 0/2] bpf: fix recursive lock and add test Jiayuan Chen
2024-11-10 8:24 ` [PATCH bpf v3 1/2] bpf: fix recursive lock when verdict program return SK_PASS Jiayuan Chen
@ 2024-11-10 8:24 ` Jiayuan Chen
1 sibling, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2024-11-10 8:24 UTC (permalink / raw)
To: martin.lau, edumazet, jakub, davem, dsahern, kuba, pabeni, netdev,
bpf, linux-kernel, horms, daniel
Cc: mykolal, ast, kpsingh, jolsa, eddyz87, shuah, sdf,
linux-kselftest, haoluo, song, john.fastabend, andrii, mhal,
yonghong.song, Jiayuan Chen
1. Add a new tests in sockmap_basic.c to test SK_PASS for sockmap
2. The return value of 'sk_skb/stream_parser' is used as a length, but
the current eBPF program returns SK_PASS, which is semantically
incorrect. This change modifies it to return skb->len. All tests
related to this eBPF program have been tested
(currently only in sockmap_basic.c).
All tests are passed.
Signed-off-by: Jiayuan Chen <mrpre@163.com>
---
test result
310/1 sockmap_basic/sockmap create_update_free:OK
310/2 sockmap_basic/sockhash create_update_free:OK
310/3 sockmap_basic/sockmap sk_msg load helpers:OK
310/4 sockmap_basic/sockhash sk_msg load helpers:OK
310/5 sockmap_basic/sockmap update:OK
310/6 sockmap_basic/sockhash update:OK
310/7 sockmap_basic/sockmap update in unsafe context:OK
310/8 sockmap_basic/sockmap copy:OK
310/9 sockmap_basic/sockhash copy:OK
310/10 sockmap_basic/sockmap skb_verdict attach:OK
310/11 sockmap_basic/sockmap skb_verdict attach_with_link:OK
310/12 sockmap_basic/sockmap msg_verdict progs query:OK
310/13 sockmap_basic/sockmap stream_parser progs query:OK
310/14 sockmap_basic/sockmap stream_verdict progs query:OK
310/15 sockmap_basic/sockmap skb_verdict progs query:OK
310/16 sockmap_basic/sockmap skb_verdict shutdown:OK
310/17 sockmap_basic/sockmap stream_parser and stream_verdict pass:OK
310/18 sockmap_basic/sockmap skb_verdict fionread:OK
310/19 sockmap_basic/sockmap skb_verdict fionread on drop:OK
310/20 sockmap_basic/sockmap skb_verdict msg_f_peek:OK
310/21 sockmap_basic/sockmap skb_verdict msg_f_peek with link:OK
310/22 sockmap_basic/sockmap unconnected af_unix:OK
310/23 sockmap_basic/sockmap one socket to many map entries:OK
310/24 sockmap_basic/sockmap one socket to many maps:OK
310/25 sockmap_basic/sockmap same socket replace:OK
310/26 sockmap_basic/sockmap sk_msg attach sockmap helpers with link:OK
310/27 sockmap_basic/sockhash sk_msg attach sockhash helpers with link:OK
310 sockmap_basic:OK
---
.../selftests/bpf/prog_tests/sockmap_basic.c | 54 +++++++++++++++++++
.../bpf/progs/test_sockmap_pass_prog.c | 2 +-
2 files changed, 55 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
index 82bfb266741c..59eafd0115df 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -501,6 +501,58 @@ static void test_sockmap_skb_verdict_shutdown(void)
test_sockmap_pass_prog__destroy(skel);
}
+static void test_sockmap_stream_pass(void)
+{
+ int zero = 0, sent, recvd;
+ int verdict, parser;
+ int err, map;
+ int c = -1, p = -1;
+ struct test_sockmap_pass_prog *pass = NULL;
+ char snd[256] = "0123456789";
+ char rcv[256] = "0";
+
+ pass = test_sockmap_pass_prog__open_and_load();
+ verdict = bpf_program__fd(pass->progs.prog_skb_verdict);
+ parser = bpf_program__fd(pass->progs.prog_skb_parser);
+ map = bpf_map__fd(pass->maps.sock_map_rx);
+
+ err = bpf_prog_attach(parser, map, BPF_SK_SKB_STREAM_PARSER, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream parser"))
+ goto out;
+
+ err = bpf_prog_attach(verdict, map, BPF_SK_SKB_STREAM_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach stream verdict"))
+ goto out;
+
+ err = create_pair(AF_INET, SOCK_STREAM, &c, &p);
+ if (err)
+ goto out;
+
+ /* sk_data_ready of 'p' will be replaced by strparser handler */
+ err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem(p)"))
+ goto out_close;
+
+ /*
+ * as 'prog_skb_parser' return the original skb len and
+ * 'prog_skb_verdict' return SK_PASS, the kernel will just
+ * pass it through to original socket 'p'
+ */
+ sent = xsend(c, snd, sizeof(snd), 0);
+ ASSERT_EQ(sent, sizeof(snd), "xsend(c)");
+
+ recvd = recv_timeout(p, rcv, sizeof(rcv), SOCK_NONBLOCK,
+ IO_TIMEOUT_SEC);
+ ASSERT_EQ(recvd, sizeof(rcv), "recv_timeout(p)");
+
+out_close:
+ close(c);
+ close(p);
+
+out:
+ test_sockmap_pass_prog__destroy(pass);
+}
+
static void test_sockmap_skb_verdict_fionread(bool pass_prog)
{
int err, map, verdict, c0 = -1, c1 = -1, p0 = -1, p1 = -1;
@@ -923,6 +975,8 @@ void test_sockmap_basic(void)
test_sockmap_progs_query(BPF_SK_SKB_VERDICT);
if (test__start_subtest("sockmap skb_verdict shutdown"))
test_sockmap_skb_verdict_shutdown();
+ if (test__start_subtest("sockmap stream_parser and stream_verdict pass"))
+ test_sockmap_stream_pass();
if (test__start_subtest("sockmap skb_verdict fionread"))
test_sockmap_skb_verdict_fionread(true);
if (test__start_subtest("sockmap skb_verdict fionread on drop"))
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_pass_prog.c b/tools/testing/selftests/bpf/progs/test_sockmap_pass_prog.c
index 69aacc96db36..515a3869e56c 100644
--- a/tools/testing/selftests/bpf/progs/test_sockmap_pass_prog.c
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_pass_prog.c
@@ -41,7 +41,7 @@ int prog_skb_verdict_clone(struct __sk_buff *skb)
SEC("sk_skb/stream_parser")
int prog_skb_parser(struct __sk_buff *skb)
{
- return SK_PASS;
+ return skb->len;
}
char _license[] SEC("license") = "GPL";
--
2.43.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-11-10 8:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-10 8:24 [PATCH bpf v3 0/2] bpf: fix recursive lock and add test Jiayuan Chen
2024-11-10 8:24 ` [PATCH bpf v3 1/2] bpf: fix recursive lock when verdict program return SK_PASS Jiayuan Chen
2024-11-10 8:24 ` [PATCH bpf v3 2/2] selftests/bpf: Add some tests with sockmap SK_PASS Jiayuan Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).