linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf
@ 2024-09-05  7:56 Philo Lu
  2024-09-05  7:56 ` [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix " Philo Lu
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Philo Lu @ 2024-09-05  7:56 UTC (permalink / raw)
  To: bpf
  Cc: edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

This makes bpf_dynptr_from_skb usable for tp_btf, so that we can easily
parse skb in tracepoints. This has been discussed in [0], and Martin
suggested to use dynptr (instead of helpers like bpf_skb_load_bytes).

For safety, skb dynptr shouldn't be used in fentry/fexit. This is achieved
by add KF_TRUSTED_ARGS flag in bpf_dynptr_from_skb defination, because
pointers passed by tracepoint are trusted (PTR_TRUSTED) while those of
fentry/fexit are not.

Another problem raises that NULL pointers could be passed to tracepoint,
such as trace_tcp_send_reset, and we need to recognize them. This is done
by add a "__nullable" suffix in the func_proto of the tracepoint,
discussed in [1].

2 Test cases are added, one for "__nullable" suffix, and the other for
using skb dynptr in tp_btf.

changelog
v1 -> v2:
 - Add "__nullable" suffix support (Alexei Starovoitov)
 - Replace "struct __sk_buff*" with "void*" in test (Martin KaFai Lau)

[0]
https://lore.kernel.org/all/20240205121038.41344-1-lulie@linux.alibaba.com/T/
[1]
https://lore.kernel.org/all/20240430121805.104618-1-lulie@linux.alibaba.com/T/

Philo Lu (5):
  bpf: Support __nullable argument suffix for tp_btf
  selftests/bpf: Add test for __nullable suffix in tp_btf
  tcp: Use skb__nullable in trace_tcp_send_reset
  bpf: Allow bpf_dynptr_from_skb() for tp_btf
  selftests/bpf: Expand skb dynptr selftests for tp_btf

 include/trace/events/tcp.h                    | 12 +++----
 kernel/bpf/btf.c                              | 13 +++++++
 kernel/bpf/verifier.c                         | 36 +++++++++++++++++--
 net/core/filter.c                             |  3 +-
 .../bpf/bpf_testmod/bpf_testmod-events.h      |  6 ++++
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  2 ++
 .../testing/selftests/bpf/prog_tests/dynptr.c | 36 +++++++++++++++++--
 .../selftests/bpf/prog_tests/module_attach.c  | 14 +++++++-
 .../testing/selftests/bpf/progs/dynptr_fail.c | 25 +++++++++++++
 .../selftests/bpf/progs/dynptr_success.c      | 23 ++++++++++++
 .../bpf/progs/test_module_attach_fail.c       | 16 +++++++++
 11 files changed, 173 insertions(+), 13 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_module_attach_fail.c

--
2.32.0.3.g01195cf9f


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix for tp_btf
  2024-09-05  7:56 [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf Philo Lu
@ 2024-09-05  7:56 ` Philo Lu
  2024-09-06 21:25   ` Andrii Nakryiko
  2024-09-07  1:14   ` Andrii Nakryiko
  2024-09-05  7:56 ` [PATCH bpf-next v2 2/5] selftests/bpf: Add test for __nullable suffix in tp_btf Philo Lu
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Philo Lu @ 2024-09-05  7:56 UTC (permalink / raw)
  To: bpf
  Cc: edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

Pointers passed to tp_btf were trusted to be valid, but some tracepoints
do take NULL pointer as input, such as trace_tcp_send_reset(). Then the
invalid memory access cannot be detected by verifier.

This patch fix it by add a suffix "__nullable" to the unreliable
argument. The suffix is shown in btf, and PTR_MAYBE_NULL will be added
to nullable arguments. Then users must check the pointer before use it.

A problem here is that we use "btf_trace_##call" to search func_proto.
As it is a typedef, argument names as well as the suffix are not
recorded. To solve this, I use bpf_raw_event_map to find
"__bpf_trace##template" from "btf_trace_##call", and then we can see the
suffix.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 kernel/bpf/btf.c      | 13 +++++++++++++
 kernel/bpf/verifier.c | 36 +++++++++++++++++++++++++++++++++---
 2 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 1e29281653c62..157f5e1247c81 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6385,6 +6385,16 @@ static bool prog_args_trusted(const struct bpf_prog *prog)
 	}
 }
 
+static bool prog_arg_maybe_null(const struct bpf_prog *prog, const struct btf *btf,
+				const struct btf_param *arg)
+{
+	if (prog->type != BPF_PROG_TYPE_TRACING ||
+	    prog->expected_attach_type != BPF_TRACE_RAW_TP)
+		return false;
+
+	return btf_param_match_suffix(btf, arg, "__nullable");
+}
+
 int btf_ctx_arg_offset(const struct btf *btf, const struct btf_type *func_proto,
 		       u32 arg_no)
 {
@@ -6554,6 +6564,9 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 	if (prog_args_trusted(prog))
 		info->reg_type |= PTR_TRUSTED;
 
+	if (prog_arg_maybe_null(prog, btf, &args[arg]))
+		info->reg_type |= PTR_MAYBE_NULL;
+
 	if (tgt_prog) {
 		enum bpf_prog_type tgt_type;
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 217eb0eafa2a6..9ee68ed915dfe 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -28,6 +28,7 @@
 #include <linux/cpumask.h>
 #include <linux/bpf_mem_alloc.h>
 #include <net/xdp.h>
+#include <linux/trace_events.h>
 
 #include "disasm.h"
 
@@ -21780,6 +21781,8 @@ static int check_non_sleepable_error_inject(u32 btf_id)
 	return btf_id_set_contains(&btf_non_sleepable_error_inject, btf_id);
 }
 
+#define BTF_MAX_NAME_SIZE 128
+
 int bpf_check_attach_target(struct bpf_verifier_log *log,
 			    const struct bpf_prog *prog,
 			    const struct bpf_prog *tgt_prog,
@@ -21788,6 +21791,8 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 {
 	bool prog_extension = prog->type == BPF_PROG_TYPE_EXT;
 	bool prog_tracing = prog->type == BPF_PROG_TYPE_TRACING;
+	char trace_symbol[BTF_MAX_NAME_SIZE];
+	const struct bpf_raw_event_map *btp;
 	const char prefix[] = "btf_trace_";
 	int ret = 0, subprog = -1, i;
 	const struct btf_type *t;
@@ -21795,6 +21800,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 	const char *tname;
 	struct btf *btf;
 	long addr = 0;
+	char *fname;
 	struct module *mod = NULL;
 
 	if (!btf_id) {
@@ -21923,10 +21929,34 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 			return -EINVAL;
 		}
 		tname += sizeof(prefix) - 1;
-		t = btf_type_by_id(btf, t->type);
-		if (!btf_type_is_ptr(t))
-			/* should never happen in valid vmlinux build */
+
+		/* The func_proto of "btf_trace_##tname" is generated from typedef without argument
+		 * names. Thus using bpf_raw_event_map to get argument names. For module, the module
+		 * name is printed in "%ps" after the template function name, so use strsep to cut
+		 * it off.
+		 */
+		btp = bpf_get_raw_tracepoint(tname);
+		if (!btp)
 			return -EINVAL;
+		sprintf(trace_symbol, "%ps", btp->bpf_func);
+		fname = trace_symbol;
+		fname = strsep(&fname, " ");
+
+		ret = btf_find_by_name_kind(btf, fname, BTF_KIND_FUNC);
+		if (ret < 0) {
+			bpf_log(log, "Cannot find btf of template %s, fall back to %s%s.\n",
+				fname, prefix, tname);
+			t = btf_type_by_id(btf, t->type);
+			if (!btf_type_is_ptr(t))
+				/* should never happen in valid vmlinux build */
+				return -EINVAL;
+		} else {
+			t = btf_type_by_id(btf, ret);
+			if (!btf_type_is_func(t))
+				/* should never happen in valid vmlinux build */
+				return -EINVAL;
+		}
+
 		t = btf_type_by_id(btf, t->type);
 		if (!btf_type_is_func_proto(t))
 			/* should never happen in valid vmlinux build */
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 2/5] selftests/bpf: Add test for __nullable suffix in tp_btf
  2024-09-05  7:56 [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf Philo Lu
  2024-09-05  7:56 ` [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix " Philo Lu
@ 2024-09-05  7:56 ` Philo Lu
  2024-09-06 21:25   ` Andrii Nakryiko
  2024-09-05  7:56 ` [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset Philo Lu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Philo Lu @ 2024-09-05  7:56 UTC (permalink / raw)
  To: bpf
  Cc: edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

Add a tracepoint with __nullable suffix in bpf_testmod, and add a
failure load case:

$./test_progs -t "module_attach"
 #173/1   module_attach/handle_tp_btf_nullable_bare:OK
 #173     module_attach:OK
 Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 .../bpf/bpf_testmod/bpf_testmod-events.h         |  6 ++++++
 .../selftests/bpf/bpf_testmod/bpf_testmod.c      |  2 ++
 .../selftests/bpf/prog_tests/module_attach.c     | 14 +++++++++++++-
 .../bpf/progs/test_module_attach_fail.c          | 16 ++++++++++++++++
 4 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_module_attach_fail.c

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
index 11ee801e75e7e..6c3b4d4f173ac 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod-events.h
@@ -34,6 +34,12 @@ DECLARE_TRACE(bpf_testmod_test_write_bare,
 	TP_ARGS(task, ctx)
 );
 
+/* Used in bpf_testmod_test_read() to test __nullable suffix */
+DECLARE_TRACE(bpf_testmod_test_nullable_bare,
+	TP_PROTO(struct bpf_testmod_test_read_ctx *ctx__nullable),
+	TP_ARGS(ctx__nullable)
+);
+
 #undef BPF_TESTMOD_DECLARE_TRACE
 #ifdef DECLARE_TRACE_WRITABLE
 #define BPF_TESTMOD_DECLARE_TRACE(call, proto, args, size) \
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index c73d04bc9e9de..9649e7f09fc90 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -394,6 +394,8 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj,
 	if (bpf_testmod_loop_test(101) > 100)
 		trace_bpf_testmod_test_read(current, &ctx);
 
+	trace_bpf_testmod_test_nullable_bare(NULL);
+
 	/* Magic number to enable writable tp */
 	if (len == 64) {
 		struct bpf_testmod_test_writable_ctx writable = {
diff --git a/tools/testing/selftests/bpf/prog_tests/module_attach.c b/tools/testing/selftests/bpf/prog_tests/module_attach.c
index 6d391d95f96e0..961d8577d6fab 100644
--- a/tools/testing/selftests/bpf/prog_tests/module_attach.c
+++ b/tools/testing/selftests/bpf/prog_tests/module_attach.c
@@ -4,6 +4,7 @@
 #include <test_progs.h>
 #include <stdbool.h>
 #include "test_module_attach.skel.h"
+#include "test_module_attach_fail.skel.h"
 #include "testing_helpers.h"
 
 static int duration;
@@ -33,7 +34,7 @@ static int trigger_module_test_writable(int *val)
 	return 0;
 }
 
-void test_module_attach(void)
+static void module_attach_succ(void)
 {
 	const int READ_SZ = 456;
 	const int WRITE_SZ = 457;
@@ -115,3 +116,14 @@ void test_module_attach(void)
 cleanup:
 	test_module_attach__destroy(skel);
 }
+
+static void module_attach_fail(void)
+{
+	RUN_TESTS(test_module_attach_fail);
+}
+
+void test_module_attach(void)
+{
+	module_attach_succ();
+	module_attach_fail();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_module_attach_fail.c b/tools/testing/selftests/bpf/progs/test_module_attach_fail.c
new file mode 100644
index 0000000000000..0f848d8f2f5e8
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_module_attach_fail.c
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "../bpf_testmod/bpf_testmod.h"
+#include "bpf_misc.h"
+
+SEC("tp_btf/bpf_testmod_test_nullable_bare")
+__failure __msg("invalid mem access")
+int BPF_PROG(handle_tp_btf_nullable_bare, struct bpf_testmod_test_read_ctx *nullable_ctx)
+{
+	return nullable_ctx->len;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-05  7:56 [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf Philo Lu
  2024-09-05  7:56 ` [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix " Philo Lu
  2024-09-05  7:56 ` [PATCH bpf-next v2 2/5] selftests/bpf: Add test for __nullable suffix in tp_btf Philo Lu
@ 2024-09-05  7:56 ` Philo Lu
  2024-09-06  0:26   ` Alexei Starovoitov
  2024-09-05  7:56 ` [PATCH bpf-next v2 4/5] bpf: Allow bpf_dynptr_from_skb() for tp_btf Philo Lu
  2024-09-05  7:56 ` [PATCH bpf-next v2 5/5] selftests/bpf: Expand skb dynptr selftests " Philo Lu
  4 siblings, 1 reply; 17+ messages in thread
From: Philo Lu @ 2024-09-05  7:56 UTC (permalink / raw)
  To: bpf
  Cc: edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

Replace skb with skb__nullable as the argument name. The suffix tells
bpf verifier through btf that the arg could be NULL and should be
checked in tp_btf prog.

For now, this is the only nullable argument in tcp tracepoints.

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 include/trace/events/tcp.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 1c8bd8e186b89..a27c4b619dffd 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -91,10 +91,10 @@ DEFINE_RST_REASON(FN, FN)
 TRACE_EVENT(tcp_send_reset,
 
 	TP_PROTO(const struct sock *sk,
-		 const struct sk_buff *skb,
+		 const struct sk_buff *skb__nullable,
 		 const enum sk_rst_reason reason),
 
-	TP_ARGS(sk, skb, reason),
+	TP_ARGS(sk, skb__nullable, reason),
 
 	TP_STRUCT__entry(
 		__field(const void *, skbaddr)
@@ -106,7 +106,7 @@ TRACE_EVENT(tcp_send_reset,
 	),
 
 	TP_fast_assign(
-		__entry->skbaddr = skb;
+		__entry->skbaddr = skb__nullable;
 		__entry->skaddr = sk;
 		/* Zero means unknown state. */
 		__entry->state = sk ? sk->sk_state : 0;
@@ -118,13 +118,13 @@ TRACE_EVENT(tcp_send_reset,
 			const struct inet_sock *inet = inet_sk(sk);
 
 			TP_STORE_ADDR_PORTS(__entry, inet, sk);
-		} else if (skb) {
-			const struct tcphdr *th = (const struct tcphdr *)skb->data;
+		} else if (skb__nullable) {
+			const struct tcphdr *th = (const struct tcphdr *)skb__nullable->data;
 			/*
 			 * We should reverse the 4-tuple of skb, so later
 			 * it can print the right flow direction of rst.
 			 */
-			TP_STORE_ADDR_PORTS_SKB(skb, th, entry->daddr, entry->saddr);
+			TP_STORE_ADDR_PORTS_SKB(skb__nullable, th, entry->daddr, entry->saddr);
 		}
 		__entry->reason = reason;
 	),
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 4/5] bpf: Allow bpf_dynptr_from_skb() for tp_btf
  2024-09-05  7:56 [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf Philo Lu
                   ` (2 preceding siblings ...)
  2024-09-05  7:56 ` [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset Philo Lu
@ 2024-09-05  7:56 ` Philo Lu
  2024-09-06  1:13   ` Martin KaFai Lau
  2024-09-05  7:56 ` [PATCH bpf-next v2 5/5] selftests/bpf: Expand skb dynptr selftests " Philo Lu
  4 siblings, 1 reply; 17+ messages in thread
From: Philo Lu @ 2024-09-05  7:56 UTC (permalink / raw)
  To: bpf
  Cc: edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

Making tp_btf able to use bpf_dynptr_from_skb(), which is useful for skb
parsing, especially for non-linear paged skb data. This is achieved by
adding KF_TRUSTED_ARGS flag to bpf_dynptr_from_skb and registering it
for TRACING progs. With KF_TRUSTED_ARGS, args from fentry/fexit are
excluded, so that unsafe progs like fexit/__kfree_skb are not allowed.

We also need the skb dynptr to be read-only in tp_btf. Because
may_access_direct_pkt_data() returns false by default when checking
bpf_dynptr_from_skb, there is no need to add BPF_PROG_TYPE_TRACING to it
explicitly.

Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 net/core/filter.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 6d4fa9198b652..4c01f4756ddb5 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -12074,7 +12074,7 @@ int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags,
 }
 
 BTF_KFUNCS_START(bpf_kfunc_check_set_skb)
-BTF_ID_FLAGS(func, bpf_dynptr_from_skb)
+BTF_ID_FLAGS(func, bpf_dynptr_from_skb, KF_TRUSTED_ARGS)
 BTF_KFUNCS_END(bpf_kfunc_check_set_skb)
 
 BTF_KFUNCS_START(bpf_kfunc_check_set_xdp)
@@ -12123,6 +12123,7 @@ static int __init bpf_kfunc_init(void)
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_XMIT, &bpf_kfunc_set_skb);
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_LWT_SEG6LOCAL, &bpf_kfunc_set_skb);
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_NETFILTER, &bpf_kfunc_set_skb);
+	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_kfunc_set_skb);
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp);
 	ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR,
 					       &bpf_kfunc_set_sock_addr);
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH bpf-next v2 5/5] selftests/bpf: Expand skb dynptr selftests for tp_btf
  2024-09-05  7:56 [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf Philo Lu
                   ` (3 preceding siblings ...)
  2024-09-05  7:56 ` [PATCH bpf-next v2 4/5] bpf: Allow bpf_dynptr_from_skb() for tp_btf Philo Lu
@ 2024-09-05  7:56 ` Philo Lu
  4 siblings, 0 replies; 17+ messages in thread
From: Philo Lu @ 2024-09-05  7:56 UTC (permalink / raw)
  To: bpf
  Cc: edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau, ast,
	daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

Add 3 test cases for skb dynptr used in tp_btf:
- test_dynptr_skb_tp_btf: use skb dynptr in tp_btf and make sure it is
  read-only.
- skb_invalid_ctx_fentry/skb_invalid_ctx_fexit: bpf_dynptr_from_skb
  should fail in fentry/fexit.

In test_dynptr_skb_tp_btf, to trigger the tracepoint in kfree_skb,
test_pkt_access is used for its test_run, as in kfree_skb.c. Because the
test process is different from others, a new setup type is defined,
i.e., SETUP_SKB_PROG_TP.

The result is like:
$ ./test_progs -t 'dynptr/test_dynptr_skb_tp_btf'
  #84/14   dynptr/test_dynptr_skb_tp_btf:OK
  #84      dynptr:OK
  #127     kfunc_dynptr_param:OK
  Summary: 2/1 PASSED, 0 SKIPPED, 0 FAILED

$ ./test_progs -t 'dynptr/skb_invalid_ctx_f'
  #84/85   dynptr/skb_invalid_ctx_fentry:OK
  #84/86   dynptr/skb_invalid_ctx_fexit:OK
  #84      dynptr:OK
  #127     kfunc_dynptr_param:OK
  Summary: 2/2 PASSED, 0 SKIPPED, 0 FAILED

Also fix two coding style nits (change spaces to tabs).

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 .../testing/selftests/bpf/prog_tests/dynptr.c | 36 +++++++++++++++++--
 .../testing/selftests/bpf/progs/dynptr_fail.c | 25 +++++++++++++
 .../selftests/bpf/progs/dynptr_success.c      | 23 ++++++++++++
 3 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c
index 7cfac53c0d58d..ba40be8b1c4ef 100644
--- a/tools/testing/selftests/bpf/prog_tests/dynptr.c
+++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c
@@ -9,6 +9,7 @@
 enum test_setup_type {
 	SETUP_SYSCALL_SLEEP,
 	SETUP_SKB_PROG,
+	SETUP_SKB_PROG_TP,
 };
 
 static struct {
@@ -28,6 +29,7 @@ static struct {
 	{"test_dynptr_clone", SETUP_SKB_PROG},
 	{"test_dynptr_skb_no_buff", SETUP_SKB_PROG},
 	{"test_dynptr_skb_strcmp", SETUP_SKB_PROG},
+	{"test_dynptr_skb_tp_btf", SETUP_SKB_PROG_TP},
 };
 
 static void verify_success(const char *prog_name, enum test_setup_type setup_type)
@@ -35,7 +37,7 @@ static void verify_success(const char *prog_name, enum test_setup_type setup_typ
 	struct dynptr_success *skel;
 	struct bpf_program *prog;
 	struct bpf_link *link;
-       int err;
+	int err;
 
 	skel = dynptr_success__open();
 	if (!ASSERT_OK_PTR(skel, "dynptr_success__open"))
@@ -47,7 +49,7 @@ static void verify_success(const char *prog_name, enum test_setup_type setup_typ
 	if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
 		goto cleanup;
 
-       bpf_program__set_autoload(prog, true);
+	bpf_program__set_autoload(prog, true);
 
 	err = dynptr_success__load(skel);
 	if (!ASSERT_OK(err, "dynptr_success__load"))
@@ -87,6 +89,36 @@ static void verify_success(const char *prog_name, enum test_setup_type setup_typ
 
 		break;
 	}
+	case SETUP_SKB_PROG_TP:
+	{
+		struct __sk_buff skb = {};
+		struct bpf_object *obj;
+		int aux_prog_fd;
+
+		/* Just use its test_run to trigger kfree_skb tracepoint */
+		err = bpf_prog_test_load("./test_pkt_access.bpf.o", BPF_PROG_TYPE_SCHED_CLS,
+					 &obj, &aux_prog_fd);
+		if (!ASSERT_OK(err, "prog_load sched cls"))
+			goto cleanup;
+
+		LIBBPF_OPTS(bpf_test_run_opts, topts,
+			    .data_in = &pkt_v4,
+			    .data_size_in = sizeof(pkt_v4),
+			    .ctx_in = &skb,
+			    .ctx_size_in = sizeof(skb),
+		);
+
+		link = bpf_program__attach(prog);
+		if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
+			goto cleanup;
+
+		err = bpf_prog_test_run_opts(aux_prog_fd, &topts);
+
+		if (!ASSERT_OK(err, "test_run"))
+			goto cleanup;
+
+		break;
+	}
 	}
 
 	ASSERT_EQ(skel->bss->err, 0, "err");
diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c
index 68b8c6eca5083..8f36c9de75915 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_fail.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c
@@ -6,6 +6,7 @@
 #include <stdbool.h>
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
 #include <linux/if_ether.h>
 #include "bpf_misc.h"
 #include "bpf_kfuncs.h"
@@ -1254,6 +1255,30 @@ int skb_invalid_ctx(void *ctx)
 	return 0;
 }
 
+SEC("fentry/skb_tx_error")
+__failure __msg("must be referenced or trusted")
+int BPF_PROG(skb_invalid_ctx_fentry, void *skb)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_from_skb(skb, 0, &ptr);
+
+	return 0;
+}
+
+SEC("fexit/skb_tx_error")
+__failure __msg("must be referenced or trusted")
+int BPF_PROG(skb_invalid_ctx_fexit, void *skb)
+{
+	struct bpf_dynptr ptr;
+
+	/* this should fail */
+	bpf_dynptr_from_skb(skb, 0, &ptr);
+
+	return 0;
+}
+
 /* Reject writes to dynptr slot for uninit arg */
 SEC("?raw_tp")
 __failure __msg("potential write to dynptr at off=-16")
diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c
index 5985920d162e7..bfcc85686cf04 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_success.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
@@ -5,6 +5,7 @@
 #include <stdbool.h>
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
 #include "bpf_misc.h"
 #include "bpf_kfuncs.h"
 #include "errno.h"
@@ -544,3 +545,25 @@ int test_dynptr_skb_strcmp(struct __sk_buff *skb)
 
 	return 1;
 }
+
+SEC("tp_btf/kfree_skb")
+int BPF_PROG(test_dynptr_skb_tp_btf, void *skb, void *location)
+{
+	__u8 write_data[2] = {1, 2};
+	struct bpf_dynptr ptr;
+	int ret;
+
+	if (bpf_dynptr_from_skb(skb, 0, &ptr)) {
+		err = 1;
+		return 1;
+	}
+
+	/* since tp_btf skbs are read only, writes should fail */
+	ret = bpf_dynptr_write(&ptr, 0, write_data, sizeof(write_data), 0);
+	if (ret != -EINVAL) {
+		err = 2;
+		return 1;
+	}
+
+	return 1;
+}
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-05  7:56 ` [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset Philo Lu
@ 2024-09-06  0:26   ` Alexei Starovoitov
  2024-09-06 22:23     ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2024-09-06  0:26 UTC (permalink / raw)
  To: Philo Lu
  Cc: bpf, Eric Dumazet, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eddy Z, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Jakub Kicinski, Paolo Abeni, Mykola Lysenko,
	Shuah Khan, Maxime Coquelin, Alexandre Torgue, Kui-Feng Lee,
	Juntong Deng, jrife, Alan Maguire, Dave Marchevsky, Daniel Xu,
	Viktor Malik, Cupertino Miranda, Matt Bobrowski, Xuan Zhuo,
	Network Development, linux-trace-kernel

On Thu, Sep 5, 2024 at 12:56 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>
> Replace skb with skb__nullable as the argument name. The suffix tells
> bpf verifier through btf that the arg could be NULL and should be
> checked in tp_btf prog.
>
> For now, this is the only nullable argument in tcp tracepoints.
>
> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
> ---
>  include/trace/events/tcp.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index 1c8bd8e186b89..a27c4b619dffd 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -91,10 +91,10 @@ DEFINE_RST_REASON(FN, FN)
>  TRACE_EVENT(tcp_send_reset,
>
>         TP_PROTO(const struct sock *sk,
> -                const struct sk_buff *skb,
> +                const struct sk_buff *skb__nullable,
>                  const enum sk_rst_reason reason),
>
> -       TP_ARGS(sk, skb, reason),
> +       TP_ARGS(sk, skb__nullable, reason),

netdev folks pls ack this patch.

Yes, it's a bit of a whack a mole and eventually we can get rid of it
with a smarter verifier (likely) or smarter objtool (unlikely).
Long term we should be able to analyze body of TP_fast_assign
automatically and conclude whether it's handling NULL for pointer
arguments or not. bpf verifier can easily do it for bpf code.
We just need to compile TP_fast_assign() as a tiny bpf snippet.
This is work in progress.

For now we have to use a suffix like __nullable.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 4/5] bpf: Allow bpf_dynptr_from_skb() for tp_btf
  2024-09-05  7:56 ` [PATCH bpf-next v2 4/5] bpf: Allow bpf_dynptr_from_skb() for tp_btf Philo Lu
@ 2024-09-06  1:13   ` Martin KaFai Lau
  0 siblings, 0 replies; 17+ messages in thread
From: Martin KaFai Lau @ 2024-09-06  1:13 UTC (permalink / raw)
  To: Philo Lu
  Cc: bpf, edumazet, rostedt, mhiramat, mathieu.desnoyers, ast, daniel,
	andrii, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
	sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

On 9/5/24 12:56 AM, Philo Lu wrote:
> Making tp_btf able to use bpf_dynptr_from_skb(), which is useful for skb
> parsing, especially for non-linear paged skb data. This is achieved by
> adding KF_TRUSTED_ARGS flag to bpf_dynptr_from_skb and registering it
> for TRACING progs. With KF_TRUSTED_ARGS, args from fentry/fexit are
> excluded, so that unsafe progs like fexit/__kfree_skb are not allowed.
> 
> We also need the skb dynptr to be read-only in tp_btf. Because
> may_access_direct_pkt_data() returns false by default when checking
> bpf_dynptr_from_skb, there is no need to add BPF_PROG_TYPE_TRACING to it
> explicitly.

Acked-by: Martin KaFai Lau <martin.lau@kernel.org>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix for tp_btf
  2024-09-05  7:56 ` [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix " Philo Lu
@ 2024-09-06 21:25   ` Andrii Nakryiko
  2024-09-07  3:26     ` Philo Lu
  2024-09-07  1:14   ` Andrii Nakryiko
  1 sibling, 1 reply; 17+ messages in thread
From: Andrii Nakryiko @ 2024-09-06 21:25 UTC (permalink / raw)
  To: Philo Lu
  Cc: bpf, edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau,
	ast, daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

On Thu, Sep 5, 2024 at 12:56 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>
> Pointers passed to tp_btf were trusted to be valid, but some tracepoints
> do take NULL pointer as input, such as trace_tcp_send_reset(). Then the
> invalid memory access cannot be detected by verifier.
>
> This patch fix it by add a suffix "__nullable" to the unreliable
> argument. The suffix is shown in btf, and PTR_MAYBE_NULL will be added
> to nullable arguments. Then users must check the pointer before use it.
>
> A problem here is that we use "btf_trace_##call" to search func_proto.
> As it is a typedef, argument names as well as the suffix are not
> recorded. To solve this, I use bpf_raw_event_map to find
> "__bpf_trace##template" from "btf_trace_##call", and then we can see the
> suffix.
>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
> ---
>  kernel/bpf/btf.c      | 13 +++++++++++++
>  kernel/bpf/verifier.c | 36 +++++++++++++++++++++++++++++++++---
>  2 files changed, 46 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 1e29281653c62..157f5e1247c81 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6385,6 +6385,16 @@ static bool prog_args_trusted(const struct bpf_prog *prog)
>         }
>  }
>
> +static bool prog_arg_maybe_null(const struct bpf_prog *prog, const struct btf *btf,
> +                               const struct btf_param *arg)
> +{
> +       if (prog->type != BPF_PROG_TYPE_TRACING ||
> +           prog->expected_attach_type != BPF_TRACE_RAW_TP)
> +               return false;
> +
> +       return btf_param_match_suffix(btf, arg, "__nullable");

why does this need to be BPF_TRACE_RAW_TP-specific logic? Are we
afraid that there might be "some_arg__nullable" argument name?..

> +}
> +
>  int btf_ctx_arg_offset(const struct btf *btf, const struct btf_type *func_proto,
>                        u32 arg_no)
>  {

[...]

> @@ -21923,10 +21929,34 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
>                         return -EINVAL;
>                 }
>                 tname += sizeof(prefix) - 1;
> -               t = btf_type_by_id(btf, t->type);
> -               if (!btf_type_is_ptr(t))
> -                       /* should never happen in valid vmlinux build */
> +
> +               /* The func_proto of "btf_trace_##tname" is generated from typedef without argument
> +                * names. Thus using bpf_raw_event_map to get argument names. For module, the module
> +                * name is printed in "%ps" after the template function name, so use strsep to cut
> +                * it off.
> +                */
> +               btp = bpf_get_raw_tracepoint(tname);
> +               if (!btp)
>                         return -EINVAL;

there has to be bpf_put_raw_tracepoint() somewhere below

pw-bot: cr

> +               sprintf(trace_symbol, "%ps", btp->bpf_func);

snprintf, but this is really a very round-about way of doing kallsym
lookup, no? Why not call kallsyms_lookup() directly?


> +               fname = trace_symbol;
> +               fname = strsep(&fname, " ");
> +
> +               ret = btf_find_by_name_kind(btf, fname, BTF_KIND_FUNC);
> +               if (ret < 0) {
> +                       bpf_log(log, "Cannot find btf of template %s, fall back to %s%s.\n",
> +                               fname, prefix, tname);
> +                       t = btf_type_by_id(btf, t->type);
> +                       if (!btf_type_is_ptr(t))
> +                               /* should never happen in valid vmlinux build */
> +                               return -EINVAL;
> +               } else {
> +                       t = btf_type_by_id(btf, ret);
> +                       if (!btf_type_is_func(t))
> +                               /* should never happen in valid vmlinux build */
> +                               return -EINVAL;
> +               }
> +
>                 t = btf_type_by_id(btf, t->type);
>                 if (!btf_type_is_func_proto(t))
>                         /* should never happen in valid vmlinux build */
> --
> 2.32.0.3.g01195cf9f
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 2/5] selftests/bpf: Add test for __nullable suffix in tp_btf
  2024-09-05  7:56 ` [PATCH bpf-next v2 2/5] selftests/bpf: Add test for __nullable suffix in tp_btf Philo Lu
@ 2024-09-06 21:25   ` Andrii Nakryiko
  0 siblings, 0 replies; 17+ messages in thread
From: Andrii Nakryiko @ 2024-09-06 21:25 UTC (permalink / raw)
  To: Philo Lu
  Cc: bpf, edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau,
	ast, daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

On Thu, Sep 5, 2024 at 12:56 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>
> Add a tracepoint with __nullable suffix in bpf_testmod, and add a
> failure load case:
>
> $./test_progs -t "module_attach"
>  #173/1   module_attach/handle_tp_btf_nullable_bare:OK
>  #173     module_attach:OK
>  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
>
> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
> ---
>  .../bpf/bpf_testmod/bpf_testmod-events.h         |  6 ++++++
>  .../selftests/bpf/bpf_testmod/bpf_testmod.c      |  2 ++
>  .../selftests/bpf/prog_tests/module_attach.c     | 14 +++++++++++++-
>  .../bpf/progs/test_module_attach_fail.c          | 16 ++++++++++++++++
>  4 files changed, 37 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/progs/test_module_attach_fail.c
>

[...]

> +
> +static void module_attach_fail(void)
> +{
> +       RUN_TESTS(test_module_attach_fail);
> +}
> +
> +void test_module_attach(void)
> +{
> +       module_attach_succ();
> +       module_attach_fail();
> +}

this is not a good idea to combine existing non-RUN_TESTS test with
RUN_TESTS. The latter is a subtest, while the former is a full test.
Keep them separate, just add a new file for
RUN_TESTS(test_module_attach_fail), or add them to existing
RUN_TESTS(), whatever makes most sense.

> diff --git a/tools/testing/selftests/bpf/progs/test_module_attach_fail.c b/tools/testing/selftests/bpf/progs/test_module_attach_fail.c
> new file mode 100644
> index 0000000000000..0f848d8f2f5e8
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_module_attach_fail.c
> @@ -0,0 +1,16 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "vmlinux.h"
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "../bpf_testmod/bpf_testmod.h"
> +#include "bpf_misc.h"
> +
> +SEC("tp_btf/bpf_testmod_test_nullable_bare")
> +__failure __msg("invalid mem access")

would be nice to confirm that register corresponding to nullable_ctx
actually has "_or_null" suffix


> +int BPF_PROG(handle_tp_btf_nullable_bare, struct bpf_testmod_test_read_ctx *nullable_ctx)
> +{
> +       return nullable_ctx->len;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.32.0.3.g01195cf9f
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-06  0:26   ` Alexei Starovoitov
@ 2024-09-06 22:23     ` Jakub Kicinski
  2024-09-06 22:41       ` Alexei Starovoitov
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2024-09-06 22:23 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Philo Lu, bpf, Eric Dumazet, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eddy Z, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Paolo Abeni, Mykola Lysenko, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Kui-Feng Lee, Juntong Deng,
	jrife, Alan Maguire, Dave Marchevsky, Daniel Xu, Viktor Malik,
	Cupertino Miranda, Matt Bobrowski, Xuan Zhuo, Network Development,
	linux-trace-kernel

On Thu, 5 Sep 2024 17:26:42 -0700 Alexei Starovoitov wrote:
> Yes, it's a bit of a whack a mole and eventually we can get rid of it
> with a smarter verifier (likely) or smarter objtool (unlikely).
> Long term we should be able to analyze body of TP_fast_assign
> automatically and conclude whether it's handling NULL for pointer
> arguments or not. bpf verifier can easily do it for bpf code.
> We just need to compile TP_fast_assign() as a tiny bpf snippet.
> This is work in progress.

Can we not wait for that work to conclude, then? AFAIU this whole
patch set is just a minor quality of life improvement for BPF progs
at the expense of carrying questionable changes upstream.
I don't see the urgency.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-06 22:23     ` Jakub Kicinski
@ 2024-09-06 22:41       ` Alexei Starovoitov
  2024-09-06 22:57         ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2024-09-06 22:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Philo Lu, bpf, Eric Dumazet, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eddy Z, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Paolo Abeni, Mykola Lysenko, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Kui-Feng Lee, Juntong Deng,
	jrife, Alan Maguire, Dave Marchevsky, Daniel Xu, Viktor Malik,
	Cupertino Miranda, Matt Bobrowski, Xuan Zhuo, Network Development,
	linux-trace-kernel

On Fri, Sep 6, 2024 at 3:23 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 5 Sep 2024 17:26:42 -0700 Alexei Starovoitov wrote:
> > Yes, it's a bit of a whack a mole and eventually we can get rid of it
> > with a smarter verifier (likely) or smarter objtool (unlikely).
> > Long term we should be able to analyze body of TP_fast_assign
> > automatically and conclude whether it's handling NULL for pointer
> > arguments or not. bpf verifier can easily do it for bpf code.
> > We just need to compile TP_fast_assign() as a tiny bpf snippet.
> > This is work in progress.
>
> Can we not wait for that work to conclude, then? AFAIU this whole
> patch set is just a minor quality of life improvement for BPF progs
> at the expense of carrying questionable changes upstream.
> I don't see the urgency.

The urgency is now because the situation is dire.
The verifier assumes that skb is not null and will remove
if (!skb) check assuming that it's a dead code.
This patch set adds trusted stuff and fixes this issue too
which is the more important part.

Also it's not clear how long it will take to do 'dual compile'.
It's showing promise atm, but timelines are not certain.
If you recall I pushed for 'dual compile' for the last several years and
only now we found time to work on it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-06 22:41       ` Alexei Starovoitov
@ 2024-09-06 22:57         ` Jakub Kicinski
  2024-09-06 23:22           ` Alexei Starovoitov
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2024-09-06 22:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Philo Lu, bpf, Eric Dumazet, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eddy Z, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Paolo Abeni, Mykola Lysenko, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Kui-Feng Lee, Juntong Deng,
	jrife, Alan Maguire, Dave Marchevsky, Daniel Xu, Viktor Malik,
	Cupertino Miranda, Matt Bobrowski, Xuan Zhuo, Network Development,
	linux-trace-kernel

On Fri, 6 Sep 2024 15:41:47 -0700 Alexei Starovoitov wrote:
> The urgency is now because the situation is dire.
> The verifier assumes that skb is not null and will remove
> if (!skb) check assuming that it's a dead code.

Meaning verifier currently isn't ready for patch 4?
Or we can crash 6.11-rc6 by attaching to a trace_tcp_send_reset()
and doing
	printf("%d\n", skb->len);
?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-06 22:57         ` Jakub Kicinski
@ 2024-09-06 23:22           ` Alexei Starovoitov
  2024-09-07  0:17             ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2024-09-06 23:22 UTC (permalink / raw)
  To: Jakub Kicinski, Jiri Olsa
  Cc: Philo Lu, bpf, Eric Dumazet, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Martin KaFai Lau, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Eddy Z, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo,
	David S. Miller, Paolo Abeni, Mykola Lysenko, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Kui-Feng Lee, Juntong Deng,
	jrife, Alan Maguire, Dave Marchevsky, Daniel Xu, Viktor Malik,
	Cupertino Miranda, Matt Bobrowski, Xuan Zhuo, Network Development,
	linux-trace-kernel

On Fri, Sep 6, 2024 at 3:57 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 6 Sep 2024 15:41:47 -0700 Alexei Starovoitov wrote:
> > The urgency is now because the situation is dire.
> > The verifier assumes that skb is not null and will remove
> > if (!skb) check assuming that it's a dead code.
>
> Meaning verifier currently isn't ready for patch 4?
> Or we can crash 6.11-rc6 by attaching to a trace_tcp_send_reset()
> and doing
>         printf("%d\n", skb->len);
> ?

depends on the prog type and how it's attached, but yes :(
Without Philo's patches.

It was reported here:
https://lore.kernel.org/bpf/ZrCZS6nisraEqehw@jlelli-thinkpadt14gen4.remote.csb/

Jiri did the analysis. These files would need to be annotated:
include/trace/events/afs.h
include/trace/events/cachefiles.h
include/trace/events/ext4.h
include/trace/events/fib.h
include/trace/events/filelock.h
include/trace/events/host1x.h
include/trace/events/huge_memory.h
include/trace/events/kmem.h
include/trace/events/netfs.h
include/trace/events/power.h
include/trace/events/qdisc.h
include/trace/events/rxrpc.h
include/trace/events/sched.h
include/trace/events/sunrpc.h
include/trace/events/tcp.h
include/trace/events/tegra_apb_dma.h
include/trace/events/timer_migration.h
include/trace/events/writeback.h

which is 18 out of 160.

All other options are worse.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset
  2024-09-06 23:22           ` Alexei Starovoitov
@ 2024-09-07  0:17             ` Jakub Kicinski
  0 siblings, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2024-09-07  0:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Philo Lu, bpf, Eric Dumazet, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Martin KaFai Lau,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eddy Z,
	Song Liu, Yonghong Song, John Fastabend, KP Singh,
	Stanislav Fomichev, Hao Luo, David S. Miller, Paolo Abeni,
	Mykola Lysenko, Shuah Khan, Maxime Coquelin, Alexandre Torgue,
	Kui-Feng Lee, Juntong Deng, jrife, Alan Maguire, Dave Marchevsky,
	Daniel Xu, Viktor Malik, Cupertino Miranda, Matt Bobrowski,
	Xuan Zhuo, Network Development, linux-trace-kernel

On Fri, 6 Sep 2024 16:22:12 -0700 Alexei Starovoitov wrote:
> > On Fri, 6 Sep 2024 15:41:47 -0700 Alexei Starovoitov wrote:  
> > > The urgency is now because the situation is dire.
> > > The verifier assumes that skb is not null and will remove
> > > if (!skb) check assuming that it's a dead code.  
> >
> > Meaning verifier currently isn't ready for patch 4?
> > Or we can crash 6.11-rc6 by attaching to a trace_tcp_send_reset()
> > and doing
> >         printf("%d\n", skb->len);
> > ?  
> 
> depends on the prog type and how it's attached, but yes :(

I see :( Thought this is just needed for patch 4.
In this case no objections "from networking perspective":

Acked-by: Jakub Kicinski <kuba@kernel.org>

although it feels more like a general tracing question.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix for tp_btf
  2024-09-05  7:56 ` [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix " Philo Lu
  2024-09-06 21:25   ` Andrii Nakryiko
@ 2024-09-07  1:14   ` Andrii Nakryiko
  1 sibling, 0 replies; 17+ messages in thread
From: Andrii Nakryiko @ 2024-09-07  1:14 UTC (permalink / raw)
  To: Philo Lu
  Cc: bpf, edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau,
	ast, daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel

On Thu, Sep 5, 2024 at 12:56 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>
> Pointers passed to tp_btf were trusted to be valid, but some tracepoints
> do take NULL pointer as input, such as trace_tcp_send_reset(). Then the
> invalid memory access cannot be detected by verifier.
>
> This patch fix it by add a suffix "__nullable" to the unreliable
> argument. The suffix is shown in btf, and PTR_MAYBE_NULL will be added
> to nullable arguments. Then users must check the pointer before use it.
>
> A problem here is that we use "btf_trace_##call" to search func_proto.
> As it is a typedef, argument names as well as the suffix are not
> recorded. To solve this, I use bpf_raw_event_map to find
> "__bpf_trace##template" from "btf_trace_##call", and then we can see the
> suffix.

BTW, just curious, is it a pure coincidence that I solved the same
problem in retsnoop with the same approach (see extensive comment in
[0]) about 2 weeks ago, or retsnoop's approach was an inspiration
here?

  [0] https://github.com/anakryiko/retsnoop/commit/7b253fc55b51d447e5ea91d99f60d9c34934f799

>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
> ---
>  kernel/bpf/btf.c      | 13 +++++++++++++
>  kernel/bpf/verifier.c | 36 +++++++++++++++++++++++++++++++++---
>  2 files changed, 46 insertions(+), 3 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix for tp_btf
  2024-09-06 21:25   ` Andrii Nakryiko
@ 2024-09-07  3:26     ` Philo Lu
  0 siblings, 0 replies; 17+ messages in thread
From: Philo Lu @ 2024-09-07  3:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, edumazet, rostedt, mhiramat, mathieu.desnoyers, martin.lau,
	ast, daniel, andrii, eddyz87, song, yonghong.song, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, davem, kuba, pabeni, mykolal, shuah,
	mcoquelin.stm32, alexandre.torgue, thinker.li, juntong.deng,
	jrife, alan.maguire, davemarchevsky, dxu, vmalik,
	cupertino.miranda, mattbobrowski, xuanzhuo, netdev,
	linux-trace-kernel



On 2024/9/7 05:25, Andrii Nakryiko wrote:
> On Thu, Sep 5, 2024 at 12:56 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>>
>> Pointers passed to tp_btf were trusted to be valid, but some tracepoints
>> do take NULL pointer as input, such as trace_tcp_send_reset(). Then the
>> invalid memory access cannot be detected by verifier.
>>
>> This patch fix it by add a suffix "__nullable" to the unreliable
>> argument. The suffix is shown in btf, and PTR_MAYBE_NULL will be added
>> to nullable arguments. Then users must check the pointer before use it.
>>
>> A problem here is that we use "btf_trace_##call" to search func_proto.
>> As it is a typedef, argument names as well as the suffix are not
>> recorded. To solve this, I use bpf_raw_event_map to find
>> "__bpf_trace##template" from "btf_trace_##call", and then we can see the
>> suffix.
>>
>> Suggested-by: Alexei Starovoitov <ast@kernel.org>
>> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
>> ---
>>   kernel/bpf/btf.c      | 13 +++++++++++++
>>   kernel/bpf/verifier.c | 36 +++++++++++++++++++++++++++++++++---
>>   2 files changed, 46 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> index 1e29281653c62..157f5e1247c81 100644
>> --- a/kernel/bpf/btf.c
>> +++ b/kernel/bpf/btf.c
>> @@ -6385,6 +6385,16 @@ static bool prog_args_trusted(const struct bpf_prog *prog)
>>          }
>>   }
>>
>> +static bool prog_arg_maybe_null(const struct bpf_prog *prog, const struct btf *btf,
>> +                               const struct btf_param *arg)
>> +{
>> +       if (prog->type != BPF_PROG_TYPE_TRACING ||
>> +           prog->expected_attach_type != BPF_TRACE_RAW_TP)
>> +               return false;
>> +
>> +       return btf_param_match_suffix(btf, arg, "__nullable");
> 
> why does this need to be BPF_TRACE_RAW_TP-specific logic? Are we
> afraid that there might be "some_arg__nullable" argument name?..
> 

Yes. I don't think the check is necessary but I'm not quite sure if it 
affects other prog/attach types. It's ok for me to remove the check. I 
added it just because this __nullable suffix only serves tp_btf now.

And thanks for your nice suggestions. I'll fix them in the next version.

Also, thanks for the information about retsnoop. Our solutions seem to 
be similar, and I'll look into it further. Honestly, it takes me some 
time to get argument names from tp_btf, and I cannot find a better 
solution but to use btp->bpf_func with kallsyms...

-- 
Philo


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-09-07  3:26 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-05  7:56 [PATCH bpf-next v2 0/5] bpf: Allow skb dynptr for tp_btf Philo Lu
2024-09-05  7:56 ` [PATCH bpf-next v2 1/5] bpf: Support __nullable argument suffix " Philo Lu
2024-09-06 21:25   ` Andrii Nakryiko
2024-09-07  3:26     ` Philo Lu
2024-09-07  1:14   ` Andrii Nakryiko
2024-09-05  7:56 ` [PATCH bpf-next v2 2/5] selftests/bpf: Add test for __nullable suffix in tp_btf Philo Lu
2024-09-06 21:25   ` Andrii Nakryiko
2024-09-05  7:56 ` [PATCH bpf-next v2 3/5] tcp: Use skb__nullable in trace_tcp_send_reset Philo Lu
2024-09-06  0:26   ` Alexei Starovoitov
2024-09-06 22:23     ` Jakub Kicinski
2024-09-06 22:41       ` Alexei Starovoitov
2024-09-06 22:57         ` Jakub Kicinski
2024-09-06 23:22           ` Alexei Starovoitov
2024-09-07  0:17             ` Jakub Kicinski
2024-09-05  7:56 ` [PATCH bpf-next v2 4/5] bpf: Allow bpf_dynptr_from_skb() for tp_btf Philo Lu
2024-09-06  1:13   ` Martin KaFai Lau
2024-09-05  7:56 ` [PATCH bpf-next v2 5/5] selftests/bpf: Expand skb dynptr selftests " Philo Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).