[PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING
@ 2023-06-02  6:59 menglong8.dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14 menglong8.dong
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: menglong8.dong @ 2023-06-02  6:59 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

From: Menglong Dong <imagedong@tencent.com>

For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
on the kernel functions whose arguments count less than 6. This is not
friendly at all, as too many functions have arguments count more than 6.

Therefore, let's enhance it by increasing the function arguments count
allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.

In the 1th patch, we make MAX_BPF_FUNC_ARGS 14, according to our
statistics.

In the 2th patch, we make arch_prepare_bpf_trampoline() support to copy
function arguments in stack for x86 arch. Therefore, the maximum
arguments can be up to MAX_BPF_FUNC_ARGS for FENTRY and FEXIT.

And the 3-5th patches are for the testcases of the 2th patch.

Changes since v1:
- change the maximun function arguments to 14 from 12
- add testcases (Jiri Olsa)
- instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow

Menglong Dong (5):
  bpf: make MAX_BPF_FUNC_ARGS 14
  bpf, x86: allow function arguments up to 14 for TRACING
  libbpf: make BPF_PROG support 15 function arguments
  selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr*
  selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments

 arch/x86/net/bpf_jit_comp.c                   | 96 ++++++++++++++++---
 include/linux/bpf.h                           |  9 +-
 net/bpf/test_run.c                            | 40 ++++++--
 tools/lib/bpf/bpf_helpers.h                   |  9 +-
 tools/lib/bpf/bpf_tracing.h                   | 10 +-
 .../selftests/bpf/prog_tests/bpf_cookie.c     | 24 ++---
 .../bpf/prog_tests/kprobe_multi_test.c        | 16 ++--
 .../testing/selftests/bpf/progs/fentry_test.c | 50 ++++++++--
 .../testing/selftests/bpf/progs/fexit_test.c  | 51 ++++++++--
 .../selftests/bpf/progs/get_func_ip_test.c    |  2 +-
 .../selftests/bpf/progs/kprobe_multi.c        | 12 +--
 .../bpf/progs/verifier_btf_ctx_access.c       |  2 +-
 .../selftests/bpf/verifier/atomic_fetch_add.c |  4 +-
 13 files changed, 249 insertions(+), 76 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14
  2023-06-02  6:59 [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
@ 2023-06-02  6:59 ` menglong8.dong
  2023-06-02 18:17   ` Alexei Starovoitov
  2023-06-03 14:13   ` Simon Horman
  2023-06-02  6:59 ` [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 20+ messages in thread
From: menglong8.dong @ 2023-06-02  6:59 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

From: Menglong Dong <imagedong@tencent.com>

According to the current kernel version, below is a statistics of the
function arguments count:

argument count | FUNC_PROTO count
7              | 367
8              | 196
9              | 71
10             | 43
11             | 22
12             | 10
13             | 15
14             | 4
15             | 0
16             | 1

It's hard to statisics the function count, so I use FUNC_PROTO in the btf
of vmlinux instead. The function with 16 arguments is ZSTD_buildCTable(),
which I think can be ignored.

Therefore, let's make the maximum of function arguments count 14. It used
to be 12, but it seems that there is no harm to make it big enough.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 include/linux/bpf.h | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f58895830ada..8b997779faf7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -961,10 +961,10 @@ enum bpf_cgroup_storage_type {
 
 #define MAX_BPF_CGROUP_STORAGE_TYPE __BPF_CGROUP_STORAGE_MAX
 
-/* The longest tracepoint has 12 args.
- * See include/trace/bpf_probe.h
+/* The maximun number of the kernel function arguments.
+ * Let's make it 14, for now.
  */
-#define MAX_BPF_FUNC_ARGS 12
+#define MAX_BPF_FUNC_ARGS 14
 
 /* The maximum number of arguments passed through registers
  * a single function may have.
@@ -2273,7 +2273,8 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 static inline bool bpf_tracing_ctx_access(int off, int size,
 					  enum bpf_access_type type)
 {
-	if (off < 0 || off >= sizeof(__u64) * MAX_BPF_FUNC_ARGS)
+	/* "+1" here is for FEXIT return value. */
+	if (off < 0 || off >= sizeof(__u64) * (MAX_BPF_FUNC_ARGS + 1))
 		return false;
 	if (type != BPF_READ)
 		return false;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING
  2023-06-02  6:59 [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14 menglong8.dong
@ 2023-06-02  6:59 ` menglong8.dong
  2023-06-02  7:40   ` Menglong Dong
                     ` (2 more replies)
  2023-06-02  6:59 ` [PATCH bpf-next v2 3/5] libbpf: make BPF_PROG support 15 function arguments menglong8.dong
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 20+ messages in thread
From: menglong8.dong @ 2023-06-02  6:59 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

From: Menglong Dong <imagedong@tencent.com>

For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
on the kernel functions whose arguments count less than 6. This is not
friendly at all, as too many functions have arguments count more than 6.

Therefore, let's enhance it by increasing the function arguments count
allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.

For the case that we don't need to call origin function, which means
without BPF_TRAMP_F_CALL_ORIG, we need only copy the function arguments
that stored in the frame of the caller to current frame. The arguments
of arg6-argN are stored in "$rbp + 0x18", we need copy them to
"$rbp - regs_off + (6 * 8)".

For the case with BPF_TRAMP_F_CALL_ORIG, we need prepare the arguments
in stack before call origin function, which means we need alloc extra
"8 * (arg_count - 6)" memory in the top of the stack. Note, there should
not be any data be pushed to the stack before call the origin function.
Then, we have to store rbx with 'mov' instead of 'push'.

It works well for the FENTRY and FEXIT, I'm not sure if there are other
complicated cases.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
v2:
- instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
- make MAX_BPF_FUNC_ARGS as the maximum argument count
---
 arch/x86/net/bpf_jit_comp.c | 96 +++++++++++++++++++++++++++++++------
 1 file changed, 81 insertions(+), 15 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 1056bbf55b17..0e247bb7d6f6 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1868,7 +1868,7 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
 	 * mov QWORD PTR [rbp-0x10],rdi
 	 * mov QWORD PTR [rbp-0x8],rsi
 	 */
-	for (i = 0, j = 0; i < min(nr_regs, 6); i++) {
+	for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
 		/* The arg_size is at most 16 bytes, enforced by the verifier. */
 		arg_size = m->arg_size[j];
 		if (arg_size > 8) {
@@ -1876,10 +1876,22 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
 			next_same_struct = !next_same_struct;
 		}
 
-		emit_stx(prog, bytes_to_bpf_size(arg_size),
-			 BPF_REG_FP,
-			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
-			 -(stack_size - i * 8));
+		if (i <= 5) {
+			/* store function arguments in regs */
+			emit_stx(prog, bytes_to_bpf_size(arg_size),
+				 BPF_REG_FP,
+				 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
+				 -(stack_size - i * 8));
+		} else {
+			/* store function arguments in stack */
+			emit_ldx(prog, bytes_to_bpf_size(arg_size),
+				 BPF_REG_0, BPF_REG_FP,
+				 (i - 6) * 8 + 0x18);
+			emit_stx(prog, bytes_to_bpf_size(arg_size),
+				 BPF_REG_FP,
+				 BPF_REG_0,
+				 -(stack_size - i * 8));
+		}
 
 		j = next_same_struct ? j : j + 1;
 	}
@@ -1913,6 +1925,41 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
 	}
 }
 
+static void prepare_origin_stack(const struct btf_func_model *m, u8 **prog,
+				 int nr_regs, int stack_size)
+{
+	int i, j, arg_size;
+	bool next_same_struct = false;
+
+	if (nr_regs <= 6)
+		return;
+
+	/* Prepare the function arguments in stack before call origin
+	 * function. These arguments must be stored in the top of the
+	 * stack.
+	 */
+	for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
+		/* The arg_size is at most 16 bytes, enforced by the verifier. */
+		arg_size = m->arg_size[j];
+		if (arg_size > 8) {
+			arg_size = 8;
+			next_same_struct = !next_same_struct;
+		}
+
+		if (i > 5) {
+			emit_ldx(prog, bytes_to_bpf_size(arg_size),
+				 BPF_REG_0, BPF_REG_FP,
+				 (i - 6) * 8 + 0x18);
+			emit_stx(prog, bytes_to_bpf_size(arg_size),
+				 BPF_REG_FP,
+				 BPF_REG_0,
+				 -(stack_size - (i - 6) * 8));
+		}
+
+		j = next_same_struct ? j : j + 1;
+	}
+}
+
 static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 			   struct bpf_tramp_link *l, int stack_size,
 			   int run_ctx_off, bool save_ret)
@@ -1938,7 +1985,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	/* arg1: mov rdi, progs[i] */
 	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
 	/* arg2: lea rsi, [rbp - ctx_cookie_off] */
-	EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
+	EMIT3_off32(0x48, 0x8D, 0xB5, -run_ctx_off);
 
 	if (emit_rsb_call(&prog, bpf_trampoline_enter(p), prog))
 		return -EINVAL;
@@ -1954,7 +2001,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	emit_nops(&prog, 2);
 
 	/* arg1: lea rdi, [rbp - stack_size] */
-	EMIT4(0x48, 0x8D, 0x7D, -stack_size);
+	EMIT3_off32(0x48, 0x8D, 0xBD, -stack_size);
 	/* arg2: progs[i]->insnsi for interpreter */
 	if (!p->jited)
 		emit_mov_imm64(&prog, BPF_REG_2,
@@ -1984,7 +2031,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
 	/* arg2: mov rsi, rbx <- start time in nsec */
 	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
 	/* arg3: lea rdx, [rbp - run_ctx_off] */
-	EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
+	EMIT3_off32(0x48, 0x8D, 0x95, -run_ctx_off);
 	if (emit_rsb_call(&prog, bpf_trampoline_exit(p), prog))
 		return -EINVAL;
 
@@ -2136,7 +2183,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 				void *func_addr)
 {
 	int i, ret, nr_regs = m->nr_args, stack_size = 0;
-	int regs_off, nregs_off, ip_off, run_ctx_off;
+	int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
 	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
 	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
 	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
@@ -2150,8 +2197,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
 			nr_regs += (m->arg_size[i] + 7) / 8 - 1;
 
-	/* x86-64 supports up to 6 arguments. 7+ can be added in the future */
-	if (nr_regs > 6)
+	/* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6
+	 * are passed through regs, the remains are through stack.
+	 */
+	if (nr_regs > MAX_BPF_FUNC_ARGS)
 		return -ENOTSUPP;
 
 	/* Generated trampoline stack layout:
@@ -2170,7 +2219,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	 *
 	 * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
 	 *
+	 * RBP - rbx_off   [ rbx value       ]  always
+	 *
 	 * RBP - run_ctx_off [ bpf_tramp_run_ctx ]
+	 *
+	 *                     [ stack_argN ]  BPF_TRAMP_F_CALL_ORIG
+	 *                     [ ...        ]
+	 *                     [ stack_arg2 ]
+	 * RBP - arg_stack_off [ stack_arg1 ]
 	 */
 
 	/* room for return value of orig_call or fentry prog */
@@ -2190,9 +2246,17 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 
 	ip_off = stack_size;
 
+	stack_size += 8;
+	rbx_off = stack_size;
+
 	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
 	run_ctx_off = stack_size;
 
+	if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG))
+		stack_size += (nr_regs - 6) * 8;
+
+	arg_stack_off = stack_size;
+
 	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
 		/* skip patched call instruction and point orig_call to actual
 		 * body of the kernel function.
@@ -2212,8 +2276,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	x86_call_depth_emit_accounting(&prog, NULL);
 	EMIT1(0x55);		 /* push rbp */
 	EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
-	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
-	EMIT1(0x53);		 /* push rbx */
+	EMIT3_off32(0x48, 0x81, 0xEC, stack_size); /* sub rsp, stack_size */
+	/* mov QWORD PTR [rbp - rbx_off], rbx */
+	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_6, -rbx_off);
 
 	/* Store number of argument registers of the traced function:
 	 *   mov rax, nr_regs
@@ -2262,6 +2327,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		restore_regs(m, &prog, nr_regs, regs_off);
+		prepare_origin_stack(m, &prog, nr_regs, arg_stack_off);
 
 		if (flags & BPF_TRAMP_F_ORIG_STACK) {
 			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
@@ -2321,14 +2387,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	if (save_ret)
 		emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
 
-	EMIT1(0x5B); /* pop rbx */
+	emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
 	EMIT1(0xC9); /* leave */
 	if (flags & BPF_TRAMP_F_SKIP_FRAME)
 		/* skip our return address and return to parent */
 		EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
 	emit_return(&prog, prog);
 	/* Make sure the trampoline generation logic doesn't overflow */
-	if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
+	if (prog > (u8 *)image_end - BPF_INSN_SAFETY) {
 		ret = -EFAULT;
 		goto cleanup;
 	}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 3/5] libbpf: make BPF_PROG support 15 function arguments
  2023-06-02  6:59 [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14 menglong8.dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
@ 2023-06-02  6:59 ` menglong8.dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 4/5] selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr* menglong8.dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments menglong8.dong
  4 siblings, 0 replies; 20+ messages in thread
From: menglong8.dong @ 2023-06-02  6:59 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

From: Menglong Dong <imagedong@tencent.com>

Now that we changed MAX_BPF_FUNC_ARGS from 12 to 14, we should update
BPF_PROG() too.

Extend BPF_PROG() to make it support 15 arguments, 14 for kernel
function arguments and 1 for return value of FEXIT.

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 tools/lib/bpf/bpf_helpers.h |  9 ++++++---
 tools/lib/bpf/bpf_tracing.h | 10 ++++++++--
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
index bbab9ad9dc5a..d1574491cf16 100644
--- a/tools/lib/bpf/bpf_helpers.h
+++ b/tools/lib/bpf/bpf_helpers.h
@@ -194,11 +194,13 @@ enum libbpf_tristate {
 #define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
 #endif
 #ifndef ___bpf_nth
-#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
+#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c,	\
+		   _d, _e, _f, N, ...) N
 #endif
 #ifndef ___bpf_narg
 #define ___bpf_narg(...) \
-	___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+	___bpf_nth(_, ##__VA_ARGS__, 15, 14, 13, 12, 11, 10, 9, 8, 7,	\
+		   6, 5, 4, 3, 2, 1, 0)
 #endif
 
 #define ___bpf_fill0(arr, p, x) do {} while (0)
@@ -290,7 +292,8 @@ enum libbpf_tristate {
 #define ___bpf_pick_printk(...) \
 	___bpf_nth(_, ##__VA_ARGS__, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk,	\
 		   __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk,		\
-		   __bpf_vprintk, __bpf_vprintk, __bpf_printk /*3*/, __bpf_printk /*2*/,\
+		   __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk,		\
+		   __bpf_vprintk, __bpf_printk /*3*/, __bpf_printk /*2*/,		\
 		   __bpf_printk /*1*/, __bpf_printk /*0*/)
 
 /* Helper macro to print out debug messages */
diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index be076a4041ab..4e940239f1c1 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -628,10 +628,13 @@ struct pt_regs;
 #define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
 #endif
 #ifndef ___bpf_nth
-#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
+#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c,	\
+		   _d, _e, _f, N, ...) N
 #endif
 #ifndef ___bpf_narg
-#define ___bpf_narg(...) ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
+#define ___bpf_narg(...) \
+	___bpf_nth(_, ##__VA_ARGS__, 15, 14, 13, 12, 11, 10, 9, 8, 7,	\
+		   6, 5, 4, 3, 2, 1, 0)
 #endif
 
 #define ___bpf_ctx_cast0()            ctx
@@ -647,6 +650,9 @@ struct pt_regs;
 #define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), (void *)ctx[9]
 #define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), (void *)ctx[10]
 #define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), (void *)ctx[11]
+#define ___bpf_ctx_cast13(x, args...) ___bpf_ctx_cast12(args), (void *)ctx[12]
+#define ___bpf_ctx_cast14(x, args...) ___bpf_ctx_cast13(args), (void *)ctx[13]
+#define ___bpf_ctx_cast15(x, args...) ___bpf_ctx_cast14(args), (void *)ctx[14]
 #define ___bpf_ctx_cast(args...)      ___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 4/5] selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr*
  2023-06-02  6:59 [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
                   ` (2 preceding siblings ...)
  2023-06-02  6:59 ` [PATCH bpf-next v2 3/5] libbpf: make BPF_PROG support 15 function arguments menglong8.dong
@ 2023-06-02  6:59 ` menglong8.dong
  2023-06-02  9:44   ` Menglong Dong
  2023-06-02  6:59 ` [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments menglong8.dong
  4 siblings, 1 reply; 20+ messages in thread
From: menglong8.dong @ 2023-06-02  6:59 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

From: Menglong Dong <imagedong@tencent.com>

To make it more clear, let's make the N in bpf_fentry_testN as the count
of target function arguments. Therefore, let's rename
bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr{1,2,3}.

Meanwhile, to stop the checkpatch complaining, move the "noinline" ahead
of "int".

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 net/bpf/test_run.c                            | 12 +++++-----
 .../selftests/bpf/prog_tests/bpf_cookie.c     | 24 +++++++++----------
 .../bpf/prog_tests/kprobe_multi_test.c        | 16 ++++++-------
 .../testing/selftests/bpf/progs/fentry_test.c | 16 ++++++-------
 .../testing/selftests/bpf/progs/fexit_test.c  | 16 ++++++-------
 .../selftests/bpf/progs/get_func_ip_test.c    |  2 +-
 .../selftests/bpf/progs/kprobe_multi.c        | 12 +++++-----
 .../bpf/progs/verifier_btf_ctx_access.c       |  2 +-
 .../selftests/bpf/verifier/atomic_fetch_add.c |  4 ++--
 9 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2321bd2f9964..c73f246a706f 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -540,17 +540,17 @@ struct bpf_fentry_test_t {
 	struct bpf_fentry_test_t *a;
 };
 
-int noinline bpf_fentry_test7(struct bpf_fentry_test_t *arg)
+noinline int bpf_fentry_test_ptr1(struct bpf_fentry_test_t *arg)
 {
 	return (long)arg;
 }
 
-int noinline bpf_fentry_test8(struct bpf_fentry_test_t *arg)
+noinline int bpf_fentry_test_ptr2(struct bpf_fentry_test_t *arg)
 {
 	return (long)arg->a;
 }
 
-__bpf_kfunc u32 bpf_fentry_test9(u32 *a)
+__bpf_kfunc u32 bpf_fentry_test_ptr3(u32 *a)
 {
 	return *a;
 }
@@ -655,9 +655,9 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
 		    bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
 		    bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
 		    bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111 ||
-		    bpf_fentry_test7((struct bpf_fentry_test_t *)0) != 0 ||
-		    bpf_fentry_test8(&arg) != 0 ||
-		    bpf_fentry_test9(&retval) != 0)
+		    bpf_fentry_test_ptr1((struct bpf_fentry_test_t *)0) != 0 ||
+		    bpf_fentry_test_ptr2(&arg) != 0 ||
+		    bpf_fentry_test_ptr3(&retval) != 0)
 			goto out;
 		break;
 	case BPF_MODIFY_RETURN:
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
index 26b2d1bffdfd..320003b96c86 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
@@ -125,9 +125,9 @@ static void kprobe_multi_link_api_subtest(void)
 	GET_ADDR("bpf_fentry_test4", addrs[2]);
 	GET_ADDR("bpf_fentry_test5", addrs[3]);
 	GET_ADDR("bpf_fentry_test6", addrs[4]);
-	GET_ADDR("bpf_fentry_test7", addrs[5]);
+	GET_ADDR("bpf_fentry_test_ptr1", addrs[5]);
 	GET_ADDR("bpf_fentry_test2", addrs[6]);
-	GET_ADDR("bpf_fentry_test8", addrs[7]);
+	GET_ADDR("bpf_fentry_test_ptr2", addrs[7]);
 
 #undef GET_ADDR
 
@@ -136,9 +136,9 @@ static void kprobe_multi_link_api_subtest(void)
 	cookies[2] = 3; /* bpf_fentry_test4 */
 	cookies[3] = 4; /* bpf_fentry_test5 */
 	cookies[4] = 5; /* bpf_fentry_test6 */
-	cookies[5] = 6; /* bpf_fentry_test7 */
+	cookies[5] = 6; /* bpf_fentry_test_ptr1 */
 	cookies[6] = 7; /* bpf_fentry_test2 */
-	cookies[7] = 8; /* bpf_fentry_test8 */
+	cookies[7] = 8; /* bpf_fentry_test_ptr2 */
 
 	opts.kprobe_multi.addrs = (const unsigned long *) &addrs;
 	opts.kprobe_multi.cnt = ARRAY_SIZE(addrs);
@@ -154,9 +154,9 @@ static void kprobe_multi_link_api_subtest(void)
 	cookies[2] = 6; /* bpf_fentry_test4 */
 	cookies[3] = 5; /* bpf_fentry_test5 */
 	cookies[4] = 4; /* bpf_fentry_test6 */
-	cookies[5] = 3; /* bpf_fentry_test7 */
+	cookies[5] = 3; /* bpf_fentry_test_ptr1 */
 	cookies[6] = 2; /* bpf_fentry_test2 */
-	cookies[7] = 1; /* bpf_fentry_test8 */
+	cookies[7] = 1; /* bpf_fentry_test_ptr2 */
 
 	opts.kprobe_multi.flags = BPF_F_KPROBE_MULTI_RETURN;
 	prog_fd = bpf_program__fd(skel->progs.test_kretprobe);
@@ -185,9 +185,9 @@ static void kprobe_multi_attach_api_subtest(void)
 		"bpf_fentry_test4",
 		"bpf_fentry_test5",
 		"bpf_fentry_test6",
-		"bpf_fentry_test7",
+		"bpf_fentry_test_ptr1",
 		"bpf_fentry_test2",
-		"bpf_fentry_test8",
+		"bpf_fentry_test_ptr2",
 	};
 	__u64 cookies[8];
 
@@ -203,9 +203,9 @@ static void kprobe_multi_attach_api_subtest(void)
 	cookies[2] = 3; /* bpf_fentry_test4 */
 	cookies[3] = 4; /* bpf_fentry_test5 */
 	cookies[4] = 5; /* bpf_fentry_test6 */
-	cookies[5] = 6; /* bpf_fentry_test7 */
+	cookies[5] = 6; /* bpf_fentry_test_ptr1 */
 	cookies[6] = 7; /* bpf_fentry_test2 */
-	cookies[7] = 8; /* bpf_fentry_test8 */
+	cookies[7] = 8; /* bpf_fentry_test_ptr2 */
 
 	opts.syms = syms;
 	opts.cnt = ARRAY_SIZE(syms);
@@ -221,9 +221,9 @@ static void kprobe_multi_attach_api_subtest(void)
 	cookies[2] = 6; /* bpf_fentry_test4 */
 	cookies[3] = 5; /* bpf_fentry_test5 */
 	cookies[4] = 4; /* bpf_fentry_test6 */
-	cookies[5] = 3; /* bpf_fentry_test7 */
+	cookies[5] = 3; /* bpf_fentry_test_ptr1 */
 	cookies[6] = 2; /* bpf_fentry_test2 */
-	cookies[7] = 1; /* bpf_fentry_test8 */
+	cookies[7] = 1; /* bpf_fentry_test_ptr2 */
 
 	opts.retprobe = true;
 
diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index 2173c4bb555e..15c77fe9fc16 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -105,8 +105,8 @@ static void test_link_api_addrs(void)
 	GET_ADDR("bpf_fentry_test4", addrs[3]);
 	GET_ADDR("bpf_fentry_test5", addrs[4]);
 	GET_ADDR("bpf_fentry_test6", addrs[5]);
-	GET_ADDR("bpf_fentry_test7", addrs[6]);
-	GET_ADDR("bpf_fentry_test8", addrs[7]);
+	GET_ADDR("bpf_fentry_test_ptr1", addrs[6]);
+	GET_ADDR("bpf_fentry_test_ptr2", addrs[7]);
 
 	opts.kprobe_multi.addrs = (const unsigned long*) addrs;
 	opts.kprobe_multi.cnt = ARRAY_SIZE(addrs);
@@ -123,8 +123,8 @@ static void test_link_api_syms(void)
 		"bpf_fentry_test4",
 		"bpf_fentry_test5",
 		"bpf_fentry_test6",
-		"bpf_fentry_test7",
-		"bpf_fentry_test8",
+		"bpf_fentry_test_ptr1",
+		"bpf_fentry_test_ptr2",
 	};
 
 	opts.kprobe_multi.syms = syms;
@@ -183,8 +183,8 @@ static void test_attach_api_addrs(void)
 	GET_ADDR("bpf_fentry_test4", addrs[3]);
 	GET_ADDR("bpf_fentry_test5", addrs[4]);
 	GET_ADDR("bpf_fentry_test6", addrs[5]);
-	GET_ADDR("bpf_fentry_test7", addrs[6]);
-	GET_ADDR("bpf_fentry_test8", addrs[7]);
+	GET_ADDR("bpf_fentry_test_ptr1", addrs[6]);
+	GET_ADDR("bpf_fentry_test_ptr2", addrs[7]);
 
 	opts.addrs = (const unsigned long *) addrs;
 	opts.cnt = ARRAY_SIZE(addrs);
@@ -201,8 +201,8 @@ static void test_attach_api_syms(void)
 		"bpf_fentry_test4",
 		"bpf_fentry_test5",
 		"bpf_fentry_test6",
-		"bpf_fentry_test7",
-		"bpf_fentry_test8",
+		"bpf_fentry_test_ptr1",
+		"bpf_fentry_test_ptr2",
 	};
 
 	opts.syms = syms;
diff --git a/tools/testing/selftests/bpf/progs/fentry_test.c b/tools/testing/selftests/bpf/progs/fentry_test.c
index 52a550d281d9..558a5f1d3d5c 100644
--- a/tools/testing/selftests/bpf/progs/fentry_test.c
+++ b/tools/testing/selftests/bpf/progs/fentry_test.c
@@ -60,20 +60,20 @@ struct bpf_fentry_test_t {
 	struct bpf_fentry_test_t *a;
 };
 
-__u64 test7_result = 0;
-SEC("fentry/bpf_fentry_test7")
-int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
+__u64 test_ptr1_result = 0;
+SEC("fentry/bpf_fentry_test_ptr1")
+int BPF_PROG(test_ptr1, struct bpf_fentry_test_t *arg)
 {
 	if (!arg)
-		test7_result = 1;
+		test_ptr1_result = 1;
 	return 0;
 }
 
-__u64 test8_result = 0;
-SEC("fentry/bpf_fentry_test8")
-int BPF_PROG(test8, struct bpf_fentry_test_t *arg)
+__u64 test_ptr2_result = 0;
+SEC("fentry/bpf_fentry_test_ptr2")
+int BPF_PROG(test_ptr2, struct bpf_fentry_test_t *arg)
 {
 	if (arg->a == 0)
-		test8_result = 1;
+		test_ptr2_result = 1;
 	return 0;
 }
diff --git a/tools/testing/selftests/bpf/progs/fexit_test.c b/tools/testing/selftests/bpf/progs/fexit_test.c
index 8f1ccb7302e1..f57886e6d918 100644
--- a/tools/testing/selftests/bpf/progs/fexit_test.c
+++ b/tools/testing/selftests/bpf/progs/fexit_test.c
@@ -61,20 +61,20 @@ struct bpf_fentry_test_t {
 	struct bpf_fentry_test *a;
 };
 
-__u64 test7_result = 0;
-SEC("fexit/bpf_fentry_test7")
-int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
+__u64 test_ptr1_result = 0;
+SEC("fexit/bpf_fentry_test_ptr1")
+int BPF_PROG(test_ptr1, struct bpf_fentry_test_t *arg)
 {
 	if (!arg)
-		test7_result = 1;
+		test_ptr1_result = 1;
 	return 0;
 }
 
-__u64 test8_result = 0;
-SEC("fexit/bpf_fentry_test8")
-int BPF_PROG(test8, struct bpf_fentry_test_t *arg)
+__u64 test_ptr2_result = 0;
+SEC("fexit/bpf_fentry_test_ptr2")
+int BPF_PROG(test_ptr2, struct bpf_fentry_test_t *arg)
 {
 	if (!arg->a)
-		test8_result = 1;
+		test_ptr2_result = 1;
 	return 0;
 }
diff --git a/tools/testing/selftests/bpf/progs/get_func_ip_test.c b/tools/testing/selftests/bpf/progs/get_func_ip_test.c
index 8559e698b40d..7fffe2b563fa 100644
--- a/tools/testing/selftests/bpf/progs/get_func_ip_test.c
+++ b/tools/testing/selftests/bpf/progs/get_func_ip_test.c
@@ -12,7 +12,7 @@ extern const void bpf_fentry_test3 __ksym;
 extern const void bpf_fentry_test4 __ksym;
 extern const void bpf_modify_return_test __ksym;
 extern const void bpf_fentry_test6 __ksym;
-extern const void bpf_fentry_test7 __ksym;
+extern const void bpf_fentry_test_ptr1 __ksym;
 
 extern bool CONFIG_X86_KERNEL_IBT __kconfig __weak;
 
diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c
index 9e1ca8e34913..e676fb195595 100644
--- a/tools/testing/selftests/bpf/progs/kprobe_multi.c
+++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c
@@ -12,8 +12,8 @@ extern const void bpf_fentry_test3 __ksym;
 extern const void bpf_fentry_test4 __ksym;
 extern const void bpf_fentry_test5 __ksym;
 extern const void bpf_fentry_test6 __ksym;
-extern const void bpf_fentry_test7 __ksym;
-extern const void bpf_fentry_test8 __ksym;
+extern const void bpf_fentry_test_ptr1 __ksym;
+extern const void bpf_fentry_test_ptr2 __ksym;
 
 int pid = 0;
 bool test_cookie = false;
@@ -57,8 +57,8 @@ static void kprobe_multi_check(void *ctx, bool is_return)
 		SET(kretprobe_test4_result, &bpf_fentry_test4, 6);
 		SET(kretprobe_test5_result, &bpf_fentry_test5, 5);
 		SET(kretprobe_test6_result, &bpf_fentry_test6, 4);
-		SET(kretprobe_test7_result, &bpf_fentry_test7, 3);
-		SET(kretprobe_test8_result, &bpf_fentry_test8, 1);
+		SET(kretprobe_test7_result, &bpf_fentry_test_ptr1, 3);
+		SET(kretprobe_test8_result, &bpf_fentry_test_ptr2, 1);
 	} else {
 		SET(kprobe_test1_result, &bpf_fentry_test1, 1);
 		SET(kprobe_test2_result, &bpf_fentry_test2, 7);
@@ -66,8 +66,8 @@ static void kprobe_multi_check(void *ctx, bool is_return)
 		SET(kprobe_test4_result, &bpf_fentry_test4, 3);
 		SET(kprobe_test5_result, &bpf_fentry_test5, 4);
 		SET(kprobe_test6_result, &bpf_fentry_test6, 5);
-		SET(kprobe_test7_result, &bpf_fentry_test7, 6);
-		SET(kprobe_test8_result, &bpf_fentry_test8, 8);
+		SET(kprobe_test7_result, &bpf_fentry_test_ptr1, 6);
+		SET(kprobe_test8_result, &bpf_fentry_test_ptr2, 8);
 	}
 
 #undef SET
diff --git a/tools/testing/selftests/bpf/progs/verifier_btf_ctx_access.c b/tools/testing/selftests/bpf/progs/verifier_btf_ctx_access.c
index a570e48b917a..b90b8a3bf169 100644
--- a/tools/testing/selftests/bpf/progs/verifier_btf_ctx_access.c
+++ b/tools/testing/selftests/bpf/progs/verifier_btf_ctx_access.c
@@ -17,7 +17,7 @@ __naked void btf_ctx_access_accept(void)
 "	::: __clobber_all);
 }
 
-SEC("fentry/bpf_fentry_test9")
+SEC("fentry/bpf_fentry_test_ptr3")
 __description("btf_ctx_access u32 pointer accept")
 __success __retval(0)
 __naked void ctx_access_u32_pointer_accept(void)
diff --git a/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c b/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
index a91de8cd9def..23d70b11ab80 100644
--- a/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
+++ b/tools/testing/selftests/bpf/verifier/atomic_fetch_add.c
@@ -88,7 +88,7 @@
 		 * kernel function being called. Load first arg into R2.
 		 */
 		BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 0),
-		/* First arg of bpf_fentry_test7 is a pointer to a struct.
+		/* First arg of bpf_fentry_test_ptr1 is a pointer to a struct.
 		 * Attempt to modify that struct. Verifier shouldn't let us
 		 * because it's kernel memory.
 		 */
@@ -100,7 +100,7 @@
 	},
 	.prog_type = BPF_PROG_TYPE_TRACING,
 	.expected_attach_type = BPF_TRACE_FENTRY,
-	.kfunc = "bpf_fentry_test7",
+	.kfunc = "bpf_fentry_test_ptr1",
 	.result = REJECT,
 	.errstr = "only read is supported",
 },
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
  2023-06-02  6:59 [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
                   ` (3 preceding siblings ...)
  2023-06-02  6:59 ` [PATCH bpf-next v2 4/5] selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr* menglong8.dong
@ 2023-06-02  6:59 ` menglong8.dong
  2023-06-02  8:24   ` Ilya Leoshkevich
  2023-06-02 18:32   ` Alexei Starovoitov
  4 siblings, 2 replies; 20+ messages in thread
From: menglong8.dong @ 2023-06-02  6:59 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

From: Menglong Dong <imagedong@tencent.com>

Add test7/test12/test14 in fexit_test.c and fentry_test.c to test the
fentry and fexit whose target function have 7/12/14 arguments.

And the testcases passed:

./test_progs -t fexit
$71      fentry_fexit:OK
$73/1    fexit_bpf2bpf/target_no_callees:OK
$73/2    fexit_bpf2bpf/target_yes_callees:OK
$73/3    fexit_bpf2bpf/func_replace:OK
$73/4    fexit_bpf2bpf/func_replace_verify:OK
$73/5    fexit_bpf2bpf/func_sockmap_update:OK
$73/6    fexit_bpf2bpf/func_replace_return_code:OK
$73/7    fexit_bpf2bpf/func_map_prog_compatibility:OK
$73/8    fexit_bpf2bpf/func_replace_multi:OK
$73/9    fexit_bpf2bpf/fmod_ret_freplace:OK
$73/10   fexit_bpf2bpf/func_replace_global_func:OK
$73/11   fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
$73/12   fexit_bpf2bpf/func_replace_progmap:OK
$73      fexit_bpf2bpf:OK
$74      fexit_sleep:OK
$75      fexit_stress:OK
$76      fexit_test:OK
Summary: 5/12 PASSED, 0 SKIPPED, 0 FAILED

./test_progs -t fentry
$71      fentry_fexit:OK
$72      fentry_test:OK
$140     module_fentry_shadow:OK
Summary: 3/0 PASSED, 0 SKIPPED, 0 FAILED

Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 net/bpf/test_run.c                            | 30 +++++++++++++++-
 .../testing/selftests/bpf/progs/fentry_test.c | 34 ++++++++++++++++++
 .../testing/selftests/bpf/progs/fexit_test.c  | 35 +++++++++++++++++++
 3 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index c73f246a706f..e12a72311eca 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -536,6 +536,27 @@ int noinline bpf_fentry_test6(u64 a, void *b, short c, int d, void *e, u64 f)
 	return a + (long)b + c + d + (long)e + f;
 }
 
+noinline int bpf_fentry_test7(u64 a, void *b, short c, int d, void *e,
+			      u64 f, u64 g)
+{
+	return a + (long)b + c + d + (long)e + f + g;
+}
+
+noinline int bpf_fentry_test12(u64 a, void *b, short c, int d, void *e,
+			       u64 f, u64 g, u64 h, u64 i, u64 j,
+			       u64 k, u64 l)
+{
+	return a + (long)b + c + d + (long)e + f + g + h + i + j + k + l;
+}
+
+noinline int bpf_fentry_test14(u64 a, void *b, short c, int d, void *e,
+			       u64 f, u64 g, u64 h, u64 i, u64 j,
+			       u64 k, u64 l, u64 m, u64 n)
+{
+	return a + (long)b + c + d + (long)e + f + g + h + i + j + k + l +
+	       m + n;
+}
+
 struct bpf_fentry_test_t {
 	struct bpf_fentry_test_t *a;
 };
@@ -657,7 +678,14 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
 		    bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111 ||
 		    bpf_fentry_test_ptr1((struct bpf_fentry_test_t *)0) != 0 ||
 		    bpf_fentry_test_ptr2(&arg) != 0 ||
-		    bpf_fentry_test_ptr3(&retval) != 0)
+		    bpf_fentry_test_ptr3(&retval) != 0 ||
+		    bpf_fentry_test7(16, (void *)17, 18, 19, (void *)20,
+				     21, 22) != 133 ||
+		    bpf_fentry_test12(16, (void *)17, 18, 19, (void *)20,
+				      21, 22, 23, 24, 25, 26, 27) != 258 ||
+		    bpf_fentry_test14(16, (void *)17, 18, 19, (void *)20,
+				      21, 22, 23, 24, 25, 26, 27, 28,
+				      29) != 315)
 			goto out;
 		break;
 	case BPF_MODIFY_RETURN:
diff --git a/tools/testing/selftests/bpf/progs/fentry_test.c b/tools/testing/selftests/bpf/progs/fentry_test.c
index 558a5f1d3d5c..0666a907f7ea 100644
--- a/tools/testing/selftests/bpf/progs/fentry_test.c
+++ b/tools/testing/selftests/bpf/progs/fentry_test.c
@@ -56,6 +56,40 @@ int BPF_PROG(test6, __u64 a, void *b, short c, int d, void * e, __u64 f)
 	return 0;
 }
 
+__u64 test7_result = 0;
+SEC("fentry/bpf_fentry_test7")
+int BPF_PROG(test7, __u64 a, void *b, short c, int d, void *e, __u64 f,
+	     __u64 g)
+{
+	test7_result = a == 16 && b == (void *)17 && c == 18 && d == 19 &&
+		e == (void *)20 && f == 21 && g == 22;
+	return 0;
+}
+
+__u64 test12_result = 0;
+SEC("fentry/bpf_fentry_test12")
+int BPF_PROG(test12, __u64 a, void *b, short c, int d, void *e, __u64 f,
+	     __u64 g, __u64 h, __u64 i, __u64 j, __u64 k, __u64 l)
+{
+	test12_result = a == 16 && b == (void *)17 && c == 18 && d == 19 &&
+		e == (void *)20 && f == 21 && g == 22 && h == 23 &&
+		i == 24 && j == 25 && k == 26 && l == 27;
+	return 0;
+}
+
+__u64 test14_result = 0;
+SEC("fentry/bpf_fentry_test14")
+int BPF_PROG(test14, __u64 a, void *b, short c, int d, void *e, __u64 f,
+	     __u64 g, __u64 h, __u64 i, __u64 j, __u64 k, __u64 l,
+	     __u64 m, __u64 n)
+{
+	test14_result = a == 16 && b == (void *)17 && c == 18 && d == 19 &&
+		e == (void *)20 && f == 21 && g == 22 && h == 23 &&
+		i == 24 && j == 25 && k == 26 && l == 27 && m == 28 &&
+		n == 29;
+	return 0;
+}
+
 struct bpf_fentry_test_t {
 	struct bpf_fentry_test_t *a;
 };
diff --git a/tools/testing/selftests/bpf/progs/fexit_test.c b/tools/testing/selftests/bpf/progs/fexit_test.c
index f57886e6d918..1b9102ad1418 100644
--- a/tools/testing/selftests/bpf/progs/fexit_test.c
+++ b/tools/testing/selftests/bpf/progs/fexit_test.c
@@ -57,6 +57,41 @@ int BPF_PROG(test6, __u64 a, void *b, short c, int d, void *e, __u64 f, int ret)
 	return 0;
 }
 
+__u64 test7_result = 0;
+SEC("fexit/bpf_fentry_test7")
+int BPF_PROG(test7, __u64 a, void *b, short c, int d, void *e, __u64 f,
+	     __u64 g, int ret)
+{
+	test7_result = a == 16 && b == (void *)17 && c == 18 && d == 19 &&
+		e == (void *)20 && f == 21 && g == 22 && ret == 133;
+	return 0;
+}
+
+__u64 test12_result = 0;
+SEC("fexit/bpf_fentry_test12")
+int BPF_PROG(test12, __u64 a, void *b, short c, int d, void *e, __u64 f,
+	     __u64 g, __u64 h, __u64 i, __u64 j, __u64 k, __u64 l,
+	     int ret)
+{
+	test12_result = a == 16 && b == (void *)17 && c == 18 && d == 19 &&
+		e == (void *)20 && f == 21 && g == 22 && h == 23 &&
+		i == 24 && j == 25 && k == 26 && l == 27 && ret == 258;
+	return 0;
+}
+
+__u64 test14_result = 0;
+SEC("fexit/bpf_fentry_test14")
+int BPF_PROG(test14, __u64 a, void *b, short c, int d, void *e, __u64 f,
+	     __u64 g, __u64 h, __u64 i, __u64 j, __u64 k, __u64 l,
+	     __u64 m, __u64 n, int ret)
+{
+	test14_result = a == 16 && b == (void *)17 && c == 18 && d == 19 &&
+		e == (void *)20 && f == 21 && g == 22 && h == 23 &&
+		i == 24 && j == 25 && k == 26 && l == 27 && m == 28 &&
+		n == 29 && ret == 315;
+	return 0;
+}
+
 struct bpf_fentry_test_t {
 	struct bpf_fentry_test *a;
 };
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING
  2023-06-02  6:59 ` [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
@ 2023-06-02  7:40   ` Menglong Dong
  2023-06-02 18:31   ` Alexei Starovoitov
  2023-06-05 20:10   ` Jiri Olsa
  2 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-02  7:40 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

On Fri, Jun 2, 2023 at 3:01 PM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
> @@ -2262,6 +2327,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>
>         if (flags & BPF_TRAMP_F_CALL_ORIG) {
>                 restore_regs(m, &prog, nr_regs, regs_off);
> +               prepare_origin_stack(m, &prog, nr_regs, arg_stack_off);
>
>                 if (flags & BPF_TRAMP_F_ORIG_STACK) {
>                         emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> @@ -2321,14 +2387,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>         if (save_ret)
>                 emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
>
> -       EMIT1(0x5B); /* pop rbx */
> +       emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
>         EMIT1(0xC9); /* leave */
>         if (flags & BPF_TRAMP_F_SKIP_FRAME)
>                 /* skip our return address and return to parent */
>                 EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
>         emit_return(&prog, prog);
>         /* Make sure the trampoline generation logic doesn't overflow */
> -       if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
> +       if (prog > (u8 *)image_end - BPF_INSN_SAFETY) {

Oops, this line is a mistake, and I should keep it still.

>                 ret = -EFAULT;
>                 goto cleanup;
>         }
> --
> 2.40.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
  2023-06-02  6:59 ` [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments menglong8.dong
@ 2023-06-02  8:24   ` Ilya Leoshkevich
  2023-06-02  8:47     ` Menglong Dong
  2023-06-02 18:32   ` Alexei Starovoitov
  1 sibling, 1 reply; 20+ messages in thread
From: Ilya Leoshkevich @ 2023-06-02  8:24 UTC (permalink / raw)
  To: menglong8.dong, olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, imagedong, xukuohai, chantr4, zwisler, eddyz87, netdev,
	bpf, linux-kernel, linux-kselftest

On Fri, 2023-06-02 at 14:59 +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <imagedong@tencent.com>
> 
> Add test7/test12/test14 in fexit_test.c and fentry_test.c to test the
> fentry and fexit whose target function have 7/12/14 arguments.
> 
> And the testcases passed:
> 
> ./test_progs -t fexit
> $71      fentry_fexit:OK
> $73/1    fexit_bpf2bpf/target_no_callees:OK
> $73/2    fexit_bpf2bpf/target_yes_callees:OK
> $73/3    fexit_bpf2bpf/func_replace:OK
> $73/4    fexit_bpf2bpf/func_replace_verify:OK
> $73/5    fexit_bpf2bpf/func_sockmap_update:OK
> $73/6    fexit_bpf2bpf/func_replace_return_code:OK
> $73/7    fexit_bpf2bpf/func_map_prog_compatibility:OK
> $73/8    fexit_bpf2bpf/func_replace_multi:OK
> $73/9    fexit_bpf2bpf/fmod_ret_freplace:OK
> $73/10   fexit_bpf2bpf/func_replace_global_func:OK
> $73/11   fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
> $73/12   fexit_bpf2bpf/func_replace_progmap:OK
> $73      fexit_bpf2bpf:OK
> $74      fexit_sleep:OK
> $75      fexit_stress:OK
> $76      fexit_test:OK
> Summary: 5/12 PASSED, 0 SKIPPED, 0 FAILED
> 
> ./test_progs -t fentry
> $71      fentry_fexit:OK
> $72      fentry_test:OK
> $140     module_fentry_shadow:OK
> Summary: 3/0 PASSED, 0 SKIPPED, 0 FAILED
> 
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  net/bpf/test_run.c                            | 30 +++++++++++++++-
>  .../testing/selftests/bpf/progs/fentry_test.c | 34
> ++++++++++++++++++
>  .../testing/selftests/bpf/progs/fexit_test.c  | 35
> +++++++++++++++++++
>  3 files changed, 98 insertions(+), 1 deletion(-)

Don't you also need

--- a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
@@ -34,7 +34,7 @@ void test_fentry_fexit(void)
        fentry_res = (__u64 *)fentry_skel->bss;
        fexit_res = (__u64 *)fexit_skel->bss;
        printf("%lld\n", fentry_skel->bss->test1_result);
-       for (i = 0; i < 8; i++) {
+       for (i = 0; i < 11; i++) {
                ASSERT_EQ(fentry_res[i], 1, "fentry result");
                ASSERT_EQ(fexit_res[i], 1, "fexit result");
        }

to verify the results of the new tests?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
  2023-06-02  8:24   ` Ilya Leoshkevich
@ 2023-06-02  8:47     ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-02  8:47 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: olsajiri, davem, dsahern, ast, daniel, andrii, martin.lau, song,
	yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, imagedong, xukuohai, chantr4, zwisler, eddyz87, netdev,
	bpf, linux-kernel, linux-kselftest

On Fri, Jun 2, 2023 at 4:24 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> On Fri, 2023-06-02 at 14:59 +0800, menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > Add test7/test12/test14 in fexit_test.c and fentry_test.c to test the
> > fentry and fexit whose target function have 7/12/14 arguments.
> >
> > And the testcases passed:
> >
> > ./test_progs -t fexit
> > $71      fentry_fexit:OK
> > $73/1    fexit_bpf2bpf/target_no_callees:OK
> > $73/2    fexit_bpf2bpf/target_yes_callees:OK
> > $73/3    fexit_bpf2bpf/func_replace:OK
> > $73/4    fexit_bpf2bpf/func_replace_verify:OK
> > $73/5    fexit_bpf2bpf/func_sockmap_update:OK
> > $73/6    fexit_bpf2bpf/func_replace_return_code:OK
> > $73/7    fexit_bpf2bpf/func_map_prog_compatibility:OK
> > $73/8    fexit_bpf2bpf/func_replace_multi:OK
> > $73/9    fexit_bpf2bpf/fmod_ret_freplace:OK
> > $73/10   fexit_bpf2bpf/func_replace_global_func:OK
> > $73/11   fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
> > $73/12   fexit_bpf2bpf/func_replace_progmap:OK
> > $73      fexit_bpf2bpf:OK
> > $74      fexit_sleep:OK
> > $75      fexit_stress:OK
> > $76      fexit_test:OK
> > Summary: 5/12 PASSED, 0 SKIPPED, 0 FAILED
> >
> > ./test_progs -t fentry
> > $71      fentry_fexit:OK
> > $72      fentry_test:OK
> > $140     module_fentry_shadow:OK
> > Summary: 3/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> > Signed-off-by: Menglong Dong <imagedong@tencent.com>
> > ---
> >  net/bpf/test_run.c                            | 30 +++++++++++++++-
> >  .../testing/selftests/bpf/progs/fentry_test.c | 34
> > ++++++++++++++++++
> >  .../testing/selftests/bpf/progs/fexit_test.c  | 35
> > +++++++++++++++++++
> >  3 files changed, 98 insertions(+), 1 deletion(-)
>
> Don't you also need
>
> --- a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
> +++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
> @@ -34,7 +34,7 @@ void test_fentry_fexit(void)
>         fentry_res = (__u64 *)fentry_skel->bss;
>         fexit_res = (__u64 *)fexit_skel->bss;
>         printf("%lld\n", fentry_skel->bss->test1_result);
> -       for (i = 0; i < 8; i++) {
> +       for (i = 0; i < 11; i++) {
>                 ASSERT_EQ(fentry_res[i], 1, "fentry result");
>                 ASSERT_EQ(fexit_res[i], 1, "fexit result");
>         }
>
> to verify the results of the new tests?

Oops, I missed this part......Thank you for reminding,
and I'll fix it in V3.

Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 4/5] selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr*
  2023-06-02  6:59 ` [PATCH bpf-next v2 4/5] selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr* menglong8.dong
@ 2023-06-02  9:44   ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-02  9:44 UTC (permalink / raw)
  To: olsajiri
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

On Fri, Jun 2, 2023 at 3:03 PM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> To make it more clear, let's make the N in bpf_fentry_testN as the count
> of target function arguments. Therefore, let's rename
> bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr{1,2,3}.
>
> Meanwhile, to stop the checkpatch complaining, move the "noinline" ahead
> of "int".
>
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  net/bpf/test_run.c                            | 12 +++++-----
>  .../selftests/bpf/prog_tests/bpf_cookie.c     | 24 +++++++++----------
>  .../bpf/prog_tests/kprobe_multi_test.c        | 16 ++++++-------
>  .../testing/selftests/bpf/progs/fentry_test.c | 16 ++++++-------
>  .../testing/selftests/bpf/progs/fexit_test.c  | 16 ++++++-------
>  .../selftests/bpf/progs/get_func_ip_test.c    |  2 +-
>  .../selftests/bpf/progs/kprobe_multi.c        | 12 +++++-----
>  .../bpf/progs/verifier_btf_ctx_access.c       |  2 +-
>  .../selftests/bpf/verifier/atomic_fetch_add.c |  4 ++--
>  9 files changed, 52 insertions(+), 52 deletions(-)
>

Sadly, this patch breaks the "bpf_fentry_test?" pattern in
kprobe_multi.c and kprobe_multi_test.c.

I'm considering changing the "bpf_fentry_test?" to
"bpf_fentry_test*" to solve this problem.

Another option, we can remove kretprobe_test7_result
and kretprobe_test8_result and only check
bpf_fentry_test1~6 in kprobe_multi_check.

Or......maybe I shouldn't rename them?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14
  2023-06-02  6:59 ` [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14 menglong8.dong
@ 2023-06-02 18:17   ` Alexei Starovoitov
  2023-06-05  2:49     ` Menglong Dong
  2023-06-03 14:13   ` Simon Horman
  1 sibling, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-06-02 18:17 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, X86 ML, H. Peter Anvin, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Mykola Lysenko, Shuah Khan, benbjiang,
	Ilya Leoshkevich, Menglong Dong, Xu Kuohai, Manu Bretelle,
	Ross Zwisler, Eddy Z, Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Fri, Jun 2, 2023 at 12:01 AM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> According to the current kernel version, below is a statistics of the
> function arguments count:
>
> argument count | FUNC_PROTO count
> 7              | 367
> 8              | 196
> 9              | 71
> 10             | 43
> 11             | 22
> 12             | 10
> 13             | 15
> 14             | 4
> 15             | 0
> 16             | 1
>
> It's hard to statisics the function count, so I use FUNC_PROTO in the btf
> of vmlinux instead. The function with 16 arguments is ZSTD_buildCTable(),
> which I think can be ignored.
>
> Therefore, let's make the maximum of function arguments count 14. It used
> to be 12, but it seems that there is no harm to make it big enough.

I think we're just fine at 12.
People need to fix their code. ZSTD_buildCTable should be first in line.
Passing arguments on the stack is not efficient from performance pov.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING
  2023-06-02  6:59 ` [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
  2023-06-02  7:40   ` Menglong Dong
@ 2023-06-02 18:31   ` Alexei Starovoitov
  2023-06-05  2:40     ` Menglong Dong
  2023-06-05 20:10   ` Jiri Olsa
  2 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-06-02 18:31 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, X86 ML, H. Peter Anvin, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Mykola Lysenko, Shuah Khan, benbjiang,
	Ilya Leoshkevich, Menglong Dong, Xu Kuohai, Manu Bretelle,
	Ross Zwisler, Eddy Z, Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Fri, Jun 2, 2023 at 12:01 AM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>

Please trim your cc when you respin. It's unnecessary huge.

> For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
> on the kernel functions whose arguments count less than 6. This is not
> friendly at all, as too many functions have arguments count more than 6.
>
> Therefore, let's enhance it by increasing the function arguments count
> allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
>
> For the case that we don't need to call origin function, which means
> without BPF_TRAMP_F_CALL_ORIG, we need only copy the function arguments
> that stored in the frame of the caller to current frame. The arguments
> of arg6-argN are stored in "$rbp + 0x18", we need copy them to
> "$rbp - regs_off + (6 * 8)".
>
> For the case with BPF_TRAMP_F_CALL_ORIG, we need prepare the arguments
> in stack before call origin function, which means we need alloc extra
> "8 * (arg_count - 6)" memory in the top of the stack. Note, there should
> not be any data be pushed to the stack before call the origin function.
> Then, we have to store rbx with 'mov' instead of 'push'.
>
> It works well for the FENTRY and FEXIT, I'm not sure if there are other
> complicated cases.
>
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
> v2:
> - instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
> - make MAX_BPF_FUNC_ARGS as the maximum argument count
> ---
>  arch/x86/net/bpf_jit_comp.c | 96 +++++++++++++++++++++++++++++++------
>  1 file changed, 81 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 1056bbf55b17..0e247bb7d6f6 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1868,7 +1868,7 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
>          * mov QWORD PTR [rbp-0x10],rdi
>          * mov QWORD PTR [rbp-0x8],rsi
>          */
> -       for (i = 0, j = 0; i < min(nr_regs, 6); i++) {
> +       for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
>                 /* The arg_size is at most 16 bytes, enforced by the verifier. */
>                 arg_size = m->arg_size[j];
>                 if (arg_size > 8) {
> @@ -1876,10 +1876,22 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
>                         next_same_struct = !next_same_struct;
>                 }
>
> -               emit_stx(prog, bytes_to_bpf_size(arg_size),
> -                        BPF_REG_FP,
> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> -                        -(stack_size - i * 8));
> +               if (i <= 5) {
> +                       /* store function arguments in regs */

The comment is confusing.
It's not storing arguments in regs.
It copies them from regs into stack.

> +                       emit_stx(prog, bytes_to_bpf_size(arg_size),
> +                                BPF_REG_FP,
> +                                i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> +                                -(stack_size - i * 8));
> +               } else {
> +                       /* store function arguments in stack */
> +                       emit_ldx(prog, bytes_to_bpf_size(arg_size),
> +                                BPF_REG_0, BPF_REG_FP,
> +                                (i - 6) * 8 + 0x18);
> +                       emit_stx(prog, bytes_to_bpf_size(arg_size),

and we will have garbage values in upper bytes.
Probably should fix both here and in regular copy from reg.

> +                                BPF_REG_FP,
> +                                BPF_REG_0,
> +                                -(stack_size - i * 8));
> +               }
>
>                 j = next_same_struct ? j : j + 1;
>         }
> @@ -1913,6 +1925,41 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
>         }
>  }
>
> +static void prepare_origin_stack(const struct btf_func_model *m, u8 **prog,
> +                                int nr_regs, int stack_size)
> +{
> +       int i, j, arg_size;
> +       bool next_same_struct = false;
> +
> +       if (nr_regs <= 6)
> +               return;
> +
> +       /* Prepare the function arguments in stack before call origin
> +        * function. These arguments must be stored in the top of the
> +        * stack.
> +        */
> +       for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
> +               /* The arg_size is at most 16 bytes, enforced by the verifier. */
> +               arg_size = m->arg_size[j];
> +               if (arg_size > 8) {
> +                       arg_size = 8;
> +                       next_same_struct = !next_same_struct;
> +               }
> +
> +               if (i > 5) {
> +                       emit_ldx(prog, bytes_to_bpf_size(arg_size),
> +                                BPF_REG_0, BPF_REG_FP,
> +                                (i - 6) * 8 + 0x18);
> +                       emit_stx(prog, bytes_to_bpf_size(arg_size),
> +                                BPF_REG_FP,
> +                                BPF_REG_0,
> +                                -(stack_size - (i - 6) * 8));
> +               }
> +
> +               j = next_same_struct ? j : j + 1;
> +       }
> +}
> +
>  static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>                            struct bpf_tramp_link *l, int stack_size,
>                            int run_ctx_off, bool save_ret)
> @@ -1938,7 +1985,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>         /* arg1: mov rdi, progs[i] */
>         emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
>         /* arg2: lea rsi, [rbp - ctx_cookie_off] */
> -       EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
> +       EMIT3_off32(0x48, 0x8D, 0xB5, -run_ctx_off);
>
>         if (emit_rsb_call(&prog, bpf_trampoline_enter(p), prog))
>                 return -EINVAL;
> @@ -1954,7 +2001,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>         emit_nops(&prog, 2);
>
>         /* arg1: lea rdi, [rbp - stack_size] */
> -       EMIT4(0x48, 0x8D, 0x7D, -stack_size);
> +       EMIT3_off32(0x48, 0x8D, 0xBD, -stack_size);
>         /* arg2: progs[i]->insnsi for interpreter */
>         if (!p->jited)
>                 emit_mov_imm64(&prog, BPF_REG_2,
> @@ -1984,7 +2031,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>         /* arg2: mov rsi, rbx <- start time in nsec */
>         emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
>         /* arg3: lea rdx, [rbp - run_ctx_off] */
> -       EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
> +       EMIT3_off32(0x48, 0x8D, 0x95, -run_ctx_off);
>         if (emit_rsb_call(&prog, bpf_trampoline_exit(p), prog))
>                 return -EINVAL;
>
> @@ -2136,7 +2183,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>                                 void *func_addr)
>  {
>         int i, ret, nr_regs = m->nr_args, stack_size = 0;
> -       int regs_off, nregs_off, ip_off, run_ctx_off;
> +       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
>         struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>         struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>         struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> @@ -2150,8 +2197,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>                 if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
>                         nr_regs += (m->arg_size[i] + 7) / 8 - 1;
>
> -       /* x86-64 supports up to 6 arguments. 7+ can be added in the future */
> -       if (nr_regs > 6)
> +       /* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6
> +        * are passed through regs, the remains are through stack.
> +        */
> +       if (nr_regs > MAX_BPF_FUNC_ARGS)
>                 return -ENOTSUPP;
>
>         /* Generated trampoline stack layout:
> @@ -2170,7 +2219,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>          *
>          * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
>          *
> +        * RBP - rbx_off   [ rbx value       ]  always
> +        *

That is the case already and we just didn't document it, right?

>          * RBP - run_ctx_off [ bpf_tramp_run_ctx ]
> +        *
> +        *                     [ stack_argN ]  BPF_TRAMP_F_CALL_ORIG
> +        *                     [ ...        ]
> +        *                     [ stack_arg2 ]
> +        * RBP - arg_stack_off [ stack_arg1 ]
>          */
>
>         /* room for return value of orig_call or fentry prog */
> @@ -2190,9 +2246,17 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>
>         ip_off = stack_size;
>
> +       stack_size += 8;
> +       rbx_off = stack_size;
> +
>         stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
>         run_ctx_off = stack_size;
>
> +       if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG))
> +               stack_size += (nr_regs - 6) * 8;
> +
> +       arg_stack_off = stack_size;
> +
>         if (flags & BPF_TRAMP_F_SKIP_FRAME) {
>                 /* skip patched call instruction and point orig_call to actual
>                  * body of the kernel function.
> @@ -2212,8 +2276,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>         x86_call_depth_emit_accounting(&prog, NULL);
>         EMIT1(0x55);             /* push rbp */
>         EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
> -       EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> -       EMIT1(0x53);             /* push rbx */
> +       EMIT3_off32(0x48, 0x81, 0xEC, stack_size); /* sub rsp, stack_size */
> +       /* mov QWORD PTR [rbp - rbx_off], rbx */
> +       emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_6, -rbx_off);
>
>         /* Store number of argument registers of the traced function:
>          *   mov rax, nr_regs
> @@ -2262,6 +2327,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>
>         if (flags & BPF_TRAMP_F_CALL_ORIG) {
>                 restore_regs(m, &prog, nr_regs, regs_off);
> +               prepare_origin_stack(m, &prog, nr_regs, arg_stack_off);
>
>                 if (flags & BPF_TRAMP_F_ORIG_STACK) {
>                         emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> @@ -2321,14 +2387,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>         if (save_ret)
>                 emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
>
> -       EMIT1(0x5B); /* pop rbx */
> +       emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);

It can stay as 'pop', no?

>         EMIT1(0xC9); /* leave */
>         if (flags & BPF_TRAMP_F_SKIP_FRAME)
>                 /* skip our return address and return to parent */
>                 EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
>         emit_return(&prog, prog);
>         /* Make sure the trampoline generation logic doesn't overflow */
> -       if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
> +       if (prog > (u8 *)image_end - BPF_INSN_SAFETY) {
>                 ret = -EFAULT;
>                 goto cleanup;
>         }
> --
> 2.40.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
  2023-06-02  6:59 ` [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments menglong8.dong
  2023-06-02  8:24   ` Ilya Leoshkevich
@ 2023-06-02 18:32   ` Alexei Starovoitov
  2023-06-05  2:55     ` Menglong Dong
  1 sibling, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2023-06-02 18:32 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, X86 ML, H. Peter Anvin, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Mykola Lysenko, Shuah Khan, benbjiang,
	Ilya Leoshkevich, Menglong Dong, Xu Kuohai, Manu Bretelle,
	Ross Zwisler, Eddy Z, Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Fri, Jun 2, 2023 at 12:03 AM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> Add test7/test12/test14 in fexit_test.c and fentry_test.c to test the
> fentry and fexit whose target function have 7/12/14 arguments.
>
> And the testcases passed:
>
> ./test_progs -t fexit
> $71      fentry_fexit:OK
> $73/1    fexit_bpf2bpf/target_no_callees:OK
> $73/2    fexit_bpf2bpf/target_yes_callees:OK
> $73/3    fexit_bpf2bpf/func_replace:OK
> $73/4    fexit_bpf2bpf/func_replace_verify:OK
> $73/5    fexit_bpf2bpf/func_sockmap_update:OK
> $73/6    fexit_bpf2bpf/func_replace_return_code:OK
> $73/7    fexit_bpf2bpf/func_map_prog_compatibility:OK
> $73/8    fexit_bpf2bpf/func_replace_multi:OK
> $73/9    fexit_bpf2bpf/fmod_ret_freplace:OK
> $73/10   fexit_bpf2bpf/func_replace_global_func:OK
> $73/11   fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
> $73/12   fexit_bpf2bpf/func_replace_progmap:OK
> $73      fexit_bpf2bpf:OK
> $74      fexit_sleep:OK
> $75      fexit_stress:OK
> $76      fexit_test:OK
> Summary: 5/12 PASSED, 0 SKIPPED, 0 FAILED
>
> ./test_progs -t fentry
> $71      fentry_fexit:OK
> $72      fentry_test:OK
> $140     module_fentry_shadow:OK
> Summary: 3/0 PASSED, 0 SKIPPED, 0 FAILED
>
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  net/bpf/test_run.c                            | 30 +++++++++++++++-
>  .../testing/selftests/bpf/progs/fentry_test.c | 34 ++++++++++++++++++
>  .../testing/selftests/bpf/progs/fexit_test.c  | 35 +++++++++++++++++++
>  3 files changed, 98 insertions(+), 1 deletion(-)
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index c73f246a706f..e12a72311eca 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -536,6 +536,27 @@ int noinline bpf_fentry_test6(u64 a, void *b, short c, int d, void *e, u64 f)
>         return a + (long)b + c + d + (long)e + f;
>  }
>
> +noinline int bpf_fentry_test7(u64 a, void *b, short c, int d, void *e,
> +                             u64 f, u64 g)
> +{
> +       return a + (long)b + c + d + (long)e + f + g;
> +}
> +
> +noinline int bpf_fentry_test12(u64 a, void *b, short c, int d, void *e,
> +                              u64 f, u64 g, u64 h, u64 i, u64 j,
> +                              u64 k, u64 l)
> +{
> +       return a + (long)b + c + d + (long)e + f + g + h + i + j + k + l;
> +}
> +
> +noinline int bpf_fentry_test14(u64 a, void *b, short c, int d, void *e,
> +                              u64 f, u64 g, u64 h, u64 i, u64 j,
> +                              u64 k, u64 l, u64 m, u64 n)
> +{
> +       return a + (long)b + c + d + (long)e + f + g + h + i + j + k + l +
> +              m + n;
> +}

Please add test func to bpf_testmod instead of here.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14
  2023-06-02  6:59 ` [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14 menglong8.dong
  2023-06-02 18:17   ` Alexei Starovoitov
@ 2023-06-03 14:13   ` Simon Horman
  2023-06-05  2:54     ` Menglong Dong
  1 sibling, 1 reply; 20+ messages in thread
From: Simon Horman @ 2023-06-03 14:13 UTC (permalink / raw)
  To: menglong8.dong
  Cc: olsajiri, davem, dsahern, ast, daniel, andrii, martin.lau, song,
	yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

On Fri, Jun 02, 2023 at 02:59:54PM +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <imagedong@tencent.com>
> 
> According to the current kernel version, below is a statistics of the
> function arguments count:
> 
> argument count | FUNC_PROTO count
> 7              | 367
> 8              | 196
> 9              | 71
> 10             | 43
> 11             | 22
> 12             | 10
> 13             | 15
> 14             | 4
> 15             | 0
> 16             | 1
> 
> It's hard to statisics the function count, so I use FUNC_PROTO in the btf
> of vmlinux instead. The function with 16 arguments is ZSTD_buildCTable(),
> which I think can be ignored.
> 
> Therefore, let's make the maximum of function arguments count 14. It used
> to be 12, but it seems that there is no harm to make it big enough.
> 
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  include/linux/bpf.h | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index f58895830ada..8b997779faf7 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -961,10 +961,10 @@ enum bpf_cgroup_storage_type {
>  
>  #define MAX_BPF_CGROUP_STORAGE_TYPE __BPF_CGROUP_STORAGE_MAX
>  
> -/* The longest tracepoint has 12 args.
> - * See include/trace/bpf_probe.h
> +/* The maximun number of the kernel function arguments.

Hi Menglong Dong,

as it looks like there will be a v3 anyway, please
consider correcting the spelling of maximum.

...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING
  2023-06-02 18:31   ` Alexei Starovoitov
@ 2023-06-05  2:40     ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-05  2:40 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Jiri Olsa, X86 ML, benbjiang,
	Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Sat, Jun 3, 2023 at 2:31 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jun 2, 2023 at 12:01 AM <menglong8.dong@gmail.com> wrote:
> >
> > From: Menglong Dong <imagedong@tencent.com>
>
> Please trim your cc when you respin. It's unnecessary huge.

Sorry for bothering the unrelated people. The cc is generated
from ./scripts/get_maintainer.pl, and I'll keep it less than 15.

>
[...]
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 1056bbf55b17..0e247bb7d6f6 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1868,7 +1868,7 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
> >          * mov QWORD PTR [rbp-0x10],rdi
> >          * mov QWORD PTR [rbp-0x8],rsi
> >          */
> > -       for (i = 0, j = 0; i < min(nr_regs, 6); i++) {
> > +       for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
> >                 /* The arg_size is at most 16 bytes, enforced by the verifier. */
> >                 arg_size = m->arg_size[j];
> >                 if (arg_size > 8) {
> > @@ -1876,10 +1876,22 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
> >                         next_same_struct = !next_same_struct;
> >                 }
> >
> > -               emit_stx(prog, bytes_to_bpf_size(arg_size),
> > -                        BPF_REG_FP,
> > -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> > -                        -(stack_size - i * 8));
> > +               if (i <= 5) {
> > +                       /* store function arguments in regs */
>
> The comment is confusing.
> It's not storing arguments in regs.
> It copies them from regs into stack.

Right, I'll use "copy arguments from regs into stack"
instead.

>
> > +                       emit_stx(prog, bytes_to_bpf_size(arg_size),
> > +                                BPF_REG_FP,
> > +                                i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> > +                                -(stack_size - i * 8));
> > +               } else {
> > +                       /* store function arguments in stack */
> > +                       emit_ldx(prog, bytes_to_bpf_size(arg_size),
> > +                                BPF_REG_0, BPF_REG_FP,
> > +                                (i - 6) * 8 + 0x18);
> > +                       emit_stx(prog, bytes_to_bpf_size(arg_size),
>
> and we will have garbage values in upper bytes.
> Probably should fix both here and in regular copy from reg.
>

I noticed it too......I'll dig it deeper to find a solution.

> > +                                BPF_REG_FP,
> > +                                BPF_REG_0,
> > +                                -(stack_size - i * 8));
> > +               }
> >
[......]
> >         /* Generated trampoline stack layout:
> > @@ -2170,7 +2219,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >          *
> >          * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
> >          *
> > +        * RBP - rbx_off   [ rbx value       ]  always
> > +        *
>
> That is the case already and we just didn't document it, right?
>

I'm afraid not anymore. In the origin logic, we use
"push rbx" after "sub rsp, stack_size". This will store
"rbx" into the top of the stack.

However, now we need to make sure the arguments,
which we copy from the stack frame of the caller into
current stack frame in prepare_origin_stack(), stay in
the top of the stack, to pass these arguments to the
orig_call through stack.

> >          * RBP - run_ctx_off [ bpf_tramp_run_ctx ]
> > +        *
> > +        *                     [ stack_argN ]  BPF_TRAMP_F_CALL_ORIG
> > +        *                     [ ...        ]
> > +        *                     [ stack_arg2 ]
> > +        * RBP - arg_stack_off [ stack_arg1 ]
> >          */
> >
> >         /* room for return value of orig_call or fentry prog */
> > @@ -2190,9 +2246,17 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >
> >         ip_off = stack_size;
> >
> > +       stack_size += 8;
> > +       rbx_off = stack_size;
> > +
> >         stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
> >         run_ctx_off = stack_size;
> >
> > +       if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG))
> > +               stack_size += (nr_regs - 6) * 8;
> > +
> > +       arg_stack_off = stack_size;
> > +
> >         if (flags & BPF_TRAMP_F_SKIP_FRAME) {
> >                 /* skip patched call instruction and point orig_call to actual
> >                  * body of the kernel function.
> > @@ -2212,8 +2276,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >         x86_call_depth_emit_accounting(&prog, NULL);
> >         EMIT1(0x55);             /* push rbp */
> >         EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
> > -       EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> > -       EMIT1(0x53);             /* push rbx */
> > +       EMIT3_off32(0x48, 0x81, 0xEC, stack_size); /* sub rsp, stack_size */
> > +       /* mov QWORD PTR [rbp - rbx_off], rbx */
> > +       emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_6, -rbx_off);
> >
> >         /* Store number of argument registers of the traced function:
> >          *   mov rax, nr_regs
> > @@ -2262,6 +2327,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >
> >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> >                 restore_regs(m, &prog, nr_regs, regs_off);
> > +               prepare_origin_stack(m, &prog, nr_regs, arg_stack_off);
> >
> >                 if (flags & BPF_TRAMP_F_ORIG_STACK) {
> >                         emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > @@ -2321,14 +2387,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >         if (save_ret)
> >                 emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
> >
> > -       EMIT1(0x5B); /* pop rbx */
> > +       emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
>
> It can stay as 'pop', no?
>

As now we don't use "push rbx" anymore,
we can't use "pop" here either, as we store rbx in
the stack of a specific location.

Thanks!
Menglong Dong

> >         EMIT1(0xC9); /* leave */
> >         if (flags & BPF_TRAMP_F_SKIP_FRAME)
> >                 /* skip our return address and return to parent */
> >                 EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
> >         emit_return(&prog, prog);
> >         /* Make sure the trampoline generation logic doesn't overflow */
> > -       if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
> > +       if (prog > (u8 *)image_end - BPF_INSN_SAFETY) {
> >                 ret = -EFAULT;
> >                 goto cleanup;
> >         }
> > --
> > 2.40.1
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14
  2023-06-02 18:17   ` Alexei Starovoitov
@ 2023-06-05  2:49     ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-05  2:49 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Jiri Olsa, X86 ML, Biao Jiang,
	Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Sat, Jun 3, 2023 at 2:17 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jun 2, 2023 at 12:01 AM <menglong8.dong@gmail.com> wrote:
> >
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > According to the current kernel version, below is a statistics of the
> > function arguments count:
> >
> > argument count | FUNC_PROTO count
> > 7              | 367
> > 8              | 196
> > 9              | 71
> > 10             | 43
> > 11             | 22
> > 12             | 10
> > 13             | 15
> > 14             | 4
> > 15             | 0
> > 16             | 1
> >
> > It's hard to statisics the function count, so I use FUNC_PROTO in the btf
> > of vmlinux instead. The function with 16 arguments is ZSTD_buildCTable(),
> > which I think can be ignored.
> >
> > Therefore, let's make the maximum of function arguments count 14. It used
> > to be 12, but it seems that there is no harm to make it big enough.
>
> I think we're just fine at 12.
> People need to fix their code. ZSTD_buildCTable should be first in line.
> Passing arguments on the stack is not efficient from performance pov.

But we still need to keep this part:

@@ -2273,7 +2273,8 @@ bool btf_ctx_access(int off, int size, enum
bpf_access_type type,
 static inline bool bpf_tracing_ctx_access(int off, int size,
                                          enum bpf_access_type type)
 {
-       if (off < 0 || off >= sizeof(__u64) * MAX_BPF_FUNC_ARGS)
+       /* "+1" here is for FEXIT return value. */
+       if (off < 0 || off >= sizeof(__u64) * (MAX_BPF_FUNC_ARGS + 1))
                return false;
        if (type != BPF_READ)
                return false;

Isn't it? Otherwise, it will make that the maximum arguments
is 12 for FENTRY, but 11 for FEXIT, as FEXIT needs to store
the return value in ctx.

How about that we change bpf_tracing_ctx_access() into:

static inline bool bpf_tracing_ctx_access(int off, int size,
                      enum bpf_access_type type,
                      int max_args)

And the caller can pass MAX_BPF_FUNC_ARGS to
it on no-FEXIT, and 'MAX_BPF_FUNC_ARGS + 1'
on FEXIT.

What do you think?

Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14
  2023-06-03 14:13   ` Simon Horman
@ 2023-06-05  2:54     ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-05  2:54 UTC (permalink / raw)
  To: Simon Horman
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Jiri Olsa, X86 ML, Biao Jiang,
	Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Sat, Jun 3, 2023 at 10:13 PM Simon Horman <simon.horman@corigine.com> wrote:
>
> On Fri, Jun 02, 2023 at 02:59:54PM +0800, menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > According to the current kernel version, below is a statistics of the
> > function arguments count:
> >
> > argument count | FUNC_PROTO count
> > 7              | 367
> > 8              | 196
> > 9              | 71
> > 10             | 43
> > 11             | 22
> > 12             | 10
> > 13             | 15
> > 14             | 4
> > 15             | 0
> > 16             | 1
> >
> > It's hard to statisics the function count, so I use FUNC_PROTO in the btf
> > of vmlinux instead. The function with 16 arguments is ZSTD_buildCTable(),
> > which I think can be ignored.
> >
> > Therefore, let's make the maximum of function arguments count 14. It used
> > to be 12, but it seems that there is no harm to make it big enough.
> >
> > Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> > Signed-off-by: Menglong Dong <imagedong@tencent.com>
> > ---
> >  include/linux/bpf.h | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index f58895830ada..8b997779faf7 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -961,10 +961,10 @@ enum bpf_cgroup_storage_type {
> >
> >  #define MAX_BPF_CGROUP_STORAGE_TYPE __BPF_CGROUP_STORAGE_MAX
> >
> > -/* The longest tracepoint has 12 args.
> > - * See include/trace/bpf_probe.h
> > +/* The maximun number of the kernel function arguments.
>
> Hi Menglong Dong,
>
> as it looks like there will be a v3 anyway, please
> consider correcting the spelling of maximum.
>

According to the advice of Alexei Starovoitov, it seems
we don't need to modify it here anymore. Anyway,
Thank you for reminding me of this spelling mistake :)

> ...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments
  2023-06-02 18:32   ` Alexei Starovoitov
@ 2023-06-05  2:55     ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-05  2:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, David S. Miller, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Jiri Olsa, X86 ML, Biao Jiang,
	Network Development, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Sat, Jun 3, 2023 at 2:32 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jun 2, 2023 at 12:03 AM <menglong8.dong@gmail.com> wrote:
> >
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > Add test7/test12/test14 in fexit_test.c and fentry_test.c to test the
> > fentry and fexit whose target function have 7/12/14 arguments.
> >
> > And the testcases passed:
> >
> > ./test_progs -t fexit
> > $71      fentry_fexit:OK
> > $73/1    fexit_bpf2bpf/target_no_callees:OK
> > $73/2    fexit_bpf2bpf/target_yes_callees:OK
> > $73/3    fexit_bpf2bpf/func_replace:OK
> > $73/4    fexit_bpf2bpf/func_replace_verify:OK
> > $73/5    fexit_bpf2bpf/func_sockmap_update:OK
> > $73/6    fexit_bpf2bpf/func_replace_return_code:OK
> > $73/7    fexit_bpf2bpf/func_map_prog_compatibility:OK
> > $73/8    fexit_bpf2bpf/func_replace_multi:OK
> > $73/9    fexit_bpf2bpf/fmod_ret_freplace:OK
> > $73/10   fexit_bpf2bpf/func_replace_global_func:OK
> > $73/11   fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
> > $73/12   fexit_bpf2bpf/func_replace_progmap:OK
> > $73      fexit_bpf2bpf:OK
> > $74      fexit_sleep:OK
> > $75      fexit_stress:OK
> > $76      fexit_test:OK
> > Summary: 5/12 PASSED, 0 SKIPPED, 0 FAILED
> >
> > ./test_progs -t fentry
> > $71      fentry_fexit:OK
> > $72      fentry_test:OK
> > $140     module_fentry_shadow:OK
> > Summary: 3/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> > Signed-off-by: Menglong Dong <imagedong@tencent.com>
> > ---
> >  net/bpf/test_run.c                            | 30 +++++++++++++++-
> >  .../testing/selftests/bpf/progs/fentry_test.c | 34 ++++++++++++++++++
> >  .../testing/selftests/bpf/progs/fexit_test.c  | 35 +++++++++++++++++++
> >  3 files changed, 98 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index c73f246a706f..e12a72311eca 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -536,6 +536,27 @@ int noinline bpf_fentry_test6(u64 a, void *b, short c, int d, void *e, u64 f)
> >         return a + (long)b + c + d + (long)e + f;
> >  }
> >
> > +noinline int bpf_fentry_test7(u64 a, void *b, short c, int d, void *e,
> > +                             u64 f, u64 g)
> > +{
> > +       return a + (long)b + c + d + (long)e + f + g;
> > +}
> > +
> > +noinline int bpf_fentry_test12(u64 a, void *b, short c, int d, void *e,
> > +                              u64 f, u64 g, u64 h, u64 i, u64 j,
> > +                              u64 k, u64 l)
> > +{
> > +       return a + (long)b + c + d + (long)e + f + g + h + i + j + k + l;
> > +}
> > +
> > +noinline int bpf_fentry_test14(u64 a, void *b, short c, int d, void *e,
> > +                              u64 f, u64 g, u64 h, u64 i, u64 j,
> > +                              u64 k, u64 l, u64 m, u64 n)
> > +{
> > +       return a + (long)b + c + d + (long)e + f + g + h + i + j + k + l +
> > +              m + n;
> > +}
>
> Please add test func to bpf_testmod instead of here.

Okay!

Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING
  2023-06-02  6:59 ` [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
  2023-06-02  7:40   ` Menglong Dong
  2023-06-02 18:31   ` Alexei Starovoitov
@ 2023-06-05 20:10   ` Jiri Olsa
  2023-06-06  2:02     ` Menglong Dong
  2 siblings, 1 reply; 20+ messages in thread
From: Jiri Olsa @ 2023-06-05 20:10 UTC (permalink / raw)
  To: menglong8.dong
  Cc: olsajiri, davem, dsahern, ast, daniel, andrii, martin.lau, song,
	yhs, john.fastabend, kpsingh, sdf, haoluo, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

On Fri, Jun 02, 2023 at 02:59:55PM +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <imagedong@tencent.com>
> 
> For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
> on the kernel functions whose arguments count less than 6. This is not
> friendly at all, as too many functions have arguments count more than 6.
> 
> Therefore, let's enhance it by increasing the function arguments count
> allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
> 
> For the case that we don't need to call origin function, which means
> without BPF_TRAMP_F_CALL_ORIG, we need only copy the function arguments
> that stored in the frame of the caller to current frame. The arguments
> of arg6-argN are stored in "$rbp + 0x18", we need copy them to
> "$rbp - regs_off + (6 * 8)".
> 
> For the case with BPF_TRAMP_F_CALL_ORIG, we need prepare the arguments
> in stack before call origin function, which means we need alloc extra
> "8 * (arg_count - 6)" memory in the top of the stack. Note, there should
> not be any data be pushed to the stack before call the origin function.
> Then, we have to store rbx with 'mov' instead of 'push'.
> 
> It works well for the FENTRY and FEXIT, I'm not sure if there are other
> complicated cases.
> 
> Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
> v2:
> - instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow

could you please describe in more details what's the problem with that?
you also changed that for 'sub rsp, stack_size'

thanks
jirka


> - make MAX_BPF_FUNC_ARGS as the maximum argument count
> ---
>  arch/x86/net/bpf_jit_comp.c | 96 +++++++++++++++++++++++++++++++------
>  1 file changed, 81 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 1056bbf55b17..0e247bb7d6f6 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1868,7 +1868,7 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
>  	 * mov QWORD PTR [rbp-0x10],rdi
>  	 * mov QWORD PTR [rbp-0x8],rsi
>  	 */
> -	for (i = 0, j = 0; i < min(nr_regs, 6); i++) {
> +	for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
>  		/* The arg_size is at most 16 bytes, enforced by the verifier. */
>  		arg_size = m->arg_size[j];
>  		if (arg_size > 8) {
> @@ -1876,10 +1876,22 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
>  			next_same_struct = !next_same_struct;
>  		}
>  
> -		emit_stx(prog, bytes_to_bpf_size(arg_size),
> -			 BPF_REG_FP,
> -			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> -			 -(stack_size - i * 8));
> +		if (i <= 5) {
> +			/* store function arguments in regs */
> +			emit_stx(prog, bytes_to_bpf_size(arg_size),
> +				 BPF_REG_FP,
> +				 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> +				 -(stack_size - i * 8));
> +		} else {
> +			/* store function arguments in stack */
> +			emit_ldx(prog, bytes_to_bpf_size(arg_size),
> +				 BPF_REG_0, BPF_REG_FP,
> +				 (i - 6) * 8 + 0x18);
> +			emit_stx(prog, bytes_to_bpf_size(arg_size),
> +				 BPF_REG_FP,
> +				 BPF_REG_0,
> +				 -(stack_size - i * 8));
> +		}
>  
>  		j = next_same_struct ? j : j + 1;
>  	}
> @@ -1913,6 +1925,41 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
>  	}
>  }
>  
> +static void prepare_origin_stack(const struct btf_func_model *m, u8 **prog,
> +				 int nr_regs, int stack_size)
> +{
> +	int i, j, arg_size;
> +	bool next_same_struct = false;
> +
> +	if (nr_regs <= 6)
> +		return;
> +
> +	/* Prepare the function arguments in stack before call origin
> +	 * function. These arguments must be stored in the top of the
> +	 * stack.
> +	 */
> +	for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
> +		/* The arg_size is at most 16 bytes, enforced by the verifier. */
> +		arg_size = m->arg_size[j];
> +		if (arg_size > 8) {
> +			arg_size = 8;
> +			next_same_struct = !next_same_struct;
> +		}
> +
> +		if (i > 5) {
> +			emit_ldx(prog, bytes_to_bpf_size(arg_size),
> +				 BPF_REG_0, BPF_REG_FP,
> +				 (i - 6) * 8 + 0x18);
> +			emit_stx(prog, bytes_to_bpf_size(arg_size),
> +				 BPF_REG_FP,
> +				 BPF_REG_0,
> +				 -(stack_size - (i - 6) * 8));
> +		}
> +
> +		j = next_same_struct ? j : j + 1;
> +	}
> +}
> +
>  static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>  			   struct bpf_tramp_link *l, int stack_size,
>  			   int run_ctx_off, bool save_ret)
> @@ -1938,7 +1985,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>  	/* arg1: mov rdi, progs[i] */
>  	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
>  	/* arg2: lea rsi, [rbp - ctx_cookie_off] */
> -	EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
> +	EMIT3_off32(0x48, 0x8D, 0xB5, -run_ctx_off);
>  
>  	if (emit_rsb_call(&prog, bpf_trampoline_enter(p), prog))
>  		return -EINVAL;
> @@ -1954,7 +2001,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>  	emit_nops(&prog, 2);
>  
>  	/* arg1: lea rdi, [rbp - stack_size] */
> -	EMIT4(0x48, 0x8D, 0x7D, -stack_size);
> +	EMIT3_off32(0x48, 0x8D, 0xBD, -stack_size);
>  	/* arg2: progs[i]->insnsi for interpreter */
>  	if (!p->jited)
>  		emit_mov_imm64(&prog, BPF_REG_2,
> @@ -1984,7 +2031,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>  	/* arg2: mov rsi, rbx <- start time in nsec */
>  	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
>  	/* arg3: lea rdx, [rbp - run_ctx_off] */
> -	EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
> +	EMIT3_off32(0x48, 0x8D, 0x95, -run_ctx_off);
>  	if (emit_rsb_call(&prog, bpf_trampoline_exit(p), prog))
>  		return -EINVAL;
>  
> @@ -2136,7 +2183,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  				void *func_addr)
>  {
>  	int i, ret, nr_regs = m->nr_args, stack_size = 0;
> -	int regs_off, nregs_off, ip_off, run_ctx_off;
> +	int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
>  	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>  	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>  	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> @@ -2150,8 +2197,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
>  			nr_regs += (m->arg_size[i] + 7) / 8 - 1;
>  
> -	/* x86-64 supports up to 6 arguments. 7+ can be added in the future */
> -	if (nr_regs > 6)
> +	/* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6
> +	 * are passed through regs, the remains are through stack.
> +	 */
> +	if (nr_regs > MAX_BPF_FUNC_ARGS)
>  		return -ENOTSUPP;
>  
>  	/* Generated trampoline stack layout:
> @@ -2170,7 +2219,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	 *
>  	 * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
>  	 *
> +	 * RBP - rbx_off   [ rbx value       ]  always
> +	 *
>  	 * RBP - run_ctx_off [ bpf_tramp_run_ctx ]
> +	 *
> +	 *                     [ stack_argN ]  BPF_TRAMP_F_CALL_ORIG
> +	 *                     [ ...        ]
> +	 *                     [ stack_arg2 ]
> +	 * RBP - arg_stack_off [ stack_arg1 ]
>  	 */
>  
>  	/* room for return value of orig_call or fentry prog */
> @@ -2190,9 +2246,17 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  
>  	ip_off = stack_size;
>  
> +	stack_size += 8;
> +	rbx_off = stack_size;
> +
>  	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
>  	run_ctx_off = stack_size;
>  
> +	if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG))
> +		stack_size += (nr_regs - 6) * 8;
> +
> +	arg_stack_off = stack_size;
> +
>  	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
>  		/* skip patched call instruction and point orig_call to actual
>  		 * body of the kernel function.
> @@ -2212,8 +2276,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	x86_call_depth_emit_accounting(&prog, NULL);
>  	EMIT1(0x55);		 /* push rbp */
>  	EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
> -	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> -	EMIT1(0x53);		 /* push rbx */
> +	EMIT3_off32(0x48, 0x81, 0xEC, stack_size); /* sub rsp, stack_size */
> +	/* mov QWORD PTR [rbp - rbx_off], rbx */
> +	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_6, -rbx_off);
>  
>  	/* Store number of argument registers of the traced function:
>  	 *   mov rax, nr_regs
> @@ -2262,6 +2327,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  
>  	if (flags & BPF_TRAMP_F_CALL_ORIG) {
>  		restore_regs(m, &prog, nr_regs, regs_off);
> +		prepare_origin_stack(m, &prog, nr_regs, arg_stack_off);
>  
>  		if (flags & BPF_TRAMP_F_ORIG_STACK) {
>  			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> @@ -2321,14 +2387,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	if (save_ret)
>  		emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
>  
> -	EMIT1(0x5B); /* pop rbx */
> +	emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
>  	EMIT1(0xC9); /* leave */
>  	if (flags & BPF_TRAMP_F_SKIP_FRAME)
>  		/* skip our return address and return to parent */
>  		EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
>  	emit_return(&prog, prog);
>  	/* Make sure the trampoline generation logic doesn't overflow */
> -	if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
> +	if (prog > (u8 *)image_end - BPF_INSN_SAFETY) {
>  		ret = -EFAULT;
>  		goto cleanup;
>  	}
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING
  2023-06-05 20:10   ` Jiri Olsa
@ 2023-06-06  2:02     ` Menglong Dong
  0 siblings, 0 replies; 20+ messages in thread
From: Menglong Dong @ 2023-06-06  2:02 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: davem, dsahern, ast, daniel, andrii, martin.lau, song, yhs,
	john.fastabend, kpsingh, sdf, haoluo, tglx, mingo, bp,
	dave.hansen, x86, hpa, edumazet, kuba, pabeni, mykolal, shuah,
	benbjiang, iii, imagedong, xukuohai, chantr4, zwisler, eddyz87,
	netdev, bpf, linux-kernel, linux-kselftest

On Tue, Jun 6, 2023 at 4:11 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, Jun 02, 2023 at 02:59:55PM +0800, menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > For now, the BPF program of type BPF_PROG_TYPE_TRACING can only be used
> > on the kernel functions whose arguments count less than 6. This is not
> > friendly at all, as too many functions have arguments count more than 6.
> >
> > Therefore, let's enhance it by increasing the function arguments count
> > allowed in arch_prepare_bpf_trampoline(), for now, only x86_64.
> >
> > For the case that we don't need to call origin function, which means
> > without BPF_TRAMP_F_CALL_ORIG, we need only copy the function arguments
> > that stored in the frame of the caller to current frame. The arguments
> > of arg6-argN are stored in "$rbp + 0x18", we need copy them to
> > "$rbp - regs_off + (6 * 8)".
> >
> > For the case with BPF_TRAMP_F_CALL_ORIG, we need prepare the arguments
> > in stack before call origin function, which means we need alloc extra
> > "8 * (arg_count - 6)" memory in the top of the stack. Note, there should
> > not be any data be pushed to the stack before call the origin function.
> > Then, we have to store rbx with 'mov' instead of 'push'.
> >
> > It works well for the FENTRY and FEXIT, I'm not sure if there are other
> > complicated cases.
> >
> > Reviewed-by: Jiang Biao <benbjiang@tencent.com>
> > Signed-off-by: Menglong Dong <imagedong@tencent.com>
> > ---
> > v2:
> > - instead EMIT4 with EMIT3_off32 for "lea" to prevent overflow
>
> could you please describe in more details what's the problem with that?
> you also changed that for 'sub rsp, stack_size'
>

Sorry for the confusion. Take 'sub rsp, stack_size' for example,
in the origin logic, which is:

  EMIT4(0x48, 0x83, 0xEC, stack_size)

the imm in the instruction is a signed char. So the maximum
of the imm is 127.

However, now the stack_size is more than 127 if
the count of the function arguments is more than 8.

Therefore, I use:

  EMIT3_off32(0x48, 0x81, 0xEC, stack_size)

And the imm in this instruction is signed int.

The same reason for "lea" instruction.

Thanks!
Menglong Dong

> thanks
> jirka
>
>
> > - make MAX_BPF_FUNC_ARGS as the maximum argument count
> > ---
> >  arch/x86/net/bpf_jit_comp.c | 96 +++++++++++++++++++++++++++++++------
> >  1 file changed, 81 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 1056bbf55b17..0e247bb7d6f6 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1868,7 +1868,7 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
> >        * mov QWORD PTR [rbp-0x10],rdi
> >        * mov QWORD PTR [rbp-0x8],rsi
> >        */
> > -     for (i = 0, j = 0; i < min(nr_regs, 6); i++) {
> > +     for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
> >               /* The arg_size is at most 16 bytes, enforced by the verifier. */
> >               arg_size = m->arg_size[j];
> >               if (arg_size > 8) {
> > @@ -1876,10 +1876,22 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
> >                       next_same_struct = !next_same_struct;
> >               }
> >
> > -             emit_stx(prog, bytes_to_bpf_size(arg_size),
> > -                      BPF_REG_FP,
> > -                      i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> > -                      -(stack_size - i * 8));
> > +             if (i <= 5) {
> > +                     /* store function arguments in regs */
> > +                     emit_stx(prog, bytes_to_bpf_size(arg_size),
> > +                              BPF_REG_FP,
> > +                              i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> > +                              -(stack_size - i * 8));
> > +             } else {
> > +                     /* store function arguments in stack */
> > +                     emit_ldx(prog, bytes_to_bpf_size(arg_size),
> > +                              BPF_REG_0, BPF_REG_FP,
> > +                              (i - 6) * 8 + 0x18);
> > +                     emit_stx(prog, bytes_to_bpf_size(arg_size),
> > +                              BPF_REG_FP,
> > +                              BPF_REG_0,
> > +                              -(stack_size - i * 8));
> > +             }
> >
> >               j = next_same_struct ? j : j + 1;
> >       }
> > @@ -1913,6 +1925,41 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_regs,
> >       }
> >  }
> >
> > +static void prepare_origin_stack(const struct btf_func_model *m, u8 **prog,
> > +                              int nr_regs, int stack_size)
> > +{
> > +     int i, j, arg_size;
> > +     bool next_same_struct = false;
> > +
> > +     if (nr_regs <= 6)
> > +             return;
> > +
> > +     /* Prepare the function arguments in stack before call origin
> > +      * function. These arguments must be stored in the top of the
> > +      * stack.
> > +      */
> > +     for (i = 0, j = 0; i < min(nr_regs, MAX_BPF_FUNC_ARGS); i++) {
> > +             /* The arg_size is at most 16 bytes, enforced by the verifier. */
> > +             arg_size = m->arg_size[j];
> > +             if (arg_size > 8) {
> > +                     arg_size = 8;
> > +                     next_same_struct = !next_same_struct;
> > +             }
> > +
> > +             if (i > 5) {
> > +                     emit_ldx(prog, bytes_to_bpf_size(arg_size),
> > +                              BPF_REG_0, BPF_REG_FP,
> > +                              (i - 6) * 8 + 0x18);
> > +                     emit_stx(prog, bytes_to_bpf_size(arg_size),
> > +                              BPF_REG_FP,
> > +                              BPF_REG_0,
> > +                              -(stack_size - (i - 6) * 8));
> > +             }
> > +
> > +             j = next_same_struct ? j : j + 1;
> > +     }
> > +}
> > +
> >  static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> >                          struct bpf_tramp_link *l, int stack_size,
> >                          int run_ctx_off, bool save_ret)
> > @@ -1938,7 +1985,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> >       /* arg1: mov rdi, progs[i] */
> >       emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
> >       /* arg2: lea rsi, [rbp - ctx_cookie_off] */
> > -     EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);
> > +     EMIT3_off32(0x48, 0x8D, 0xB5, -run_ctx_off);
> >
> >       if (emit_rsb_call(&prog, bpf_trampoline_enter(p), prog))
> >               return -EINVAL;
> > @@ -1954,7 +2001,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> >       emit_nops(&prog, 2);
> >
> >       /* arg1: lea rdi, [rbp - stack_size] */
> > -     EMIT4(0x48, 0x8D, 0x7D, -stack_size);
> > +     EMIT3_off32(0x48, 0x8D, 0xBD, -stack_size);
> >       /* arg2: progs[i]->insnsi for interpreter */
> >       if (!p->jited)
> >               emit_mov_imm64(&prog, BPF_REG_2,
> > @@ -1984,7 +2031,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> >       /* arg2: mov rsi, rbx <- start time in nsec */
> >       emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
> >       /* arg3: lea rdx, [rbp - run_ctx_off] */
> > -     EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
> > +     EMIT3_off32(0x48, 0x8D, 0x95, -run_ctx_off);
> >       if (emit_rsb_call(&prog, bpf_trampoline_exit(p), prog))
> >               return -EINVAL;
> >
> > @@ -2136,7 +2183,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >                               void *func_addr)
> >  {
> >       int i, ret, nr_regs = m->nr_args, stack_size = 0;
> > -     int regs_off, nregs_off, ip_off, run_ctx_off;
> > +     int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
> >       struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> >       struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> >       struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > @@ -2150,8 +2197,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
> >                       nr_regs += (m->arg_size[i] + 7) / 8 - 1;
> >
> > -     /* x86-64 supports up to 6 arguments. 7+ can be added in the future */
> > -     if (nr_regs > 6)
> > +     /* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6
> > +      * are passed through regs, the remains are through stack.
> > +      */
> > +     if (nr_regs > MAX_BPF_FUNC_ARGS)
> >               return -ENOTSUPP;
> >
> >       /* Generated trampoline stack layout:
> > @@ -2170,7 +2219,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >        *
> >        * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
> >        *
> > +      * RBP - rbx_off   [ rbx value       ]  always
> > +      *
> >        * RBP - run_ctx_off [ bpf_tramp_run_ctx ]
> > +      *
> > +      *                     [ stack_argN ]  BPF_TRAMP_F_CALL_ORIG
> > +      *                     [ ...        ]
> > +      *                     [ stack_arg2 ]
> > +      * RBP - arg_stack_off [ stack_arg1 ]
> >        */
> >
> >       /* room for return value of orig_call or fentry prog */
> > @@ -2190,9 +2246,17 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >
> >       ip_off = stack_size;
> >
> > +     stack_size += 8;
> > +     rbx_off = stack_size;
> > +
> >       stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
> >       run_ctx_off = stack_size;
> >
> > +     if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG))
> > +             stack_size += (nr_regs - 6) * 8;
> > +
> > +     arg_stack_off = stack_size;
> > +
> >       if (flags & BPF_TRAMP_F_SKIP_FRAME) {
> >               /* skip patched call instruction and point orig_call to actual
> >                * body of the kernel function.
> > @@ -2212,8 +2276,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >       x86_call_depth_emit_accounting(&prog, NULL);
> >       EMIT1(0x55);             /* push rbp */
> >       EMIT3(0x48, 0x89, 0xE5); /* mov rbp, rsp */
> > -     EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> > -     EMIT1(0x53);             /* push rbx */
> > +     EMIT3_off32(0x48, 0x81, 0xEC, stack_size); /* sub rsp, stack_size */
> > +     /* mov QWORD PTR [rbp - rbx_off], rbx */
> > +     emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_6, -rbx_off);
> >
> >       /* Store number of argument registers of the traced function:
> >        *   mov rax, nr_regs
> > @@ -2262,6 +2327,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >
> >       if (flags & BPF_TRAMP_F_CALL_ORIG) {
> >               restore_regs(m, &prog, nr_regs, regs_off);
> > +             prepare_origin_stack(m, &prog, nr_regs, arg_stack_off);
> >
> >               if (flags & BPF_TRAMP_F_ORIG_STACK) {
> >                       emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > @@ -2321,14 +2387,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >       if (save_ret)
> >               emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
> >
> > -     EMIT1(0x5B); /* pop rbx */
> > +     emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, -rbx_off);
> >       EMIT1(0xC9); /* leave */
> >       if (flags & BPF_TRAMP_F_SKIP_FRAME)
> >               /* skip our return address and return to parent */
> >               EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
> >       emit_return(&prog, prog);
> >       /* Make sure the trampoline generation logic doesn't overflow */
> > -     if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY)) {
> > +     if (prog > (u8 *)image_end - BPF_INSN_SAFETY) {
> >               ret = -EFAULT;
> >               goto cleanup;
> >       }
> > --
> > 2.40.1
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-06-06  2:02 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-02  6:59 [PATCH bpf-next v2 0/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
2023-06-02  6:59 ` [PATCH bpf-next v2 1/5] bpf: make MAX_BPF_FUNC_ARGS 14 menglong8.dong
2023-06-02 18:17   ` Alexei Starovoitov
2023-06-05  2:49     ` Menglong Dong
2023-06-03 14:13   ` Simon Horman
2023-06-05  2:54     ` Menglong Dong
2023-06-02  6:59 ` [PATCH bpf-next v2 2/5] bpf, x86: allow function arguments up to 14 for TRACING menglong8.dong
2023-06-02  7:40   ` Menglong Dong
2023-06-02 18:31   ` Alexei Starovoitov
2023-06-05  2:40     ` Menglong Dong
2023-06-05 20:10   ` Jiri Olsa
2023-06-06  2:02     ` Menglong Dong
2023-06-02  6:59 ` [PATCH bpf-next v2 3/5] libbpf: make BPF_PROG support 15 function arguments menglong8.dong
2023-06-02  6:59 ` [PATCH bpf-next v2 4/5] selftests/bpf: rename bpf_fentry_test{7,8,9} to bpf_fentry_test_ptr* menglong8.dong
2023-06-02  9:44   ` Menglong Dong
2023-06-02  6:59 ` [PATCH bpf-next v2 5/5] selftests/bpf: add testcase for FENTRY/FEXIT with 6+ arguments menglong8.dong
2023-06-02  8:24   ` Ilya Leoshkevich
2023-06-02  8:47     ` Menglong Dong
2023-06-02 18:32   ` Alexei Starovoitov
2023-06-05  2:55     ` Menglong Dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).