[PATCH bpf-next v4 0/9] bpf: tracing session supporting

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next v4 0/9] bpf: tracing session supporting
@ 2025-12-17  9:54 Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 1/9] bpf: add tracing session support Menglong Dong
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Hi, all.

In this version, I combined Alexei and Andrii's advice, which makes the
architecture specific code much simpler.

Sometimes, we need to hook both the entry and exit of a function with
TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
function, which is not convenient.

Therefore, we add a tracing session support for TRACING. Generally
speaking, it's similar to kprobe session, which can hook both the entry
and exit of a function with a single BPF program. Session cookie is also
supported with the kfunc bpf_fsession_cookie(). In order to limit the
stack usage, we limit the maximum number of cookies to 4.

The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
inlined in the verifier.

We allow the usage of bpf_get_func_ret() to get the return value in the
fentry of the tracing session, as it will always get "0", which is safe
enough and is OK. Maybe we can prohibit the usage of bpf_get_func_ret()
in the fentry in verifier, which can make the architecture specific code
simpler.

The fsession stuff is arch related, so the -EOPNOTSUPP will be returned if
it is not supported yet by the arch. In this series, we only support
x86_64. And later, other arch will be implemented.

Changes since v3:
* instead of adding a new hlist to progs_hlist in trampoline, add the bpf
  program to both the fentry hlist and the fexit hlist.
* introduce the 2nd patch to reuse the nr_args field in the stack to
  store all the information we need(except the session cookies).
* limit the maximum number of cookies to 4.
* remove the logic to skip fexit if the fentry return non-zero.

Changes since v2:
* squeeze some patches:
  - the 2 patches for the kfunc bpf_tracing_is_exit() and
    bpf_fsession_cookie() are merged into the second patch.
  - the testcases for fsession are also squeezed.

* fix the CI error by move the testcase for bpf_get_func_ip to
  fsession_test.c

Changes since v1:
* session cookie support.
  In this version, session cookie is implemented, and the kfunc
  bpf_fsession_cookie() is added.

* restructure the layout of the stack.
  In this version, the session stuff that stored in the stack is changed,
  and we locate them after the return value to not break
  bpf_get_func_ip().

* testcase enhancement.
  Some nits in the testcase that suggested by Jiri is fixed. Meanwhile,
  the testcase for get_func_ip and session cookie is added too.

Menglong Dong (9):
  bpf: add tracing session support
  bpf: use last 8-bits for the nr_args in trampoline
  bpf: add the kfunc bpf_fsession_is_return
  bpf: add the kfunc bpf_fsession_cookie
  bpf,x86: introduce emit_st_r0_imm64() for trampoline
  bpf,x86: add tracing session supporting for x86_64
  libbpf: add support for tracing session
  selftests/bpf: add testcases for tracing session
  selftests/bpf: test fsession mixed with fentry and fexit

 arch/x86/net/bpf_jit_comp.c                   |  47 +++-
 include/linux/bpf.h                           |  39 +++
 include/uapi/linux/bpf.h                      |   1 +
 kernel/bpf/btf.c                              |   2 +
 kernel/bpf/syscall.c                          |  18 +-
 kernel/bpf/trampoline.c                       |  50 +++-
 kernel/bpf/verifier.c                         |  75 ++++--
 kernel/trace/bpf_trace.c                      |  56 ++++-
 net/bpf/test_run.c                            |   1 +
 net/core/bpf_sk_storage.c                     |   1 +
 tools/bpf/bpftool/common.c                    |   1 +
 tools/include/uapi/linux/bpf.h                |   1 +
 tools/lib/bpf/bpf.c                           |   2 +
 tools/lib/bpf/libbpf.c                        |   3 +
 .../selftests/bpf/prog_tests/fsession_test.c  |  90 +++++++
 .../bpf/prog_tests/tracing_failure.c          |   2 +-
 .../selftests/bpf/progs/fsession_test.c       | 226 ++++++++++++++++++
 17 files changed, 571 insertions(+), 44 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fsession_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fsession_test.c

-- 
2.52.0

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 1/9] bpf: add tracing session support
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-17  9:54 ` [PATCH bpf-next v4 2/9] bpf: use last 8-bits for the nr_args in trampoline Menglong Dong
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

The tracing session is something that similar to kprobe session. It allow
to attach a single BPF program to both the entry and the exit of the
target functions.

Introduce the struct bpf_fsession_link, which allows to add the link to
both the fentry and fexit progs_hlist of the trampoline.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
v4:
- instead of adding a new hlist to progs_hlist in trampoline, add the bpf
  program to both the fentry hlist and the fexit hlist.
---
 include/linux/bpf.h                           | 20 +++++++++++
 include/uapi/linux/bpf.h                      |  1 +
 kernel/bpf/btf.c                              |  2 ++
 kernel/bpf/syscall.c                          | 18 +++++++++-
 kernel/bpf/trampoline.c                       | 36 +++++++++++++++----
 kernel/bpf/verifier.c                         | 12 +++++--
 net/bpf/test_run.c                            |  1 +
 net/core/bpf_sk_storage.c                     |  1 +
 tools/include/uapi/linux/bpf.h                |  1 +
 .../bpf/prog_tests/tracing_failure.c          |  2 +-
 10 files changed, 83 insertions(+), 11 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 28d8d6b7bb1e..3b2273b110b8 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1291,6 +1291,7 @@ enum bpf_tramp_prog_type {
 	BPF_TRAMP_MODIFY_RETURN,
 	BPF_TRAMP_MAX,
 	BPF_TRAMP_REPLACE, /* more than MAX */
+	BPF_TRAMP_SESSION,
 };
 
 struct bpf_tramp_image {
@@ -1854,6 +1855,11 @@ struct bpf_tracing_link {
 	struct bpf_prog *tgt_prog;
 };
 
+struct bpf_fsession_link {
+	struct bpf_tracing_link link;
+	struct bpf_tramp_link fexit;
+};
+
 struct bpf_raw_tp_link {
 	struct bpf_link link;
 	struct bpf_raw_event_map *btp;
@@ -2114,6 +2120,20 @@ static inline void bpf_struct_ops_desc_release(struct bpf_struct_ops_desc *st_op
 
 #endif
 
+static inline int bpf_fsession_cnt(struct bpf_tramp_links *links)
+{
+	struct bpf_tramp_links fentries = links[BPF_TRAMP_FENTRY];
+	int cnt = 0;
+
+	for (int i = 0; i < links[BPF_TRAMP_FENTRY].nr_links; i++) {
+		if (fentries.links[i]->link.prog->expected_attach_type ==
+		    BPF_TRACE_SESSION)
+			cnt++;
+	}
+
+	return cnt;
+}
+
 int bpf_prog_ctx_arg_info_init(struct bpf_prog *prog,
 			       const struct bpf_ctx_arg_aux *info, u32 cnt);
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 84ced3ed2d21..696a7d37db0e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1145,6 +1145,7 @@ enum bpf_attach_type {
 	BPF_NETKIT_PEER,
 	BPF_TRACE_KPROBE_SESSION,
 	BPF_TRACE_UPROBE_SESSION,
+	BPF_TRACE_SESSION,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 0de8fc8a0e0b..2c1c3e0caff8 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6107,6 +6107,7 @@ static int btf_validate_prog_ctx_type(struct bpf_verifier_log *log, const struct
 		case BPF_TRACE_FENTRY:
 		case BPF_TRACE_FEXIT:
 		case BPF_MODIFY_RETURN:
+		case BPF_TRACE_SESSION:
 			/* allow u64* as ctx */
 			if (btf_is_int(t) && t->size == 8)
 				return 0;
@@ -6704,6 +6705,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 			fallthrough;
 		case BPF_LSM_CGROUP:
 		case BPF_TRACE_FEXIT:
+		case BPF_TRACE_SESSION:
 			/* When LSM programs are attached to void LSM hooks
 			 * they use FEXIT trampolines and when attached to
 			 * int LSM hooks, they use MODIFY_RETURN trampolines.
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 3080cc48bfc3..91c77f63261a 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3579,6 +3579,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
 	case BPF_PROG_TYPE_TRACING:
 		if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
 		    prog->expected_attach_type != BPF_TRACE_FEXIT &&
+		    prog->expected_attach_type != BPF_TRACE_SESSION &&
 		    prog->expected_attach_type != BPF_MODIFY_RETURN) {
 			err = -EINVAL;
 			goto out_put_prog;
@@ -3628,7 +3629,21 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
 		key = bpf_trampoline_compute_key(tgt_prog, NULL, btf_id);
 	}
 
-	link = kzalloc(sizeof(*link), GFP_USER);
+	if (prog->expected_attach_type == BPF_TRACE_SESSION) {
+		struct bpf_fsession_link *fslink;
+
+		fslink = kzalloc(sizeof(*fslink), GFP_USER);
+		if (fslink) {
+			bpf_link_init(&fslink->fexit.link, BPF_LINK_TYPE_TRACING,
+				      &bpf_tracing_link_lops, prog, attach_type);
+			fslink->fexit.cookie = bpf_cookie;
+			link = &fslink->link;
+		} else {
+			link = NULL;
+		}
+	} else {
+		link = kzalloc(sizeof(*link), GFP_USER);
+	}
 	if (!link) {
 		err = -ENOMEM;
 		goto out_put_prog;
@@ -4352,6 +4367,7 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
 	case BPF_TRACE_RAW_TP:
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
+	case BPF_TRACE_SESSION:
 	case BPF_MODIFY_RETURN:
 		return BPF_PROG_TYPE_TRACING;
 	case BPF_LSM_MAC:
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 976d89011b15..3b9fc99e1e89 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -111,7 +111,7 @@ bool bpf_prog_has_trampoline(const struct bpf_prog *prog)
 
 	return (ptype == BPF_PROG_TYPE_TRACING &&
 		(eatype == BPF_TRACE_FENTRY || eatype == BPF_TRACE_FEXIT ||
-		 eatype == BPF_MODIFY_RETURN)) ||
+		 eatype == BPF_MODIFY_RETURN || eatype == BPF_TRACE_SESSION)) ||
 		(ptype == BPF_PROG_TYPE_LSM && eatype == BPF_LSM_MAC);
 }
 
@@ -559,6 +559,8 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
 		return BPF_TRAMP_MODIFY_RETURN;
 	case BPF_TRACE_FEXIT:
 		return BPF_TRAMP_FEXIT;
+	case BPF_TRACE_SESSION:
+		return BPF_TRAMP_SESSION;
 	case BPF_LSM_MAC:
 		if (!prog->aux->attach_func_proto->type)
 			/* The function returns void, we cannot modify its
@@ -594,12 +596,13 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 				      struct bpf_trampoline *tr,
 				      struct bpf_prog *tgt_prog)
 {
-	enum bpf_tramp_prog_type kind;
-	struct bpf_tramp_link *link_exiting;
+	enum bpf_tramp_prog_type kind, okind;
+	struct bpf_tramp_link *link_existing;
+	struct bpf_fsession_link *fslink;
 	int err = 0;
 	int cnt = 0, i;
 
-	kind = bpf_attach_type_to_tramp(link->link.prog);
+	okind = kind = bpf_attach_type_to_tramp(link->link.prog);
 	if (tr->extension_prog)
 		/* cannot attach fentry/fexit if extension prog is attached.
 		 * cannot overwrite extension prog either.
@@ -621,13 +624,18 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 					  BPF_MOD_JUMP, NULL,
 					  link->link.prog->bpf_func);
 	}
+	if (kind == BPF_TRAMP_SESSION) {
+		/* deal with fsession as fentry by default */
+		kind = BPF_TRAMP_FENTRY;
+		cnt++;
+	}
 	if (cnt >= BPF_MAX_TRAMP_LINKS)
 		return -E2BIG;
 	if (!hlist_unhashed(&link->tramp_hlist))
 		/* prog already linked */
 		return -EBUSY;
-	hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
-		if (link_exiting->link.prog != link->link.prog)
+	hlist_for_each_entry(link_existing, &tr->progs_hlist[kind], tramp_hlist) {
+		if (link_existing->link.prog != link->link.prog)
 			continue;
 		/* prog already linked */
 		return -EBUSY;
@@ -635,8 +643,18 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 
 	hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]);
 	tr->progs_cnt[kind]++;
+	if (okind == BPF_TRAMP_SESSION) {
+		fslink = container_of(link, struct bpf_fsession_link, link.link);
+		hlist_add_head(&fslink->fexit.tramp_hlist,
+			       &tr->progs_hlist[BPF_TRAMP_FEXIT]);
+		tr->progs_cnt[BPF_TRAMP_FEXIT]++;
+	}
 	err = bpf_trampoline_update(tr, true /* lock_direct_mutex */);
 	if (err) {
+		if (okind == BPF_TRAMP_SESSION) {
+			hlist_del_init(&fslink->fexit.tramp_hlist);
+			tr->progs_cnt[BPF_TRAMP_FEXIT]--;
+		}
 		hlist_del_init(&link->tramp_hlist);
 		tr->progs_cnt[kind]--;
 	}
@@ -659,6 +677,7 @@ static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
 					struct bpf_trampoline *tr,
 					struct bpf_prog *tgt_prog)
 {
+	struct bpf_fsession_link *fslink;
 	enum bpf_tramp_prog_type kind;
 	int err;
 
@@ -672,6 +691,11 @@ static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
 		guard(mutex)(&tgt_prog->aux->ext_mutex);
 		tgt_prog->aux->is_extended = false;
 		return err;
+	} else if (kind == BPF_TRAMP_SESSION) {
+		fslink = container_of(link, struct bpf_fsession_link, link.link);
+		hlist_del_init(&fslink->fexit.tramp_hlist);
+		tr->progs_cnt[BPF_TRAMP_FEXIT]--;
+		kind = BPF_TRAMP_FENTRY;
 	}
 	hlist_del_init(&link->tramp_hlist);
 	tr->progs_cnt[kind]--;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a31c032b2dd6..d399bfd2413f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17402,6 +17402,7 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char
 		switch (env->prog->expected_attach_type) {
 		case BPF_TRACE_FENTRY:
 		case BPF_TRACE_FEXIT:
+		case BPF_TRACE_SESSION:
 			range = retval_range(0, 0);
 			break;
 		case BPF_TRACE_RAW_TP:
@@ -23298,6 +23299,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 		if (prog_type == BPF_PROG_TYPE_TRACING &&
 		    insn->imm == BPF_FUNC_get_func_ret) {
 			if (eatype == BPF_TRACE_FEXIT ||
+			    eatype == BPF_TRACE_SESSION ||
 			    eatype == BPF_MODIFY_RETURN) {
 				/* Load nr_args from ctx - 8 */
 				insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
@@ -24242,7 +24244,8 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 		if (tgt_prog->type == BPF_PROG_TYPE_TRACING &&
 		    prog_extension &&
 		    (tgt_prog->expected_attach_type == BPF_TRACE_FENTRY ||
-		     tgt_prog->expected_attach_type == BPF_TRACE_FEXIT)) {
+		     tgt_prog->expected_attach_type == BPF_TRACE_FEXIT ||
+		     tgt_prog->expected_attach_type == BPF_TRACE_SESSION)) {
 			/* Program extensions can extend all program types
 			 * except fentry/fexit. The reason is the following.
 			 * The fentry/fexit programs are used for performance
@@ -24257,7 +24260,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 			 * beyond reasonable stack size. Hence extending fentry
 			 * is not allowed.
 			 */
-			bpf_log(log, "Cannot extend fentry/fexit\n");
+			bpf_log(log, "Cannot extend fentry/fexit/session\n");
 			return -EINVAL;
 		}
 	} else {
@@ -24341,6 +24344,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
 	case BPF_LSM_CGROUP:
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
+	case BPF_TRACE_SESSION:
 		if (!btf_type_is_func(t)) {
 			bpf_log(log, "attach_btf_id %u is not a function\n",
 				btf_id);
@@ -24507,6 +24511,7 @@ static bool can_be_sleepable(struct bpf_prog *prog)
 		case BPF_TRACE_FEXIT:
 		case BPF_MODIFY_RETURN:
 		case BPF_TRACE_ITER:
+		case BPF_TRACE_SESSION:
 			return true;
 		default:
 			return false;
@@ -24588,9 +24593,10 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 			tgt_info.tgt_name);
 		return -EINVAL;
 	} else if ((prog->expected_attach_type == BPF_TRACE_FEXIT ||
+		   prog->expected_attach_type == BPF_TRACE_SESSION ||
 		   prog->expected_attach_type == BPF_MODIFY_RETURN) &&
 		   btf_id_set_contains(&noreturn_deny, btf_id)) {
-		verbose(env, "Attaching fexit/fmod_ret to __noreturn function '%s' is rejected.\n",
+		verbose(env, "Attaching fexit/session/fmod_ret to __noreturn function '%s' is rejected.\n",
 			tgt_info.tgt_name);
 		return -EINVAL;
 	}
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 655efac6f133..ddec08b696de 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -685,6 +685,7 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
 	switch (prog->expected_attach_type) {
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
+	case BPF_TRACE_SESSION:
 		if (bpf_fentry_test1(1) != 2 ||
 		    bpf_fentry_test2(2, 3) != 5 ||
 		    bpf_fentry_test3(4, 5, 6) != 15 ||
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 850dd736ccd1..afe28b558716 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -365,6 +365,7 @@ static bool bpf_sk_storage_tracing_allowed(const struct bpf_prog *prog)
 		return true;
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
+	case BPF_TRACE_SESSION:
 		return !!strncmp(prog->aux->attach_func_name, "bpf_sk_storage",
 				 strlen("bpf_sk_storage"));
 	default:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 6b92b0847ec2..f0dec9f8f416 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1145,6 +1145,7 @@ enum bpf_attach_type {
 	BPF_NETKIT_PEER,
 	BPF_TRACE_KPROBE_SESSION,
 	BPF_TRACE_UPROBE_SESSION,
+	BPF_TRACE_SESSION,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/tools/testing/selftests/bpf/prog_tests/tracing_failure.c b/tools/testing/selftests/bpf/prog_tests/tracing_failure.c
index 10e231965589..58b02552507d 100644
--- a/tools/testing/selftests/bpf/prog_tests/tracing_failure.c
+++ b/tools/testing/selftests/bpf/prog_tests/tracing_failure.c
@@ -73,7 +73,7 @@ static void test_tracing_deny(void)
 static void test_fexit_noreturns(void)
 {
 	test_tracing_fail_prog("fexit_noreturns",
-			       "Attaching fexit/fmod_ret to __noreturn function 'do_exit' is rejected.");
+			       "Attaching fexit/session/fmod_ret to __noreturn function 'do_exit' is rejected.");
 }
 
 void test_tracing_failure(void)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 2/9] bpf: use last 8-bits for the nr_args in trampoline
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 1/9] bpf: add tracing session support Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 3/9] bpf: add the kfunc bpf_fsession_is_return Menglong Dong
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

For now, ctx[-1] is used to store the nr_args in the trampoline. However,
1-byte is enough to store such information. Therefor, we use only the last
byts of ctx[-1] to store the nr_args, and reverve the rest for other
usages.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 kernel/bpf/verifier.c    | 35 +++++++++++++++++++----------------
 kernel/trace/bpf_trace.c |  4 ++--
 2 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d399bfd2413f..96753833c090 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -23275,15 +23275,16 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 		    insn->imm == BPF_FUNC_get_func_arg) {
 			/* Load nr_args from ctx - 8 */
 			insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
-			insn_buf[1] = BPF_JMP32_REG(BPF_JGE, BPF_REG_2, BPF_REG_0, 6);
-			insn_buf[2] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 3);
-			insn_buf[3] = BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1);
-			insn_buf[4] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0);
-			insn_buf[5] = BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0);
-			insn_buf[6] = BPF_MOV64_IMM(BPF_REG_0, 0);
-			insn_buf[7] = BPF_JMP_A(1);
-			insn_buf[8] = BPF_MOV64_IMM(BPF_REG_0, -EINVAL);
-			cnt = 9;
+			insn_buf[1] = BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xFF);
+			insn_buf[2] = BPF_JMP32_REG(BPF_JGE, BPF_REG_2, BPF_REG_0, 6);
+			insn_buf[3] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_2, 3);
+			insn_buf[4] = BPF_ALU64_REG(BPF_ADD, BPF_REG_2, BPF_REG_1);
+			insn_buf[5] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_2, 0);
+			insn_buf[6] = BPF_STX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0);
+			insn_buf[7] = BPF_MOV64_IMM(BPF_REG_0, 0);
+			insn_buf[8] = BPF_JMP_A(1);
+			insn_buf[9] = BPF_MOV64_IMM(BPF_REG_0, -EINVAL);
+			cnt = 10;
 
 			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
 			if (!new_prog)
@@ -23303,12 +23304,13 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			    eatype == BPF_MODIFY_RETURN) {
 				/* Load nr_args from ctx - 8 */
 				insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
-				insn_buf[1] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_0, 3);
-				insn_buf[2] = BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1);
-				insn_buf[3] = BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0);
-				insn_buf[4] = BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_3, 0);
-				insn_buf[5] = BPF_MOV64_IMM(BPF_REG_0, 0);
-				cnt = 6;
+				insn_buf[1] = BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xFF);
+				insn_buf[2] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_0, 3);
+				insn_buf[3] = BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1);
+				insn_buf[4] = BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0);
+				insn_buf[5] = BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_3, 0);
+				insn_buf[6] = BPF_MOV64_IMM(BPF_REG_0, 0);
+				cnt = 7;
 			} else {
 				insn_buf[0] = BPF_MOV64_IMM(BPF_REG_0, -EOPNOTSUPP);
 				cnt = 1;
@@ -23329,8 +23331,9 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 		    insn->imm == BPF_FUNC_get_func_arg_cnt) {
 			/* Load nr_args from ctx - 8 */
 			insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
+			insn_buf[1] = BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xFF);
 
-			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 1);
+			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, 2);
 			if (!new_prog)
 				return -ENOMEM;
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index d57727abaade..10c9992d2745 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1194,7 +1194,7 @@ const struct bpf_func_proto bpf_get_branch_snapshot_proto = {
 BPF_CALL_3(get_func_arg, void *, ctx, u32, n, u64 *, value)
 {
 	/* This helper call is inlined by verifier. */
-	u64 nr_args = ((u64 *)ctx)[-1];
+	u64 nr_args = ((u64 *)ctx)[-1] & 0xFF;
 
 	if ((u64) n >= nr_args)
 		return -EINVAL;
@@ -1214,7 +1214,7 @@ static const struct bpf_func_proto bpf_get_func_arg_proto = {
 BPF_CALL_2(get_func_ret, void *, ctx, u64 *, value)
 {
 	/* This helper call is inlined by verifier. */
-	u64 nr_args = ((u64 *)ctx)[-1];
+	u64 nr_args = ((u64 *)ctx)[-1] & 0xFF;
 
 	*value = ((u64 *)ctx)[nr_args];
 	return 0;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 3/9] bpf: add the kfunc bpf_fsession_is_return
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 1/9] bpf: add tracing session support Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 2/9] bpf: use last 8-bits for the nr_args in trampoline Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie Menglong Dong
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

If TRACE_SESSION exists, we will use the bit (1 << BPF_TRAMP_M_IS_RETURN)
in ctx[-1] to store the "is_return" flag.

Introduce the kfunc bpf_fsession_is_return(), which is used to tell if it
is fexit currently. Meanwhile, inline it in the verifier.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
v4:
- split out the bpf_fsession_cookie() to another patch

v3:
- merge the bpf_tracing_is_exit and bpf_fsession_cookie into a single
  patch

v2:
- store the session flags after return value, instead of before nr_args
- inline the bpf_tracing_is_exit, as Jiri suggested
---
 include/linux/bpf.h      |  3 +++
 kernel/bpf/verifier.c    | 11 +++++++++-
 kernel/trace/bpf_trace.c | 43 +++++++++++++++++++++++++++++++++++++---
 3 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 3b2273b110b8..d165ace5cc9b 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1213,6 +1213,9 @@ enum {
 #endif
 };
 
+#define BPF_TRAMP_M_NR_ARGS	0
+#define BPF_TRAMP_M_IS_RETURN	8
+
 struct bpf_tramp_links {
 	struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
 	int nr_links;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 96753833c090..b0dcd715150f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12380,6 +12380,7 @@ enum special_kfunc_type {
 	KF___bpf_trap,
 	KF_bpf_task_work_schedule_signal_impl,
 	KF_bpf_task_work_schedule_resume_impl,
+	KF_bpf_fsession_is_return,
 };
 
 BTF_ID_LIST(special_kfunc_list)
@@ -12454,6 +12455,7 @@ BTF_ID(func, bpf_dynptr_file_discard)
 BTF_ID(func, __bpf_trap)
 BTF_ID(func, bpf_task_work_schedule_signal_impl)
 BTF_ID(func, bpf_task_work_schedule_resume_impl)
+BTF_ID(func, bpf_fsession_is_return)
 
 static bool is_task_work_add_kfunc(u32 func_id)
 {
@@ -12508,7 +12510,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
 	struct bpf_reg_state *reg = &regs[regno];
 	bool arg_mem_size = false;
 
-	if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx])
+	if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
+	    meta->func_id == special_kfunc_list[KF_bpf_fsession_is_return])
 		return KF_ARG_PTR_TO_CTX;
 
 	/* In this function, we verify the kfunc's BTF as per the argument type,
@@ -22556,6 +22559,12 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		   desc->func_id == special_kfunc_list[KF_bpf_rdonly_cast]) {
 		insn_buf[0] = BPF_MOV64_REG(BPF_REG_0, BPF_REG_1);
 		*cnt = 1;
+	} else if (desc->func_id == special_kfunc_list[KF_bpf_fsession_is_return]) {
+		/* Load nr_args from ctx - 8 */
+		insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
+		insn_buf[1] = BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, BPF_TRAMP_M_IS_RETURN);
+		insn_buf[2] = BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 1);
+		*cnt = 3;
 	}
 
 	if (env->insn_aux_data[insn_idx].arg_prog) {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 10c9992d2745..0857d77eb34c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -3356,12 +3356,49 @@ static const struct btf_kfunc_id_set bpf_kprobe_multi_kfunc_set = {
 	.filter = bpf_kprobe_multi_filter,
 };
 
-static int __init bpf_kprobe_multi_kfuncs_init(void)
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc bool bpf_fsession_is_return(void *ctx)
+{
+	/* This helper call is inlined by verifier. */
+	return !!(((u64 *)ctx)[-1] & (1 << BPF_TRAMP_M_IS_RETURN));
+}
+
+__bpf_kfunc_end_defs();
+
+BTF_KFUNCS_START(tracing_kfunc_set_ids)
+BTF_ID_FLAGS(func, bpf_fsession_is_return, KF_FASTCALL)
+BTF_KFUNCS_END(tracing_kfunc_set_ids)
+
+static int bpf_tracing_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
-	return register_btf_kfunc_id_set(BPF_PROG_TYPE_KPROBE, &bpf_kprobe_multi_kfunc_set);
+	if (!btf_id_set8_contains(&tracing_kfunc_set_ids, kfunc_id))
+		return 0;
+
+	if (prog->type != BPF_PROG_TYPE_TRACING ||
+	    prog->expected_attach_type != BPF_TRACE_SESSION)
+		return -EINVAL;
+
+	return 0;
+}
+
+static const struct btf_kfunc_id_set bpf_tracing_kfunc_set = {
+	.owner = THIS_MODULE,
+	.set = &tracing_kfunc_set_ids,
+	.filter = bpf_tracing_filter,
+};
+
+static int __init bpf_trace_kfuncs_init(void)
+{
+	int err = 0;
+
+	err = err ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_KPROBE, &bpf_kprobe_multi_kfunc_set);
+	err = err ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_tracing_kfunc_set);
+
+	return err;
 }
 
-late_initcall(bpf_kprobe_multi_kfuncs_init);
+late_initcall(bpf_trace_kfuncs_init);
 
 typedef int (*copy_fn_t)(void *dst, const void *src, u32 size, struct task_struct *tsk);
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (2 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 3/9] bpf: add the kfunc bpf_fsession_is_return Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-17  9:54 ` [PATCH bpf-next v4 5/9] bpf,x86: introduce emit_st_r0_imm64() for trampoline Menglong Dong
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Implement session cookie for fsession. In order to limit the stack usage,
we make 4 as the maximum of the cookie count.

The offset of the current cookie is stored in the
"(ctx[-1] >> BPF_TRAMP_M_COOKIE) & 0xFF". Therefore, we can get the
session cookie with ctx[-offset].

The stack will look like this:

  return value	-> 8 bytes
  argN		-> 8 bytes
  ...
  arg1		-> 8 bytes
  nr_args	-> 8 bytes
  ip(optional)	-> 8 bytes
  cookie2	-> 8 bytes
  cookie1	-> 8 bytes

Inline the bpf_fsession_cookie() in the verifer too.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v4:
- limit the maximum of the cookie count to 4
- store the session cookies before nr_regs in stack
---
 include/linux/bpf.h      | 16 ++++++++++++++++
 kernel/bpf/trampoline.c  | 14 +++++++++++++-
 kernel/bpf/verifier.c    | 20 ++++++++++++++++++--
 kernel/trace/bpf_trace.c |  9 +++++++++
 4 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index d165ace5cc9b..0f35c6ab538c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1215,6 +1215,7 @@ enum {
 
 #define BPF_TRAMP_M_NR_ARGS	0
 #define BPF_TRAMP_M_IS_RETURN	8
+#define BPF_TRAMP_M_COOKIE	9
 
 struct bpf_tramp_links {
 	struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
@@ -1318,6 +1319,7 @@ struct bpf_trampoline {
 	struct mutex mutex;
 	refcount_t refcnt;
 	u32 flags;
+	int cookie_cnt;
 	u64 key;
 	struct {
 		struct btf_func_model model;
@@ -1762,6 +1764,7 @@ struct bpf_prog {
 				enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
 				call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
 				call_get_func_ip:1, /* Do we call get_func_ip() */
+				call_session_cookie:1, /* Do we call bpf_fsession_cookie() */
 				tstamp_type_access:1, /* Accessed __sk_buff->tstamp_type */
 				sleepable:1;	/* BPF program is sleepable */
 	enum bpf_prog_type	type;		/* Type of BPF program */
@@ -2137,6 +2140,19 @@ static inline int bpf_fsession_cnt(struct bpf_tramp_links *links)
 	return cnt;
 }
 
+static inline int bpf_fsession_cookie_cnt(struct bpf_tramp_links *links)
+{
+	struct bpf_tramp_links fentries = links[BPF_TRAMP_FENTRY];
+	int cnt = 0;
+
+	for (int i = 0; i < links[BPF_TRAMP_FENTRY].nr_links; i++) {
+		if (fentries.links[i]->link.prog->call_session_cookie)
+			cnt++;
+	}
+
+	return cnt;
+}
+
 int bpf_prog_ctx_arg_info_init(struct bpf_prog *prog,
 			       const struct bpf_ctx_arg_aux *info, u32 cnt);
 
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 3b9fc99e1e89..68f3d73a94a9 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -592,6 +592,8 @@ static int bpf_freplace_check_tgt_prog(struct bpf_prog *tgt_prog)
 	return 0;
 }
 
+#define BPF_TRAMP_MAX_COOKIES 4
+
 static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 				      struct bpf_trampoline *tr,
 				      struct bpf_prog *tgt_prog)
@@ -599,7 +601,7 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 	enum bpf_tramp_prog_type kind, okind;
 	struct bpf_tramp_link *link_existing;
 	struct bpf_fsession_link *fslink;
-	int err = 0;
+	int err = 0, cookie_cnt;
 	int cnt = 0, i;
 
 	okind = kind = bpf_attach_type_to_tramp(link->link.prog);
@@ -640,6 +642,12 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 		/* prog already linked */
 		return -EBUSY;
 	}
+	cookie_cnt = tr->cookie_cnt;
+	if (link->link.prog->call_session_cookie) {
+		if (cookie_cnt >= BPF_TRAMP_MAX_COOKIES)
+			return -E2BIG;
+		cookie_cnt++;
+	}
 
 	hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]);
 	tr->progs_cnt[kind]++;
@@ -657,6 +665,8 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 		}
 		hlist_del_init(&link->tramp_hlist);
 		tr->progs_cnt[kind]--;
+	} else {
+		tr->cookie_cnt = cookie_cnt;
 	}
 	return err;
 }
@@ -696,6 +706,8 @@ static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link,
 		hlist_del_init(&fslink->fexit.tramp_hlist);
 		tr->progs_cnt[BPF_TRAMP_FEXIT]--;
 		kind = BPF_TRAMP_FENTRY;
+		if (link->link.prog->call_session_cookie)
+			tr->cookie_cnt--;
 	}
 	hlist_del_init(&link->tramp_hlist);
 	tr->progs_cnt[kind]--;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b0dcd715150f..da8335cf7c46 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12381,6 +12381,7 @@ enum special_kfunc_type {
 	KF_bpf_task_work_schedule_signal_impl,
 	KF_bpf_task_work_schedule_resume_impl,
 	KF_bpf_fsession_is_return,
+	KF_bpf_fsession_cookie,
 };
 
 BTF_ID_LIST(special_kfunc_list)
@@ -12456,6 +12457,7 @@ BTF_ID(func, __bpf_trap)
 BTF_ID(func, bpf_task_work_schedule_signal_impl)
 BTF_ID(func, bpf_task_work_schedule_resume_impl)
 BTF_ID(func, bpf_fsession_is_return)
+BTF_ID(func, bpf_fsession_cookie)
 
 static bool is_task_work_add_kfunc(u32 func_id)
 {
@@ -12511,7 +12513,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
 	bool arg_mem_size = false;
 
 	if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
-	    meta->func_id == special_kfunc_list[KF_bpf_fsession_is_return])
+	    meta->func_id == special_kfunc_list[KF_bpf_fsession_is_return] ||
+	    meta->func_id == special_kfunc_list[KF_bpf_fsession_cookie])
 		return KF_ARG_PTR_TO_CTX;
 
 	/* In this function, we verify the kfunc's BTF as per the argument type,
@@ -14009,7 +14012,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		}
 	}
 
-	if (meta.func_id == special_kfunc_list[KF_bpf_session_cookie]) {
+	if (meta.func_id == special_kfunc_list[KF_bpf_session_cookie] ||
+	    meta.func_id == special_kfunc_list[KF_bpf_fsession_cookie]) {
 		meta.r0_size = sizeof(u64);
 		meta.r0_rdonly = false;
 	}
@@ -14293,6 +14297,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			return err;
 	}
 
+	if (meta.func_id == special_kfunc_list[KF_bpf_fsession_cookie])
+		env->prog->call_session_cookie = true;
+
 	return 0;
 }
 
@@ -22565,6 +22572,15 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		insn_buf[1] = BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, BPF_TRAMP_M_IS_RETURN);
 		insn_buf[2] = BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 1);
 		*cnt = 3;
+	} else if (desc->func_id == special_kfunc_list[KF_bpf_fsession_cookie]) {
+		/* Load nr_args from ctx - 8 */
+		insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
+		insn_buf[1] = BPF_ALU64_IMM(BPF_RSH, BPF_REG_0, BPF_TRAMP_M_COOKIE);
+		insn_buf[2] = BPF_ALU64_IMM(BPF_AND, BPF_REG_0, 0xFF);
+		insn_buf[3] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_0, 3);
+		insn_buf[4] = BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1);
+		insn_buf[5] = BPF_ALU64_IMM(BPF_NEG, BPF_REG_0, 0);
+		*cnt = 6;
 	}
 
 	if (env->insn_aux_data[insn_idx].arg_prog) {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0857d77eb34c..ae7866d8ea62 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -3364,10 +3364,19 @@ __bpf_kfunc bool bpf_fsession_is_return(void *ctx)
 	return !!(((u64 *)ctx)[-1] & (1 << BPF_TRAMP_M_IS_RETURN));
 }
 
+__bpf_kfunc u64 *bpf_fsession_cookie(void *ctx)
+{
+	/* This helper call is inlined by verifier. */
+	u64 off = (((u64 *)ctx)[-1] >> BPF_TRAMP_M_COOKIE) & 0xFF;
+
+	return &((u64 *)ctx)[-off];
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(tracing_kfunc_set_ids)
 BTF_ID_FLAGS(func, bpf_fsession_is_return, KF_FASTCALL)
+BTF_ID_FLAGS(func, bpf_fsession_cookie, KF_FASTCALL)
 BTF_KFUNCS_END(tracing_kfunc_set_ids)
 
 static int bpf_tracing_filter(const struct bpf_prog *prog, u32 kfunc_id)
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 5/9] bpf,x86: introduce emit_st_r0_imm64() for trampoline
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (3 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-17  9:54 ` [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64 Menglong Dong
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Introduce the helper emit_st_r0_imm64(), which is used to store a imm64 to
the stack with the help of r0.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 arch/x86/net/bpf_jit_comp.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b69dc7194e2c..8cbeefb26192 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1300,6 +1300,15 @@ static void emit_st_r12(u8 **pprog, u32 size, u32 dst_reg, int off, int imm)
 	emit_st_index(pprog, size, dst_reg, X86_REG_R12, off, imm);
 }
 
+static void emit_st_r0_imm64(u8 **pprog, u64 value, int off)
+{
+	/* mov rax, value
+	 * mov QWORD PTR [rbp - off], rax
+	 */
+	emit_mov_imm64(pprog, BPF_REG_0, value >> 32, (u32) value);
+	emit_stx(pprog, BPF_DW, BPF_REG_FP, BPF_REG_0, -off);
+}
+
 static int emit_atomic_rmw(u8 **pprog, u32 atomic_op,
 			   u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size)
 {
@@ -3341,16 +3350,14 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 	 *   mov rax, nr_regs
 	 *   mov QWORD PTR [rbp - nregs_off], rax
 	 */
-	emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_regs);
-	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -nregs_off);
+	emit_st_r0_imm64(&prog, nr_regs, nregs_off);
 
 	if (flags & BPF_TRAMP_F_IP_ARG) {
 		/* Store IP address of the traced function:
 		 * movabsq rax, func_addr
 		 * mov QWORD PTR [rbp - ip_off], rax
 		 */
-		emit_mov_imm64(&prog, BPF_REG_0, (long) func_addr >> 32, (u32) (long) func_addr);
-		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off);
+		emit_st_r0_imm64(&prog, (long)func_addr, ip_off);
 	}
 
 	save_args(m, &prog, regs_off, false, flags);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (4 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 5/9] bpf,x86: introduce emit_st_r0_imm64() for trampoline Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-17  9:54 ` [PATCH bpf-next v4 7/9] libbpf: add support for tracing session Menglong Dong
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Add BPF_TRACE_SESSION supporting to x86_64, including:

1. clear the return value in the stack before fentry to make the fentry
   of the fsession can only get 0 with bpf_get_func_ret(). If we can limit
   that bpf_get_func_ret() can only be used in the
   "bpf_fsession_is_return() == true" code path, we don't need do this
   thing anymore.

2. clear all the session cookies' value in the stack. If we can make sure
   that the reading to session cookie can only be done after initialize in
   the verifier, we don't need this anymore.

2. store the index of the cookie to ctx[-1] before the calling to fsession

3. store the "is_return" flag to ctx[-1] before the calling to fexit of
   the fsession.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
v4:
- some adjustment to the 1st patch, such as we get the fsession prog from
  fentry and fexit hlist
- remove the supporting of skipping fexit with fentry return non-zero

v2:
- add session cookie support
- add the session stuff after return value, instead of before nr_args
---
 arch/x86/net/bpf_jit_comp.c | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8cbeefb26192..99b0223374bd 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3086,12 +3086,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
 static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
 		      struct bpf_tramp_links *tl, int stack_size,
 		      int run_ctx_off, bool save_ret,
-		      void *image, void *rw_image)
+		      void *image, void *rw_image, u64 nr_regs)
 {
 	int i;
 	u8 *prog = *pprog;
 
 	for (i = 0; i < tl->nr_links; i++) {
+		if (tl->links[i]->link.prog->call_session_cookie) {
+			/* 'stack_size + 8' is the offset of nr_regs in stack */
+			emit_st_r0_imm64(&prog, nr_regs, stack_size + 8);
+			nr_regs -= (1 << BPF_TRAMP_M_COOKIE);
+		}
 		if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
 				    run_ctx_off, save_ret, image, rw_image))
 			return -EINVAL;
@@ -3208,8 +3213,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 					 struct bpf_tramp_links *tlinks,
 					 void *func_addr)
 {
-	int i, ret, nr_regs = m->nr_args, stack_size = 0;
-	int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
+	int i, ret, nr_regs = m->nr_args, cookie_cnt, stack_size = 0;
+	int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off,
+	    cookie_off;
 	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
 	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
 	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
@@ -3282,6 +3288,11 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 
 	ip_off = stack_size;
 
+	cookie_cnt = bpf_fsession_cookie_cnt(tlinks);
+	/* room for session cookies */
+	stack_size += cookie_cnt * 8;
+	cookie_off = stack_size;
+
 	stack_size += 8;
 	rbx_off = stack_size;
 
@@ -3372,9 +3383,19 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 		}
 	}
 
+	if (bpf_fsession_cnt(tlinks)) {
+		/* clear all the session cookies' value */
+		for (int i = 0; i < cookie_cnt; i++)
+			emit_st_r0_imm64(&prog, 0, cookie_off - 8 * i);
+		/* clear the return value to make sure fentry always get 0 */
+		emit_st_r0_imm64(&prog, 0, 8);
+		nr_regs += (((cookie_off - regs_off) / 8) << BPF_TRAMP_M_COOKIE);
+	}
+
 	if (fentry->nr_links) {
 		if (invoke_bpf(m, &prog, fentry, regs_off, run_ctx_off,
-			       flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image))
+			       flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image,
+			       nr_regs))
 			return -EINVAL;
 	}
 
@@ -3434,9 +3455,14 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
 		}
 	}
 
+	/* set the "is_return" flag for fsession */
+	nr_regs += (1 << BPF_TRAMP_M_IS_RETURN);
+	if (bpf_fsession_cnt(tlinks))
+		emit_st_r0_imm64(&prog, nr_regs, nregs_off);
+
 	if (fexit->nr_links) {
 		if (invoke_bpf(m, &prog, fexit, regs_off, run_ctx_off,
-			       false, image, rw_image)) {
+			       false, image, rw_image, nr_regs)) {
 			ret = -EINVAL;
 			goto cleanup;
 		}
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 7/9] libbpf: add support for tracing session
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (5 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64 Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-17  9:54 ` [PATCH bpf-next v4 8/9] selftests/bpf: add testcases " Menglong Dong
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Add BPF_TRACE_SESSION to libbpf and bpftool.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 tools/bpf/bpftool/common.c | 1 +
 tools/lib/bpf/bpf.c        | 2 ++
 tools/lib/bpf/libbpf.c     | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index e8daf963ecef..534be6cfa2be 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -1191,6 +1191,7 @@ const char *bpf_attach_type_input_str(enum bpf_attach_type t)
 	case BPF_TRACE_FENTRY:			return "fentry";
 	case BPF_TRACE_FEXIT:			return "fexit";
 	case BPF_MODIFY_RETURN:			return "mod_ret";
+	case BPF_TRACE_SESSION:			return "fsession";
 	case BPF_SK_REUSEPORT_SELECT:		return "sk_skb_reuseport_select";
 	case BPF_SK_REUSEPORT_SELECT_OR_MIGRATE:	return "sk_skb_reuseport_select_or_migrate";
 	default:	return libbpf_bpf_attach_type_str(t);
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 21b57a629916..5042df4a5df7 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -794,6 +794,7 @@ int bpf_link_create(int prog_fd, int target_fd,
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 	case BPF_MODIFY_RETURN:
+	case BPF_TRACE_SESSION:
 	case BPF_LSM_MAC:
 		attr.link_create.tracing.cookie = OPTS_GET(opts, tracing.cookie, 0);
 		if (!OPTS_ZEROED(opts, tracing))
@@ -917,6 +918,7 @@ int bpf_link_create(int prog_fd, int target_fd,
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 	case BPF_MODIFY_RETURN:
+	case BPF_TRACE_SESSION:
 		return bpf_raw_tracepoint_open(NULL, prog_fd);
 	default:
 		return libbpf_err(err);
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index c7c79014d46c..0c095195df31 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -115,6 +115,7 @@ static const char * const attach_type_name[] = {
 	[BPF_TRACE_FENTRY]		= "trace_fentry",
 	[BPF_TRACE_FEXIT]		= "trace_fexit",
 	[BPF_MODIFY_RETURN]		= "modify_return",
+	[BPF_TRACE_SESSION]		= "trace_session",
 	[BPF_LSM_MAC]			= "lsm_mac",
 	[BPF_LSM_CGROUP]		= "lsm_cgroup",
 	[BPF_SK_LOOKUP]			= "sk_lookup",
@@ -9853,6 +9854,8 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("fentry.s+",		TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("fmod_ret.s+",		TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("fexit.s+",		TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
+	SEC_DEF("fsession+",		TRACING, BPF_TRACE_SESSION, SEC_ATTACH_BTF, attach_trace),
+	SEC_DEF("fsession.s+",		TRACING, BPF_TRACE_SESSION, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
 	SEC_DEF("freplace+",		EXT, 0, SEC_ATTACH_BTF, attach_trace),
 	SEC_DEF("lsm+",			LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
 	SEC_DEF("lsm.s+",		LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 8/9] selftests/bpf: add testcases for tracing session
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (6 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 7/9] libbpf: add support for tracing session Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-17 10:24   ` bot+bpf-ci
  2025-12-17  9:54 ` [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit Menglong Dong
  2025-12-19  0:55 ` [PATCH bpf-next v4 0/9] bpf: tracing session supporting Andrii Nakryiko
  9 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Add testcases for BPF_TRACE_SESSION. The function arguments and return
value are tested both in the entry and exit. And the kfunc
bpf_tracing_is_exit() is also tested.

As the layout of the stack changed for fsession, so we also test
bpf_get_func_ip() for it.

Session cookie for fsession is also tested. Multiple fsession BPF progs is
attached to bpf_fentry_test1() and session cookie is read and write in
the testcase.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
v3:
- restructure the testcase by combine the testcases for session cookie and
  get_func_ip into one patch
---
 .../selftests/bpf/prog_tests/fsession_test.c  |  90 ++++++++
 .../selftests/bpf/progs/fsession_test.c       | 192 ++++++++++++++++++
 2 files changed, 282 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fsession_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fsession_test.c

diff --git a/tools/testing/selftests/bpf/prog_tests/fsession_test.c b/tools/testing/selftests/bpf/prog_tests/fsession_test.c
new file mode 100644
index 000000000000..83f3953a1ff6
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/fsession_test.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 ChinaTelecom */
+#include <test_progs.h>
+#include "fsession_test.skel.h"
+
+static int check_result(struct fsession_test *skel)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, topts);
+	int err, prog_fd;
+
+	/* Trigger test function calls */
+	prog_fd = bpf_program__fd(skel->progs.test1);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	if (!ASSERT_OK(err, "test_run_opts err"))
+		return err;
+	if (!ASSERT_OK(topts.retval, "test_run_opts retval"))
+		return topts.retval;
+
+	for (int i = 0; i < sizeof(*skel->bss) / sizeof(__u64); i++) {
+		if (!ASSERT_EQ(((__u64 *)skel->bss)[i], 1, "test_result"))
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void test_fsession_basic(void)
+{
+	struct fsession_test *skel = NULL;
+	int err;
+
+	skel = fsession_test__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fsession_test__open_and_load"))
+		goto cleanup;
+
+	err = fsession_test__attach(skel);
+	if (!ASSERT_OK(err, "fsession_attach"))
+		goto cleanup;
+
+	check_result(skel);
+cleanup:
+	fsession_test__destroy(skel);
+}
+
+static void test_fsession_reattach(void)
+{
+	struct fsession_test *skel = NULL;
+	int err;
+
+	skel = fsession_test__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fsession_test__open_and_load"))
+		goto cleanup;
+
+	/* First attach */
+	err = fsession_test__attach(skel);
+	if (!ASSERT_OK(err, "fsession_first_attach"))
+		goto cleanup;
+
+	if (check_result(skel))
+		goto cleanup;
+
+	/* Detach */
+	fsession_test__detach(skel);
+
+	/* Reset counters */
+	memset(skel->bss, 0, sizeof(*skel->bss));
+
+	/* Second attach */
+	err = fsession_test__attach(skel);
+	if (!ASSERT_OK(err, "fsession_second_attach"))
+		goto cleanup;
+
+	if (check_result(skel))
+		goto cleanup;
+
+cleanup:
+	fsession_test__destroy(skel);
+}
+
+void test_fsession_test(void)
+{
+#if !defined(__x86_64__)
+	test__skip();
+	return;
+#endif
+	if (test__start_subtest("fsession_basic"))
+		test_fsession_basic();
+	if (test__start_subtest("fsession_reattach"))
+		test_fsession_reattach();
+}
diff --git a/tools/testing/selftests/bpf/progs/fsession_test.c b/tools/testing/selftests/bpf/progs/fsession_test.c
new file mode 100644
index 000000000000..f7c96ef1c7a9
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fsession_test.c
@@ -0,0 +1,192 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2025 ChinaTelecom */
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+__u64 test1_entry_result = 0;
+__u64 test1_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test1, int a, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		/* This is entry */
+		test1_entry_result = a == 1 && ret == 0;
+		/* Return 0 to allow exit to be called */
+		return 0;
+	}
+
+	/* This is exit */
+	test1_exit_result = a == 1 && ret == 2;
+	return 0;
+}
+
+__u64 test2_entry_result = 0;
+__u64 test2_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test3")
+int BPF_PROG(test2, char a, int b, __u64 c, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		test2_entry_result = a == 4 && b == 5 && c == 6 && ret == 0;
+		return 0;
+	}
+
+	test2_exit_result = a == 4 && b == 5 && c == 6 && ret == 15;
+	return 0;
+}
+
+__u64 test3_entry_result = 0;
+__u64 test3_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test4")
+int BPF_PROG(test3, void *a, char b, int c, __u64 d, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		test3_entry_result = a == (void *)7 && b == 8 && c == 9 && d == 10 && ret == 0;
+		return 0;
+	}
+
+	test3_exit_result = a == (void *)7 && b == 8 && c == 9 && d == 10 && ret == 34;
+	return 0;
+}
+
+__u64 test4_entry_result = 0;
+__u64 test4_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test5")
+int BPF_PROG(test4, __u64 a, void *b, short c, int d, __u64 e, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		test4_entry_result = a == 11 && b == (void *)12 && c == 13 && d == 14 &&
+			e == 15 && ret == 0;
+		return 0;
+	}
+
+	test4_exit_result = a == 11 && b == (void *)12 && c == 13 && d == 14 &&
+		e == 15 && ret == 65;
+	return 0;
+}
+
+__u64 test5_entry_result = 0;
+__u64 test5_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test7")
+int BPF_PROG(test5, struct bpf_fentry_test_t *arg, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		if (!arg)
+			test5_entry_result = ret == 0;
+		return 0;
+	}
+
+	if (!arg)
+		test5_exit_result = 1;
+	return 0;
+}
+
+__u64 test6_entry_result = 0;
+__u64 test6_exit_result = 0;
+/*
+ * test1, test8 and test9 hook the same target to verify the "ret" is always
+ * 0 in the entry.
+ */
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test6, int a, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		test6_entry_result = a == 1 && ret == 0;
+		return 0;
+	}
+
+	/* This is exit */
+	test6_exit_result = 1;
+	return 0;
+}
+
+__u64 test7_entry_result = 0;
+__u64 test7_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test7, int a, int ret)
+{
+	bool is_exit = bpf_fsession_is_return(ctx);
+
+	if (!is_exit) {
+		test7_entry_result = a == 1 && ret == 0;
+		return 0;
+	}
+
+	test7_exit_result = 1;
+	return 0;
+}
+
+__u64 test8_entry_result = 0;
+__u64 test8_exit_result = 0;
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test8, int a)
+{
+	__u64 addr = bpf_get_func_ip(ctx);
+
+	if (bpf_fsession_is_return(ctx))
+		test8_exit_result = (const void *) addr == &bpf_fentry_test1;
+	else
+		test8_entry_result = (const void *) addr == &bpf_fentry_test1;
+	return 0;
+}
+
+__u64 test9_entry_ok = 0;
+__u64 test9_exit_ok = 0;
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test9, int a)
+{
+	__u64 *cookie = bpf_fsession_cookie(ctx);
+
+	if (!bpf_fsession_is_return(ctx)) {
+		if (cookie) {
+			*cookie = 0xAAAABBBBCCCCDDDDull;
+			test9_entry_ok = *cookie == 0xAAAABBBBCCCCDDDDull;
+		}
+		return 0;
+	}
+
+	if (cookie)
+		test9_exit_ok = *cookie == 0xAAAABBBBCCCCDDDDull;
+	return 0;
+}
+
+__u64 test10_entry_ok = 0;
+__u64 test10_exit_ok = 0;
+
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test10, int a)
+{
+	__u64 *cookie = bpf_fsession_cookie(ctx);
+
+	if (!bpf_fsession_is_return(ctx)) {
+		if (cookie) {
+			*cookie = 0x1111222233334444ull;
+			test10_entry_ok = *cookie == 0x1111222233334444ull;
+		}
+		return 0;
+	}
+
+	if (cookie)
+		test10_exit_ok = *cookie == 0x1111222233334444ull;
+	return 0;
+}
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (7 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 8/9] selftests/bpf: add testcases " Menglong Dong
@ 2025-12-17  9:54 ` Menglong Dong
  2025-12-17 10:24   ` bot+bpf-ci
  2025-12-19  0:55 ` [PATCH bpf-next v4 0/9] bpf: tracing session supporting Andrii Nakryiko
  9 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-17  9:54 UTC (permalink / raw)
  To: ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel

Test the fsession when it is used together with fentry, fexit.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 .../selftests/bpf/progs/fsession_test.c       | 34 +++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/fsession_test.c b/tools/testing/selftests/bpf/progs/fsession_test.c
index f7c96ef1c7a9..223a6ea47888 100644
--- a/tools/testing/selftests/bpf/progs/fsession_test.c
+++ b/tools/testing/selftests/bpf/progs/fsession_test.c
@@ -190,3 +190,37 @@ int BPF_PROG(test10, int a)
 		test10_exit_ok = *cookie == 0x1111222233334444ull;
 	return 0;
 }
+
+__u64 test11_entry_result = 0;
+__u64 test11_exit_result = 0;
+
+SEC("fsession/bpf_fentry_test1")
+int BPF_PROG(test11, int a, int ret)
+{
+	__u64 *cookie = bpf_fsession_cookie(ctx);
+
+	if (!bpf_fsession_is_return(ctx)) {
+		test11_entry_result = a == 1 && ret == 0;
+		*cookie = 0x123456ULL;
+		return 0;
+	}
+
+	test11_exit_result = a == 1 && ret == 2 && *cookie == 0x123456ULL;
+	return 0;
+}
+
+__u64 test12_result = 0;
+SEC("fexit/bpf_fentry_test1")
+int BPF_PROG(test12, int a, int ret)
+{
+	test12_result = a == 1 && ret == 2;
+	return 0;
+}
+
+__u64 test13_result = 0;
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(test13, int a)
+{
+	test13_result = a == 1;
+	return 0;
+}
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 8/9] selftests/bpf: add testcases for tracing session
  2025-12-17  9:54 ` [PATCH bpf-next v4 8/9] selftests/bpf: add testcases " Menglong Dong
@ 2025-12-17 10:24   ` bot+bpf-ci
  2025-12-17 11:42     ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2025-12-17 10:24 UTC (permalink / raw)
  To: menglong8.dong, ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1523 bytes --]

> diff --git a/tools/testing/selftests/bpf/progs/fsession_test.c b/tools/testing/selftests/bpf/progs/fsession_test.c
> new file mode 100644
> index 000000000000..f7c96ef1c7a9
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/fsession_test.c
> @@ -0,0 +1,192 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2025 ChinaTelecom */
> +#include <vmlinux.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +char _license[] SEC("license") = "GPL";

[ ... ]

> +__u64 test8_entry_result = 0;
> +__u64 test8_exit_result = 0;
> +SEC("fsession/bpf_fentry_test1")
> +int BPF_PROG(test8, int a)
> +{
> +	__u64 addr = bpf_get_func_ip(ctx);
> +
> +	if (bpf_fsession_is_return(ctx))
> +		test8_exit_result = (const void *) addr == &bpf_fentry_test1;
> +	else
> +		test8_entry_result = (const void *) addr == &bpf_fentry_test1;
> +	return 0;
> +}

In test8, the code uses &bpf_fentry_test1 without declaring it. Looking at
similar tests like kprobe_multi.c and kprobe_multi_session.c, they include:

    extern const void bpf_fentry_test1 __ksym;

at the top of the file before taking the address of kernel symbols.
Without the __ksym declaration, does the comparison work correctly, or
would this cause the test to fail during compilation or at runtime?


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20299185010

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit
  2025-12-17  9:54 ` [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit Menglong Dong
@ 2025-12-17 10:24   ` bot+bpf-ci
  2025-12-17 10:37     ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2025-12-17 10:24 UTC (permalink / raw)
  To: menglong8.dong, ast, andrii
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 2138 bytes --]

> diff --git a/tools/testing/selftests/bpf/progs/fsession_test.c b/tools/testing/selftests/bpf/progs/fsession_test.c
> index f7c96ef1c7a9..223a6ea47888 100644
> --- a/tools/testing/selftests/bpf/progs/fsession_test.c
> +++ b/tools/testing/selftests/bpf/progs/fsession_test.c
> @@ -190,3 +190,37 @@ int BPF_PROG(test10, int a)
>  		test10_exit_ok = *cookie == 0x1111222233334444ull;
>  	return 0;
>  }
> +
> +__u64 test11_entry_result = 0;
> +__u64 test11_exit_result = 0;
> +
> +SEC("fsession/bpf_fentry_test1")
> +int BPF_PROG(test11, int a, int ret)
> +{
> +	__u64 *cookie = bpf_fsession_cookie(ctx);
> +
> +	if (!bpf_fsession_is_return(ctx)) {
> +		test11_entry_result = a == 1 && ret == 0;
> +		*cookie = 0x123456ULL;

Should the cookie pointer be checked for NULL before dereferencing?

Looking at the existing tests in this file, both test9 and test10 (which
also call bpf_fsession_cookie) explicitly check if the cookie pointer is
NULL before dereferencing it:

  test9: if (cookie) { *cookie = 0xAAAA...; }
  test10: if (cookie) { *cookie = 0x1111...; }

test11 dereferences cookie without this check. If cookie were NULL or
invalid, test11 would crash rather than fail gracefully like test9/test10
do. While the BPF verifier likely ensures cookie is valid when the
program is loaded, following the established pattern would make the test
more robust.

The same issue occurs in the exit path:

> +		return 0;
> +	}
> +
> +	test11_exit_result = a == 1 && ret == 2 && *cookie == 0x123456ULL;
                                                    ^^^^^^^

> +	return 0;
> +}
> +
> +__u64 test12_result = 0;
> +SEC("fexit/bpf_fentry_test1")
> +int BPF_PROG(test12, int a, int ret)
> +{
> +	test12_result = a == 1 && ret == 2;
> +	return 0;
> +}
> +
> +__u64 test13_result = 0;
> +SEC("fentry/bpf_fentry_test1")
> +int BPF_PROG(test13, int a)
> +{
> +	test13_result = a == 1;
> +	return 0;
> +}


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20299185010

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit
  2025-12-17 10:24   ` bot+bpf-ci
@ 2025-12-17 10:37     ` Menglong Dong
  0 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-17 10:37 UTC (permalink / raw)
  To: menglong8.dong, ast, andrii, bot+bpf-ci
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

On 2025/12/17 18:24 bot+bpf-ci@kernel.org write:
> > diff --git a/tools/testing/selftests/bpf/progs/fsession_test.c b/tools/testing/selftests/bpf/progs/fsession_test.c
> > index f7c96ef1c7a9..223a6ea47888 100644
> > --- a/tools/testing/selftests/bpf/progs/fsession_test.c
> > +++ b/tools/testing/selftests/bpf/progs/fsession_test.c
> > @@ -190,3 +190,37 @@ int BPF_PROG(test10, int a)
> >  		test10_exit_ok = *cookie == 0x1111222233334444ull;
> >  	return 0;
> >  }
> > +
> > +__u64 test11_entry_result = 0;
> > +__u64 test11_exit_result = 0;
> > +
> > +SEC("fsession/bpf_fentry_test1")
> > +int BPF_PROG(test11, int a, int ret)
> > +{
> > +	__u64 *cookie = bpf_fsession_cookie(ctx);
> > +
> > +	if (!bpf_fsession_is_return(ctx)) {
> > +		test11_entry_result = a == 1 && ret == 0;
> > +		*cookie = 0x123456ULL;
> 
> Should the cookie pointer be checked for NULL before dereferencing?
> 
> Looking at the existing tests in this file, both test9 and test10 (which
> also call bpf_fsession_cookie) explicitly check if the cookie pointer is
> NULL before dereferencing it:
> 
>   test9: if (cookie) { *cookie = 0xAAAA...; }
>   test10: if (cookie) { *cookie = 0x1111...; }

The checking is unnecessary, as bpf_fsession_cookie() can't
return NULL. I'll remove such checking in the test in the next
version.

Thanks!
Menglong Dong

> 
> test11 dereferences cookie without this check. If cookie were NULL or
> invalid, test11 would crash rather than fail gracefully like test9/test10
> do. While the BPF verifier likely ensures cookie is valid when the
> program is loaded, following the established pattern would make the test
> more robust.
> 
> The same issue occurs in the exit path:
> 
> > +		return 0;
> > +	}
> > +
> > +	test11_exit_result = a == 1 && ret == 2 && *cookie == 0x123456ULL;
>                                                     ^^^^^^^
> 
> > +	return 0;
> > +}
> > +
> > +__u64 test12_result = 0;
> > +SEC("fexit/bpf_fentry_test1")
> > +int BPF_PROG(test12, int a, int ret)
> > +{
> > +	test12_result = a == 1 && ret == 2;
> > +	return 0;
> > +}
> > +
> > +__u64 test13_result = 0;
> > +SEC("fentry/bpf_fentry_test1")
> > +int BPF_PROG(test13, int a)
> > +{
> > +	test13_result = a == 1;
> > +	return 0;
> > +}
> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20299185010
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 8/9] selftests/bpf: add testcases for tracing session
  2025-12-17 10:24   ` bot+bpf-ci
@ 2025-12-17 11:42     ` Menglong Dong
  0 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-17 11:42 UTC (permalink / raw)
  To: menglong8.dong, ast, andrii, bot+bpf-ci
  Cc: davem, dsahern, daniel, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx, mingo, bp,
	dave.hansen, x86, hpa, netdev, bpf, linux-kernel, ast, andrii,
	daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai

On 2025/12/17 18:24 bot+bpf-ci@kernel.org write:
> > diff --git a/tools/testing/selftests/bpf/progs/fsession_test.c b/tools/testing/selftests/bpf/progs/fsession_test.c
> > new file mode 100644
> > index 000000000000..f7c96ef1c7a9
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/fsession_test.c
> > @@ -0,0 +1,192 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2025 ChinaTelecom */
> > +#include <vmlinux.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +
> > +char _license[] SEC("license") = "GPL";
> 
> [ ... ]
> 
> > +__u64 test8_entry_result = 0;
> > +__u64 test8_exit_result = 0;
> > +SEC("fsession/bpf_fentry_test1")
> > +int BPF_PROG(test8, int a)
> > +{
> > +	__u64 addr = bpf_get_func_ip(ctx);
> > +
> > +	if (bpf_fsession_is_return(ctx))
> > +		test8_exit_result = (const void *) addr == &bpf_fentry_test1;
> > +	else
> > +		test8_entry_result = (const void *) addr == &bpf_fentry_test1;
> > +	return 0;
> > +}
> 
> In test8, the code uses &bpf_fentry_test1 without declaring it. Looking at
> similar tests like kprobe_multi.c and kprobe_multi_session.c, they include:
> 
>     extern const void bpf_fentry_test1 __ksym;
> 
> at the top of the file before taking the address of kernel symbols.
> Without the __ksym declaration, does the comparison work correctly, or
> would this cause the test to fail during compilation or at runtime?

It seems that it is generated into vmlinux.h, and that's why it works
well during compilation and runtime.

Thanks!
Menglong Dong

> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20299185010
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 0/9] bpf: tracing session supporting
  2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
                   ` (8 preceding siblings ...)
  2025-12-17  9:54 ` [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit Menglong Dong
@ 2025-12-19  0:55 ` Andrii Nakryiko
  2025-12-19  1:18   ` Menglong Dong
  9 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19  0:55 UTC (permalink / raw)
  To: Menglong Dong
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On Wed, Dec 17, 2025 at 1:54 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> Hi, all.
>
> In this version, I combined Alexei and Andrii's advice, which makes the
> architecture specific code much simpler.
>
> Sometimes, we need to hook both the entry and exit of a function with
> TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
> function, which is not convenient.
>
> Therefore, we add a tracing session support for TRACING. Generally
> speaking, it's similar to kprobe session, which can hook both the entry
> and exit of a function with a single BPF program. Session cookie is also
> supported with the kfunc bpf_fsession_cookie(). In order to limit the
> stack usage, we limit the maximum number of cookies to 4.
>
> The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
> inlined in the verifier.

We have generic bpf_session_is_return() and bpf_session_cookie() (that
currently works for ksession), can't you just implement them for the
newly added program type instead of adding type-specific kfuncs?

>
> We allow the usage of bpf_get_func_ret() to get the return value in the
> fentry of the tracing session, as it will always get "0", which is safe
> enough and is OK. Maybe we can prohibit the usage of bpf_get_func_ret()
> in the fentry in verifier, which can make the architecture specific code
> simpler.
>
> The fsession stuff is arch related, so the -EOPNOTSUPP will be returned if
> it is not supported yet by the arch. In this series, we only support
> x86_64. And later, other arch will be implemented.
>
> Changes since v3:
> * instead of adding a new hlist to progs_hlist in trampoline, add the bpf
>   program to both the fentry hlist and the fexit hlist.
> * introduce the 2nd patch to reuse the nr_args field in the stack to
>   store all the information we need(except the session cookies).
> * limit the maximum number of cookies to 4.
> * remove the logic to skip fexit if the fentry return non-zero.
>
> Changes since v2:
> * squeeze some patches:
>   - the 2 patches for the kfunc bpf_tracing_is_exit() and
>     bpf_fsession_cookie() are merged into the second patch.
>   - the testcases for fsession are also squeezed.
>
> * fix the CI error by move the testcase for bpf_get_func_ip to
>   fsession_test.c
>
> Changes since v1:
> * session cookie support.
>   In this version, session cookie is implemented, and the kfunc
>   bpf_fsession_cookie() is added.
>
> * restructure the layout of the stack.
>   In this version, the session stuff that stored in the stack is changed,
>   and we locate them after the return value to not break
>   bpf_get_func_ip().
>
> * testcase enhancement.
>   Some nits in the testcase that suggested by Jiri is fixed. Meanwhile,
>   the testcase for get_func_ip and session cookie is added too.
>
> Menglong Dong (9):
>   bpf: add tracing session support
>   bpf: use last 8-bits for the nr_args in trampoline
>   bpf: add the kfunc bpf_fsession_is_return
>   bpf: add the kfunc bpf_fsession_cookie
>   bpf,x86: introduce emit_st_r0_imm64() for trampoline
>   bpf,x86: add tracing session supporting for x86_64
>   libbpf: add support for tracing session
>   selftests/bpf: add testcases for tracing session
>   selftests/bpf: test fsession mixed with fentry and fexit
>
>  arch/x86/net/bpf_jit_comp.c                   |  47 +++-
>  include/linux/bpf.h                           |  39 +++
>  include/uapi/linux/bpf.h                      |   1 +
>  kernel/bpf/btf.c                              |   2 +
>  kernel/bpf/syscall.c                          |  18 +-
>  kernel/bpf/trampoline.c                       |  50 +++-
>  kernel/bpf/verifier.c                         |  75 ++++--
>  kernel/trace/bpf_trace.c                      |  56 ++++-
>  net/bpf/test_run.c                            |   1 +
>  net/core/bpf_sk_storage.c                     |   1 +
>  tools/bpf/bpftool/common.c                    |   1 +
>  tools/include/uapi/linux/bpf.h                |   1 +
>  tools/lib/bpf/bpf.c                           |   2 +
>  tools/lib/bpf/libbpf.c                        |   3 +
>  .../selftests/bpf/prog_tests/fsession_test.c  |  90 +++++++
>  .../bpf/prog_tests/tracing_failure.c          |   2 +-
>  .../selftests/bpf/progs/fsession_test.c       | 226 ++++++++++++++++++
>  17 files changed, 571 insertions(+), 44 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/fsession_test.c
>  create mode 100644 tools/testing/selftests/bpf/progs/fsession_test.c
>
> --
> 2.52.0
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 1/9] bpf: add tracing session support
  2025-12-17  9:54 ` [PATCH bpf-next v4 1/9] bpf: add tracing session support Menglong Dong
@ 2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-19  1:24     ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19  0:55 UTC (permalink / raw)
  To: Menglong Dong
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> The tracing session is something that similar to kprobe session. It allow
> to attach a single BPF program to both the entry and the exit of the
> target functions.
>
> Introduce the struct bpf_fsession_link, which allows to add the link to
> both the fentry and fexit progs_hlist of the trampoline.
>
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
> v4:
> - instead of adding a new hlist to progs_hlist in trampoline, add the bpf
>   program to both the fentry hlist and the fexit hlist.
> ---
>  include/linux/bpf.h                           | 20 +++++++++++
>  include/uapi/linux/bpf.h                      |  1 +
>  kernel/bpf/btf.c                              |  2 ++
>  kernel/bpf/syscall.c                          | 18 +++++++++-
>  kernel/bpf/trampoline.c                       | 36 +++++++++++++++----
>  kernel/bpf/verifier.c                         | 12 +++++--
>  net/bpf/test_run.c                            |  1 +
>  net/core/bpf_sk_storage.c                     |  1 +
>  tools/include/uapi/linux/bpf.h                |  1 +
>  .../bpf/prog_tests/tracing_failure.c          |  2 +-
>  10 files changed, 83 insertions(+), 11 deletions(-)
>

[...]

>  int bpf_prog_ctx_arg_info_init(struct bpf_prog *prog,
>                                const struct bpf_ctx_arg_aux *info, u32 cnt);
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 84ced3ed2d21..696a7d37db0e 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1145,6 +1145,7 @@ enum bpf_attach_type {
>         BPF_NETKIT_PEER,
>         BPF_TRACE_KPROBE_SESSION,
>         BPF_TRACE_UPROBE_SESSION,
> +       BPF_TRACE_SESSION,

FSESSION for consistency with FENTRY and FEXIT

>         __MAX_BPF_ATTACH_TYPE
>  };
>

[...]

>  {
> -       enum bpf_tramp_prog_type kind;
> -       struct bpf_tramp_link *link_exiting;
> +       enum bpf_tramp_prog_type kind, okind;
> +       struct bpf_tramp_link *link_existing;
> +       struct bpf_fsession_link *fslink;
>         int err = 0;
>         int cnt = 0, i;
>
> -       kind = bpf_attach_type_to_tramp(link->link.prog);
> +       okind = kind = bpf_attach_type_to_tramp(link->link.prog);
>         if (tr->extension_prog)
>                 /* cannot attach fentry/fexit if extension prog is attached.
>                  * cannot overwrite extension prog either.
> @@ -621,13 +624,18 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
>                                           BPF_MOD_JUMP, NULL,
>                                           link->link.prog->bpf_func);
>         }
> +       if (kind == BPF_TRAMP_SESSION) {
> +               /* deal with fsession as fentry by default */
> +               kind = BPF_TRAMP_FENTRY;
> +               cnt++;
> +       }

this "pretend we are BPF_TRAMP_FENTRY" looks a bit hacky and is very
hard to follow. I think it would be cleaner to have explicit small
special cases for BPF_TRAMP_SESSION, and then generalize
hlist_for_each_entry case by using a local variable for storing
&tr->progs_hlist[kind] (which for TRAMP_SESSION you'll set to
&tr->progs_hlist[BPF_TRAMP_FENTRY]). You'll then just do extra
hlist_add_head/hlist_del_init and count manipulation. IMO, it's better
than keeping in head what kind and okind is...


>         if (cnt >= BPF_MAX_TRAMP_LINKS)
>                 return -E2BIG;
>         if (!hlist_unhashed(&link->tramp_hlist))
>                 /* prog already linked */
>                 return -EBUSY;
> -       hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
> -               if (link_exiting->link.prog != link->link.prog)
> +       hlist_for_each_entry(link_existing, &tr->progs_hlist[kind], tramp_hlist) {
> +               if (link_existing->link.prog != link->link.prog)
>                         continue;
>                 /* prog already linked */
>                 return -EBUSY;

[...]

> @@ -23298,6 +23299,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                 if (prog_type == BPF_PROG_TYPE_TRACING &&
>                     insn->imm == BPF_FUNC_get_func_ret) {
>                         if (eatype == BPF_TRACE_FEXIT ||
> +                           eatype == BPF_TRACE_SESSION ||
>                             eatype == BPF_MODIFY_RETURN) {
>                                 /* Load nr_args from ctx - 8 */
>                                 insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
> @@ -24242,7 +24244,8 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
>                 if (tgt_prog->type == BPF_PROG_TYPE_TRACING &&
>                     prog_extension &&
>                     (tgt_prog->expected_attach_type == BPF_TRACE_FENTRY ||
> -                    tgt_prog->expected_attach_type == BPF_TRACE_FEXIT)) {
> +                    tgt_prog->expected_attach_type == BPF_TRACE_FEXIT ||
> +                    tgt_prog->expected_attach_type == BPF_TRACE_SESSION)) {
>                         /* Program extensions can extend all program types
>                          * except fentry/fexit. The reason is the following.
>                          * The fentry/fexit programs are used for performance
> @@ -24257,7 +24260,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
>                          * beyond reasonable stack size. Hence extending fentry
>                          * is not allowed.
>                          */
> -                       bpf_log(log, "Cannot extend fentry/fexit\n");
> +                       bpf_log(log, "Cannot extend fentry/fexit/session\n");

fsession?

>                         return -EINVAL;
>                 }
>         } else {
> @@ -24341,6 +24344,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
>         case BPF_LSM_CGROUP:
>         case BPF_TRACE_FENTRY:
>         case BPF_TRACE_FEXIT:
> +       case BPF_TRACE_SESSION:
>                 if (!btf_type_is_func(t)) {
>                         bpf_log(log, "attach_btf_id %u is not a function\n",
>                                 btf_id);

[...]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie
  2025-12-17  9:54 ` [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie Menglong Dong
@ 2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-19  1:31     ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19  0:55 UTC (permalink / raw)
  To: Menglong Dong
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> Implement session cookie for fsession. In order to limit the stack usage,
> we make 4 as the maximum of the cookie count.
>
> The offset of the current cookie is stored in the
> "(ctx[-1] >> BPF_TRAMP_M_COOKIE) & 0xFF". Therefore, we can get the
> session cookie with ctx[-offset].
>
> The stack will look like this:
>
>   return value  -> 8 bytes
>   argN          -> 8 bytes
>   ...
>   arg1          -> 8 bytes
>   nr_args       -> 8 bytes
>   ip(optional)  -> 8 bytes
>   cookie2       -> 8 bytes
>   cookie1       -> 8 bytes
>
> Inline the bpf_fsession_cookie() in the verifer too.
>
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
> v4:
> - limit the maximum of the cookie count to 4
> - store the session cookies before nr_regs in stack
> ---
>  include/linux/bpf.h      | 16 ++++++++++++++++
>  kernel/bpf/trampoline.c  | 14 +++++++++++++-
>  kernel/bpf/verifier.c    | 20 ++++++++++++++++++--
>  kernel/trace/bpf_trace.c |  9 +++++++++
>  4 files changed, 56 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index d165ace5cc9b..0f35c6ab538c 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1215,6 +1215,7 @@ enum {
>
>  #define BPF_TRAMP_M_NR_ARGS    0
>  #define BPF_TRAMP_M_IS_RETURN  8
> +#define BPF_TRAMP_M_COOKIE     9
>
>  struct bpf_tramp_links {
>         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> @@ -1318,6 +1319,7 @@ struct bpf_trampoline {
>         struct mutex mutex;
>         refcount_t refcnt;
>         u32 flags;
> +       int cookie_cnt;

can't you just count this each time you need to know instead of
keeping track of this? it's not that expensive and won't happen that
frequently (and we keep lock on trampoline, so it's also safe and
race-free to count)

>         u64 key;
>         struct {
>                 struct btf_func_model model;
> @@ -1762,6 +1764,7 @@ struct bpf_prog {
>                                 enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
>                                 call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
>                                 call_get_func_ip:1, /* Do we call get_func_ip() */
> +                               call_session_cookie:1, /* Do we call bpf_fsession_cookie() */
>                                 tstamp_type_access:1, /* Accessed __sk_buff->tstamp_type */
>                                 sleepable:1;    /* BPF program is sleepable */
>         enum bpf_prog_type      type;           /* Type of BPF program */

[...]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64
  2025-12-17  9:54 ` [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64 Menglong Dong
@ 2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-19  1:41     ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19  0:55 UTC (permalink / raw)
  To: Menglong Dong
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> Add BPF_TRACE_SESSION supporting to x86_64, including:
>
> 1. clear the return value in the stack before fentry to make the fentry
>    of the fsession can only get 0 with bpf_get_func_ret(). If we can limit
>    that bpf_get_func_ret() can only be used in the
>    "bpf_fsession_is_return() == true" code path, we don't need do this
>    thing anymore.

What does bpf_get_func_ret() return today for fentry? zero or just
random garbage? If the latter, we can keep the same semantics for
fsession on entry. Ultimately, result of bpf_get_func_ret() is
meaningless outside of fexit/session-exit

>
> 2. clear all the session cookies' value in the stack. If we can make sure
>    that the reading to session cookie can only be done after initialize in
>    the verifier, we don't need this anymore.
>
> 2. store the index of the cookie to ctx[-1] before the calling to fsession
>
> 3. store the "is_return" flag to ctx[-1] before the calling to fexit of
>    the fsession.
>
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
> v4:
> - some adjustment to the 1st patch, such as we get the fsession prog from
>   fentry and fexit hlist
> - remove the supporting of skipping fexit with fentry return non-zero
>
> v2:
> - add session cookie support
> - add the session stuff after return value, instead of before nr_args
> ---
>  arch/x86/net/bpf_jit_comp.c | 36 +++++++++++++++++++++++++++++++-----
>  1 file changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 8cbeefb26192..99b0223374bd 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -3086,12 +3086,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
>  static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
>                       struct bpf_tramp_links *tl, int stack_size,
>                       int run_ctx_off, bool save_ret,
> -                     void *image, void *rw_image)
> +                     void *image, void *rw_image, u64 nr_regs)
>  {
>         int i;
>         u8 *prog = *pprog;
>
>         for (i = 0; i < tl->nr_links; i++) {
> +               if (tl->links[i]->link.prog->call_session_cookie) {
> +                       /* 'stack_size + 8' is the offset of nr_regs in stack */
> +                       emit_st_r0_imm64(&prog, nr_regs, stack_size + 8);
> +                       nr_regs -= (1 << BPF_TRAMP_M_COOKIE);

you have to rename nr_regs to something more meaningful because it's
so weird to see some bit manipulations with *number of arguments*

> +               }
>                 if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
>                                     run_ctx_off, save_ret, image, rw_image))
>                         return -EINVAL;
> @@ -3208,8 +3213,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
>                                          struct bpf_tramp_links *tlinks,
>                                          void *func_addr)
>  {
> -       int i, ret, nr_regs = m->nr_args, stack_size = 0;
> -       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
> +       int i, ret, nr_regs = m->nr_args, cookie_cnt, stack_size = 0;
> +       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off,
> +           cookie_off;

if it doesn't fit on a single line, just `int cookie_off;` on a
separate line, why wrap the line?

>         struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>         struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>         struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];

[...]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 7/9] libbpf: add support for tracing session
  2025-12-17  9:54 ` [PATCH bpf-next v4 7/9] libbpf: add support for tracing session Menglong Dong
@ 2025-12-19  0:55   ` Andrii Nakryiko
  2025-12-19  1:42     ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19  0:55 UTC (permalink / raw)
  To: Menglong Dong
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> Add BPF_TRACE_SESSION to libbpf and bpftool.
>
> Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> ---
>  tools/bpf/bpftool/common.c | 1 +
>  tools/lib/bpf/bpf.c        | 2 ++
>  tools/lib/bpf/libbpf.c     | 3 +++
>  3 files changed, 6 insertions(+)
>
> diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
> index e8daf963ecef..534be6cfa2be 100644
> --- a/tools/bpf/bpftool/common.c
> +++ b/tools/bpf/bpftool/common.c
> @@ -1191,6 +1191,7 @@ const char *bpf_attach_type_input_str(enum bpf_attach_type t)
>         case BPF_TRACE_FENTRY:                  return "fentry";
>         case BPF_TRACE_FEXIT:                   return "fexit";
>         case BPF_MODIFY_RETURN:                 return "mod_ret";
> +       case BPF_TRACE_SESSION:                 return "fsession";
>         case BPF_SK_REUSEPORT_SELECT:           return "sk_skb_reuseport_select";
>         case BPF_SK_REUSEPORT_SELECT_OR_MIGRATE:        return "sk_skb_reuseport_select_or_migrate";
>         default:        return libbpf_bpf_attach_type_str(t);
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 21b57a629916..5042df4a5df7 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -794,6 +794,7 @@ int bpf_link_create(int prog_fd, int target_fd,
>         case BPF_TRACE_FENTRY:
>         case BPF_TRACE_FEXIT:
>         case BPF_MODIFY_RETURN:
> +       case BPF_TRACE_SESSION:
>         case BPF_LSM_MAC:
>                 attr.link_create.tracing.cookie = OPTS_GET(opts, tracing.cookie, 0);
>                 if (!OPTS_ZEROED(opts, tracing))
> @@ -917,6 +918,7 @@ int bpf_link_create(int prog_fd, int target_fd,
>         case BPF_TRACE_FENTRY:
>         case BPF_TRACE_FEXIT:
>         case BPF_MODIFY_RETURN:
> +       case BPF_TRACE_SESSION:

no need, this is a legacy fallback path for programs that were (at
some point for older kernels) attachable only through
BPF_RAW_TRACEPOINT_OPEN. BPF_LINK_CREATE is sufficient, drop this
line.

>                 return bpf_raw_tracepoint_open(NULL, prog_fd);
>         default:
>                 return libbpf_err(err);
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index c7c79014d46c..0c095195df31 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -115,6 +115,7 @@ static const char * const attach_type_name[] = {
>         [BPF_TRACE_FENTRY]              = "trace_fentry",
>         [BPF_TRACE_FEXIT]               = "trace_fexit",
>         [BPF_MODIFY_RETURN]             = "modify_return",
> +       [BPF_TRACE_SESSION]             = "trace_session",

let's use fsession terminology consistently


>         [BPF_LSM_MAC]                   = "lsm_mac",
>         [BPF_LSM_CGROUP]                = "lsm_cgroup",
>         [BPF_SK_LOOKUP]                 = "sk_lookup",
> @@ -9853,6 +9854,8 @@ static const struct bpf_sec_def section_defs[] = {
>         SEC_DEF("fentry.s+",            TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
>         SEC_DEF("fmod_ret.s+",          TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
>         SEC_DEF("fexit.s+",             TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> +       SEC_DEF("fsession+",            TRACING, BPF_TRACE_SESSION, SEC_ATTACH_BTF, attach_trace),
> +       SEC_DEF("fsession.s+",          TRACING, BPF_TRACE_SESSION, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
>         SEC_DEF("freplace+",            EXT, 0, SEC_ATTACH_BTF, attach_trace),
>         SEC_DEF("lsm+",                 LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
>         SEC_DEF("lsm.s+",               LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
> --
> 2.52.0
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 0/9] bpf: tracing session supporting
  2025-12-19  0:55 ` [PATCH bpf-next v4 0/9] bpf: tracing session supporting Andrii Nakryiko
@ 2025-12-19  1:18   ` Menglong Dong
  2025-12-19 16:55     ` Andrii Nakryiko
  0 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-19  1:18 UTC (permalink / raw)
  To: Menglong Dong, Andrii Nakryiko
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> On Wed, Dec 17, 2025 at 1:54 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > Hi, all.
> >
> > In this version, I combined Alexei and Andrii's advice, which makes the
> > architecture specific code much simpler.
> >
> > Sometimes, we need to hook both the entry and exit of a function with
> > TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
> > function, which is not convenient.
> >
> > Therefore, we add a tracing session support for TRACING. Generally
> > speaking, it's similar to kprobe session, which can hook both the entry
> > and exit of a function with a single BPF program. Session cookie is also
> > supported with the kfunc bpf_fsession_cookie(). In order to limit the
> > stack usage, we limit the maximum number of cookies to 4.
> >
> > The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
> > inlined in the verifier.
> 
> We have generic bpf_session_is_return() and bpf_session_cookie() (that
> currently works for ksession), can't you just implement them for the
> newly added program type instead of adding type-specific kfuncs?

Hi, Andrii. I tried and found that it's a little hard to reuse them. The
bpf_session_is_return() and bpf_session_cookie() are defined as kfunc, which
makes we can't implement different functions for different attach type, like
what bpf helper does.

The way we store "is_return" and "cookie" in fsession is different with
ksession. For ksession, it store the "is_return" in struct bpf_session_run_ctx.
Even if we move the "nr_regs" from stack to struct bpf_tramp_run_ctx,
it's still hard to reuse the bpf_session_is_return() or bpf_session_cookie(),
as the way of storing the "is_return" and "cookie" in fsession and ksession
is different, and it's a little difficult and complex to unify them.

What's more, we will lose the advantage of inline bpf_fsession_is_return
and bpf_fsession_cookie in verifier.

I'll check more to see if there is a more simple way to reuse them.

Thanks!
Menglong Dong

> 
> >
> > We allow the usage of bpf_get_func_ret() to get the return value in the
> > fentry of the tracing session, as it will always get "0", which is safe
> > enough and is OK. Maybe we can prohibit the usage of bpf_get_func_ret()
> > in the fentry in verifier, which can make the architecture specific code
> > simpler.
> >
> > The fsession stuff is arch related, so the -EOPNOTSUPP will be returned if
> > it is not supported yet by the arch. In this series, we only support
> > x86_64. And later, other arch will be implemented.
> >
> > Changes since v3:
> > * instead of adding a new hlist to progs_hlist in trampoline, add the bpf
> >   program to both the fentry hlist and the fexit hlist.
> > * introduce the 2nd patch to reuse the nr_args field in the stack to
> >   store all the information we need(except the session cookies).
> > * limit the maximum number of cookies to 4.
> > * remove the logic to skip fexit if the fentry return non-zero.
> >
> > Changes since v2:
> > * squeeze some patches:
> >   - the 2 patches for the kfunc bpf_tracing_is_exit() and
> >     bpf_fsession_cookie() are merged into the second patch.
> >   - the testcases for fsession are also squeezed.
> >
> > * fix the CI error by move the testcase for bpf_get_func_ip to
> >   fsession_test.c
> >
> > Changes since v1:
> > * session cookie support.
> >   In this version, session cookie is implemented, and the kfunc
> >   bpf_fsession_cookie() is added.
> >
> > * restructure the layout of the stack.
> >   In this version, the session stuff that stored in the stack is changed,
> >   and we locate them after the return value to not break
> >   bpf_get_func_ip().
> >
> > * testcase enhancement.
> >   Some nits in the testcase that suggested by Jiri is fixed. Meanwhile,
> >   the testcase for get_func_ip and session cookie is added too.
> >
> > Menglong Dong (9):
> >   bpf: add tracing session support
> >   bpf: use last 8-bits for the nr_args in trampoline
> >   bpf: add the kfunc bpf_fsession_is_return
> >   bpf: add the kfunc bpf_fsession_cookie
> >   bpf,x86: introduce emit_st_r0_imm64() for trampoline
> >   bpf,x86: add tracing session supporting for x86_64
> >   libbpf: add support for tracing session
> >   selftests/bpf: add testcases for tracing session
> >   selftests/bpf: test fsession mixed with fentry and fexit
> >
> >  arch/x86/net/bpf_jit_comp.c                   |  47 +++-
> >  include/linux/bpf.h                           |  39 +++
> >  include/uapi/linux/bpf.h                      |   1 +
> >  kernel/bpf/btf.c                              |   2 +
> >  kernel/bpf/syscall.c                          |  18 +-
> >  kernel/bpf/trampoline.c                       |  50 +++-
> >  kernel/bpf/verifier.c                         |  75 ++++--
> >  kernel/trace/bpf_trace.c                      |  56 ++++-
> >  net/bpf/test_run.c                            |   1 +
> >  net/core/bpf_sk_storage.c                     |   1 +
> >  tools/bpf/bpftool/common.c                    |   1 +
> >  tools/include/uapi/linux/bpf.h                |   1 +
> >  tools/lib/bpf/bpf.c                           |   2 +
> >  tools/lib/bpf/libbpf.c                        |   3 +
> >  .../selftests/bpf/prog_tests/fsession_test.c  |  90 +++++++
> >  .../bpf/prog_tests/tracing_failure.c          |   2 +-
> >  .../selftests/bpf/progs/fsession_test.c       | 226 ++++++++++++++++++
> >  17 files changed, 571 insertions(+), 44 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/fsession_test.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/fsession_test.c
> >
> > --
> > 2.52.0
> >
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 1/9] bpf: add tracing session support
  2025-12-19  0:55   ` Andrii Nakryiko
@ 2025-12-19  1:24     ` Menglong Dong
  0 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-19  1:24 UTC (permalink / raw)
  To: Menglong Dong, Andrii Nakryiko
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > The tracing session is something that similar to kprobe session. It allow
> > to attach a single BPF program to both the entry and the exit of the
> > target functions.
> >
> > Introduce the struct bpf_fsession_link, which allows to add the link to
> > both the fentry and fexit progs_hlist of the trampoline.
> >
> > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
> > Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> > ---
> > v4:
> > - instead of adding a new hlist to progs_hlist in trampoline, add the bpf
> >   program to both the fentry hlist and the fexit hlist.
> > ---
> >  include/linux/bpf.h                           | 20 +++++++++++
> >  include/uapi/linux/bpf.h                      |  1 +
> >  kernel/bpf/btf.c                              |  2 ++
> >  kernel/bpf/syscall.c                          | 18 +++++++++-
> >  kernel/bpf/trampoline.c                       | 36 +++++++++++++++----
> >  kernel/bpf/verifier.c                         | 12 +++++--
> >  net/bpf/test_run.c                            |  1 +
> >  net/core/bpf_sk_storage.c                     |  1 +
> >  tools/include/uapi/linux/bpf.h                |  1 +
> >  .../bpf/prog_tests/tracing_failure.c          |  2 +-
> >  10 files changed, 83 insertions(+), 11 deletions(-)
> >
> 
> [...]
> 
> >  int bpf_prog_ctx_arg_info_init(struct bpf_prog *prog,
> >                                const struct bpf_ctx_arg_aux *info, u32 cnt);
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 84ced3ed2d21..696a7d37db0e 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1145,6 +1145,7 @@ enum bpf_attach_type {
> >         BPF_NETKIT_PEER,
> >         BPF_TRACE_KPROBE_SESSION,
> >         BPF_TRACE_UPROBE_SESSION,
> > +       BPF_TRACE_SESSION,
> 
> FSESSION for consistency with FENTRY and FEXIT

OK

> 
> >         __MAX_BPF_ATTACH_TYPE
> >  };
> >
> 
> [...]
> 
> >  {
> > -       enum bpf_tramp_prog_type kind;
> > -       struct bpf_tramp_link *link_exiting;
> > +       enum bpf_tramp_prog_type kind, okind;
> > +       struct bpf_tramp_link *link_existing;
> > +       struct bpf_fsession_link *fslink;
> >         int err = 0;
> >         int cnt = 0, i;
> >
> > -       kind = bpf_attach_type_to_tramp(link->link.prog);
> > +       okind = kind = bpf_attach_type_to_tramp(link->link.prog);
> >         if (tr->extension_prog)
> >                 /* cannot attach fentry/fexit if extension prog is attached.
> >                  * cannot overwrite extension prog either.
> > @@ -621,13 +624,18 @@ static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link,
> >                                           BPF_MOD_JUMP, NULL,
> >                                           link->link.prog->bpf_func);
> >         }
> > +       if (kind == BPF_TRAMP_SESSION) {
> > +               /* deal with fsession as fentry by default */
> > +               kind = BPF_TRAMP_FENTRY;
> > +               cnt++;
> > +       }
> 
> this "pretend we are BPF_TRAMP_FENTRY" looks a bit hacky and is very
> hard to follow. I think it would be cleaner to have explicit small
> special cases for BPF_TRAMP_SESSION, and then generalize
> hlist_for_each_entry case by using a local variable for storing
> &tr->progs_hlist[kind] (which for TRAMP_SESSION you'll set to
> &tr->progs_hlist[BPF_TRAMP_FENTRY]). You'll then just do extra
> hlist_add_head/hlist_del_init and count manipulation. IMO, it's better
> than keeping in head what kind and okind is...

Ah, the way now does seem a little hacky. I tried to make this series
looks less complex by modifying the code as less as possible.

I'll use a explicit way here as you advised.

> 
> 
> >         if (cnt >= BPF_MAX_TRAMP_LINKS)
> >                 return -E2BIG;
> >         if (!hlist_unhashed(&link->tramp_hlist))
> >                 /* prog already linked */
> >                 return -EBUSY;
> > -       hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) {
> > -               if (link_exiting->link.prog != link->link.prog)
> > +       hlist_for_each_entry(link_existing, &tr->progs_hlist[kind], tramp_hlist) {
> > +               if (link_existing->link.prog != link->link.prog)
> >                         continue;
> >                 /* prog already linked */
> >                 return -EBUSY;
> 
> [...]
> 
> > @@ -23298,6 +23299,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> >                 if (prog_type == BPF_PROG_TYPE_TRACING &&
> >                     insn->imm == BPF_FUNC_get_func_ret) {
> >                         if (eatype == BPF_TRACE_FEXIT ||
> > +                           eatype == BPF_TRACE_SESSION ||
> >                             eatype == BPF_MODIFY_RETURN) {
> >                                 /* Load nr_args from ctx - 8 */
> >                                 insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, -8);
> > @@ -24242,7 +24244,8 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> >                 if (tgt_prog->type == BPF_PROG_TYPE_TRACING &&
> >                     prog_extension &&
> >                     (tgt_prog->expected_attach_type == BPF_TRACE_FENTRY ||
> > -                    tgt_prog->expected_attach_type == BPF_TRACE_FEXIT)) {
> > +                    tgt_prog->expected_attach_type == BPF_TRACE_FEXIT ||
> > +                    tgt_prog->expected_attach_type == BPF_TRACE_SESSION)) {
> >                         /* Program extensions can extend all program types
> >                          * except fentry/fexit. The reason is the following.
> >                          * The fentry/fexit programs are used for performance
> > @@ -24257,7 +24260,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> >                          * beyond reasonable stack size. Hence extending fentry
> >                          * is not allowed.
> >                          */
> > -                       bpf_log(log, "Cannot extend fentry/fexit\n");
> > +                       bpf_log(log, "Cannot extend fentry/fexit/session\n");
> 
> fsession?

OK

Thanks!
Menglong Dong

> 
> >                         return -EINVAL;
> >                 }
> >         } else {
> > @@ -24341,6 +24344,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> >         case BPF_LSM_CGROUP:
> >         case BPF_TRACE_FENTRY:
> >         case BPF_TRACE_FEXIT:
> > +       case BPF_TRACE_SESSION:
> >                 if (!btf_type_is_func(t)) {
> >                         bpf_log(log, "attach_btf_id %u is not a function\n",
> >                                 btf_id);
> 
> [...]
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie
  2025-12-19  0:55   ` Andrii Nakryiko
@ 2025-12-19  1:31     ` Menglong Dong
  2025-12-19 12:01       ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-19  1:31 UTC (permalink / raw)
  To: Menglong Dong, Andrii Nakryiko
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > Implement session cookie for fsession. In order to limit the stack usage,
> > we make 4 as the maximum of the cookie count.
> >
> > The offset of the current cookie is stored in the
> > "(ctx[-1] >> BPF_TRAMP_M_COOKIE) & 0xFF". Therefore, we can get the
> > session cookie with ctx[-offset].
> >
> > The stack will look like this:
> >
> >   return value  -> 8 bytes
> >   argN          -> 8 bytes
> >   ...
> >   arg1          -> 8 bytes
> >   nr_args       -> 8 bytes
> >   ip(optional)  -> 8 bytes
> >   cookie2       -> 8 bytes
> >   cookie1       -> 8 bytes
> >
> > Inline the bpf_fsession_cookie() in the verifer too.
> >
> > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > ---
> > v4:
> > - limit the maximum of the cookie count to 4
> > - store the session cookies before nr_regs in stack
> > ---
> >  include/linux/bpf.h      | 16 ++++++++++++++++
> >  kernel/bpf/trampoline.c  | 14 +++++++++++++-
> >  kernel/bpf/verifier.c    | 20 ++++++++++++++++++--
> >  kernel/trace/bpf_trace.c |  9 +++++++++
> >  4 files changed, 56 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index d165ace5cc9b..0f35c6ab538c 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1215,6 +1215,7 @@ enum {
> >
> >  #define BPF_TRAMP_M_NR_ARGS    0
> >  #define BPF_TRAMP_M_IS_RETURN  8
> > +#define BPF_TRAMP_M_COOKIE     9
> >
> >  struct bpf_tramp_links {
> >         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> > @@ -1318,6 +1319,7 @@ struct bpf_trampoline {
> >         struct mutex mutex;
> >         refcount_t refcnt;
> >         u32 flags;
> > +       int cookie_cnt;
> 
> can't you just count this each time you need to know instead of
> keeping track of this? it's not that expensive and won't happen that
> frequently (and we keep lock on trampoline, so it's also safe and
> race-free to count)

There is a for-loop below that use the "cookie_cnt" to clear all the
cookie to zero. We limited the maximum of cookie_cnt to 4, so
I guess we can count it directly there. I'll change it in the
next version.

Thanks!
Menglong Dong

> 
> >         u64 key;
> >         struct {
> >                 struct btf_func_model model;
> > @@ -1762,6 +1764,7 @@ struct bpf_prog {
> >                                 enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
> >                                 call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
> >                                 call_get_func_ip:1, /* Do we call get_func_ip() */
> > +                               call_session_cookie:1, /* Do we call bpf_fsession_cookie() */
> >                                 tstamp_type_access:1, /* Accessed __sk_buff->tstamp_type */
> >                                 sleepable:1;    /* BPF program is sleepable */
> >         enum bpf_prog_type      type;           /* Type of BPF program */
> 
> [...]
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64
  2025-12-19  0:55   ` Andrii Nakryiko
@ 2025-12-19  1:41     ` Menglong Dong
  2025-12-19 16:56       ` Andrii Nakryiko
  0 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-19  1:41 UTC (permalink / raw)
  To: Menglong Dong, Andrii Nakryiko
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > Add BPF_TRACE_SESSION supporting to x86_64, including:
> >
> > 1. clear the return value in the stack before fentry to make the fentry
> >    of the fsession can only get 0 with bpf_get_func_ret(). If we can limit
> >    that bpf_get_func_ret() can only be used in the
> >    "bpf_fsession_is_return() == true" code path, we don't need do this
> >    thing anymore.
> 
> What does bpf_get_func_ret() return today for fentry? zero or just
> random garbage? If the latter, we can keep the same semantics for
> fsession on entry. Ultimately, result of bpf_get_func_ret() is
> meaningless outside of fexit/session-exit

For fentry, bpf_get_func_ret() is not allowed to be called. For fsession,
I think the best way is that we allow to call bpf_get_func_ret() in the
"bpf_fsession_is_return() == true"  branch, and prohibit it in
"bpf_fsession_is_return() == false" branch. However, we need to track
such condition in verifier, which will make things complicated. So
I think we can allow the usage of bpf_get_func_ret() in fsession and
make sure it will always get zero in the fsession-fentry for now.

Thanks!
Menglong Dong

> 
> >
> > 2. clear all the session cookies' value in the stack. If we can make sure
> >    that the reading to session cookie can only be done after initialize in
> >    the verifier, we don't need this anymore.
> >
> > 2. store the index of the cookie to ctx[-1] before the calling to fsession
> >
> > 3. store the "is_return" flag to ctx[-1] before the calling to fexit of
> >    the fsession.
> >
> > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
> > Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> > ---
> > v4:
> > - some adjustment to the 1st patch, such as we get the fsession prog from
> >   fentry and fexit hlist
> > - remove the supporting of skipping fexit with fentry return non-zero
> >
> > v2:
> > - add session cookie support
> > - add the session stuff after return value, instead of before nr_args
> > ---
> >  arch/x86/net/bpf_jit_comp.c | 36 +++++++++++++++++++++++++++++++-----
> >  1 file changed, 31 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 8cbeefb26192..99b0223374bd 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -3086,12 +3086,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> >  static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> >                       struct bpf_tramp_links *tl, int stack_size,
> >                       int run_ctx_off, bool save_ret,
> > -                     void *image, void *rw_image)
> > +                     void *image, void *rw_image, u64 nr_regs)
> >  {
> >         int i;
> >         u8 *prog = *pprog;
> >
> >         for (i = 0; i < tl->nr_links; i++) {
> > +               if (tl->links[i]->link.prog->call_session_cookie) {
> > +                       /* 'stack_size + 8' is the offset of nr_regs in stack */
> > +                       emit_st_r0_imm64(&prog, nr_regs, stack_size + 8);
> > +                       nr_regs -= (1 << BPF_TRAMP_M_COOKIE);
> 
> you have to rename nr_regs to something more meaningful because it's
> so weird to see some bit manipulations with *number of arguments*
> 
> > +               }
> >                 if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
> >                                     run_ctx_off, save_ret, image, rw_image))
> >                         return -EINVAL;
> > @@ -3208,8 +3213,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> >                                          struct bpf_tramp_links *tlinks,
> >                                          void *func_addr)
> >  {
> > -       int i, ret, nr_regs = m->nr_args, stack_size = 0;
> > -       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
> > +       int i, ret, nr_regs = m->nr_args, cookie_cnt, stack_size = 0;
> > +       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off,
> > +           cookie_off;
> 
> if it doesn't fit on a single line, just `int cookie_off;` on a
> separate line, why wrap the line?
> 
> >         struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> >         struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> >         struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> 
> [...]
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 7/9] libbpf: add support for tracing session
  2025-12-19  0:55   ` Andrii Nakryiko
@ 2025-12-19  1:42     ` Menglong Dong
  0 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-19  1:42 UTC (permalink / raw)
  To: Menglong Dong, Andrii Nakryiko
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > Add BPF_TRACE_SESSION to libbpf and bpftool.
> >
> > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > ---
> >  tools/bpf/bpftool/common.c | 1 +
> >  tools/lib/bpf/bpf.c        | 2 ++
> >  tools/lib/bpf/libbpf.c     | 3 +++
> >  3 files changed, 6 insertions(+)
> >
> > diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
> > index e8daf963ecef..534be6cfa2be 100644
> > --- a/tools/bpf/bpftool/common.c
> > +++ b/tools/bpf/bpftool/common.c
> > @@ -1191,6 +1191,7 @@ const char *bpf_attach_type_input_str(enum bpf_attach_type t)
> >         case BPF_TRACE_FENTRY:                  return "fentry";
> >         case BPF_TRACE_FEXIT:                   return "fexit";
> >         case BPF_MODIFY_RETURN:                 return "mod_ret";
> > +       case BPF_TRACE_SESSION:                 return "fsession";
> >         case BPF_SK_REUSEPORT_SELECT:           return "sk_skb_reuseport_select";
> >         case BPF_SK_REUSEPORT_SELECT_OR_MIGRATE:        return "sk_skb_reuseport_select_or_migrate";
> >         default:        return libbpf_bpf_attach_type_str(t);
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 21b57a629916..5042df4a5df7 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -794,6 +794,7 @@ int bpf_link_create(int prog_fd, int target_fd,
> >         case BPF_TRACE_FENTRY:
> >         case BPF_TRACE_FEXIT:
> >         case BPF_MODIFY_RETURN:
> > +       case BPF_TRACE_SESSION:
> >         case BPF_LSM_MAC:
> >                 attr.link_create.tracing.cookie = OPTS_GET(opts, tracing.cookie, 0);
> >                 if (!OPTS_ZEROED(opts, tracing))
> > @@ -917,6 +918,7 @@ int bpf_link_create(int prog_fd, int target_fd,
> >         case BPF_TRACE_FENTRY:
> >         case BPF_TRACE_FEXIT:
> >         case BPF_MODIFY_RETURN:
> > +       case BPF_TRACE_SESSION:
> 
> no need, this is a legacy fallback path for programs that were (at
> some point for older kernels) attachable only through
> BPF_RAW_TRACEPOINT_OPEN. BPF_LINK_CREATE is sufficient, drop this
> line.

OK, I see.

> 
> >                 return bpf_raw_tracepoint_open(NULL, prog_fd);
> >         default:
> >                 return libbpf_err(err);
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index c7c79014d46c..0c095195df31 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -115,6 +115,7 @@ static const char * const attach_type_name[] = {
> >         [BPF_TRACE_FENTRY]              = "trace_fentry",
> >         [BPF_TRACE_FEXIT]               = "trace_fexit",
> >         [BPF_MODIFY_RETURN]             = "modify_return",
> > +       [BPF_TRACE_SESSION]             = "trace_session",
> 
> let's use fsession terminology consistently

OK, so we will use "trace_fsession" here.

Thanks!
Menglong Dong

> 
> 
> >         [BPF_LSM_MAC]                   = "lsm_mac",
> >         [BPF_LSM_CGROUP]                = "lsm_cgroup",
> >         [BPF_SK_LOOKUP]                 = "sk_lookup",
> > @@ -9853,6 +9854,8 @@ static const struct bpf_sec_def section_defs[] = {
> >         SEC_DEF("fentry.s+",            TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> >         SEC_DEF("fmod_ret.s+",          TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> >         SEC_DEF("fexit.s+",             TRACING, BPF_TRACE_FEXIT, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> > +       SEC_DEF("fsession+",            TRACING, BPF_TRACE_SESSION, SEC_ATTACH_BTF, attach_trace),
> > +       SEC_DEF("fsession.s+",          TRACING, BPF_TRACE_SESSION, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_trace),
> >         SEC_DEF("freplace+",            EXT, 0, SEC_ATTACH_BTF, attach_trace),
> >         SEC_DEF("lsm+",                 LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm),
> >         SEC_DEF("lsm.s+",               LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm),
> > --
> > 2.52.0
> >
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie
  2025-12-19  1:31     ` Menglong Dong
@ 2025-12-19 12:01       ` Menglong Dong
  0 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-19 12:01 UTC (permalink / raw)
  To: Menglong Dong, Andrii Nakryiko
  Cc: ast, andrii, davem, dsahern, daniel, martin.lau, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, tglx,
	mingo, bp, dave.hansen, x86, hpa, netdev, bpf, linux-kernel

On 2025/12/19 09:31, Menglong Dong wrote:
> On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> > On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > Implement session cookie for fsession. In order to limit the stack usage,
> > > we make 4 as the maximum of the cookie count.
> > >
> > > The offset of the current cookie is stored in the
> > > "(ctx[-1] >> BPF_TRAMP_M_COOKIE) & 0xFF". Therefore, we can get the
> > > session cookie with ctx[-offset].
> > >
> > > The stack will look like this:
> > >
> > >   return value  -> 8 bytes
> > >   argN          -> 8 bytes
> > >   ...
> > >   arg1          -> 8 bytes
> > >   nr_args       -> 8 bytes
> > >   ip(optional)  -> 8 bytes
> > >   cookie2       -> 8 bytes
> > >   cookie1       -> 8 bytes
> > >
> > > Inline the bpf_fsession_cookie() in the verifer too.
> > >
> > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > > ---
> > > v4:
> > > - limit the maximum of the cookie count to 4
> > > - store the session cookies before nr_regs in stack
> > > ---
> > >  include/linux/bpf.h      | 16 ++++++++++++++++
> > >  kernel/bpf/trampoline.c  | 14 +++++++++++++-
> > >  kernel/bpf/verifier.c    | 20 ++++++++++++++++++--
> > >  kernel/trace/bpf_trace.c |  9 +++++++++
> > >  4 files changed, 56 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index d165ace5cc9b..0f35c6ab538c 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -1215,6 +1215,7 @@ enum {
> > >
> > >  #define BPF_TRAMP_M_NR_ARGS    0
> > >  #define BPF_TRAMP_M_IS_RETURN  8
> > > +#define BPF_TRAMP_M_COOKIE     9
> > >
> > >  struct bpf_tramp_links {
> > >         struct bpf_tramp_link *links[BPF_MAX_TRAMP_LINKS];
> > > @@ -1318,6 +1319,7 @@ struct bpf_trampoline {
> > >         struct mutex mutex;
> > >         refcount_t refcnt;
> > >         u32 flags;
> > > +       int cookie_cnt;
> > 
> > can't you just count this each time you need to know instead of
> > keeping track of this? it's not that expensive and won't happen that
> > frequently (and we keep lock on trampoline, so it's also safe and
> > race-free to count)
> 
> There is a for-loop below that use the "cookie_cnt" to clear all the
> cookie to zero. We limited the maximum of cookie_cnt to 4, so
> I guess we can count it directly there. I'll change it in the
> next version.

Sorry I messed it up with the 5th patch. I'll remove this cookie_cnt
and count it directly in __bpf_trampoline_link_prog(). And for the
other comments in the patch, all ACK.

Thanks!
Menglong Dong

> 
> Thanks!
> Menglong Dong
> 
> > 
> > >         u64 key;
> > >         struct {
> > >                 struct btf_func_model model;
> > > @@ -1762,6 +1764,7 @@ struct bpf_prog {
> > >                                 enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
> > >                                 call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
> > >                                 call_get_func_ip:1, /* Do we call get_func_ip() */
> > > +                               call_session_cookie:1, /* Do we call bpf_fsession_cookie() */
> > >                                 tstamp_type_access:1, /* Accessed __sk_buff->tstamp_type */
> > >                                 sleepable:1;    /* BPF program is sleepable */
> > >         enum bpf_prog_type      type;           /* Type of BPF program */
> > 
> > [...]
> > 
> 
> 
> 
> 
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 0/9] bpf: tracing session supporting
  2025-12-19  1:18   ` Menglong Dong
@ 2025-12-19 16:55     ` Andrii Nakryiko
  2025-12-20  1:12       ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19 16:55 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Menglong Dong, ast, andrii, davem, dsahern, daniel, martin.lau,
	eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev,
	bpf, linux-kernel

On Thu, Dec 18, 2025 at 5:18 PM Menglong Dong <menglong.dong@linux.dev> wrote:
>
> On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> > On Wed, Dec 17, 2025 at 1:54 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > Hi, all.
> > >
> > > In this version, I combined Alexei and Andrii's advice, which makes the
> > > architecture specific code much simpler.
> > >
> > > Sometimes, we need to hook both the entry and exit of a function with
> > > TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
> > > function, which is not convenient.
> > >
> > > Therefore, we add a tracing session support for TRACING. Generally
> > > speaking, it's similar to kprobe session, which can hook both the entry
> > > and exit of a function with a single BPF program. Session cookie is also
> > > supported with the kfunc bpf_fsession_cookie(). In order to limit the
> > > stack usage, we limit the maximum number of cookies to 4.
> > >
> > > The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
> > > inlined in the verifier.
> >
> > We have generic bpf_session_is_return() and bpf_session_cookie() (that
> > currently works for ksession), can't you just implement them for the
> > newly added program type instead of adding type-specific kfuncs?
>
> Hi, Andrii. I tried and found that it's a little hard to reuse them. The
> bpf_session_is_return() and bpf_session_cookie() are defined as kfunc, which
> makes we can't implement different functions for different attach type, like
> what bpf helper does.

Are you sure? We certainly support kfunc implementation specialization
for sleepable vs non-sleepable BPF programs. Check specialize_kfunc()
in verifier.c

>
> The way we store "is_return" and "cookie" in fsession is different with
> ksession. For ksession, it store the "is_return" in struct bpf_session_run_ctx.
> Even if we move the "nr_regs" from stack to struct bpf_tramp_run_ctx,
> it's still hard to reuse the bpf_session_is_return() or bpf_session_cookie(),
> as the way of storing the "is_return" and "cookie" in fsession and ksession
> is different, and it's a little difficult and complex to unify them.

I'm not saying we should unify the implementation, you have to
implement different version of logically the same kfunc, of course.

>
> What's more, we will lose the advantage of inline bpf_fsession_is_return
> and bpf_fsession_cookie in verifier.
>

I'd double check that either. BPF verifier and JIT do know program
type, so you can pick how to inline
bpf_session_is_return()/bpf_session_cookie() based on that.

> I'll check more to see if there is a more simple way to reuse them.
>
> Thanks!
> Menglong Dong
>
> >
> > >
> > > We allow the usage of bpf_get_func_ret() to get the return value in the
> > > fentry of the tracing session, as it will always get "0", which is safe
> > > enough and is OK. Maybe we can prohibit the usage of bpf_get_func_ret()
> > > in the fentry in verifier, which can make the architecture specific code
> > > simpler.
> > >
> > > The fsession stuff is arch related, so the -EOPNOTSUPP will be returned if
> > > it is not supported yet by the arch. In this series, we only support
> > > x86_64. And later, other arch will be implemented.
> > >
> > > Changes since v3:
> > > * instead of adding a new hlist to progs_hlist in trampoline, add the bpf
> > >   program to both the fentry hlist and the fexit hlist.
> > > * introduce the 2nd patch to reuse the nr_args field in the stack to
> > >   store all the information we need(except the session cookies).
> > > * limit the maximum number of cookies to 4.
> > > * remove the logic to skip fexit if the fentry return non-zero.
> > >
> > > Changes since v2:
> > > * squeeze some patches:
> > >   - the 2 patches for the kfunc bpf_tracing_is_exit() and
> > >     bpf_fsession_cookie() are merged into the second patch.
> > >   - the testcases for fsession are also squeezed.
> > >
> > > * fix the CI error by move the testcase for bpf_get_func_ip to
> > >   fsession_test.c
> > >
> > > Changes since v1:
> > > * session cookie support.
> > >   In this version, session cookie is implemented, and the kfunc
> > >   bpf_fsession_cookie() is added.
> > >
> > > * restructure the layout of the stack.
> > >   In this version, the session stuff that stored in the stack is changed,
> > >   and we locate them after the return value to not break
> > >   bpf_get_func_ip().
> > >
> > > * testcase enhancement.
> > >   Some nits in the testcase that suggested by Jiri is fixed. Meanwhile,
> > >   the testcase for get_func_ip and session cookie is added too.
> > >
> > > Menglong Dong (9):
> > >   bpf: add tracing session support
> > >   bpf: use last 8-bits for the nr_args in trampoline
> > >   bpf: add the kfunc bpf_fsession_is_return
> > >   bpf: add the kfunc bpf_fsession_cookie
> > >   bpf,x86: introduce emit_st_r0_imm64() for trampoline
> > >   bpf,x86: add tracing session supporting for x86_64
> > >   libbpf: add support for tracing session
> > >   selftests/bpf: add testcases for tracing session
> > >   selftests/bpf: test fsession mixed with fentry and fexit
> > >
> > >  arch/x86/net/bpf_jit_comp.c                   |  47 +++-
> > >  include/linux/bpf.h                           |  39 +++
> > >  include/uapi/linux/bpf.h                      |   1 +
> > >  kernel/bpf/btf.c                              |   2 +
> > >  kernel/bpf/syscall.c                          |  18 +-
> > >  kernel/bpf/trampoline.c                       |  50 +++-
> > >  kernel/bpf/verifier.c                         |  75 ++++--
> > >  kernel/trace/bpf_trace.c                      |  56 ++++-
> > >  net/bpf/test_run.c                            |   1 +
> > >  net/core/bpf_sk_storage.c                     |   1 +
> > >  tools/bpf/bpftool/common.c                    |   1 +
> > >  tools/include/uapi/linux/bpf.h                |   1 +
> > >  tools/lib/bpf/bpf.c                           |   2 +
> > >  tools/lib/bpf/libbpf.c                        |   3 +
> > >  .../selftests/bpf/prog_tests/fsession_test.c  |  90 +++++++
> > >  .../bpf/prog_tests/tracing_failure.c          |   2 +-
> > >  .../selftests/bpf/progs/fsession_test.c       | 226 ++++++++++++++++++
> > >  17 files changed, 571 insertions(+), 44 deletions(-)
> > >  create mode 100644 tools/testing/selftests/bpf/prog_tests/fsession_test.c
> > >  create mode 100644 tools/testing/selftests/bpf/progs/fsession_test.c
> > >
> > > --
> > > 2.52.0
> > >
> >
>
>
>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64
  2025-12-19  1:41     ` Menglong Dong
@ 2025-12-19 16:56       ` Andrii Nakryiko
  0 siblings, 0 replies; 30+ messages in thread
From: Andrii Nakryiko @ 2025-12-19 16:56 UTC (permalink / raw)
  To: Menglong Dong
  Cc: Menglong Dong, ast, andrii, davem, dsahern, daniel, martin.lau,
	eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev,
	bpf, linux-kernel

On Thu, Dec 18, 2025 at 5:42 PM Menglong Dong <menglong.dong@linux.dev> wrote:
>
> On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> > On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > Add BPF_TRACE_SESSION supporting to x86_64, including:
> > >
> > > 1. clear the return value in the stack before fentry to make the fentry
> > >    of the fsession can only get 0 with bpf_get_func_ret(). If we can limit
> > >    that bpf_get_func_ret() can only be used in the
> > >    "bpf_fsession_is_return() == true" code path, we don't need do this
> > >    thing anymore.
> >
> > What does bpf_get_func_ret() return today for fentry? zero or just
> > random garbage? If the latter, we can keep the same semantics for
> > fsession on entry. Ultimately, result of bpf_get_func_ret() is
> > meaningless outside of fexit/session-exit
>
> For fentry, bpf_get_func_ret() is not allowed to be called. For fsession,
> I think the best way is that we allow to call bpf_get_func_ret() in the
> "bpf_fsession_is_return() == true"  branch, and prohibit it in
> "bpf_fsession_is_return() == false" branch. However, we need to track
> such condition in verifier, which will make things complicated. So
> I think we can allow the usage of bpf_get_func_ret() in fsession and
> make sure it will always get zero in the fsession-fentry for now.

yeah, that's fine. and assembly complication is not that big, just
zero out a slot on the stack, right? I think it's fine.

>
> Thanks!
> Menglong Dong
>
> >
> > >
> > > 2. clear all the session cookies' value in the stack. If we can make sure
> > >    that the reading to session cookie can only be done after initialize in
> > >    the verifier, we don't need this anymore.
> > >
> > > 2. store the index of the cookie to ctx[-1] before the calling to fsession
> > >
> > > 3. store the "is_return" flag to ctx[-1] before the calling to fexit of
> > >    the fsession.
> > >
> > > Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
> > > Co-developed-by: Leon Hwang <leon.hwang@linux.dev>
> > > Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> > > ---
> > > v4:
> > > - some adjustment to the 1st patch, such as we get the fsession prog from
> > >   fentry and fexit hlist
> > > - remove the supporting of skipping fexit with fentry return non-zero
> > >
> > > v2:
> > > - add session cookie support
> > > - add the session stuff after return value, instead of before nr_args
> > > ---
> > >  arch/x86/net/bpf_jit_comp.c | 36 +++++++++++++++++++++++++++++++-----
> > >  1 file changed, 31 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index 8cbeefb26192..99b0223374bd 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -3086,12 +3086,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> > >  static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > >                       struct bpf_tramp_links *tl, int stack_size,
> > >                       int run_ctx_off, bool save_ret,
> > > -                     void *image, void *rw_image)
> > > +                     void *image, void *rw_image, u64 nr_regs)
> > >  {
> > >         int i;
> > >         u8 *prog = *pprog;
> > >
> > >         for (i = 0; i < tl->nr_links; i++) {
> > > +               if (tl->links[i]->link.prog->call_session_cookie) {
> > > +                       /* 'stack_size + 8' is the offset of nr_regs in stack */
> > > +                       emit_st_r0_imm64(&prog, nr_regs, stack_size + 8);
> > > +                       nr_regs -= (1 << BPF_TRAMP_M_COOKIE);
> >
> > you have to rename nr_regs to something more meaningful because it's
> > so weird to see some bit manipulations with *number of arguments*
> >
> > > +               }
> > >                 if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
> > >                                     run_ctx_off, save_ret, image, rw_image))
> > >                         return -EINVAL;
> > > @@ -3208,8 +3213,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> > >                                          struct bpf_tramp_links *tlinks,
> > >                                          void *func_addr)
> > >  {
> > > -       int i, ret, nr_regs = m->nr_args, stack_size = 0;
> > > -       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
> > > +       int i, ret, nr_regs = m->nr_args, cookie_cnt, stack_size = 0;
> > > +       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off,
> > > +           cookie_off;
> >
> > if it doesn't fit on a single line, just `int cookie_off;` on a
> > separate line, why wrap the line?
> >
> > >         struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > >         struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > >         struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> >
> > [...]
> >
>
>
>
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 0/9] bpf: tracing session supporting
  2025-12-19 16:55     ` Andrii Nakryiko
@ 2025-12-20  1:12       ` Menglong Dong
  2025-12-20  9:01         ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-20  1:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Menglong Dong, ast, andrii, davem, dsahern, daniel, martin.lau,
	eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev,
	bpf, linux-kernel

On 2025/12/20 00:55, Andrii Nakryiko wrote:
> On Thu, Dec 18, 2025 at 5:18 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> >
> > On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> > > On Wed, Dec 17, 2025 at 1:54 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > >
> > > > Hi, all.
> > > >
> > > > In this version, I combined Alexei and Andrii's advice, which makes the
> > > > architecture specific code much simpler.
> > > >
> > > > Sometimes, we need to hook both the entry and exit of a function with
> > > > TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
> > > > function, which is not convenient.
> > > >
> > > > Therefore, we add a tracing session support for TRACING. Generally
> > > > speaking, it's similar to kprobe session, which can hook both the entry
> > > > and exit of a function with a single BPF program. Session cookie is also
> > > > supported with the kfunc bpf_fsession_cookie(). In order to limit the
> > > > stack usage, we limit the maximum number of cookies to 4.
> > > >
> > > > The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
> > > > inlined in the verifier.
> > >
> > > We have generic bpf_session_is_return() and bpf_session_cookie() (that
> > > currently works for ksession), can't you just implement them for the
> > > newly added program type instead of adding type-specific kfuncs?
> >
> > Hi, Andrii. I tried and found that it's a little hard to reuse them. The
> > bpf_session_is_return() and bpf_session_cookie() are defined as kfunc, which
> > makes we can't implement different functions for different attach type, like
> > what bpf helper does.
> 
> Are you sure? We certainly support kfunc implementation specialization
> for sleepable vs non-sleepable BPF programs. Check specialize_kfunc()
> in verifier.c

Ah, I remember it now. We do can use different kfunc version
for different case in specialize_kfunc().

> 
> >
> > The way we store "is_return" and "cookie" in fsession is different with
> > ksession. For ksession, it store the "is_return" in struct bpf_session_run_ctx.
> > Even if we move the "nr_regs" from stack to struct bpf_tramp_run_ctx,
> > it's still hard to reuse the bpf_session_is_return() or bpf_session_cookie(),
> > as the way of storing the "is_return" and "cookie" in fsession and ksession
> > is different, and it's a little difficult and complex to unify them.
> 
> I'm not saying we should unify the implementation, you have to
> implement different version of logically the same kfunc, of course.

I see. The problem now is that the prototype of bpf_session_cookie()
or bpf_session_is_return() don't satisfy our need. For bpf_session_cookie(),
we at least need the context to be the argument. However, both
of them don't have any function argument. After all, the prototype of
different version of logically the same kfunc should be the same.

I think it's not a good idea to modify the prototype of existing kfunc,
can we?

> 
> >
> > What's more, we will lose the advantage of inline bpf_fsession_is_return
> > and bpf_fsession_cookie in verifier.
> >
> 
> I'd double check that either. BPF verifier and JIT do know program
> type, so you can pick how to inline
> bpf_session_is_return()/bpf_session_cookie() based on that.

Yeah, we can inline it depend on the program type if we can solve
the prototype problem.

Thanks!
Menglong Dong


> 
> > I'll check more to see if there is a more simple way to reuse them.
> >
> > Thanks!
> > Menglong Dong
> >
> > >
[...]
> >
> >
> >
> >





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 0/9] bpf: tracing session supporting
  2025-12-20  1:12       ` Menglong Dong
@ 2025-12-20  9:01         ` Menglong Dong
  2025-12-20 12:22           ` Menglong Dong
  0 siblings, 1 reply; 30+ messages in thread
From: Menglong Dong @ 2025-12-20  9:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Menglong Dong, ast, andrii, davem, dsahern, daniel, martin.lau,
	eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev,
	bpf, linux-kernel

On 2025/12/20 09:12, Menglong Dong wrote:
> On 2025/12/20 00:55, Andrii Nakryiko wrote:
> > On Thu, Dec 18, 2025 at 5:18 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> > >
> > > On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> > > > On Wed, Dec 17, 2025 at 1:54 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > > >
> > > > > Hi, all.
> > > > >
> > > > > In this version, I combined Alexei and Andrii's advice, which makes the
> > > > > architecture specific code much simpler.
> > > > >
> > > > > Sometimes, we need to hook both the entry and exit of a function with
> > > > > TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
> > > > > function, which is not convenient.
> > > > >
> > > > > Therefore, we add a tracing session support for TRACING. Generally
> > > > > speaking, it's similar to kprobe session, which can hook both the entry
> > > > > and exit of a function with a single BPF program. Session cookie is also
> > > > > supported with the kfunc bpf_fsession_cookie(). In order to limit the
> > > > > stack usage, we limit the maximum number of cookies to 4.
> > > > >
> > > > > The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
> > > > > inlined in the verifier.
> > > >
> > > > We have generic bpf_session_is_return() and bpf_session_cookie() (that
> > > > currently works for ksession), can't you just implement them for the
> > > > newly added program type instead of adding type-specific kfuncs?
> > >
> > > Hi, Andrii. I tried and found that it's a little hard to reuse them. The
> > > bpf_session_is_return() and bpf_session_cookie() are defined as kfunc, which
> > > makes we can't implement different functions for different attach type, like
> > > what bpf helper does.
> > 
> > Are you sure? We certainly support kfunc implementation specialization
> > for sleepable vs non-sleepable BPF programs. Check specialize_kfunc()
> > in verifier.c
> 
> Ah, I remember it now. We do can use different kfunc version
> for different case in specialize_kfunc().
> 
> > 
> > >
> > > The way we store "is_return" and "cookie" in fsession is different with
> > > ksession. For ksession, it store the "is_return" in struct bpf_session_run_ctx.
> > > Even if we move the "nr_regs" from stack to struct bpf_tramp_run_ctx,
> > > it's still hard to reuse the bpf_session_is_return() or bpf_session_cookie(),
> > > as the way of storing the "is_return" and "cookie" in fsession and ksession
> > > is different, and it's a little difficult and complex to unify them.
> > 
> > I'm not saying we should unify the implementation, you have to
> > implement different version of logically the same kfunc, of course.
> 
> I see. The problem now is that the prototype of bpf_session_cookie()
> or bpf_session_is_return() don't satisfy our need. For bpf_session_cookie(),
> we at least need the context to be the argument. However, both
> of them don't have any function argument. After all, the prototype of
> different version of logically the same kfunc should be the same.

Hi, Andrii. I see that you want to make the API consistent between
ksession and fsession, which is more friendly for the user.

After my analysis, I think we have following approach:
1. change the function prototype of bpf_session_cookie and bpf_session_is_return
to:
    bool bpf_session_is_return(void *ctx);
    bool bpf_session_cookie(void *ctx);
And we do the fix up in specialize_kfunc(), which I think is the easiest
way. The defect is that it will break existing users.

2. We define a fixup_kfunc_call_early() and call it in add_subprog_and_kfunc.
In the fixup_kfunc_call_early(), we will change the target kfunc(which is insn->imm)
from bpf_session_cookie() to bpf_fsession_cookie(). For the bpf_session_cookie(),
we make its prototype to:
    __bpf_kfunc __u64 *bpf_session_cookie(void *ctx__ign)
Therefore, it won't break the existing users. For the ksession that uses the
old prototype, it can pass the verifier too. Following is a demo patch of this
approach. In this way, we can allow a extension in the prototype for a kfunc
in the feature too.

What do you think?

Thanks!
Menglong Dong

>patch<

+static int fixup_kfunc_call_early(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+       struct bpf_prog *prog = env->prog;
+
+       if (prog->expected_attach_type == BPF_TRACE_FSESSION) {
+               if (insn->imm == special_kfunc_list[KF_bpf_session_cookie])
+                       insn->imm = special_kfunc_list[KF_bpf_fsession_cookie];
+               else if (insn->imm == special_kfunc_list[KF_bpf_session_is_return])
+                       insn->imm = special_kfunc_list[KF_bpf_fsession_is_return];
+       }
+
+       return 0;
+}

@@ -3489,10 +3490,12 @@ static int add_subprog_and_kfunc(struct bpf_verifier_env *env)
                        return -EPERM;
                }
 
-               if (bpf_pseudo_func(insn) || bpf_pseudo_call(insn))
+               if (bpf_pseudo_func(insn) || bpf_pseudo_call(insn)) {
                        ret = add_subprog(env, i + insn->imm + 1);
-               else
-                       ret = add_kfunc_call(env, insn->imm, insn->off);
+               } else {
+                       ret = fixup_kfunc_call_early(env, insn);
+                       ret = ret ?: add_kfunc_call(env, insn->imm, insn->off);
+               }

@@ -3316,7 +3321,7 @@ static u64 bpf_uprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
 
 __bpf_kfunc_start_defs();
 
-__bpf_kfunc bool bpf_session_is_return(void)
+__bpf_kfunc bool bpf_session_is_return(void *ctx__ign)
 {
        struct bpf_session_run_ctx *session_ctx;
 
@@ -3324,7 +3329,7 @@ __bpf_kfunc bool bpf_session_is_return(void)
        return session_ctx->is_return;
 }
 
-__bpf_kfunc __u64 *bpf_session_cookie(void)
+__bpf_kfunc __u64 *bpf_session_cookie(void *ctx__ign)
 {
        struct bpf_session_run_ctx *session_ctx;

> 
> I think it's not a good idea to modify the prototype of existing kfunc,
> can we?
> 
> > 
> > >
> > > What's more, we will lose the advantage of inline bpf_fsession_is_return
> > > and bpf_fsession_cookie in verifier.
> > >
> > 
> > I'd double check that either. BPF verifier and JIT do know program
> > type, so you can pick how to inline
> > bpf_session_is_return()/bpf_session_cookie() based on that.
> 
> Yeah, we can inline it depend on the program type if we can solve
> the prototype problem.
> 
> Thanks!
> Menglong Dong
> 
> 
> > 
> > > I'll check more to see if there is a more simple way to reuse them.
> > >
> > > Thanks!
> > > Menglong Dong
> > >
> > > >
> [...]
> > >
> > >
> > >
> > >
> 
> 
> 
> 
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next v4 0/9] bpf: tracing session supporting
  2025-12-20  9:01         ` Menglong Dong
@ 2025-12-20 12:22           ` Menglong Dong
  0 siblings, 0 replies; 30+ messages in thread
From: Menglong Dong @ 2025-12-20 12:22 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Menglong Dong, ast, andrii, davem, dsahern, daniel, martin.lau,
	eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, netdev,
	bpf, linux-kernel

On 2025/12/20 17:01, Menglong Dong wrote:
> On 2025/12/20 09:12, Menglong Dong wrote:
> > On 2025/12/20 00:55, Andrii Nakryiko wrote:
> > > On Thu, Dec 18, 2025 at 5:18 PM Menglong Dong <menglong.dong@linux.dev> wrote:
> > > >
> > > > On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@gmail.com> write:
> > > > > On Wed, Dec 17, 2025 at 1:54 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > > > >
> > > > > > Hi, all.
> > > > > >
> > > > > > In this version, I combined Alexei and Andrii's advice, which makes the
> > > > > > architecture specific code much simpler.
> > > > > >
> > > > > > Sometimes, we need to hook both the entry and exit of a function with
> > > > > > TRACING. Therefore, we need define a FENTRY and a FEXIT for the target
> > > > > > function, which is not convenient.
> > > > > >
> > > > > > Therefore, we add a tracing session support for TRACING. Generally
> > > > > > speaking, it's similar to kprobe session, which can hook both the entry
> > > > > > and exit of a function with a single BPF program. Session cookie is also
> > > > > > supported with the kfunc bpf_fsession_cookie(). In order to limit the
> > > > > > stack usage, we limit the maximum number of cookies to 4.
> > > > > >
> > > > > > The kfunc bpf_fsession_is_return() and bpf_fsession_cookie() are both
> > > > > > inlined in the verifier.
> > > > >
> > > > > We have generic bpf_session_is_return() and bpf_session_cookie() (that
> > > > > currently works for ksession), can't you just implement them for the
> > > > > newly added program type instead of adding type-specific kfuncs?
> > > >
> > > > Hi, Andrii. I tried and found that it's a little hard to reuse them. The
> > > > bpf_session_is_return() and bpf_session_cookie() are defined as kfunc, which
> > > > makes we can't implement different functions for different attach type, like
> > > > what bpf helper does.
> > > 
> > > Are you sure? We certainly support kfunc implementation specialization
> > > for sleepable vs non-sleepable BPF programs. Check specialize_kfunc()
> > > in verifier.c
> > 
> > Ah, I remember it now. We do can use different kfunc version
> > for different case in specialize_kfunc().
> > 
> > > 
> > > >
> > > > The way we store "is_return" and "cookie" in fsession is different with
> > > > ksession. For ksession, it store the "is_return" in struct bpf_session_run_ctx.
> > > > Even if we move the "nr_regs" from stack to struct bpf_tramp_run_ctx,
> > > > it's still hard to reuse the bpf_session_is_return() or bpf_session_cookie(),
> > > > as the way of storing the "is_return" and "cookie" in fsession and ksession
> > > > is different, and it's a little difficult and complex to unify them.
> > > 
> > > I'm not saying we should unify the implementation, you have to
> > > implement different version of logically the same kfunc, of course.
> > 
> > I see. The problem now is that the prototype of bpf_session_cookie()
> > or bpf_session_is_return() don't satisfy our need. For bpf_session_cookie(),
> > we at least need the context to be the argument. However, both
> > of them don't have any function argument. After all, the prototype of
> > different version of logically the same kfunc should be the same.
> 
> Hi, Andrii. I see that you want to make the API consistent between
> ksession and fsession, which is more friendly for the user.
> 
> After my analysis, I think we have following approach:
> 1. change the function prototype of bpf_session_cookie and bpf_session_is_return
> to:
>     bool bpf_session_is_return(void *ctx);
>     bool bpf_session_cookie(void *ctx);
> And we do the fix up in specialize_kfunc(), which I think is the easiest
> way. The defect is that it will break existing users.
> 
> 2. We define a fixup_kfunc_call_early() and call it in add_subprog_and_kfunc.
> In the fixup_kfunc_call_early(), we will change the target kfunc(which is insn->imm)
> from bpf_session_cookie() to bpf_fsession_cookie(). For the bpf_session_cookie(),
> we make its prototype to:
>     __bpf_kfunc __u64 *bpf_session_cookie(void *ctx__ign)

Ah, it's not a good idea. The libbpf will check if the
prototype of bpf_session_cookie is compatible between local
and vmlinux. We can skip the fields whose name has "__ign"
in current libbpf in __bpf_core_types_are_compat(). But for
the old libbpf, the compatible checking will fail, which means
that it will still break the existing users :(

> Therefore, it won't break the existing users. For the ksession that uses the
> old prototype, it can pass the verifier too. Following is a demo patch of this
> approach. In this way, we can allow a extension in the prototype for a kfunc
> in the feature too.
> 
> What do you think?
> 
> Thanks!
> Menglong Dong
> 
> >patch<
> 
> +static int fixup_kfunc_call_early(struct bpf_verifier_env *env, struct bpf_insn *insn)
> +{
> +       struct bpf_prog *prog = env->prog;
> +
> +       if (prog->expected_attach_type == BPF_TRACE_FSESSION) {
> +               if (insn->imm == special_kfunc_list[KF_bpf_session_cookie])
> +                       insn->imm = special_kfunc_list[KF_bpf_fsession_cookie];
> +               else if (insn->imm == special_kfunc_list[KF_bpf_session_is_return])
> +                       insn->imm = special_kfunc_list[KF_bpf_fsession_is_return];
> +       }
> +
> +       return 0;
> +}
> 
> @@ -3489,10 +3490,12 @@ static int add_subprog_and_kfunc(struct bpf_verifier_env *env)
>                         return -EPERM;
>                 }
>  
> -               if (bpf_pseudo_func(insn) || bpf_pseudo_call(insn))
> +               if (bpf_pseudo_func(insn) || bpf_pseudo_call(insn)) {
>                         ret = add_subprog(env, i + insn->imm + 1);
> -               else
> -                       ret = add_kfunc_call(env, insn->imm, insn->off);
> +               } else {
> +                       ret = fixup_kfunc_call_early(env, insn);
> +                       ret = ret ?: add_kfunc_call(env, insn->imm, insn->off);
> +               }
> 
> @@ -3316,7 +3321,7 @@ static u64 bpf_uprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
>  
>  __bpf_kfunc_start_defs();
>  
> -__bpf_kfunc bool bpf_session_is_return(void)
> +__bpf_kfunc bool bpf_session_is_return(void *ctx__ign)
>  {
>         struct bpf_session_run_ctx *session_ctx;
>  
> @@ -3324,7 +3329,7 @@ __bpf_kfunc bool bpf_session_is_return(void)
>         return session_ctx->is_return;
>  }
>  
> -__bpf_kfunc __u64 *bpf_session_cookie(void)
> +__bpf_kfunc __u64 *bpf_session_cookie(void *ctx__ign)
>  {
>         struct bpf_session_run_ctx *session_ctx;
> 
> > 
> > I think it's not a good idea to modify the prototype of existing kfunc,
> > can we?
> > 
> > > 
> > > >
> > > > What's more, we will lose the advantage of inline bpf_fsession_is_return
> > > > and bpf_fsession_cookie in verifier.
> > > >
> > > 
> > > I'd double check that either. BPF verifier and JIT do know program
> > > type, so you can pick how to inline
> > > bpf_session_is_return()/bpf_session_cookie() based on that.
> > 
> > Yeah, we can inline it depend on the program type if we can solve
> > the prototype problem.
> > 
> > Thanks!
> > Menglong Dong
> > 
> > 
> > > 
> > > > I'll check more to see if there is a more simple way to reuse them.
> > > >
> > > > Thanks!
> > > > Menglong Dong
> > > >
> > > > >
> > [...]
> > > >
> > > >
> > > >
> > > >
> > 
> > 
> > 
> > 
> > 
> 
> 
> 
> 
> 





^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-12-20 12:23 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-17  9:54 [PATCH bpf-next v4 0/9] bpf: tracing session supporting Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 1/9] bpf: add tracing session support Menglong Dong
2025-12-19  0:55   ` Andrii Nakryiko
2025-12-19  1:24     ` Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 2/9] bpf: use last 8-bits for the nr_args in trampoline Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 3/9] bpf: add the kfunc bpf_fsession_is_return Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 4/9] bpf: add the kfunc bpf_fsession_cookie Menglong Dong
2025-12-19  0:55   ` Andrii Nakryiko
2025-12-19  1:31     ` Menglong Dong
2025-12-19 12:01       ` Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 5/9] bpf,x86: introduce emit_st_r0_imm64() for trampoline Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86_64 Menglong Dong
2025-12-19  0:55   ` Andrii Nakryiko
2025-12-19  1:41     ` Menglong Dong
2025-12-19 16:56       ` Andrii Nakryiko
2025-12-17  9:54 ` [PATCH bpf-next v4 7/9] libbpf: add support for tracing session Menglong Dong
2025-12-19  0:55   ` Andrii Nakryiko
2025-12-19  1:42     ` Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 8/9] selftests/bpf: add testcases " Menglong Dong
2025-12-17 10:24   ` bot+bpf-ci
2025-12-17 11:42     ` Menglong Dong
2025-12-17  9:54 ` [PATCH bpf-next v4 9/9] selftests/bpf: test fsession mixed with fentry and fexit Menglong Dong
2025-12-17 10:24   ` bot+bpf-ci
2025-12-17 10:37     ` Menglong Dong
2025-12-19  0:55 ` [PATCH bpf-next v4 0/9] bpf: tracing session supporting Andrii Nakryiko
2025-12-19  1:18   ` Menglong Dong
2025-12-19 16:55     ` Andrii Nakryiko
2025-12-20  1:12       ` Menglong Dong
2025-12-20  9:01         ` Menglong Dong
2025-12-20 12:22           ` Menglong Dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).