bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC bpf-next v1 0/4] Support cookie for link-based struct_ops attachment
@ 2025-07-08 23:08 Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 1/4] bpf: Factor out bpf_struct_ops_prepare_attach() Amery Hung
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Amery Hung @ 2025-07-08 23:08 UTC (permalink / raw)
  To: bpf
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, ameryhung,
	kernel-team

* Motivation *

Currently, sched_ext only supports a single scheduler. scx kfuncs can
reference the calling scheduler simply with a global pointer, scx_root,
whether being called in struct_ops, tracing, syscall or async callback.
However, when sched_ext moves to support multiple scx schedulers,
scx kfunc without task_struct argument won't be able to easily refer
to the calling scheduler without passing additional information.

The vision is to be able to view struct_ops, tracing, syscall programs
and async callbacks in a file as part of a scheduler, and have the
ability to refer to the scheduler in kfuncs called from any of the
programs.

However, right now there is not a notion that connects bpf programs in a
file in the kernel. Until it is clear what that should be, as a
stopgap, we would like to reuse cookie to associate bpf programs
belonging to the same scheduler.

* Design *

The interim solution is to use bpf cookie as the scheduler id, and add
the cookie as an argument to scx kfuncs without task_struct. By
maintaining the cookie to scheduler mapping in sched_ext, kfuncs will be
able to refer to the calling scheduler internally using cookie.

For the user space loader, it should use the same cookie when attaching
bpf programs. In bpf programs, they will call bpf_get_attach_cookie()
to get the cookie and pass it to scx kfuncs.

The following patches are the first part of the kernel-side plumbing: to
support cookie for link-based struct_ops attachment. struct_ops already
uses bpf trampoline. The only problem is the trampoline is being populated
during map_update when the cookie is not known yet. This patchset moves
the trampoline preparation from map_update to link_create for link-based
struct_ops attachment. Then, we extend the UAPI to take the cookie and
propagate it all the way to the struct_ops link_create. Finally,
bpf_get_attach_cookie() is allowed to struct_ops programs. 

A problem that has not been addressed in this RFC is that by moving
trampoline preparation to link_create, creating multiple links
using the same struct_ops map becomes problematic. A simple solution
may be just disable if there is no obvious use case for it.

* Alternative *

Martin has suggested a way that involves less kernel changes: Using
struct_ops map id as the sched_ext scheduler id. Instead of further
complicating struct_ops, sched_ext can create the struct_ops map id
to scheduler mapping in reg(). The id will also be stored in a global
variable in the file referenced by different types of programs. The
rest is similar to the cookie approach: kfuncs without task_struct
will need to add an argument, and they will lookup the scheduler by
the id.

* Previous approach *

The previous solution was to set a "this pointer" pointing to the
struct_ops struct during struct_ops map_update. Then, kfuncs can get
a pointer to the scheduler with a __prog argument. However, this
approach is broken as it would only work for struct_ops programs.


Some thought after Martin brought up the idea

I think we should use the map id instead of cookie. Especially if we are
going to disable creating multiple links per map, the cookie effectively
becomes the same as the map id.


Amery Hung (4):
  bpf: Factor out bpf_struct_ops_prepare_attach()
  bpf: Support cookie for linked-based struct_ops attachment
  libbpf: Support link-based struct_ops attach with options
  selftests/bpf: Test bpf_get_attach_cookie() in struct_ops program

 include/uapi/linux/bpf.h                      |   3 +
 kernel/bpf/bpf_struct_ops.c                   | 199 ++++++++++++------
 kernel/bpf/helpers.c                          |  18 ++
 tools/include/uapi/linux/bpf.h                |   3 +
 tools/lib/bpf/bpf.c                           |   5 +
 tools/lib/bpf/bpf.h                           |   3 +
 tools/lib/bpf/libbpf.c                        |  14 +-
 tools/lib/bpf/libbpf.h                        |  10 +
 tools/lib/bpf/libbpf.map                      |   1 +
 .../bpf/prog_tests/test_struct_ops_cookie.c   |  42 ++++
 .../selftests/bpf/progs/struct_ops_cookie.c   |  48 +++++
 .../selftests/bpf/test_kmods/bpf_testmod.c    |   1 +
 12 files changed, 280 insertions(+), 67 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_cookie.c
 create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_cookie.c

-- 
2.47.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC bpf-next v1 1/4] bpf: Factor out bpf_struct_ops_prepare_attach()
  2025-07-08 23:08 [RFC bpf-next v1 0/4] Support cookie for link-based struct_ops attachment Amery Hung
@ 2025-07-08 23:08 ` Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment Amery Hung
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Amery Hung @ 2025-07-08 23:08 UTC (permalink / raw)
  To: bpf
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, ameryhung,
	kernel-team

To support cookie for struct_ops attachment, the trampoline needs be
created during link_create. To prepare for it, factor out the creation
of trampoline and ksym from struct_ops map update. No functional change.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 kernel/bpf/bpf_struct_ops.c | 148 ++++++++++++++++++++++--------------
 1 file changed, 93 insertions(+), 55 deletions(-)

diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 96113633e391..4d150e99a86c 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -673,6 +673,95 @@ static void bpf_struct_ops_map_free_ksyms(struct bpf_struct_ops_map *st_map)
 	}
 }
 
+static int bpf_struct_ops_prepare_attach(struct bpf_struct_ops_map *st_map)
+{
+	const struct bpf_struct_ops *st_ops = st_map->st_ops_desc->st_ops;
+	const struct btf_type *t = st_map->st_ops_desc->type;
+	struct bpf_link **plink = st_map->links;
+	struct bpf_ksym **pksym = st_map->ksyms;
+	const struct btf_member *member;
+	struct bpf_tramp_links *tlinks;
+	void *udata, *kdata;
+	int i, err = 0;
+	u32 trampoline_start, image_off = 0;
+	void *cur_image = NULL, *image = NULL;
+	const char *tname, *mname;
+
+	tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
+	if (!tlinks)
+		return -ENOMEM;
+
+	udata = &st_map->uvalue->data;
+	kdata = &st_map->kvalue.data;
+
+	tname = btf_name_by_offset(st_map->btf, t->name_off);
+
+	for_each_member(i, t, member) {
+		const struct btf_type *ptype;
+		struct bpf_tramp_link *link;
+		struct bpf_ksym *ksym;
+		u32 moff, id;
+
+		ptype = btf_type_resolve_ptr(st_map->btf, member->type, NULL);
+		if (!ptype || !btf_type_is_func_proto(ptype))
+			continue;
+
+		moff = __btf_member_bit_offset(t, member) / 8;
+
+		id = *(unsigned long *)(udata + moff);
+		if (!id)
+			continue;
+
+		mname = btf_name_by_offset(st_map->btf, member->name_off);
+		link = container_of(*plink++, struct bpf_tramp_link, link);
+
+		ksym = kzalloc(sizeof(*ksym), GFP_USER);
+		if (!ksym) {
+			err = -ENOMEM;
+			goto out;
+		}
+		*pksym++ = ksym;
+
+		trampoline_start = image_off;
+		err = bpf_struct_ops_prepare_trampoline(tlinks, link,
+						&st_ops->func_models[i],
+						*(void **)(st_ops->cfi_stubs + moff),
+						&image, &image_off,
+						st_map->image_pages_cnt < MAX_TRAMP_IMAGE_PAGES);
+		if (err)
+			goto out;
+
+		if (cur_image != image) {
+			st_map->image_pages[st_map->image_pages_cnt++] = image;
+			cur_image = image;
+			trampoline_start = 0;
+		}
+
+		*(void **)(kdata + moff) = image + trampoline_start + cfi_get_offset();
+
+		/* init ksym for this trampoline */
+		bpf_struct_ops_ksym_init(tname, mname,
+					 image + trampoline_start,
+					 image_off - trampoline_start,
+					 ksym);
+	}
+
+	if (st_ops->validate) {
+		err = st_ops->validate(kdata);
+		if (err)
+			goto out;
+	}
+	for (i = 0; i < st_map->image_pages_cnt; i++) {
+		err = arch_protect_bpf_trampoline(st_map->image_pages[i],
+						  PAGE_SIZE);
+		if (err)
+			goto out;
+	}
+out:
+	kfree(tlinks);
+	return err;
+}
+
 static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 					   void *value, u64 flags)
 {
@@ -683,14 +772,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	const struct btf_type *module_type;
 	const struct btf_member *member;
 	const struct btf_type *t = st_ops_desc->type;
-	struct bpf_tramp_links *tlinks;
 	void *udata, *kdata;
 	int prog_fd, err;
-	u32 i, trampoline_start, image_off = 0;
-	void *cur_image = NULL, *image = NULL;
+	u32 i;
 	struct bpf_link **plink;
-	struct bpf_ksym **pksym;
-	const char *tname, *mname;
 
 	if (flags)
 		return -EINVAL;
@@ -710,10 +795,6 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	if (uvalue->common.state || refcount_read(&uvalue->common.refcnt))
 		return -EINVAL;
 
-	tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);
-	if (!tlinks)
-		return -ENOMEM;
-
 	uvalue = (struct bpf_struct_ops_value *)st_map->uvalue;
 	kvalue = (struct bpf_struct_ops_value *)&st_map->kvalue;
 
@@ -730,18 +811,14 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	kdata = &kvalue->data;
 
 	plink = st_map->links;
-	pksym = st_map->ksyms;
-	tname = btf_name_by_offset(st_map->btf, t->name_off);
 	module_type = btf_type_by_id(btf_vmlinux, st_ops_ids[IDX_MODULE_ID]);
 	for_each_member(i, t, member) {
 		const struct btf_type *mtype, *ptype;
 		struct bpf_prog *prog;
 		struct bpf_tramp_link *link;
-		struct bpf_ksym *ksym;
 		u32 moff;
 
 		moff = __btf_member_bit_offset(t, member) / 8;
-		mname = btf_name_by_offset(st_map->btf, member->name_off);
 		ptype = btf_type_resolve_ptr(st_map->btf, member->type, NULL);
 		if (ptype == module_type) {
 			if (*(void **)(udata + moff))
@@ -811,51 +888,13 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 			      &bpf_struct_ops_link_lops, prog);
 		*plink++ = &link->link;
 
-		ksym = kzalloc(sizeof(*ksym), GFP_USER);
-		if (!ksym) {
-			err = -ENOMEM;
-			goto reset_unlock;
-		}
-		*pksym++ = ksym;
-
-		trampoline_start = image_off;
-		err = bpf_struct_ops_prepare_trampoline(tlinks, link,
-						&st_ops->func_models[i],
-						*(void **)(st_ops->cfi_stubs + moff),
-						&image, &image_off,
-						st_map->image_pages_cnt < MAX_TRAMP_IMAGE_PAGES);
-		if (err)
-			goto reset_unlock;
-
-		if (cur_image != image) {
-			st_map->image_pages[st_map->image_pages_cnt++] = image;
-			cur_image = image;
-			trampoline_start = 0;
-		}
-
-		*(void **)(kdata + moff) = image + trampoline_start + cfi_get_offset();
-
 		/* put prog_id to udata */
 		*(unsigned long *)(udata + moff) = prog->aux->id;
-
-		/* init ksym for this trampoline */
-		bpf_struct_ops_ksym_init(tname, mname,
-					 image + trampoline_start,
-					 image_off - trampoline_start,
-					 ksym);
 	}
 
-	if (st_ops->validate) {
-		err = st_ops->validate(kdata);
-		if (err)
-			goto reset_unlock;
-	}
-	for (i = 0; i < st_map->image_pages_cnt; i++) {
-		err = arch_protect_bpf_trampoline(st_map->image_pages[i],
-						  PAGE_SIZE);
-		if (err)
-			goto reset_unlock;
-	}
+	err = bpf_struct_ops_prepare_attach(st_map);
+	if (err)
+		goto reset_unlock;
 
 	if (st_map->map.map_flags & BPF_F_LINK) {
 		err = 0;
@@ -897,7 +936,6 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	memset(uvalue, 0, map->value_size);
 	memset(kvalue, 0, map->value_size);
 unlock:
-	kfree(tlinks);
 	mutex_unlock(&st_map->lock);
 	if (!err)
 		bpf_struct_ops_map_add_ksyms(st_map);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-08 23:08 [RFC bpf-next v1 0/4] Support cookie for link-based struct_ops attachment Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 1/4] bpf: Factor out bpf_struct_ops_prepare_attach() Amery Hung
@ 2025-07-08 23:08 ` Amery Hung
  2025-07-09 22:13   ` Martin KaFai Lau
  2025-07-08 23:08 ` [RFC bpf-next v1 3/4] libbpf: Support link-based struct_ops attach with options Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 4/4] selftests/bpf: Test bpf_get_attach_cookie() in struct_ops program Amery Hung
  3 siblings, 1 reply; 18+ messages in thread
From: Amery Hung @ 2025-07-08 23:08 UTC (permalink / raw)
  To: bpf
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, ameryhung,
	kernel-team

Support cookie when attaching a struct_ops map using link. The cookie is
associated with the link and can be retrieved in bpf struct_ops program
using bpf_get_attach_cookie().

Implementation wise, trampoline and ksyms preparation are deferred from
map_update to link_create for struct_ops maps with BPF_F_LINK. Since bpf
cookie is hardcoded to the trampoline in arch_prepare_bpf_trampoline(),
it must be done when cookie is known (i.e., link_create). The trampoline
and ksyms are freed once a link is detached as the struct_ops map is no
longer associated with the link.

TODO:
A struct_ops map with BPF_F_LINK should be prevented from being used to
create another link after this patch since a struct_ops map currently
only support a set of trampoline. This may be done reusing the state
from non-BPF_F_LINK struct_ops map: set the state from READY to INUSE
in link_create, and check the state in link_create and link_update.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 include/uapi/linux/bpf.h       |  3 ++
 kernel/bpf/bpf_struct_ops.c    | 59 +++++++++++++++++++++++++---------
 kernel/bpf/helpers.c           | 18 +++++++++++
 tools/include/uapi/linux/bpf.h |  3 ++
 4 files changed, 68 insertions(+), 15 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0670e15a6100..4708d0783130 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1818,6 +1818,9 @@ union bpf_attr {
 				};
 				__u64		expected_revision;
 			} cgroup;
+			struct {
+				__u64		cookie;
+			} struct_ops;
 		};
 	} link_create;
 
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 4d150e99a86c..3ad0697a3c00 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -59,6 +59,7 @@ struct bpf_struct_ops_link {
 	struct bpf_link link;
 	struct bpf_map __rcu *map;
 	wait_queue_head_t wait_hup;
+	u64 cookie;
 };
 
 static DEFINE_MUTEX(update_mutex);
@@ -673,7 +674,7 @@ static void bpf_struct_ops_map_free_ksyms(struct bpf_struct_ops_map *st_map)
 	}
 }
 
-static int bpf_struct_ops_prepare_attach(struct bpf_struct_ops_map *st_map)
+static int bpf_struct_ops_prepare_attach(struct bpf_struct_ops_map *st_map, u64 cookie)
 {
 	const struct bpf_struct_ops *st_ops = st_map->st_ops_desc->st_ops;
 	const struct btf_type *t = st_map->st_ops_desc->type;
@@ -714,6 +715,7 @@ static int bpf_struct_ops_prepare_attach(struct bpf_struct_ops_map *st_map)
 
 		mname = btf_name_by_offset(st_map->btf, member->name_off);
 		link = container_of(*plink++, struct bpf_tramp_link, link);
+		link->cookie = cookie;
 
 		ksym = kzalloc(sizeof(*ksym), GFP_USER);
 		if (!ksym) {
@@ -892,10 +894,6 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		*(unsigned long *)(udata + moff) = prog->aux->id;
 	}
 
-	err = bpf_struct_ops_prepare_attach(st_map);
-	if (err)
-		goto reset_unlock;
-
 	if (st_map->map.map_flags & BPF_F_LINK) {
 		err = 0;
 		/* Let bpf_link handle registration & unregistration.
@@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		goto unlock;
 	}
 
+	err = bpf_struct_ops_prepare_attach(st_map, 0);
+	if (err)
+		goto reset_unlock;
+
 	err = st_ops->reg(kdata, NULL);
 	if (likely(!err)) {
 		/* This refcnt increment on the map here after
@@ -915,6 +917,7 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		 * or transition it to TOBEFREE concurrently.
 		 */
 		bpf_map_inc(map);
+		bpf_struct_ops_map_add_ksyms(st_map);
 		/* Pair with smp_load_acquire() during lookup_elem().
 		 * It ensures the above udata updates (e.g. prog->aux->id)
 		 * can be seen once BPF_STRUCT_OPS_STATE_INUSE is set.
@@ -937,8 +940,6 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 	memset(kvalue, 0, map->value_size);
 unlock:
 	mutex_unlock(&st_map->lock);
-	if (!err)
-		bpf_struct_ops_map_add_ksyms(st_map);
 	return err;
 }
 
@@ -1247,7 +1248,11 @@ static void bpf_struct_ops_map_link_show_fdinfo(const struct bpf_link *link,
 	rcu_read_lock();
 	map = rcu_dereference(st_link->map);
 	if (map)
-		seq_printf(seq, "map_id:\t%d\n", map->id);
+		seq_printf(seq,
+			   "map_id:\t%d\n"
+			   "cookie:\t%llu\n",
+			   map->id,
+			   st_link->cookie);
 	rcu_read_unlock();
 }
 
@@ -1302,14 +1307,28 @@ static int bpf_struct_ops_map_link_update(struct bpf_link *link, struct bpf_map
 		goto err_out;
 	}
 
+	err = bpf_struct_ops_prepare_attach(st_map, st_link->cookie);
+	if (err)
+		goto free_image;
+
 	err = st_map->st_ops_desc->st_ops->update(st_map->kvalue.data, old_st_map->kvalue.data, link);
 	if (err)
-		goto err_out;
+		goto free_image;
 
 	bpf_map_inc(new_map);
 	rcu_assign_pointer(st_link->map, new_map);
+	bpf_struct_ops_map_add_ksyms(st_map);
+	bpf_struct_ops_map_del_ksyms(old_st_map);
+	bpf_struct_ops_map_free_ksyms(old_st_map);
+	bpf_struct_ops_map_free_image(old_st_map);
 	bpf_map_put(old_map);
+	mutex_unlock(&update_mutex);
+
+	return 0;
 
+free_image:
+	bpf_struct_ops_map_free_ksyms(st_map);
+	bpf_struct_ops_map_free_image(st_map);
 err_out:
 	mutex_unlock(&update_mutex);
 
@@ -1395,24 +1414,34 @@ int bpf_struct_ops_link_create(union bpf_attr *attr)
 	if (err)
 		goto err_out;
 
+	link->cookie = attr->link_create.struct_ops.cookie;
+
 	init_waitqueue_head(&link->wait_hup);
 
 	/* Hold the update_mutex such that the subsystem cannot
 	 * do link->ops->detach() before the link is fully initialized.
 	 */
 	mutex_lock(&update_mutex);
+	err = bpf_struct_ops_prepare_attach(st_map, link->cookie);
+	if (err)
+		goto free_image;
+
 	err = st_map->st_ops_desc->st_ops->reg(st_map->kvalue.data, &link->link);
-	if (err) {
-		mutex_unlock(&update_mutex);
-		bpf_link_cleanup(&link_primer);
-		link = NULL;
-		goto err_out;
-	}
+	if (err)
+		goto free_image;
+
 	RCU_INIT_POINTER(link->map, map);
+	bpf_struct_ops_map_add_ksyms(st_map);
 	mutex_unlock(&update_mutex);
 
 	return bpf_link_settle(&link_primer);
 
+free_image:
+	bpf_struct_ops_map_free_ksyms(st_map);
+	bpf_struct_ops_map_free_image(st_map);
+	mutex_unlock(&update_mutex);
+	bpf_link_cleanup(&link_primer);
+	link = NULL;
 err_out:
 	bpf_map_put(map);
 	kfree(link);
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 3d33181d5e67..4075fdd1533f 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1894,6 +1894,21 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = {
 	.arg3_type	= ARG_CONST_ALLOC_SIZE_OR_ZERO,
 };
 
+BPF_CALL_1(bpf_get_attach_cookie_struct_ops, void *, ctx)
+{
+	struct bpf_tramp_run_ctx *run_ctx;
+
+	run_ctx = container_of(current->bpf_ctx, struct bpf_tramp_run_ctx, run_ctx);
+	return run_ctx->bpf_cookie;
+}
+
+static const struct bpf_func_proto bpf_get_attach_cookie_proto_struct_ops = {
+	.func		= bpf_get_attach_cookie_struct_ops,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+
 const struct bpf_func_proto bpf_get_current_task_proto __weak;
 const struct bpf_func_proto bpf_get_current_task_btf_proto __weak;
 const struct bpf_func_proto bpf_probe_read_user_proto __weak;
@@ -1962,6 +1977,9 @@ bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_ns_current_pid_tgid_proto;
 	case BPF_FUNC_get_current_uid_gid:
 		return &bpf_get_current_uid_gid_proto;
+	case BPF_FUNC_get_attach_cookie:
+		return prog->type == BPF_PROG_TYPE_STRUCT_OPS ?
+		       &bpf_get_attach_cookie_proto_struct_ops : NULL;
 	default:
 		break;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0670e15a6100..4708d0783130 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1818,6 +1818,9 @@ union bpf_attr {
 				};
 				__u64		expected_revision;
 			} cgroup;
+			struct {
+				__u64		cookie;
+			} struct_ops;
 		};
 	} link_create;
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC bpf-next v1 3/4] libbpf: Support link-based struct_ops attach with options
  2025-07-08 23:08 [RFC bpf-next v1 0/4] Support cookie for link-based struct_ops attachment Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 1/4] bpf: Factor out bpf_struct_ops_prepare_attach() Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment Amery Hung
@ 2025-07-08 23:08 ` Amery Hung
  2025-07-08 23:08 ` [RFC bpf-next v1 4/4] selftests/bpf: Test bpf_get_attach_cookie() in struct_ops program Amery Hung
  3 siblings, 0 replies; 18+ messages in thread
From: Amery Hung @ 2025-07-08 23:08 UTC (permalink / raw)
  To: bpf
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, ameryhung,
	kernel-team

Currently libbpf supports bpf_map__attach_struct_ops() with signature:
  LIBBPF_API struct bpf_link *
  bpf_map__attach_struct_ops(const struct bpf_map *map);

To support cookie for struct_ops attachment, an additional argument is
needed. The argument is a pointer to a structure containing all the
option used for bpf_link_create.

Add the new API:
  LIBBPF_API struct bpf_link *
  bpf_map__attach_struct_ops_opts(const struct bpf_map *map,
                                  const struct bpf_struct_ops_opts *opts);

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 tools/lib/bpf/bpf.c      |  5 +++++
 tools/lib/bpf/bpf.h      |  3 +++
 tools/lib/bpf/libbpf.c   | 14 +++++++++++++-
 tools/lib/bpf/libbpf.h   | 10 ++++++++++
 tools/lib/bpf/libbpf.map |  1 +
 5 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index ab40dbf9f020..c1e66e190982 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -881,6 +881,11 @@ int bpf_link_create(int prog_fd, int target_fd,
 		if (!OPTS_ZEROED(opts, cgroup))
 			return libbpf_err(-EINVAL);
 		break;
+	case BPF_STRUCT_OPS:
+		attr.link_create.struct_ops.cookie = OPTS_GET(opts, struct_ops.cookie, 0);
+		if (!OPTS_ZEROED(opts, struct_ops))
+			return libbpf_err(-EINVAL);
+		break;
 	default:
 		if (!OPTS_ZEROED(opts, flags))
 			return libbpf_err(-EINVAL);
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 7252150e7ad3..2386ad4a295c 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -443,6 +443,9 @@ struct bpf_link_create_opts {
 			__u32 relative_id;
 			__u64 expected_revision;
 		} cgroup;
+		struct {
+			__u64 cookie;
+		} struct_ops;
 	};
 	size_t :0;
 };
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index aee36402f0a3..f5b71eccd5cc 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -13110,10 +13110,20 @@ static int bpf_link__detach_struct_ops(struct bpf_link *link)
 
 struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map)
 {
+	return bpf_map__attach_struct_ops_opts(map, NULL);
+}
+
+struct bpf_link *bpf_map__attach_struct_ops_opts(const struct bpf_map *map,
+						 const struct bpf_struct_ops_opts *opts)
+{
+	LIBBPF_OPTS(bpf_link_create_opts, link_create_opts);
 	struct bpf_link_struct_ops *link;
 	__u32 zero = 0;
 	int err, fd;
 
+	if (!OPTS_VALID(opts, bpf_struct_ops_opts))
+		return libbpf_err_ptr(-EINVAL);
+
 	if (!bpf_map__is_struct_ops(map)) {
 		pr_warn("map '%s': can't attach non-struct_ops map\n", map->name);
 		return libbpf_err_ptr(-EINVAL);
@@ -13149,7 +13159,9 @@ struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map)
 		return &link->link;
 	}
 
-	fd = bpf_link_create(map->fd, 0, BPF_STRUCT_OPS, NULL);
+	link_create_opts.struct_ops.cookie = OPTS_GET(opts, cookie, 0);
+
+	fd = bpf_link_create(map->fd, 0, BPF_STRUCT_OPS, &link_create_opts);
 	if (fd < 0) {
 		free(link);
 		return libbpf_err_ptr(fd);
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index d1cf813a057b..a4a758a8a25b 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -894,6 +894,16 @@ bpf_program__attach_cgroup_opts(const struct bpf_program *prog, int cgroup_fd,
 
 struct bpf_map;
 
+struct bpf_struct_ops_opts {
+	/* size of this struct, for forward/backward compatibility */
+	size_t sz;
+	__u64 cookie;
+	size_t :0;
+};
+#define bpf_struct_ops_opts__last_field cookie
+
+LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops_opts(const struct bpf_map *map,
+							    const struct bpf_struct_ops_opts* opts);
 LIBBPF_API struct bpf_link *bpf_map__attach_struct_ops(const struct bpf_map *map);
 LIBBPF_API int bpf_link__update_map(struct bpf_link *link, const struct bpf_map *map);
 
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 1bbf77326420..6998300c766a 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -439,6 +439,7 @@ LIBBPF_1.6.0 {
 		bpf_object__prepare;
 		bpf_prog_stream_read;
 		bpf_program__attach_cgroup_opts;
+		bpf_map__attach_struct_ops_opts;
 		bpf_program__func_info;
 		bpf_program__func_info_cnt;
 		bpf_program__line_info;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC bpf-next v1 4/4] selftests/bpf: Test bpf_get_attach_cookie() in struct_ops program
  2025-07-08 23:08 [RFC bpf-next v1 0/4] Support cookie for link-based struct_ops attachment Amery Hung
                   ` (2 preceding siblings ...)
  2025-07-08 23:08 ` [RFC bpf-next v1 3/4] libbpf: Support link-based struct_ops attach with options Amery Hung
@ 2025-07-08 23:08 ` Amery Hung
  3 siblings, 0 replies; 18+ messages in thread
From: Amery Hung @ 2025-07-08 23:08 UTC (permalink / raw)
  To: bpf
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, ameryhung,
	kernel-team

Attach a struct_ops map with a cookie and test that struct_ops programs
can read the cookie using bpf_get_attach_cookie().

Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
 .../bpf/prog_tests/test_struct_ops_cookie.c   | 42 ++++++++++++++++
 .../selftests/bpf/progs/struct_ops_cookie.c   | 48 +++++++++++++++++++
 .../selftests/bpf/test_kmods/bpf_testmod.c    |  1 +
 3 files changed, 91 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_cookie.c
 create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_cookie.c

diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_cookie.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_cookie.c
new file mode 100644
index 000000000000..36179785b173
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_cookie.c
@@ -0,0 +1,42 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+
+#include "struct_ops_cookie.skel.h"
+
+static void test_struct_ops_cookie_basic(void)
+{
+	LIBBPF_OPTS(bpf_struct_ops_opts, attach_opts);
+	LIBBPF_OPTS(bpf_test_run_opts, run_opts);
+	struct struct_ops_cookie *skel;
+	struct bpf_link *link;
+	int err, prog_fd;
+
+	skel = struct_ops_cookie__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "struct_ops_cookie__open_and_load"))
+		return;
+
+	attach_opts.cookie = 0x12345678;
+	link = bpf_map__attach_struct_ops_opts(skel->maps.testmod_cookie, &attach_opts);
+	if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops_opts"))
+		goto out;
+
+	prog_fd = bpf_program__fd(skel->progs.trigger_test_1);
+	err = bpf_prog_test_run_opts(prog_fd, &run_opts);
+	ASSERT_OK(err, "bpf_prog_test_run_opts");
+	ASSERT_EQ(skel->bss->cookie_test_1, 0x12345678, "cookie_1_value");
+
+	prog_fd = bpf_program__fd(skel->progs.trigger_test_2);
+	err = bpf_prog_test_run_opts(prog_fd, &run_opts);
+	ASSERT_OK(err, "bpf_prog_test_run_opts");
+	ASSERT_EQ(skel->bss->cookie_test_2, 0x12345678, "cookie_2_value");
+
+out:
+	bpf_link__destroy(link);
+	struct_ops_cookie__destroy(skel);
+}
+
+void serial_test_struct_ops_cookie(void)
+{
+	if (test__start_subtest("struct_ops_cookie_basic"))
+		test_struct_ops_cookie_basic();
+}
diff --git a/tools/testing/selftests/bpf/progs/struct_ops_cookie.c b/tools/testing/selftests/bpf/progs/struct_ops_cookie.c
new file mode 100644
index 000000000000..1033939fbc1e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/struct_ops_cookie.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "../test_kmods/bpf_testmod.h"
+
+char _license[] SEC("license") = "GPL";
+
+__u64 cookie_test_1 = 0;
+__u64 cookie_test_2 = 0;
+
+void bpf_testmod_ops3_call_test_1(void) __ksym;
+void bpf_testmod_ops3_call_test_2(void) __ksym;
+
+SEC("struct_ops/test_cookie_1")
+int BPF_PROG(test_cookie_1)
+{
+	cookie_test_1 = bpf_get_attach_cookie(ctx);
+	return 0;
+}
+
+SEC("struct_ops/test_cookie_2")
+int BPF_PROG(test_cookie_2)
+{
+	cookie_test_2 = bpf_get_attach_cookie(ctx);
+	return 0;
+}
+
+SEC("syscall")
+int trigger_test_1(void *ctx)
+{
+	bpf_testmod_ops3_call_test_1();
+	return 0;
+}
+
+SEC("syscall")
+int trigger_test_2(void *ctx)
+{
+	bpf_testmod_ops3_call_test_2();
+	return 0;
+}
+
+/* Struct ops map that will be attached with a cookie */
+SEC(".struct_ops.link")
+struct bpf_testmod_ops3 testmod_cookie = {
+	.test_1 = (void *)test_cookie_1,
+	.test_2 = (void *)test_cookie_2,
+};
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index e9e918cdf31f..d273b327a1c0 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -1139,6 +1139,7 @@ static const struct bpf_verifier_ops bpf_testmod_verifier_ops = {
 };
 
 static const struct bpf_verifier_ops bpf_testmod_verifier_ops3 = {
+	.get_func_proto	 = bpf_base_func_proto,
 	.is_valid_access = bpf_testmod_ops_is_valid_access,
 };
 
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-08 23:08 ` [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment Amery Hung
@ 2025-07-09 22:13   ` Martin KaFai Lau
  2025-07-10 18:26     ` Martin KaFai Lau
  2025-07-10 18:39     ` Amery Hung
  0 siblings, 2 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2025-07-09 22:13 UTC (permalink / raw)
  To: Amery Hung
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, kernel-team,
	bpf

On 7/8/25 4:08 PM, Amery Hung wrote:
> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
>   		goto unlock;
>   	}
>   
> +	err = bpf_struct_ops_prepare_attach(st_map, 0);

A follow-up on the "using the map->id as the cookie" comment in the cover 
letter. I meant to use the map->id here instead of 0. If the cookie is intended 
to identify a particular struct_ops instance (i.e., the struct_ops map), then 
map->id should be a good fit, and it is automatically generated by the kernel 
during the map creation. As a result, I suspect that most of the changes in 
patch 1 and patch 2 will not be needed.

If I understand correctly, the kfunc implementation needs to look up the scx_ops 
instance (i.e., the struct_ops map) from the map->id/cookie. There is a similar 
map->id lookup in bpf_map_get_fd_by_id(), which requires acquiring a spin_lock. 
If performance is a concern, we can investigate whether it can be rcu-ified. 
 From a quick glance, bpf_map_free_id() is called before call_rcu(). Note that 
bpf_struct_ops_map_free() will wait for an RCU grace period.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-09 22:13   ` Martin KaFai Lau
@ 2025-07-10 18:26     ` Martin KaFai Lau
  2025-07-10 18:39     ` Amery Hung
  1 sibling, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2025-07-10 18:26 UTC (permalink / raw)
  To: Amery Hung
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, kernel-team,
	bpf

On 7/9/25 3:13 PM, Martin KaFai Lau wrote:
> If I understand correctly, the kfunc implementation needs to look up the scx_ops 
> instance (i.e., the struct_ops map) from the map->id/cookie. There is a similar 
> map->id lookup in bpf_map_get_fd_by_id(), which requires acquiring a spin_lock. 
> If performance is a concern, we can investigate whether it can be rcu-ified. 
>  From a quick glance, bpf_map_free_id() is called before call_rcu(). Note that 
> bpf_struct_ops_map_free() will wait for an RCU grace period.
> 

After looking at bpf_map_put more, not all BPF map frees wait for RCU. It may be 
cleaner to store the sched_ext_ops map in a separate (i.e., new) lookup data 
structure. That may also make it easier to ensure the looked up map is a 
struct_ops map (or in particular a sched_ext_ops).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-09 22:13   ` Martin KaFai Lau
  2025-07-10 18:26     ` Martin KaFai Lau
@ 2025-07-10 18:39     ` Amery Hung
  2025-07-10 19:47       ` Martin KaFai Lau
  1 sibling, 1 reply; 18+ messages in thread
From: Amery Hung @ 2025-07-10 18:39 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, kernel-team,
	bpf

On Wed, Jul 9, 2025 at 3:13 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 7/8/25 4:08 PM, Amery Hung wrote:
> > @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> >               goto unlock;
> >       }
> >
> > +     err = bpf_struct_ops_prepare_attach(st_map, 0);
>
> A follow-up on the "using the map->id as the cookie" comment in the cover
> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> map->id should be a good fit, and it is automatically generated by the kernel
> during the map creation. As a result, I suspect that most of the changes in
> patch 1 and patch 2 will not be needed.
>

Do you mean keep using cookie as the mechanism to associate programs,
but for struct_ops the cookie will be map->id (i.e.,
bpf_get_attah_cookie() in struct_ops will return map->id)?

> If I understand correctly, the kfunc implementation needs to look up the scx_ops
> instance (i.e., the struct_ops map) from the map->id/cookie. There is a similar
> map->id lookup in bpf_map_get_fd_by_id(), which requires acquiring a spin_lock.
> If performance is a concern, we can investigate whether it can be rcu-ified.
>  From a quick glance, bpf_map_free_id() is called before call_rcu(). Note that
> bpf_struct_ops_map_free() will wait for an RCU grace period.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-10 18:39     ` Amery Hung
@ 2025-07-10 19:47       ` Martin KaFai Lau
  2025-07-10 21:00         ` Amery Hung
  0 siblings, 1 reply; 18+ messages in thread
From: Martin KaFai Lau @ 2025-07-10 19:47 UTC (permalink / raw)
  To: Amery Hung
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, kernel-team,
	bpf

On 7/10/25 11:39 AM, Amery Hung wrote:
>> On 7/8/25 4:08 PM, Amery Hung wrote:
>>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
>>>                goto unlock;
>>>        }
>>>
>>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
>> A follow-up on the "using the map->id as the cookie" comment in the cover
>> letter. I meant to use the map->id here instead of 0. If the cookie is intended
>> to identify a particular struct_ops instance (i.e., the struct_ops map), then
>> map->id should be a good fit, and it is automatically generated by the kernel
>> during the map creation. As a result, I suspect that most of the changes in
>> patch 1 and patch 2 will not be needed.
>>
> Do you mean keep using cookie as the mechanism to associate programs,
> but for struct_ops the cookie will be map->id (i.e.,
> bpf_get_attah_cookie() in struct_ops will return map->id)?

I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx. 
Then there is no need for user space to generate a unique cookie during 
link_create. The kernel has already generated a unique ID in the map->id. The 
map->id is available during the bpf_struct_ops_map_update_elem(). Then there is 
also no need to distinguish between SEC(".struct_ops") vs 
SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.

A minor detail: note that the same struct ops program can be used in different 
trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.

If the question is about bpf global variable vs bpf cookie, yeah, I think using 
a bpf global variable should also work. The global variable can be initialized 
before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should 
be known already. I don't have a strong opinion on reusing the bpf cookie in the 
struct ops trampoline. No one is using it now, so it is available to be used. 
Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap 
also. Using bpf cookie to allow the struct ops program to tell which struct_ops 
map is calling it seems to fit well also after sleeping on it a bit. bpf global 
variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").

For tracing program, the bpf cookie seems to be an existing mechanism that can 
have any value (?). Thus, user space is free to store the map->id in it also. It 
can also choose to store the map->id in a bpf global variable if it has other 
uses for the bpf cookie. I think both should work similarly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-10 19:47       ` Martin KaFai Lau
@ 2025-07-10 21:00         ` Amery Hung
  2025-07-11 18:41           ` Andrii Nakryiko
  0 siblings, 1 reply; 18+ messages in thread
From: Amery Hung @ 2025-07-10 21:00 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: alexei.starovoitov, andrii, daniel, tj, martin.lau, kernel-team,
	bpf

On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 7/10/25 11:39 AM, Amery Hung wrote:
> >> On 7/8/25 4:08 PM, Amery Hung wrote:
> >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> >>>                goto unlock;
> >>>        }
> >>>
> >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> >> A follow-up on the "using the map->id as the cookie" comment in the cover
> >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> >> map->id should be a good fit, and it is automatically generated by the kernel
> >> during the map creation. As a result, I suspect that most of the changes in
> >> patch 1 and patch 2 will not be needed.
> >>
> > Do you mean keep using cookie as the mechanism to associate programs,
> > but for struct_ops the cookie will be map->id (i.e.,
> > bpf_get_attah_cookie() in struct_ops will return map->id)?
>
> I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> Then there is no need for user space to generate a unique cookie during
> link_create. The kernel has already generated a unique ID in the map->id. The
> map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> also no need to distinguish between SEC(".struct_ops") vs
> SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
>
> A minor detail: note that the same struct ops program can be used in different
> trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
>
> If the question is about bpf global variable vs bpf cookie, yeah, I think using
> a bpf global variable should also work. The global variable can be initialized
> before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> be known already. I don't have a strong opinion on reusing the bpf cookie in the
> struct ops trampoline. No one is using it now, so it is available to be used.
> Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> map is calling it seems to fit well also after sleeping on it a bit. bpf global
> variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
>

While both of them work, using cookie instead of global variable is
one less thing for the user to take care of (i.e., slightly better
usability).

With the approach you suggested, to not mix the existing semantics of
bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
map->id stored in bpf_tramp_run_ctx::bpf_cookie. Maybe
bpf_struct_ops_get_map_id()?

Another approach is to complete the plumbing of this patchset by
moving trampoline and ksyms from map to link. Right now it is broken
when creating multiple links from the same map as can be seen in the
CI. I think this is better as we don't create another unique thing for
struct_ops.

WDYT?

> For tracing program, the bpf cookie seems to be an existing mechanism that can
> have any value (?). Thus, user space is free to store the map->id in it also. It
> can also choose to store the map->id in a bpf global variable if it has other
> uses for the bpf cookie. I think both should work similarly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-10 21:00         ` Amery Hung
@ 2025-07-11 18:41           ` Andrii Nakryiko
  2025-07-11 19:29             ` Amery Hung
  0 siblings, 1 reply; 18+ messages in thread
From: Andrii Nakryiko @ 2025-07-11 18:41 UTC (permalink / raw)
  To: Amery Hung
  Cc: Martin KaFai Lau, alexei.starovoitov, andrii, daniel, tj,
	martin.lau, kernel-team, bpf

On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >
> > On 7/10/25 11:39 AM, Amery Hung wrote:
> > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > >>>                goto unlock;
> > >>>        }
> > >>>
> > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > >> map->id should be a good fit, and it is automatically generated by the kernel
> > >> during the map creation. As a result, I suspect that most of the changes in
> > >> patch 1 and patch 2 will not be needed.
> > >>
> > > Do you mean keep using cookie as the mechanism to associate programs,
> > > but for struct_ops the cookie will be map->id (i.e.,
> > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> >
> > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > Then there is no need for user space to generate a unique cookie during
> > link_create. The kernel has already generated a unique ID in the map->id. The
> > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > also no need to distinguish between SEC(".struct_ops") vs
> > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> >
> > A minor detail: note that the same struct ops program can be used in different
> > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> >
> > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > a bpf global variable should also work. The global variable can be initialized
> > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > struct ops trampoline. No one is using it now, so it is available to be used.
> > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> >
>
> While both of them work, using cookie instead of global variable is
> one less thing for the user to take care of (i.e., slightly better
> usability).
>
> With the approach you suggested, to not mix the existing semantics of
> bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the

yes, if absolutely necessary, sure, let's reuse the spot that is
reserved for cookie inside the trampoline, but let's not expose this
as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
helper for struct_ops), because BPF cookie is meant to be fully user
controllable and used for whatever they deem necessary. Not
necessarily to just identify the struct_ops map. So it will be a huge
violation to just pre-define what BPF cookie value is for struct_ops.

> map->id stored in bpf_tramp_run_ctx::bpf_cookie. Maybe
> bpf_struct_ops_get_map_id()?
>
> Another approach is to complete the plumbing of this patchset by
> moving trampoline and ksyms from map to link. Right now it is broken
> when creating multiple links from the same map as can be seen in the
> CI. I think this is better as we don't create another unique thing for
> struct_ops.
>
> WDYT?

I think that is a logical thing to do, because BPF link represents
attachment, and trampoline should conceptually correspond to an
attachment, not to the thing that is being attached (and might be
attached to multiple places, potentially). We have this approach with
the fentry/fexit program's trampoline, so it would be nice to move
struct_ops to the same model.

>
> > For tracing program, the bpf cookie seems to be an existing mechanism that can
> > have any value (?). Thus, user space is free to store the map->id in it also. It
> > can also choose to store the map->id in a bpf global variable if it has other
> > uses for the bpf cookie. I think both should work similarly.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-11 18:41           ` Andrii Nakryiko
@ 2025-07-11 19:29             ` Amery Hung
  2025-07-11 20:21               ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Amery Hung @ 2025-07-11 19:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Martin KaFai Lau, alexei.starovoitov, andrii, daniel, tj,
	martin.lau, kernel-team, bpf

On Fri, Jul 11, 2025 at 11:41 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > >
> > > On 7/10/25 11:39 AM, Amery Hung wrote:
> > > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > >>>                goto unlock;
> > > >>>        }
> > > >>>
> > > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > > >> map->id should be a good fit, and it is automatically generated by the kernel
> > > >> during the map creation. As a result, I suspect that most of the changes in
> > > >> patch 1 and patch 2 will not be needed.
> > > >>
> > > > Do you mean keep using cookie as the mechanism to associate programs,
> > > > but for struct_ops the cookie will be map->id (i.e.,
> > > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> > >
> > > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > > Then there is no need for user space to generate a unique cookie during
> > > link_create. The kernel has already generated a unique ID in the map->id. The
> > > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > > also no need to distinguish between SEC(".struct_ops") vs
> > > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> > >
> > > A minor detail: note that the same struct ops program can be used in different
> > > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> > >
> > > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > > a bpf global variable should also work. The global variable can be initialized
> > > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > > struct ops trampoline. No one is using it now, so it is available to be used.
> > > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> > >
> >
> > While both of them work, using cookie instead of global variable is
> > one less thing for the user to take care of (i.e., slightly better
> > usability).
> >
> > With the approach you suggested, to not mix the existing semantics of
> > bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
>
> yes, if absolutely necessary, sure, let's reuse the spot that is
> reserved for cookie inside the trampoline, but let's not expose this
> as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
> helper for struct_ops), because BPF cookie is meant to be fully user
> controllable and used for whatever they deem necessary. Not
> necessarily to just identify the struct_ops map. So it will be a huge
> violation to just pre-define what BPF cookie value is for struct_ops.
>

We had some offline discussions and figured out this will not work well.

sched_ext users already call scx kfuncs in global subprograms. If we
choose to add bpf_get_struct_ops_id() to get the id to be passed to
scx kfuncs, it will force the user to create two sets of the same
global subprog. The one called by struct_ops that calls
bpf_get_struct_ops_id() and tracing programs that calls
bpf_get_attach_cookie().

> > map->id stored in bpf_tramp_run_ctx::bpf_cookie. Maybe
> > bpf_struct_ops_get_map_id()?
> >
> > Another approach is to complete the plumbing of this patchset by
> > moving trampoline and ksyms from map to link. Right now it is broken
> > when creating multiple links from the same map as can be seen in the
> > CI. I think this is better as we don't create another unique thing for
> > struct_ops.
> >
> > WDYT?
>
> I think that is a logical thing to do, because BPF link represents
> attachment, and trampoline should conceptually correspond to an
> attachment, not to the thing that is being attached (and might be
> attached to multiple places, potentially). We have this approach with
> the fentry/fexit program's trampoline, so it would be nice to move
> struct_ops to the same model.
>
> >
> > > For tracing program, the bpf cookie seems to be an existing mechanism that can
> > > have any value (?). Thus, user space is free to store the map->id in it also. It
> > > can also choose to store the map->id in a bpf global variable if it has other
> > > uses for the bpf cookie. I think both should work similarly.
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-11 19:29             ` Amery Hung
@ 2025-07-11 20:21               ` Alexei Starovoitov
  2025-07-11 21:38                 ` Amery Hung
  2025-07-11 21:55                 ` Martin KaFai Lau
  0 siblings, 2 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2025-07-11 20:21 UTC (permalink / raw)
  To: Amery Hung
  Cc: Andrii Nakryiko, Martin KaFai Lau, Andrii Nakryiko,
	Daniel Borkmann, Tejun Heo, Martin KaFai Lau, Kernel Team, bpf

On Fri, Jul 11, 2025 at 12:29 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 11:41 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@gmail.com> wrote:
> > >
> > > On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > > >
> > > > On 7/10/25 11:39 AM, Amery Hung wrote:
> > > > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > > > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > > >>>                goto unlock;
> > > > >>>        }
> > > > >>>
> > > > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > > > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > > > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > > > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > > > >> map->id should be a good fit, and it is automatically generated by the kernel
> > > > >> during the map creation. As a result, I suspect that most of the changes in
> > > > >> patch 1 and patch 2 will not be needed.
> > > > >>
> > > > > Do you mean keep using cookie as the mechanism to associate programs,
> > > > > but for struct_ops the cookie will be map->id (i.e.,
> > > > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> > > >
> > > > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > > > Then there is no need for user space to generate a unique cookie during
> > > > link_create. The kernel has already generated a unique ID in the map->id. The
> > > > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > > > also no need to distinguish between SEC(".struct_ops") vs
> > > > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> > > >
> > > > A minor detail: note that the same struct ops program can be used in different
> > > > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> > > >
> > > > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > > > a bpf global variable should also work. The global variable can be initialized
> > > > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > > > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > > > struct ops trampoline. No one is using it now, so it is available to be used.
> > > > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > > > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > > > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > > > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> > > >
> > >
> > > While both of them work, using cookie instead of global variable is
> > > one less thing for the user to take care of (i.e., slightly better
> > > usability).
> > >
> > > With the approach you suggested, to not mix the existing semantics of
> > > bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
> >
> > yes, if absolutely necessary, sure, let's reuse the spot that is
> > reserved for cookie inside the trampoline, but let's not expose this
> > as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
> > helper for struct_ops), because BPF cookie is meant to be fully user
> > controllable and used for whatever they deem necessary. Not
> > necessarily to just identify the struct_ops map. So it will be a huge
> > violation to just pre-define what BPF cookie value is for struct_ops.
> >
>
> We had some offline discussions and figured out this will not work well.
>
> sched_ext users already call scx kfuncs in global subprograms. If we
> choose to add bpf_get_struct_ops_id() to get the id to be passed to
> scx kfuncs, it will force the user to create two sets of the same
> global subprog. The one called by struct_ops that calls
> bpf_get_struct_ops_id() and tracing programs that calls
> bpf_get_attach_cookie().

Can we put cookie into map_extra during st_ops map creation time and
later copy it into actual cookie place in a trampoline?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-11 20:21               ` Alexei Starovoitov
@ 2025-07-11 21:38                 ` Amery Hung
  2025-07-14 20:46                   ` Andrii Nakryiko
  2025-07-11 21:55                 ` Martin KaFai Lau
  1 sibling, 1 reply; 18+ messages in thread
From: Amery Hung @ 2025-07-11 21:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Martin KaFai Lau, Andrii Nakryiko,
	Daniel Borkmann, Tejun Heo, Martin KaFai Lau, Kernel Team, bpf

On Fri, Jul 11, 2025 at 1:21 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 12:29 PM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > On Fri, Jul 11, 2025 at 11:41 AM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@gmail.com> wrote:
> > > >
> > > > On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > > > >
> > > > > On 7/10/25 11:39 AM, Amery Hung wrote:
> > > > > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > > > > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > > > >>>                goto unlock;
> > > > > >>>        }
> > > > > >>>
> > > > > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > > > > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > > > > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > > > > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > > > > >> map->id should be a good fit, and it is automatically generated by the kernel
> > > > > >> during the map creation. As a result, I suspect that most of the changes in
> > > > > >> patch 1 and patch 2 will not be needed.
> > > > > >>
> > > > > > Do you mean keep using cookie as the mechanism to associate programs,
> > > > > > but for struct_ops the cookie will be map->id (i.e.,
> > > > > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> > > > >
> > > > > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > > > > Then there is no need for user space to generate a unique cookie during
> > > > > link_create. The kernel has already generated a unique ID in the map->id. The
> > > > > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > > > > also no need to distinguish between SEC(".struct_ops") vs
> > > > > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> > > > >
> > > > > A minor detail: note that the same struct ops program can be used in different
> > > > > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> > > > >
> > > > > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > > > > a bpf global variable should also work. The global variable can be initialized
> > > > > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > > > > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > > > > struct ops trampoline. No one is using it now, so it is available to be used.
> > > > > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > > > > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > > > > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > > > > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> > > > >
> > > >
> > > > While both of them work, using cookie instead of global variable is
> > > > one less thing for the user to take care of (i.e., slightly better
> > > > usability).
> > > >
> > > > With the approach you suggested, to not mix the existing semantics of
> > > > bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
> > >
> > > yes, if absolutely necessary, sure, let's reuse the spot that is
> > > reserved for cookie inside the trampoline, but let's not expose this
> > > as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
> > > helper for struct_ops), because BPF cookie is meant to be fully user
> > > controllable and used for whatever they deem necessary. Not
> > > necessarily to just identify the struct_ops map. So it will be a huge
> > > violation to just pre-define what BPF cookie value is for struct_ops.
> > >
> >
> > We had some offline discussions and figured out this will not work well.
> >
> > sched_ext users already call scx kfuncs in global subprograms. If we
> > choose to add bpf_get_struct_ops_id() to get the id to be passed to
> > scx kfuncs, it will force the user to create two sets of the same
> > global subprog. The one called by struct_ops that calls
> > bpf_get_struct_ops_id() and tracing programs that calls
> > bpf_get_attach_cookie().
>
> Can we put cookie into map_extra during st_ops map creation time and
> later copy it into actual cookie place in a trampoline?

It should work. Currently, it makes sense to have cookie in struct_ops
map as all struct_ops implementers (hid, tcp cc, qdisc, sched_ext) do
not allow multi-attachment of the same map. A struct_ops map is
effectively an unique attachment for now.

Additionally, we will be able to support cookie for non-link
struct_ops with this way.

This approach will not block future effort to support link-specific
cookie if there is such a use case. We can revisit this patchset then.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-11 20:21               ` Alexei Starovoitov
  2025-07-11 21:38                 ` Amery Hung
@ 2025-07-11 21:55                 ` Martin KaFai Lau
  1 sibling, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2025-07-11 21:55 UTC (permalink / raw)
  To: Alexei Starovoitov, Amery Hung
  Cc: Andrii Nakryiko, Andrii Nakryiko, Daniel Borkmann, Tejun Heo,
	Martin KaFai Lau, Kernel Team, bpf

On 7/11/25 1:21 PM, Alexei Starovoitov wrote:
> Can we put cookie into map_extra during st_ops map creation time and
> later copy it into actual cookie place in a trampoline?

+1. good idea. This should do.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-11 21:38                 ` Amery Hung
@ 2025-07-14 20:46                   ` Andrii Nakryiko
  2025-07-14 21:02                     ` Amery Hung
  0 siblings, 1 reply; 18+ messages in thread
From: Andrii Nakryiko @ 2025-07-14 20:46 UTC (permalink / raw)
  To: Amery Hung
  Cc: Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko,
	Daniel Borkmann, Tejun Heo, Martin KaFai Lau, Kernel Team, bpf

On Fri, Jul 11, 2025 at 2:38 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 1:21 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Jul 11, 2025 at 12:29 PM Amery Hung <ameryhung@gmail.com> wrote:
> > >
> > > On Fri, Jul 11, 2025 at 11:41 AM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@gmail.com> wrote:
> > > > >
> > > > > On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > > > > >
> > > > > > On 7/10/25 11:39 AM, Amery Hung wrote:
> > > > > > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > > > > > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > > > > >>>                goto unlock;
> > > > > > >>>        }
> > > > > > >>>
> > > > > > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > > > > > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > > > > > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > > > > > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > > > > > >> map->id should be a good fit, and it is automatically generated by the kernel
> > > > > > >> during the map creation. As a result, I suspect that most of the changes in
> > > > > > >> patch 1 and patch 2 will not be needed.
> > > > > > >>
> > > > > > > Do you mean keep using cookie as the mechanism to associate programs,
> > > > > > > but for struct_ops the cookie will be map->id (i.e.,
> > > > > > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> > > > > >
> > > > > > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > > > > > Then there is no need for user space to generate a unique cookie during
> > > > > > link_create. The kernel has already generated a unique ID in the map->id. The
> > > > > > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > > > > > also no need to distinguish between SEC(".struct_ops") vs
> > > > > > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> > > > > >
> > > > > > A minor detail: note that the same struct ops program can be used in different
> > > > > > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> > > > > >
> > > > > > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > > > > > a bpf global variable should also work. The global variable can be initialized
> > > > > > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > > > > > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > > > > > struct ops trampoline. No one is using it now, so it is available to be used.
> > > > > > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > > > > > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > > > > > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > > > > > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> > > > > >
> > > > >
> > > > > While both of them work, using cookie instead of global variable is
> > > > > one less thing for the user to take care of (i.e., slightly better
> > > > > usability).
> > > > >
> > > > > With the approach you suggested, to not mix the existing semantics of
> > > > > bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
> > > >
> > > > yes, if absolutely necessary, sure, let's reuse the spot that is
> > > > reserved for cookie inside the trampoline, but let's not expose this
> > > > as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
> > > > helper for struct_ops), because BPF cookie is meant to be fully user
> > > > controllable and used for whatever they deem necessary. Not
> > > > necessarily to just identify the struct_ops map. So it will be a huge
> > > > violation to just pre-define what BPF cookie value is for struct_ops.
> > > >
> > >
> > > We had some offline discussions and figured out this will not work well.
> > >
> > > sched_ext users already call scx kfuncs in global subprograms. If we
> > > choose to add bpf_get_struct_ops_id() to get the id to be passed to
> > > scx kfuncs, it will force the user to create two sets of the same
> > > global subprog. The one called by struct_ops that calls
> > > bpf_get_struct_ops_id() and tracing programs that calls
> > > bpf_get_attach_cookie().
> >
> > Can we put cookie into map_extra during st_ops map creation time and
> > later copy it into actual cookie place in a trampoline?
>
> It should work. Currently, it makes sense to have cookie in struct_ops
> map as all struct_ops implementers (hid, tcp cc, qdisc, sched_ext) do
> not allow multi-attachment of the same map. A struct_ops map is
> effectively an unique attachment for now.

Is there any conceptual reason why struct_ops map shouldn't be allowed
to be attached to multiple places? If not, should we try to lift this
restriction and not invent struct_ops-specific BPF cookie APIs?

From libbpf POV, struct_ops map is created implicitly from the
struct_ops variable. There is no map_flags field there. We'll need to
add new special APIs and/or conventions just to be able to set
map_flags.

>
> Additionally, we will be able to support cookie for non-link
> struct_ops with this way.
>
> This approach will not block future effort to support link-specific
> cookie if there is such a use case. We can revisit this patchset then.

It will create two ways to specify BPF cookie for struct_ops: (legacy
and special way) through map_flags and common one through the
LINK_CREATE command (and I guess we'd need to reject LINK_CREATE if
cookie was already set through map_flags, right?). Why confuse users
like that?

From what I understand, the problem is that currently struct_ops map's
BPF trampoline(s) are created a bit too early, before the attachment
step. How hard would it be to move trampoline creation to an actual
attachment time? Should we seriously consider this before we invent
new struct_ops-specific exceptions for BPF cookie?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-14 20:46                   ` Andrii Nakryiko
@ 2025-07-14 21:02                     ` Amery Hung
  2025-07-14 22:51                       ` Martin KaFai Lau
  0 siblings, 1 reply; 18+ messages in thread
From: Amery Hung @ 2025-07-14 21:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Martin KaFai Lau, Andrii Nakryiko,
	Daniel Borkmann, Tejun Heo, Martin KaFai Lau, Kernel Team, bpf

On Mon, Jul 14, 2025 at 1:46 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 2:38 PM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > On Fri, Jul 11, 2025 at 1:21 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Fri, Jul 11, 2025 at 12:29 PM Amery Hung <ameryhung@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 11, 2025 at 11:41 AM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Thu, Jul 10, 2025 at 2:00 PM Amery Hung <ameryhung@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Jul 10, 2025 at 12:47 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > > > > > >
> > > > > > > On 7/10/25 11:39 AM, Amery Hung wrote:
> > > > > > > >> On 7/8/25 4:08 PM, Amery Hung wrote:
> > > > > > > >>> @@ -906,6 +904,10 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > > > > > >>>                goto unlock;
> > > > > > > >>>        }
> > > > > > > >>>
> > > > > > > >>> +     err = bpf_struct_ops_prepare_attach(st_map, 0);
> > > > > > > >> A follow-up on the "using the map->id as the cookie" comment in the cover
> > > > > > > >> letter. I meant to use the map->id here instead of 0. If the cookie is intended
> > > > > > > >> to identify a particular struct_ops instance (i.e., the struct_ops map), then
> > > > > > > >> map->id should be a good fit, and it is automatically generated by the kernel
> > > > > > > >> during the map creation. As a result, I suspect that most of the changes in
> > > > > > > >> patch 1 and patch 2 will not be needed.
> > > > > > > >>
> > > > > > > > Do you mean keep using cookie as the mechanism to associate programs,
> > > > > > > > but for struct_ops the cookie will be map->id (i.e.,
> > > > > > > > bpf_get_attah_cookie() in struct_ops will return map->id)?
> > > > > > >
> > > > > > > I meant to use the map->id as the bpf_cookie stored in the bpf_tramp_run_ctx.
> > > > > > > Then there is no need for user space to generate a unique cookie during
> > > > > > > link_create. The kernel has already generated a unique ID in the map->id. The
> > > > > > > map->id is available during the bpf_struct_ops_map_update_elem(). Then there is
> > > > > > > also no need to distinguish between SEC(".struct_ops") vs
> > > > > > > SEC(".struct_ops.link"). Most of the patch 1 and patch 2 will not be needed.
> > > > > > >
> > > > > > > A minor detail: note that the same struct ops program can be used in different
> > > > > > > trampolines. Thus, to be specific, the bpf cookie is stored in the trampoline.
> > > > > > >
> > > > > > > If the question is about bpf global variable vs bpf cookie, yeah, I think using
> > > > > > > a bpf global variable should also work. The global variable can be initialized
> > > > > > > before libbpf's bpf_map__attach_struct_ops(). At that time, the map->id should
> > > > > > > be known already. I don't have a strong opinion on reusing the bpf cookie in the
> > > > > > > struct ops trampoline. No one is using it now, so it is available to be used.
> > > > > > > Exposing BPF_FUNC_get_attach_cookie for struct ops programs is pretty cheap
> > > > > > > also. Using bpf cookie to allow the struct ops program to tell which struct_ops
> > > > > > > map is calling it seems to fit well also after sleeping on it a bit. bpf global
> > > > > > > variable will also break if a bpf_prog.o has more than one SEC(".struct_ops").
> > > > > > >
> > > > > >
> > > > > > While both of them work, using cookie instead of global variable is
> > > > > > one less thing for the user to take care of (i.e., slightly better
> > > > > > usability).
> > > > > >
> > > > > > With the approach you suggested, to not mix the existing semantics of
> > > > > > bpf cookie, I think a new struct_ops kfuncs is needed to retrieve the
> > > > >
> > > > > yes, if absolutely necessary, sure, let's reuse the spot that is
> > > > > reserved for cookie inside the trampoline, but let's not expose this
> > > > > as real BPF cookie (i.e., let's not allow bpf_get_attach_cookie()
> > > > > helper for struct_ops), because BPF cookie is meant to be fully user
> > > > > controllable and used for whatever they deem necessary. Not
> > > > > necessarily to just identify the struct_ops map. So it will be a huge
> > > > > violation to just pre-define what BPF cookie value is for struct_ops.
> > > > >
> > > >
> > > > We had some offline discussions and figured out this will not work well.
> > > >
> > > > sched_ext users already call scx kfuncs in global subprograms. If we
> > > > choose to add bpf_get_struct_ops_id() to get the id to be passed to
> > > > scx kfuncs, it will force the user to create two sets of the same
> > > > global subprog. The one called by struct_ops that calls
> > > > bpf_get_struct_ops_id() and tracing programs that calls
> > > > bpf_get_attach_cookie().
> > >
> > > Can we put cookie into map_extra during st_ops map creation time and
> > > later copy it into actual cookie place in a trampoline?
> >
> > It should work. Currently, it makes sense to have cookie in struct_ops
> > map as all struct_ops implementers (hid, tcp cc, qdisc, sched_ext) do
> > not allow multi-attachment of the same map. A struct_ops map is
> > effectively an unique attachment for now.
>
> Is there any conceptual reason why struct_ops map shouldn't be allowed
> to be attached to multiple places? If not, should we try to lift this
> restriction and not invent struct_ops-specific BPF cookie APIs?
>
> From libbpf POV, struct_ops map is created implicitly from the
> struct_ops variable. There is no map_flags field there. We'll need to
> add new special APIs and/or conventions just to be able to set
> map_flags.
>
> >
> > Additionally, we will be able to support cookie for non-link
> > struct_ops with this way.
> >
> > This approach will not block future effort to support link-specific
> > cookie if there is such a use case. We can revisit this patchset then.
>
> It will create two ways to specify BPF cookie for struct_ops: (legacy
> and special way) through map_flags and common one through the
> LINK_CREATE command (and I guess we'd need to reject LINK_CREATE if
> cookie was already set through map_flags, right?). Why confuse users
> like that?
>
> From what I understand, the problem is that currently struct_ops map's
> BPF trampoline(s) are created a bit too early, before the attachment
> step. How hard would it be to move trampoline creation to an actual
> attachment time? Should we seriously consider this before we invent
> new struct_ops-specific exceptions for BPF cookie?

Yeah...

I realized that while map_extra works, the cookie needs to be passed
to the kernel in map creation time and will create the confusion you
mentioned in libbpf.

I am fixing the patchset by moving trampoline and ksyms to
bpf_struct_ops_link. It shouldn't complicate struct_ops code too much
(finger cross).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment
  2025-07-14 21:02                     ` Amery Hung
@ 2025-07-14 22:51                       ` Martin KaFai Lau
  0 siblings, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2025-07-14 22:51 UTC (permalink / raw)
  To: Amery Hung, Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Tejun Heo,
	Martin KaFai Lau, Kernel Team, bpf

On 7/14/25 2:02 PM, Amery Hung wrote:
>>> It should work. Currently, it makes sense to have cookie in struct_ops
>>> map as all struct_ops implementers (hid, tcp cc, qdisc, sched_ext) do
>>> not allow multi-attachment of the same map. A struct_ops map is
>>> effectively an unique attachment for now.
>>
>> Is there any conceptual reason why struct_ops map shouldn't be allowed
>> to be attached to multiple places? If not, should we try to lift this
>> restriction and not invent struct_ops-specific BPF cookie APIs?

bpf_struct_ops map currently does allow attaching multiple times through the 
link interface. It is up to the subsystem tcp-cc/hid/qdisc/scx how to handle it.

> I am fixing the patchset by moving trampoline and ksyms to
> bpf_struct_ops_link. It shouldn't complicate struct_ops code too much
> (finger cross).

seems reasonable. I think it will be useful to have comments in v2 on how their 
lifecycle will be handled.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-07-14 22:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-08 23:08 [RFC bpf-next v1 0/4] Support cookie for link-based struct_ops attachment Amery Hung
2025-07-08 23:08 ` [RFC bpf-next v1 1/4] bpf: Factor out bpf_struct_ops_prepare_attach() Amery Hung
2025-07-08 23:08 ` [RFC bpf-next v1 2/4] bpf: Support cookie for linked-based struct_ops attachment Amery Hung
2025-07-09 22:13   ` Martin KaFai Lau
2025-07-10 18:26     ` Martin KaFai Lau
2025-07-10 18:39     ` Amery Hung
2025-07-10 19:47       ` Martin KaFai Lau
2025-07-10 21:00         ` Amery Hung
2025-07-11 18:41           ` Andrii Nakryiko
2025-07-11 19:29             ` Amery Hung
2025-07-11 20:21               ` Alexei Starovoitov
2025-07-11 21:38                 ` Amery Hung
2025-07-14 20:46                   ` Andrii Nakryiko
2025-07-14 21:02                     ` Amery Hung
2025-07-14 22:51                       ` Martin KaFai Lau
2025-07-11 21:55                 ` Martin KaFai Lau
2025-07-08 23:08 ` [RFC bpf-next v1 3/4] libbpf: Support link-based struct_ops attach with options Amery Hung
2025-07-08 23:08 ` [RFC bpf-next v1 4/4] selftests/bpf: Test bpf_get_attach_cookie() in struct_ops program Amery Hung

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).