* [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops
@ 2025-11-04 17:26 Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 1/7] bpf: Allow verifier to fixup kernel module kfuncs Amery Hung
` (6 more replies)
0 siblings, 7 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
v4 -> v5
- Simplify the API for getting associated struct_ops and dont't
expose struct_ops map lifecycle management (Andrii, Alexei)
Link: https://lore.kernel.org/bpf/20251024212914.1474337-1-ameryhung@gmail.com/
v3 -> v4
- Fix potential dangling pointer in timer callback. Protect
st_ops_assoc with RCU. The get helper now needs to be paired with
bpf_struct_ops_put()
- The command should only increase refcount once for a program
(Andrii)
- Test a struct_ops program reused in two struct_ops maps
- Test getting associated struct_ops in timer callback
Link: https://lore.kernel.org/bpf/20251017215627.722338-1-ameryhung@gmail.com/
v2 -> v3
- Change the type of st_ops_assoc from void* (i.e., kdata) to bpf_map
(Andrii)
- Fix a bug that clears BPF_PTR_POISON when a struct_ops map is freed
(Andrii)
- Return NULL if the map is not fully initialized (Martin)
- Move struct_ops map refcount inc/dec into internal helpers (Martin)
- Add libbpf API, bpf_program__assoc_struct_ops (Andrii)
Link: https://lore.kernel.org/bpf/20251016204503.3203690-1-ameryhung@gmail.com/
v1 -> v2
- Poison st_ops_assoc when reusing the program in more than one
struct_ops maps and add a helper to access the pointer (Andrii)
- Minor style and naming changes (Andrii)
Link: https://lore.kernel.org/bpf/20251010174953.2884682-1-ameryhung@gmail.com/
---
Hi,
This patchset adds a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to
the bpf() syscall to allow associating a BPF program with a struct_ops.
The command is introduced to address a emerging need from struct_ops
users. As the number of subsystems adopting struct_ops grows, more
users are building their struct_ops-based solution with some help from
other BPF programs. For exmample, scx_layer uses a syscall program as
a user space trigger to refresh layers [0]. It also uses tracing program
to infer whether a task is using GPU and needs to be prioritized [1]. In
these use cases, when there are multiple struct_ops instances, the
struct_ops kfuncs called from different BPF programs, whether struct_ops
or not needs to be able to refer to a specific one, which currently is
not possible.
The new BPF command will allow users to explicitly associate a BPF
program with a struct_ops map. The libbpf wrapper can be called after
loading programs and before attaching programs and struct_ops.
Internally, it will set prog->aux->st_ops_assoc to the struct_ops
map. struct_ops kfuncs can then get the associated struct_ops struct
by calling bpf_prog_get_assoc_struct_ops() with prog->aux, which can
be acquired from a "__prog" argument. The value of the speical
argument will be fixed up by the verifier during verification.
The command conceptually associates the implementation of BPF programs
with struct_ops map, not the attachment. A program associated with the
map will take a refcount of it so that st_ops_assoc always points to a
valid struct_ops struct. struct_ops implementers can use the helper,
bpf_prog_get_assoc_struct_ops to get the pointer. The returned
struct_ops if not NULL is guaranteed to be valid and initialized.
However, it is not guarantted that the struct_ops is attached. The
struct_ops implementer still need to take stepis to track and check the
state of the struct_ops in kdata, if the use case demand the struct_ops
to be attached.
We can also consider support associating struct_ops link with BPF
programs, which on one hand make struct_ops implementer's job easier,
but might complicate libbpf workflow and does not apply to legacy
struct_ops attachment.
[0] https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/bpf/main.bpf.c#L557
[1] https://github.com/sched-ext/scx/blob/main/scheds/rust/scx_layered/src/bpf/main.bpf.c#L754
---
Amery Hung (7):
bpf: Allow verifier to fixup kernel module kfuncs
bpf: Support associating BPF program with struct_ops
bpf: Pin associated struct_ops when registering async callback
libbpf: Add support for associating BPF program with struct_ops
selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command
selftests/bpf: Test ambiguous associated struct_ops
selftests/bpf: Test getting associated struct_ops in timer callback
include/linux/bpf.h | 16 ++
include/uapi/linux/bpf.h | 17 ++
kernel/bpf/bpf_struct_ops.c | 90 ++++++++
kernel/bpf/core.c | 3 +
kernel/bpf/helpers.c | 105 +++++++---
kernel/bpf/syscall.c | 46 +++++
kernel/bpf/verifier.c | 3 +-
tools/include/uapi/linux/bpf.h | 17 ++
tools/lib/bpf/bpf.c | 19 ++
tools/lib/bpf/bpf.h | 21 ++
tools/lib/bpf/libbpf.c | 30 +++
tools/lib/bpf/libbpf.h | 16 ++
tools/lib/bpf/libbpf.map | 2 +
.../bpf/prog_tests/test_struct_ops_assoc.c | 194 ++++++++++++++++++
.../selftests/bpf/progs/struct_ops_assoc.c | 105 ++++++++++
.../bpf/progs/struct_ops_assoc_in_timer.c | 77 +++++++
.../bpf/progs/struct_ops_assoc_reuse.c | 75 +++++++
.../selftests/bpf/test_kmods/bpf_testmod.c | 17 ++
.../bpf/test_kmods/bpf_testmod_kfunc.h | 1 +
19 files changed, 819 insertions(+), 35 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_assoc.c
create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_assoc_in_timer.c
create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_assoc_reuse.c
--
2.47.3
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 1/7] bpf: Allow verifier to fixup kernel module kfuncs
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
` (5 subsequent siblings)
6 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Allow verifier to fixup kfuncs in kernel module to support kfuncs with
__prog arguments. Currently, special kfuncs and kfuncs with __prog
arguments are kernel kfuncs. Allowing kernel module kfuncs should not
affect existing kfunc fixup as kernel module kfuncs have BTF IDs greater
than kernel kfuncs' BTF IDs.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/verifier.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 542e23fb19c7..8f4410eee3b6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21973,8 +21973,7 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
if (!bpf_jit_supports_far_kfunc_call())
insn->imm = BPF_CALL_IMM(desc->addr);
- if (insn->off)
- return 0;
+
if (desc->func_id == special_kfunc_list[KF_bpf_obj_new_impl] ||
desc->func_id == special_kfunc_list[KF_bpf_percpu_obj_new_impl]) {
struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta;
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 1/7] bpf: Allow verifier to fixup kernel module kfuncs Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
` (3 more replies)
2025-11-04 17:26 ` [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback Amery Hung
` (4 subsequent siblings)
6 siblings, 4 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating
a BPF program with a struct_ops map. This command takes a file
descriptor of a struct_ops map and a BPF program and set
prog->aux->st_ops_assoc to the kdata of the struct_ops map.
The command does not accept a struct_ops program nor a non-struct_ops
map. Programs of a struct_ops map is automatically associated with the
map during map update. If a program is shared between two struct_ops
maps, prog->aux->st_ops_assoc will be poisoned to indicate that the
associated struct_ops is ambiguous. The pointer, once poisoned, cannot
be reset since we have lost track of associated struct_ops. For other
program types, the associated struct_ops map, once set, cannot be
changed later. This restriction may be lifted in the future if there is
a use case.
A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve
the associated struct_ops pointer. The returned pointer, if not NULL, is
guaranteed to be valid and point to a fully updated struct_ops struct.
For struct_ops program reused in multiple struct_ops map, the return
will be NULL.
To make sure the returned pointer to be valid, the command increases the
refcount of the map for every associated non-struct_ops programs. For
struct_ops programs, the destruction of a struct_ops map already waits for
its BPF programs to finish running. A later patch will further make sure
the map will not be freed when an async callback schedule from struct_ops
is running.
struct_ops implementers should note that the struct_ops returned may or
may not be attached. The struct_ops implementer will be responsible for
tracking and checking the state of the associated struct_ops map if the
use case requires an attached struct_ops.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/linux/bpf.h | 16 ++++++
include/uapi/linux/bpf.h | 17 +++++++
kernel/bpf/bpf_struct_ops.c | 90 ++++++++++++++++++++++++++++++++++
kernel/bpf/core.c | 3 ++
kernel/bpf/syscall.c | 46 +++++++++++++++++
tools/include/uapi/linux/bpf.h | 17 +++++++
6 files changed, 189 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a47d67db3be5..0f71030c03e1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1726,6 +1726,8 @@ struct bpf_prog_aux {
struct rcu_head rcu;
};
struct bpf_stream stream[2];
+ struct mutex st_ops_assoc_mutex;
+ struct bpf_map *st_ops_assoc;
};
struct bpf_prog {
@@ -2026,6 +2028,9 @@ static inline void bpf_module_put(const void *data, struct module *owner)
module_put(owner);
}
int bpf_struct_ops_link_create(union bpf_attr *attr);
+int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map);
+void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog);
+void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux);
u32 bpf_struct_ops_id(const void *kdata);
#ifdef CONFIG_NET
@@ -2073,6 +2078,17 @@ static inline int bpf_struct_ops_link_create(union bpf_attr *attr)
{
return -EOPNOTSUPP;
}
+static inline int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map)
+{
+ return -EOPNOTSUPP;
+}
+static inline void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog)
+{
+}
+static inline void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
+{
+ return NULL;
+}
static inline void bpf_map_struct_ops_info_fill(struct bpf_map_info *info, struct bpf_map *map)
{
}
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1d73f165394d..a2d8cfac8555 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -918,6 +918,16 @@ union bpf_iter_link_info {
* Number of bytes read from the stream on success, or -1 if an
* error occurred (in which case, *errno* is set appropriately).
*
+ * BPF_PROG_ASSOC_STRUCT_OPS
+ * Description
+ * Associate a BPF program with a struct_ops map. The struct_ops
+ * map is identified by *map_fd* and the BPF program is
+ * identified by *prog_fd*.
+ *
+ * Return
+ * 0 on success or -1 if an error occurred (in which case,
+ * *errno* is set appropriately).
+ *
* NOTES
* eBPF objects (maps and programs) can be shared between processes.
*
@@ -974,6 +984,7 @@ enum bpf_cmd {
BPF_PROG_BIND_MAP,
BPF_TOKEN_CREATE,
BPF_PROG_STREAM_READ_BY_FD,
+ BPF_PROG_ASSOC_STRUCT_OPS,
__MAX_BPF_CMD,
};
@@ -1893,6 +1904,12 @@ union bpf_attr {
__u32 prog_fd;
} prog_stream_read;
+ struct {
+ __u32 map_fd;
+ __u32 prog_fd;
+ __u32 flags;
+ } prog_assoc_struct_ops;
+
} __attribute__((aligned(8)));
/* The description below is an attempt at providing documentation to eBPF
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index a41e6730edcf..0a19842da7a2 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -533,6 +533,17 @@ static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
}
}
+static void bpf_struct_ops_map_dissoc_progs(struct bpf_struct_ops_map *st_map)
+{
+ u32 i;
+
+ for (i = 0; i < st_map->funcs_cnt; i++) {
+ if (!st_map->links[i])
+ break;
+ bpf_prog_disassoc_struct_ops(st_map->links[i]->prog);
+ }
+}
+
static void bpf_struct_ops_map_free_image(struct bpf_struct_ops_map *st_map)
{
int i;
@@ -801,6 +812,12 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
goto reset_unlock;
}
+ err = bpf_prog_assoc_struct_ops(prog, &st_map->map);
+ if (err) {
+ bpf_prog_put(prog);
+ goto reset_unlock;
+ }
+
link = kzalloc(sizeof(*link), GFP_USER);
if (!link) {
bpf_prog_put(prog);
@@ -980,6 +997,8 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
if (btf_is_module(st_map->btf))
module_put(st_map->st_ops_desc->st_ops->owner);
+ bpf_struct_ops_map_dissoc_progs(st_map);
+
bpf_struct_ops_map_del_ksyms(st_map);
/* The struct_ops's function may switch to another struct_ops.
@@ -1394,6 +1413,77 @@ int bpf_struct_ops_link_create(union bpf_attr *attr)
return err;
}
+int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map)
+{
+ struct bpf_map *st_ops_assoc;
+
+ guard(mutex)(&prog->aux->st_ops_assoc_mutex);
+
+ st_ops_assoc = prog->aux->st_ops_assoc;
+
+ if (st_ops_assoc && st_ops_assoc == map)
+ return 0;
+
+ if (st_ops_assoc) {
+ if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
+ return -EBUSY;
+
+ WRITE_ONCE(prog->aux->st_ops_assoc, BPF_PTR_POISON);
+ } else {
+ if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
+ bpf_map_inc(map);
+
+ WRITE_ONCE(prog->aux->st_ops_assoc, map);
+ }
+
+ return 0;
+}
+
+void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog)
+{
+ struct bpf_map *st_ops_assoc;
+
+ guard(mutex)(&prog->aux->st_ops_assoc_mutex);
+
+ st_ops_assoc = prog->aux->st_ops_assoc;
+
+ if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
+ return;
+
+ if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
+ bpf_map_put(st_ops_assoc);
+
+ WRITE_ONCE(prog->aux->st_ops_assoc, NULL);
+}
+
+/*
+ * Get a reference to the struct_ops struct (i.e., kdata) associated with a
+ * program.
+ *
+ * If the returned pointer is not NULL, it must points to a valid and
+ * initialized struct_ops. The struct_ops may or may not be attached.
+ * Kernel struct_ops implementers are responsible for tracking and checking
+ * the state of the struct_ops if the use case requires an attached struct_ops.
+ */
+void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
+{
+ struct bpf_map *st_ops_assoc = READ_ONCE(aux->st_ops_assoc);
+ struct bpf_struct_ops_map *st_map;
+
+ if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
+ return NULL;
+
+ st_map = (struct bpf_struct_ops_map *)st_ops_assoc;
+
+ if (smp_load_acquire(&st_map->kvalue.common.state) == BPF_STRUCT_OPS_STATE_INIT) {
+ bpf_map_put(st_ops_assoc);
+ return NULL;
+ }
+
+ return &st_map->kvalue.data;
+}
+EXPORT_SYMBOL_GPL(bpf_prog_get_assoc_struct_ops);
+
void bpf_map_struct_ops_info_fill(struct bpf_map_info *info, struct bpf_map *map)
{
struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index d595fe512498..441bfeece377 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -136,6 +136,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
mutex_init(&fp->aux->used_maps_mutex);
mutex_init(&fp->aux->ext_mutex);
mutex_init(&fp->aux->dst_mutex);
+ mutex_init(&fp->aux->st_ops_assoc_mutex);
#ifdef CONFIG_BPF_SYSCALL
bpf_prog_stream_init(fp);
@@ -286,6 +287,7 @@ void __bpf_prog_free(struct bpf_prog *fp)
if (fp->aux) {
mutex_destroy(&fp->aux->used_maps_mutex);
mutex_destroy(&fp->aux->dst_mutex);
+ mutex_destroy(&fp->aux->st_ops_assoc_mutex);
kfree(fp->aux->poke_tab);
kfree(fp->aux);
}
@@ -2875,6 +2877,7 @@ static void bpf_prog_free_deferred(struct work_struct *work)
#endif
bpf_free_used_maps(aux);
bpf_free_used_btfs(aux);
+ bpf_prog_disassoc_struct_ops(aux->prog);
if (bpf_prog_is_dev_bound(aux))
bpf_prog_dev_bound_destroy(aux->prog);
#ifdef CONFIG_PERF_EVENTS
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8a129746bd6c..dddbe89e9718 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -6107,6 +6107,49 @@ static int prog_stream_read(union bpf_attr *attr)
return ret;
}
+#define BPF_PROG_ASSOC_STRUCT_OPS_LAST_FIELD prog_assoc_struct_ops.prog_fd
+
+static int prog_assoc_struct_ops(union bpf_attr *attr)
+{
+ struct bpf_prog *prog;
+ struct bpf_map *map;
+ int ret;
+
+ if (CHECK_ATTR(BPF_PROG_ASSOC_STRUCT_OPS))
+ return -EINVAL;
+
+ if (attr->prog_assoc_struct_ops.flags)
+ return -EINVAL;
+
+ prog = bpf_prog_get(attr->prog_assoc_struct_ops.prog_fd);
+ if (IS_ERR(prog))
+ return PTR_ERR(prog);
+
+ if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
+ ret = -EINVAL;
+ goto put_prog;
+ }
+
+ map = bpf_map_get(attr->prog_assoc_struct_ops.map_fd);
+ if (IS_ERR(map)) {
+ ret = PTR_ERR(map);
+ goto put_prog;
+ }
+
+ if (map->map_type != BPF_MAP_TYPE_STRUCT_OPS) {
+ ret = -EINVAL;
+ goto put_map;
+ }
+
+ ret = bpf_prog_assoc_struct_ops(prog, map);
+
+put_map:
+ bpf_map_put(map);
+put_prog:
+ bpf_prog_put(prog);
+ return ret;
+}
+
static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size)
{
union bpf_attr attr;
@@ -6246,6 +6289,9 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size)
case BPF_PROG_STREAM_READ_BY_FD:
err = prog_stream_read(&attr);
break;
+ case BPF_PROG_ASSOC_STRUCT_OPS:
+ err = prog_assoc_struct_ops(&attr);
+ break;
default:
err = -EINVAL;
break;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 1d73f165394d..a2d8cfac8555 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -918,6 +918,16 @@ union bpf_iter_link_info {
* Number of bytes read from the stream on success, or -1 if an
* error occurred (in which case, *errno* is set appropriately).
*
+ * BPF_PROG_ASSOC_STRUCT_OPS
+ * Description
+ * Associate a BPF program with a struct_ops map. The struct_ops
+ * map is identified by *map_fd* and the BPF program is
+ * identified by *prog_fd*.
+ *
+ * Return
+ * 0 on success or -1 if an error occurred (in which case,
+ * *errno* is set appropriately).
+ *
* NOTES
* eBPF objects (maps and programs) can be shared between processes.
*
@@ -974,6 +984,7 @@ enum bpf_cmd {
BPF_PROG_BIND_MAP,
BPF_TOKEN_CREATE,
BPF_PROG_STREAM_READ_BY_FD,
+ BPF_PROG_ASSOC_STRUCT_OPS,
__MAX_BPF_CMD,
};
@@ -1893,6 +1904,12 @@ union bpf_attr {
__u32 prog_fd;
} prog_stream_read;
+ struct {
+ __u32 map_fd;
+ __u32 prog_fd;
+ __u32 flags;
+ } prog_assoc_struct_ops;
+
} __attribute__((aligned(8)));
/* The description below is an attempt at providing documentation to eBPF
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 1/7] bpf: Allow verifier to fixup kernel module kfuncs Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
2025-11-04 18:03 ` bot+bpf-ci
` (2 more replies)
2025-11-04 17:26 ` [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops Amery Hung
` (3 subsequent siblings)
6 siblings, 3 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Take a refcount of the associated struct_ops map to prevent the map from
being freed when an async callback scheduled from a struct_ops program
runs.
Since struct_ops programs do not take refcounts on the struct_ops map,
it is possible for a struct_ops map to be freed when an async callback
scheduled from it runs. To prevent this, take a refcount on prog->aux->
st_ops_assoc and save it in a newly created struct bpf_async_res for
every async mechanism. The reference needs to be preserved in
bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
and reference leak could happen.
bpf_async_res will contain a async callback's BPF program and resources
related to the BPF program. The resources will be acquired when
registering a callback and released when cancelled or when the map
associated with the callback is freed.
Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
what it now does.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/helpers.c | 105 +++++++++++++++++++++++++++++--------------
1 file changed, 72 insertions(+), 33 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 930e132f440f..5c081cd604d5 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1092,9 +1092,14 @@ static void *map_key_from_value(struct bpf_map *map, void *value, u32 *arr_idx)
return (void *)value - round_up(map->key_size, 8);
}
+struct bpf_async_res {
+ struct bpf_prog *prog;
+ struct bpf_map *st_ops_assoc;
+};
+
struct bpf_async_cb {
struct bpf_map *map;
- struct bpf_prog *prog;
+ struct bpf_async_res res;
void __rcu *callback_fn;
void *value;
union {
@@ -1299,8 +1304,8 @@ static int __bpf_async_init(struct bpf_async_kern *async, struct bpf_map *map, u
break;
}
cb->map = map;
- cb->prog = NULL;
cb->flags = flags;
+ memset(&cb->res, 0, sizeof(cb->res));
rcu_assign_pointer(cb->callback_fn, NULL);
WRITE_ONCE(async->cb, cb);
@@ -1351,11 +1356,47 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
.arg3_type = ARG_ANYTHING,
};
+static void bpf_async_res_put(struct bpf_async_res *res)
+{
+ bpf_prog_put(res->prog);
+
+ if (res->st_ops_assoc)
+ bpf_map_put(res->st_ops_assoc);
+}
+
+static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
+{
+ struct bpf_map *st_ops_assoc = NULL;
+ int err;
+
+ prog = bpf_prog_inc_not_zero(prog);
+ if (IS_ERR(prog))
+ return PTR_ERR(prog);
+
+ st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
+ if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
+ st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
+ st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
+ if (IS_ERR(st_ops_assoc)) {
+ err = PTR_ERR(st_ops_assoc);
+ goto put_prog;
+ }
+ }
+
+ res->prog = prog;
+ res->st_ops_assoc = st_ops_assoc;
+ return 0;
+put_prog:
+ bpf_prog_put(prog);
+ return err;
+}
+
static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback_fn,
struct bpf_prog_aux *aux, unsigned int flags,
enum bpf_async_type type)
{
struct bpf_prog *prev, *prog = aux->prog;
+ struct bpf_async_res res;
struct bpf_async_cb *cb;
int ret = 0;
@@ -1376,20 +1417,18 @@ static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback
ret = -EPERM;
goto out;
}
- prev = cb->prog;
+ prev = cb->res.prog;
if (prev != prog) {
- /* Bump prog refcnt once. Every bpf_timer_set_callback()
+ /* Get prog and related resources once. Every bpf_timer_set_callback()
* can pick different callback_fn-s within the same prog.
*/
- prog = bpf_prog_inc_not_zero(prog);
- if (IS_ERR(prog)) {
- ret = PTR_ERR(prog);
+ ret = bpf_async_res_get(&res, prog);
+ if (ret)
goto out;
- }
if (prev)
- /* Drop prev prog refcnt when swapping with new prog */
- bpf_prog_put(prev);
- cb->prog = prog;
+ /* Put prev prog and related resources when swapping with new prog */
+ bpf_async_res_put(&cb->res);
+ cb->res = res;
}
rcu_assign_pointer(cb->callback_fn, callback_fn);
out:
@@ -1423,7 +1462,7 @@ BPF_CALL_3(bpf_timer_start, struct bpf_async_kern *, timer, u64, nsecs, u64, fla
return -EINVAL;
__bpf_spin_lock_irqsave(&timer->lock);
t = timer->timer;
- if (!t || !t->cb.prog) {
+ if (!t || !t->cb.res.prog) {
ret = -EINVAL;
goto out;
}
@@ -1451,14 +1490,14 @@ static const struct bpf_func_proto bpf_timer_start_proto = {
.arg3_type = ARG_ANYTHING,
};
-static void drop_prog_refcnt(struct bpf_async_cb *async)
+static void bpf_async_cb_reset(struct bpf_async_cb *cb)
{
- struct bpf_prog *prog = async->prog;
+ struct bpf_prog *prog = cb->res.prog;
if (prog) {
- bpf_prog_put(prog);
- async->prog = NULL;
- rcu_assign_pointer(async->callback_fn, NULL);
+ bpf_async_res_put(&cb->res);
+ memset(&cb->res, 0, sizeof(cb->res));
+ rcu_assign_pointer(cb->callback_fn, NULL);
}
}
@@ -1512,7 +1551,7 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_async_kern *, timer)
goto out;
}
drop:
- drop_prog_refcnt(&t->cb);
+ bpf_async_cb_reset(&t->cb);
out:
__bpf_spin_unlock_irqrestore(&timer->lock);
/* Cancel the timer and wait for associated callback to finish
@@ -1545,7 +1584,7 @@ static struct bpf_async_cb *__bpf_async_cancel_and_free(struct bpf_async_kern *a
cb = async->cb;
if (!cb)
goto out;
- drop_prog_refcnt(cb);
+ bpf_async_cb_reset(cb);
/* The subsequent bpf_timer_start/cancel() helpers won't be able to use
* this timer, since it won't be initialized.
*/
@@ -3112,7 +3151,7 @@ __bpf_kfunc int bpf_wq_start(struct bpf_wq *wq, unsigned int flags)
if (flags)
return -EINVAL;
w = READ_ONCE(async->work);
- if (!w || !READ_ONCE(w->cb.prog))
+ if (!w || !READ_ONCE(w->cb.res.prog))
return -EINVAL;
schedule_work(&w->work);
@@ -4034,8 +4073,8 @@ struct bpf_task_work_ctx {
refcount_t refcnt;
struct callback_head work;
struct irq_work irq_work;
- /* bpf_prog that schedules task work */
- struct bpf_prog *prog;
+ /* bpf_prog that schedules task work and related resources */
+ struct bpf_async_res res;
/* task for which callback is scheduled */
struct task_struct *task;
/* the map and map value associated with this context */
@@ -4053,9 +4092,9 @@ struct bpf_task_work_kern {
static void bpf_task_work_ctx_reset(struct bpf_task_work_ctx *ctx)
{
- if (ctx->prog) {
- bpf_prog_put(ctx->prog);
- ctx->prog = NULL;
+ if (ctx->res.prog) {
+ bpf_async_res_put(&ctx->res);
+ memset(&ctx->res, 0, sizeof(ctx->res));
}
if (ctx->task) {
bpf_task_release(ctx->task);
@@ -4233,19 +4272,19 @@ static int bpf_task_work_schedule(struct task_struct *task, struct bpf_task_work
struct bpf_map *map, bpf_task_work_callback_t callback_fn,
struct bpf_prog_aux *aux, enum task_work_notify_mode mode)
{
- struct bpf_prog *prog;
struct bpf_task_work_ctx *ctx;
+ struct bpf_async_res res;
int err;
BTF_TYPE_EMIT(struct bpf_task_work);
- prog = bpf_prog_inc_not_zero(aux->prog);
- if (IS_ERR(prog))
- return -EBADF;
+ err = bpf_async_res_get(&res, aux->prog);
+ if (err)
+ return err;
task = bpf_task_acquire(task);
if (!task) {
err = -EBADF;
- goto release_prog;
+ goto release_res;
}
ctx = bpf_task_work_acquire_ctx(tw, map);
@@ -4256,7 +4295,7 @@ static int bpf_task_work_schedule(struct task_struct *task, struct bpf_task_work
ctx->task = task;
ctx->callback_fn = callback_fn;
- ctx->prog = prog;
+ ctx->res = res;
ctx->mode = mode;
ctx->map = map;
ctx->map_val = (void *)tw - map->record->task_work_off;
@@ -4268,8 +4307,8 @@ static int bpf_task_work_schedule(struct task_struct *task, struct bpf_task_work
release_all:
bpf_task_release(task);
-release_prog:
- bpf_prog_put(prog);
+release_res:
+ bpf_async_res_put(&res);
return err;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
` (2 preceding siblings ...)
2025-11-04 17:26 ` [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
2025-11-04 23:26 ` Andrii Nakryiko
2025-11-04 17:26 ` [PATCH bpf-next v5 5/7] selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command Amery Hung
` (2 subsequent siblings)
6 siblings, 2 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Add low-level wrapper and libbpf API for BPF_PROG_ASSOC_STRUCT_OPS
command in the bpf() syscall.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
tools/lib/bpf/bpf.c | 19 +++++++++++++++++++
tools/lib/bpf/bpf.h | 21 +++++++++++++++++++++
tools/lib/bpf/libbpf.c | 30 ++++++++++++++++++++++++++++++
tools/lib/bpf/libbpf.h | 16 ++++++++++++++++
tools/lib/bpf/libbpf.map | 2 ++
5 files changed, 88 insertions(+)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index b66f5fbfbbb2..21b57a629916 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -1397,3 +1397,22 @@ int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
err = sys_bpf(BPF_PROG_STREAM_READ_BY_FD, &attr, attr_sz);
return libbpf_err_errno(err);
}
+
+int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
+ struct bpf_prog_assoc_struct_ops_opts *opts)
+{
+ const size_t attr_sz = offsetofend(union bpf_attr, prog_assoc_struct_ops);
+ union bpf_attr attr;
+ int err;
+
+ if (!OPTS_VALID(opts, bpf_prog_assoc_struct_ops_opts))
+ return libbpf_err(-EINVAL);
+
+ memset(&attr, 0, attr_sz);
+ attr.prog_assoc_struct_ops.map_fd = map_fd;
+ attr.prog_assoc_struct_ops.prog_fd = prog_fd;
+ attr.prog_assoc_struct_ops.flags = OPTS_GET(opts, flags, 0);
+
+ err = sys_bpf(BPF_PROG_ASSOC_STRUCT_OPS, &attr, attr_sz);
+ return libbpf_err_errno(err);
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index e983a3e40d61..1f9c28d27795 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -733,6 +733,27 @@ struct bpf_prog_stream_read_opts {
LIBBPF_API int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
struct bpf_prog_stream_read_opts *opts);
+struct bpf_prog_assoc_struct_ops_opts {
+ size_t sz;
+ __u32 flags;
+ size_t :0;
+};
+#define bpf_prog_assoc_struct_ops_opts__last_field flags
+
+/**
+ * @brief **bpf_prog_assoc_struct_ops** associates a BPF program with a
+ * struct_ops map.
+ *
+ * @param prog_fd FD for the BPF program
+ * @param map_fd FD for the struct_ops map to be associated with the BPF program
+ * @param opts optional options, can be NULL
+ *
+ * @return 0 on success; negative error code, otherwise (errno is also set to
+ * the error code)
+ */
+LIBBPF_API int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
+ struct bpf_prog_assoc_struct_ops_opts *opts);
+
#ifdef __cplusplus
} /* extern "C" */
#endif
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fbe74686c97d..260e1feaa665 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -13891,6 +13891,36 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
return 0;
}
+int bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
+ struct bpf_prog_assoc_struct_ops_opts *opts)
+{
+ int prog_fd;
+
+ prog_fd = bpf_program__fd(prog);
+ if (prog_fd < 0) {
+ pr_warn("prog '%s': can't associate BPF program without FD (was it loaded?)\n",
+ prog->name);
+ return -EINVAL;
+ }
+
+ if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
+ pr_warn("prog '%s': can't associate struct_ops program\n", prog->name);
+ return -EINVAL;
+ }
+
+ if (map->fd < 0) {
+ pr_warn("map '%s': can't associate BPF map without FD (was it created?)\n", map->name);
+ return -EINVAL;
+ }
+
+ if (!bpf_map__is_struct_ops(map)) {
+ pr_warn("map '%s': can't associate non-struct_ops map\n", map->name);
+ return -EINVAL;
+ }
+
+ return bpf_prog_assoc_struct_ops(prog_fd, map->fd, opts);
+}
+
int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz)
{
int err = 0, n, len, start, end = -1;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 5118d0a90e24..45720b7c2aaa 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1003,6 +1003,22 @@ LIBBPF_API int
bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd,
const char *attach_func_name);
+struct bpf_prog_assoc_struct_ops_opts; /* defined in bpf.h */
+
+/**
+ * @brief **bpf_program__assoc_struct_ops()** associates a BPF program with a
+ * struct_ops map.
+ *
+ * @param prog BPF program
+ * @param map struct_ops map to be associated with the BPF program
+ * @param opts optional options, can be NULL
+ *
+ * @return error code; or 0 if no error occurred.
+ */
+LIBBPF_API int
+bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
+ struct bpf_prog_assoc_struct_ops_opts *opts);
+
/**
* @brief **bpf_object__find_map_by_name()** returns BPF map of
* the given name, if it exists within the passed BPF object
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 8ed8749907d4..84fb90a016c9 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -451,4 +451,6 @@ LIBBPF_1.7.0 {
global:
bpf_map__set_exclusive_program;
bpf_map__exclusive_program;
+ bpf_prog_assoc_struct_ops;
+ bpf_program__assoc_struct_ops;
} LIBBPF_1.6.0;
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 5/7] selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
` (3 preceding siblings ...)
2025-11-04 17:26 ` [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 6/7] selftests/bpf: Test ambiguous associated struct_ops Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 7/7] selftests/bpf: Test getting associated struct_ops in timer callback Amery Hung
6 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Test BPF_PROG_ASSOC_STRUCT_OPS command that associates a BPF program
with a struct_ops. The test follows the same logic in commit
ba7000f1c360 ("selftests/bpf: Test multi_st_ops and calling kfuncs from
different programs"), but instead of using map id to identify a specific
struct_ops, this test uses the new BPF command to associate a struct_ops
with a program.
The test consists of two sets of almost identical struct_ops maps and BPF
programs associated with the map. Their only difference is the unique
value returned by bpf_testmod_multi_st_ops::test_1().
The test first loads the programs and associates them with struct_ops
maps. Then, it exercises the BPF programs. They will in turn call kfunc
bpf_kfunc_multi_st_ops_test_1_prog_arg() to trigger test_1() of the
associated struct_ops map, and then check if the right unique value is
returned.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../bpf/prog_tests/test_struct_ops_assoc.c | 72 ++++++++++++
.../selftests/bpf/progs/struct_ops_assoc.c | 105 ++++++++++++++++++
.../selftests/bpf/test_kmods/bpf_testmod.c | 17 +++
.../bpf/test_kmods/bpf_testmod_kfunc.h | 1 +
4 files changed, 195 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_assoc.c
diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
new file mode 100644
index 000000000000..29e8b58a14fa
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+#include "struct_ops_assoc.skel.h"
+
+static void test_st_ops_assoc(void)
+{
+ struct struct_ops_assoc *skel = NULL;
+ int err, pid;
+
+ skel = struct_ops_assoc__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "struct_ops_assoc__open"))
+ goto out;
+
+ /* cannot explicitly associate struct_ops program */
+ err = bpf_program__assoc_struct_ops(skel->progs.test_1_a,
+ skel->maps.st_ops_map_a, NULL);
+ ASSERT_ERR(err, "bpf_program__assoc_struct_ops");
+
+ err = bpf_program__assoc_struct_ops(skel->progs.syscall_prog_a,
+ skel->maps.st_ops_map_a, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ err = bpf_program__assoc_struct_ops(skel->progs.sys_enter_prog_a,
+ skel->maps.st_ops_map_a, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ err = bpf_program__assoc_struct_ops(skel->progs.syscall_prog_b,
+ skel->maps.st_ops_map_b, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ err = bpf_program__assoc_struct_ops(skel->progs.sys_enter_prog_b,
+ skel->maps.st_ops_map_b, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ /* sys_enter_prog_a already associated with map_a */
+ err = bpf_program__assoc_struct_ops(skel->progs.sys_enter_prog_a,
+ skel->maps.st_ops_map_b, NULL);
+ ASSERT_ERR(err, "bpf_program__assoc_struct_ops");
+
+ err = struct_ops_assoc__attach(skel);
+ if (!ASSERT_OK(err, "struct_ops_assoc__attach"))
+ goto out;
+
+ /* run tracing prog that calls .test_1 and checks return */
+ pid = getpid();
+ skel->bss->test_pid = pid;
+ sys_gettid();
+ skel->bss->test_pid = 0;
+
+ ASSERT_EQ(skel->bss->test_err_a, 0, "skel->bss->test_err_a");
+ ASSERT_EQ(skel->bss->test_err_b, 0, "skel->bss->test_err_b");
+
+ /* run syscall_prog that calls .test_1 and checks return */
+ err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.syscall_prog_a), NULL);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.syscall_prog_b), NULL);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ ASSERT_EQ(skel->bss->test_err_a, 0, "skel->bss->test_err_a");
+ ASSERT_EQ(skel->bss->test_err_b, 0, "skel->bss->test_err_b");
+
+out:
+ struct_ops_assoc__destroy(skel);
+}
+
+void test_struct_ops_assoc(void)
+{
+ if (test__start_subtest("st_ops_assoc"))
+ test_st_ops_assoc();
+}
diff --git a/tools/testing/selftests/bpf/progs/struct_ops_assoc.c b/tools/testing/selftests/bpf/progs/struct_ops_assoc.c
new file mode 100644
index 000000000000..fe47287a49f0
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/struct_ops_assoc.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include "bpf_misc.h"
+#include "../test_kmods/bpf_testmod.h"
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+char _license[] SEC("license") = "GPL";
+
+int test_pid;
+
+/* Programs associated with st_ops_map_a */
+
+#define MAP_A_MAGIC 1234
+int test_err_a;
+
+SEC("struct_ops")
+int BPF_PROG(test_1_a, struct st_ops_args *args)
+{
+ return MAP_A_MAGIC;
+}
+
+SEC("tp_btf/sys_enter")
+int BPF_PROG(sys_enter_prog_a, struct pt_regs *regs, long id)
+{
+ struct st_ops_args args = {};
+ struct task_struct *task;
+ int ret;
+
+ task = bpf_get_current_task_btf();
+ if (!test_pid || task->pid != test_pid)
+ return 0;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_A_MAGIC)
+ test_err_a++;
+
+ return 0;
+}
+
+SEC("syscall")
+int syscall_prog_a(void *ctx)
+{
+ struct st_ops_args args = {};
+ int ret;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_A_MAGIC)
+ test_err_a++;
+
+ return 0;
+}
+
+SEC(".struct_ops.link")
+struct bpf_testmod_multi_st_ops st_ops_map_a = {
+ .test_1 = (void *)test_1_a,
+};
+
+/* Programs associated with st_ops_map_b */
+
+#define MAP_B_MAGIC 5678
+int test_err_b;
+
+SEC("struct_ops")
+int BPF_PROG(test_1_b, struct st_ops_args *args)
+{
+ return MAP_B_MAGIC;
+}
+
+SEC("tp_btf/sys_enter")
+int BPF_PROG(sys_enter_prog_b, struct pt_regs *regs, long id)
+{
+ struct st_ops_args args = {};
+ struct task_struct *task;
+ int ret;
+
+ task = bpf_get_current_task_btf();
+ if (!test_pid || task->pid != test_pid)
+ return 0;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_B_MAGIC)
+ test_err_b++;
+
+ return 0;
+}
+
+SEC("syscall")
+int syscall_prog_b(void *ctx)
+{
+ struct st_ops_args args = {};
+ int ret;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_B_MAGIC)
+ test_err_b++;
+
+ return 0;
+}
+
+SEC(".struct_ops.link")
+struct bpf_testmod_multi_st_ops st_ops_map_b = {
+ .test_1 = (void *)test_1_b,
+};
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index 8074bc5f6f20..f45258202997 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -1108,6 +1108,7 @@ __bpf_kfunc int bpf_kfunc_st_ops_inc10(struct st_ops_args *args)
}
__bpf_kfunc int bpf_kfunc_multi_st_ops_test_1(struct st_ops_args *args, u32 id);
+__bpf_kfunc int bpf_kfunc_multi_st_ops_test_1_prog_arg(struct st_ops_args *args, void *aux_prog);
BTF_KFUNCS_START(bpf_testmod_check_kfunc_ids)
BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc)
@@ -1150,6 +1151,7 @@ BTF_ID_FLAGS(func, bpf_kfunc_st_ops_test_epilogue, KF_TRUSTED_ARGS | KF_SLEEPABL
BTF_ID_FLAGS(func, bpf_kfunc_st_ops_test_pro_epilogue, KF_TRUSTED_ARGS | KF_SLEEPABLE)
BTF_ID_FLAGS(func, bpf_kfunc_st_ops_inc10, KF_TRUSTED_ARGS)
BTF_ID_FLAGS(func, bpf_kfunc_multi_st_ops_test_1, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_kfunc_multi_st_ops_test_1_prog_arg, KF_TRUSTED_ARGS)
BTF_KFUNCS_END(bpf_testmod_check_kfunc_ids)
static int bpf_testmod_ops_init(struct btf *btf)
@@ -1611,6 +1613,7 @@ static struct bpf_testmod_multi_st_ops *multi_st_ops_find_nolock(u32 id)
return NULL;
}
+/* Call test_1() of the struct_ops map identified by the id */
int bpf_kfunc_multi_st_ops_test_1(struct st_ops_args *args, u32 id)
{
struct bpf_testmod_multi_st_ops *st_ops;
@@ -1626,6 +1629,20 @@ int bpf_kfunc_multi_st_ops_test_1(struct st_ops_args *args, u32 id)
return ret;
}
+/* Call test_1() of the associated struct_ops map */
+int bpf_kfunc_multi_st_ops_test_1_prog_arg(struct st_ops_args *args, void *aux__prog)
+{
+ struct bpf_prog_aux *prog_aux = (struct bpf_prog_aux *)aux__prog;
+ struct bpf_testmod_multi_st_ops *st_ops;
+ int ret = -1;
+
+ st_ops = (struct bpf_testmod_multi_st_ops *)bpf_prog_get_assoc_struct_ops(prog_aux);
+ if (st_ops)
+ ret = st_ops->test_1(args);
+
+ return ret;
+}
+
static int multi_st_ops_reg(void *kdata, struct bpf_link *link)
{
struct bpf_testmod_multi_st_ops *st_ops =
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
index 4df6fa6a92cb..d40f4cddbd1e 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod_kfunc.h
@@ -162,5 +162,6 @@ struct task_struct *bpf_kfunc_ret_rcu_test(void) __ksym;
int *bpf_kfunc_ret_rcu_test_nostruct(int rdonly_buf_size) __ksym;
int bpf_kfunc_multi_st_ops_test_1(struct st_ops_args *args, u32 id) __ksym;
+int bpf_kfunc_multi_st_ops_test_1_prog_arg(struct st_ops_args *args, void *aux__prog) __ksym;
#endif /* _BPF_TESTMOD_KFUNC_H */
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 6/7] selftests/bpf: Test ambiguous associated struct_ops
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
` (4 preceding siblings ...)
2025-11-04 17:26 ` [PATCH bpf-next v5 5/7] selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 7/7] selftests/bpf: Test getting associated struct_ops in timer callback Amery Hung
6 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Add a test to make sure implicit struct_ops association does not
break backward compatibility nor return incorrect struct_ops.
struct_ops programs should still be allowed to be reused in
different struct_ops map. The associated struct_ops map set implicitly
however will be poisoned. Trying to read it through the helper
bpf_prog_get_assoc_struct_ops() should result in a NULL pointer.
While recursion of test_1() cannot happen due to the associated
struct_ops being ambiguois, explicitly check for it to prevent stack
overflow if the test regresses.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../bpf/prog_tests/test_struct_ops_assoc.c | 38 ++++++++++
.../bpf/progs/struct_ops_assoc_reuse.c | 75 +++++++++++++++++++
2 files changed, 113 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_assoc_reuse.c
diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
index 29e8b58a14fa..f69306cb8974 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
@@ -2,6 +2,7 @@
#include <test_progs.h>
#include "struct_ops_assoc.skel.h"
+#include "struct_ops_assoc_reuse.skel.h"
static void test_st_ops_assoc(void)
{
@@ -65,8 +66,45 @@ static void test_st_ops_assoc(void)
struct_ops_assoc__destroy(skel);
}
+static void test_st_ops_assoc_reuse(void)
+{
+ struct struct_ops_assoc_reuse *skel = NULL;
+ int err;
+
+ skel = struct_ops_assoc_reuse__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "struct_ops_assoc_reuse__open"))
+ goto out;
+
+ err = bpf_program__assoc_struct_ops(skel->progs.syscall_prog_a,
+ skel->maps.st_ops_map_a, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ err = bpf_program__assoc_struct_ops(skel->progs.syscall_prog_b,
+ skel->maps.st_ops_map_b, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ err = struct_ops_assoc_reuse__attach(skel);
+ if (!ASSERT_OK(err, "struct_ops_assoc__attach"))
+ goto out;
+
+ /* run syscall_prog that calls .test_1 and checks return */
+ err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.syscall_prog_a), NULL);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.syscall_prog_b), NULL);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ ASSERT_EQ(skel->bss->test_err_a, 0, "skel->bss->test_err_a");
+ ASSERT_EQ(skel->bss->test_err_b, 0, "skel->bss->test_err_b");
+
+out:
+ struct_ops_assoc_reuse__destroy(skel);
+}
+
void test_struct_ops_assoc(void)
{
if (test__start_subtest("st_ops_assoc"))
test_st_ops_assoc();
+ if (test__start_subtest("st_ops_assoc_reuse"))
+ test_st_ops_assoc_reuse();
}
diff --git a/tools/testing/selftests/bpf/progs/struct_ops_assoc_reuse.c b/tools/testing/selftests/bpf/progs/struct_ops_assoc_reuse.c
new file mode 100644
index 000000000000..caaa45bdccc2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/struct_ops_assoc_reuse.c
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include "bpf_misc.h"
+#include "../test_kmods/bpf_testmod.h"
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define MAP_A_MAGIC 1234
+int test_err_a;
+int recur;
+
+/*
+ * test_1_a is reused. The kfunc should not be able to get the associated
+ * struct_ops and call test_1 recursively as it is ambiguous.
+ */
+SEC("struct_ops")
+int BPF_PROG(test_1_a, struct st_ops_args *args)
+{
+ int ret;
+
+ if (!recur) {
+ recur++;
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(args, NULL);
+ if (ret != -1)
+ test_err_a++;
+ recur--;
+ }
+
+ return MAP_A_MAGIC;
+}
+
+/* Programs associated with st_ops_map_a */
+
+SEC("syscall")
+int syscall_prog_a(void *ctx)
+{
+ struct st_ops_args args = {};
+ int ret;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_A_MAGIC)
+ test_err_a++;
+
+ return 0;
+}
+
+SEC(".struct_ops.link")
+struct bpf_testmod_multi_st_ops st_ops_map_a = {
+ .test_1 = (void *)test_1_a,
+};
+
+/* Programs associated with st_ops_map_b */
+
+int test_err_b;
+
+SEC("syscall")
+int syscall_prog_b(void *ctx)
+{
+ struct st_ops_args args = {};
+ int ret;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_A_MAGIC)
+ test_err_b++;
+
+ return 0;
+}
+
+SEC(".struct_ops.link")
+struct bpf_testmod_multi_st_ops st_ops_map_b = {
+ .test_1 = (void *)test_1_a,
+};
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH bpf-next v5 7/7] selftests/bpf: Test getting associated struct_ops in timer callback
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
` (5 preceding siblings ...)
2025-11-04 17:26 ` [PATCH bpf-next v5 6/7] selftests/bpf: Test ambiguous associated struct_ops Amery Hung
@ 2025-11-04 17:26 ` Amery Hung
6 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 17:26 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team
Make sure 1) a timer callback can also reference the associated
struct_ops, and then make sure 2) the timer callback cannot get a
dangled pointer to the struct_ops when the map is freed.
The test schedules a timer callback from a struct_ops program since
struct_ops programs do not pin the map. It is possible for the timer
callback to run after the map is freed. The timer callback calls a
kfunc that runs .test_1() of the associated struct_ops, which should
return MAP_MAGIC when the map is still alive or -1 when the map is
gone.
The first subtest added in this patch schedules the timer callback to
run immediately, while the map is still alive. The second subtest added
schedules the callback to run 500ms after syscall_prog runs and then
frees the map right after syscall_prog runs. Both subtests then wait
until the callback runs to check the return of the kfunc.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../bpf/prog_tests/test_struct_ops_assoc.c | 84 +++++++++++++++++++
.../bpf/progs/struct_ops_assoc_in_timer.c | 77 +++++++++++++++++
2 files changed, 161 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_assoc_in_timer.c
diff --git a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
index f69306cb8974..902d210a3551 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_struct_ops_assoc.c
@@ -3,6 +3,7 @@
#include <test_progs.h>
#include "struct_ops_assoc.skel.h"
#include "struct_ops_assoc_reuse.skel.h"
+#include "struct_ops_assoc_in_timer.skel.h"
static void test_st_ops_assoc(void)
{
@@ -101,10 +102,93 @@ static void test_st_ops_assoc_reuse(void)
struct_ops_assoc_reuse__destroy(skel);
}
+static void test_st_ops_assoc_in_timer(void)
+{
+ struct struct_ops_assoc_in_timer *skel = NULL;
+ int err;
+
+ skel = struct_ops_assoc_in_timer__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "struct_ops_assoc_reuse__open"))
+ goto out;
+
+ err = bpf_program__assoc_struct_ops(skel->progs.syscall_prog,
+ skel->maps.st_ops_map, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ err = struct_ops_assoc_in_timer__attach(skel);
+ if (!ASSERT_OK(err, "struct_ops_assoc__attach"))
+ goto out;
+
+ /*
+ * Run .test_1 by calling kfunc bpf_kfunc_multi_st_ops_test_1_prog_arg() and checks
+ * the return value. .test_1 will also schedule timer_cb that runs .test_1 again
+ * immediately.
+ */
+ err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.syscall_prog), NULL);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ /* Check the return of the kfunc after timer_cb runs */
+ while (!READ_ONCE(skel->bss->timer_cb_run))
+ sched_yield();
+ ASSERT_EQ(skel->bss->timer_test_1_ret, 1234, "skel->bss->timer_test_1_ret");
+ ASSERT_EQ(skel->bss->test_err, 0, "skel->bss->test_err_a");
+out:
+ struct_ops_assoc_in_timer__destroy(skel);
+}
+
+static void test_st_ops_assoc_in_timer_no_uref(void)
+{
+ struct struct_ops_assoc_in_timer *skel = NULL;
+ struct bpf_link *link;
+ int err;
+
+ skel = struct_ops_assoc_in_timer__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "struct_ops_assoc_reuse__open"))
+ goto out;
+
+ err = bpf_program__assoc_struct_ops(skel->progs.syscall_prog,
+ skel->maps.st_ops_map, NULL);
+ ASSERT_OK(err, "bpf_program__assoc_struct_ops");
+
+ link = bpf_map__attach_struct_ops(skel->maps.st_ops_map);
+ if (!ASSERT_OK_PTR(link, "bpf_map__attach_struct_ops"))
+ goto out;
+
+ /*
+ * Run .test_1 by calling kfunc bpf_kfunc_multi_st_ops_test_1_prog_arg() and checks
+ * the return value. .test_1 will also schedule timer_cb that runs .test_1 again.
+ * timer_cb will run 500ms after syscall_prog runs, when the user space no longer
+ * holds a reference to st_ops_map.
+ */
+ skel->bss->timer_ns = 500000000;
+ err = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.syscall_prog), NULL);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ /*
+ * Detach and close struct_ops map. timer_cb holding a reference to the map should
+ * prevent it from being freed.
+ */
+ bpf_link__destroy(link);
+ close(bpf_program__fd(skel->progs.syscall_prog));
+ close(bpf_map__fd(skel->maps.st_ops_map));
+
+ /* Check the return of the kfunc after timer_cb runs */
+ while (!READ_ONCE(skel->bss->timer_cb_run))
+ sched_yield();
+ ASSERT_EQ(skel->bss->timer_test_1_ret, 1234, "skel->bss->timer_test_1_ret");
+ ASSERT_EQ(skel->bss->test_err, 0, "skel->bss->test_err_a");
+out:
+ struct_ops_assoc_in_timer__destroy(skel);
+}
+
void test_struct_ops_assoc(void)
{
if (test__start_subtest("st_ops_assoc"))
test_st_ops_assoc();
if (test__start_subtest("st_ops_assoc_reuse"))
test_st_ops_assoc_reuse();
+ if (test__start_subtest("st_ops_assoc_in_timer"))
+ test_st_ops_assoc_in_timer();
+ if (test__start_subtest("st_ops_assoc_in_timer_no_uref"))
+ test_st_ops_assoc_in_timer_no_uref();
}
diff --git a/tools/testing/selftests/bpf/progs/struct_ops_assoc_in_timer.c b/tools/testing/selftests/bpf/progs/struct_ops_assoc_in_timer.c
new file mode 100644
index 000000000000..9d4e427568b2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/struct_ops_assoc_in_timer.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include "bpf_misc.h"
+#include "../test_kmods/bpf_testmod.h"
+#include "../test_kmods/bpf_testmod_kfunc.h"
+
+char _license[] SEC("license") = "GPL";
+
+struct elem {
+ struct bpf_timer timer;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 1);
+ __type(key, int);
+ __type(value, struct elem);
+} array_map SEC(".maps");
+
+#define MAP_MAGIC 1234
+int recur;
+int test_err;
+int timer_ns;
+int timer_test_1_ret;
+int timer_cb_run;
+
+__noinline static int timer_cb(void *map, int *key, struct bpf_timer *timer)
+{
+ struct st_ops_args args = {};
+
+ recur++;
+ timer_test_1_ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ recur--;
+
+ timer_cb_run++;
+
+ return 0;
+}
+
+SEC("struct_ops")
+int BPF_PROG(test_1, struct st_ops_args *args)
+{
+ struct bpf_timer *timer;
+ int key = 0;
+
+ if (!recur) {
+ timer = bpf_map_lookup_elem(&array_map, &key);
+ if (!timer)
+ return 0;
+
+ bpf_timer_init(timer, &array_map, 1);
+ bpf_timer_set_callback(timer, timer_cb);
+ bpf_timer_start(timer, timer_ns, 0);
+ }
+
+ return MAP_MAGIC;
+}
+
+SEC("syscall")
+int syscall_prog(void *ctx)
+{
+ struct st_ops_args args = {};
+ int ret;
+
+ ret = bpf_kfunc_multi_st_ops_test_1_prog_arg(&args, NULL);
+ if (ret != MAP_MAGIC)
+ test_err++;
+
+ return 0;
+}
+
+SEC(".struct_ops.link")
+struct bpf_testmod_multi_st_ops st_ops_map = {
+ .test_1 = (void *)test_1,
+};
--
2.47.3
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops
2025-11-04 17:26 ` [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops Amery Hung
@ 2025-11-04 17:54 ` bot+bpf-ci
2025-11-04 23:27 ` Andrii Nakryiko
2025-11-04 23:26 ` Andrii Nakryiko
1 sibling, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2025-11-04 17:54 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1945 bytes --]
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index fbe74686c..260e1feaa 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
[ ... ]
> @@ -13891,6 +13891,36 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
> return 0;
> }
>
> +int bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> + struct bpf_prog_assoc_struct_ops_opts *opts)
> +{
> + int prog_fd;
> +
> + prog_fd = bpf_program__fd(prog);
> + if (prog_fd < 0) {
> + pr_warn("prog '%s': can't associate BPF program without FD (was it loaded?)\n",
> + prog->name);
^^^^^^^^^^
Can prog be NULL here? If bpf_program__fd() returns -EINVAL because prog
was NULL, this dereferences prog->name without a NULL check. Looking at
bpf_program__set_attach_target() just above, it checks "if (!prog)"
before any dereferences.
> + return -EINVAL;
> + }
> +
> + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
^^^^^^^^^^
Same issue - prog->type is dereferenced without verifying prog is not
NULL.
> + pr_warn("prog '%s': can't associate struct_ops program\n", prog->name);
> + return -EINVAL;
> + }
> +
> + if (map->fd < 0) {
^^^^^^
Can map be NULL? There's no NULL check before dereferencing map->fd.
Other map APIs like bpf_map__fd() and bpf_map__name() check for NULL
before use.
> + pr_warn("map '%s': can't associate BPF map without FD (was it created?)\n", map->name);
> + return -EINVAL;
> + }
> +
> + if (!bpf_map__is_struct_ops(map)) {
> + pr_warn("map '%s': can't associate non-struct_ops map\n", map->name);
> + return -EINVAL;
> + }
> +
> + return bpf_prog_assoc_struct_ops(prog_fd, map->fd, opts);
> +}
> +
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19077679684
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
@ 2025-11-04 17:54 ` bot+bpf-ci
2025-11-04 18:03 ` Amery Hung
2025-11-04 21:59 ` Song Liu
` (2 subsequent siblings)
3 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2025-11-04 17:54 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 4680 bytes --]
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a47d67db3..0f71030c0 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
[ ... ]
> @@ -2026,6 +2028,9 @@ static inline void bpf_module_put(const void *data, struct module *owner)
> module_put(owner);
> }
> int bpf_struct_ops_link_create(union bpf_attr *attr);
> +int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map);
> +void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog);
> +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux);
> u32 bpf_struct_ops_id(const void *kdata);
[ ... ]
> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> index a41e6730e..0a19842da 100644
> --- a/kernel/bpf/bpf_struct_ops.c
> +++ b/kernel/bpf/bpf_struct_ops.c
[ ... ]
> @@ -1394,6 +1413,77 @@ int bpf_struct_ops_link_create(union bpf_attr *attr)
> return err;
> }
>
> +int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map)
> +{
> + struct bpf_map *st_ops_assoc;
> +
> + guard(mutex)(&prog->aux->st_ops_assoc_mutex);
> +
> + st_ops_assoc = prog->aux->st_ops_assoc;
> +
> + if (st_ops_assoc && st_ops_assoc == map)
> + return 0;
> +
> + if (st_ops_assoc) {
> + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
> + return -EBUSY;
> +
> + WRITE_ONCE(prog->aux->st_ops_assoc, BPF_PTR_POISON);
> + } else {
> + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
> + bpf_map_inc(map);
> +
> + WRITE_ONCE(prog->aux->st_ops_assoc, map);
> + }
> +
> + return 0;
> +}
> +
> +void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog)
> +{
> + struct bpf_map *st_ops_assoc;
> +
> + guard(mutex)(&prog->aux->st_ops_assoc_mutex);
> +
> + st_ops_assoc = prog->aux->st_ops_assoc;
> +
> + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
> + return;
> +
> + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
> + bpf_map_put(st_ops_assoc);
> +
> + WRITE_ONCE(prog->aux->st_ops_assoc, NULL);
> +}
> +
> +/*
> + * Get a reference to the struct_ops struct (i.e., kdata) associated with a
> + * program.
> + *
> + * If the returned pointer is not NULL, it must points to a valid and
> + * initialized struct_ops. The struct_ops may or may not be attached.
> + * Kernel struct_ops implementers are responsible for tracking and checking
> + * the state of the struct_ops if the use case requires an attached struct_ops.
> + */
> +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
> +{
> + struct bpf_map *st_ops_assoc = READ_ONCE(aux->st_ops_assoc);
> + struct bpf_struct_ops_map *st_map;
> +
> + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
> + return NULL;
> +
> + st_map = (struct bpf_struct_ops_map *)st_ops_assoc;
> +
> + if (smp_load_acquire(&st_map->kvalue.common.state) == BPF_STRUCT_OPS_STATE_INIT) {
> + bpf_map_put(st_ops_assoc);
^^^^
Does bpf_prog_get_assoc_struct_ops() need to call bpf_map_put() here?
The function comment says "Get a reference to the struct_ops struct"
but the function never calls bpf_map_get/inc() to acquire a reference.
It only reads aux->st_ops_assoc via READ_ONCE().
When the state check fails (INIT state), the function calls bpf_map_put()
which drops the reference that was acquired in bpf_prog_assoc_struct_ops().
But on the success path below, it returns kdata without any refcount
operation.
This creates an imbalance:
- INIT state path: drops a reference, returns NULL
- Non-INIT path: no refcount change, returns kdata
The caller has no way to know whether the reference count was modified.
Also, the function is EXPORTED but doesn't document the reference counting
semantics clearly.
> + return NULL;
> + }
> +
> + return &st_map->kvalue.data;
> +}
> +EXPORT_SYMBOL_GPL(bpf_prog_get_assoc_struct_ops);
Can this function race with bpf_prog_disassoc_struct_ops()?
Since bpf_prog_get_assoc_struct_ops() reads aux->st_ops_assoc via
READ_ONCE() without holding the mutex or acquiring a reference, there's
a window where:
Thread A: reads st_ops_assoc pointer
Thread B: calls bpf_prog_disassoc_struct_ops()->bpf_map_put() (drops last ref)
Thread A: dereferences st_map->kvalue.common.state (use-after-free)
For non-struct_ops programs, bpf_prog_assoc_struct_ops() holds a
reference via bpf_map_inc(). This reference is dropped in
bpf_prog_disassoc_struct_ops() when the program is freed. Without
acquiring a new reference in bpf_prog_get_assoc_struct_ops(), the map
can be freed during the race window above.
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19077679684
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-04 17:26 ` [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback Amery Hung
@ 2025-11-04 18:03 ` bot+bpf-ci
2025-11-04 18:10 ` Amery Hung
2025-11-04 23:20 ` Andrii Nakryiko
2025-11-06 2:13 ` Martin KaFai Lau
2 siblings, 1 reply; 30+ messages in thread
From: bot+bpf-ci @ 2025-11-04 18:03 UTC (permalink / raw)
To: ameryhung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
ameryhung, kernel-team, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2099 bytes --]
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 865b0dae3..557570479 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
[ ... ]
> +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> +{
> + struct bpf_map *st_ops_assoc = NULL;
> + int err;
> +
> + prog = bpf_prog_inc_not_zero(prog);
> + if (IS_ERR(prog))
> + return PTR_ERR(prog);
> +
> + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
> + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
> + if (IS_ERR(st_ops_assoc)) {
> + err = PTR_ERR(st_ops_assoc);
> + goto put_prog;
> + }
> + }
Can this race with bpf_prog_disassoc_struct_ops()? Between reading
st_ops_assoc and incrementing it, another thread could dissociate the
map:
bpf_async_res_get():
READ_ONCE(prog->aux->st_ops_assoc) // reads valid map pointer
bpf_prog_disassoc_struct_ops():
guard(mutex)(&prog->aux->st_ops_assoc_mutex)
WRITE_ONCE(prog->aux->st_ops_assoc, NULL)
bpf_map_put(st_ops_assoc) // potentially frees map
bpf_async_res_get():
bpf_map_inc_not_zero(st_ops_assoc) // use-after-free
The map could be freed via RCU and memory reused before
bpf_map_inc_not_zero() executes. Other functions that access
st_ops_assoc (bpf_prog_assoc_struct_ops and bpf_prog_disassoc_struct_ops)
hold prog->aux->st_ops_assoc_mutex. Additionally, bpf_map_inc_not_zero's
documentation states "map_idr_lock should have been held or the map
should have been protected by rcu read lock."
Should bpf_async_res_get() hold the st_ops_assoc_mutex or an RCU read
lock around the st_ops_assoc access?
> +
> + res->prog = prog;
> + res->st_ops_assoc = st_ops_assoc;
> + return 0;
> +put_prog:
> + bpf_prog_put(prog);
> + return err;
> +}
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19077679684
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 17:54 ` bot+bpf-ci
@ 2025-11-04 18:03 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 18:03 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, ast, eddyz87, yonghong.song, clm, ihor.solodrai
On Tue, Nov 4, 2025 at 9:54 AM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index a47d67db3..0f71030c0 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
>
> [ ... ]
>
> > @@ -2026,6 +2028,9 @@ static inline void bpf_module_put(const void *data, struct module *owner)
> > module_put(owner);
> > }
> > int bpf_struct_ops_link_create(union bpf_attr *attr);
> > +int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map);
> > +void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog);
> > +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux);
> > u32 bpf_struct_ops_id(const void *kdata);
>
> [ ... ]
>
> > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > index a41e6730e..0a19842da 100644
> > --- a/kernel/bpf/bpf_struct_ops.c
> > +++ b/kernel/bpf/bpf_struct_ops.c
>
> [ ... ]
>
> > @@ -1394,6 +1413,77 @@ int bpf_struct_ops_link_create(union bpf_attr *attr)
> > return err;
> > }
> >
> > +int bpf_prog_assoc_struct_ops(struct bpf_prog *prog, struct bpf_map *map)
> > +{
> > + struct bpf_map *st_ops_assoc;
> > +
> > + guard(mutex)(&prog->aux->st_ops_assoc_mutex);
> > +
> > + st_ops_assoc = prog->aux->st_ops_assoc;
> > +
> > + if (st_ops_assoc && st_ops_assoc == map)
> > + return 0;
> > +
> > + if (st_ops_assoc) {
> > + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
> > + return -EBUSY;
> > +
> > + WRITE_ONCE(prog->aux->st_ops_assoc, BPF_PTR_POISON);
> > + } else {
> > + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
> > + bpf_map_inc(map);
> > +
> > + WRITE_ONCE(prog->aux->st_ops_assoc, map);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +void bpf_prog_disassoc_struct_ops(struct bpf_prog *prog)
> > +{
> > + struct bpf_map *st_ops_assoc;
> > +
> > + guard(mutex)(&prog->aux->st_ops_assoc_mutex);
> > +
> > + st_ops_assoc = prog->aux->st_ops_assoc;
> > +
> > + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
> > + return;
> > +
> > + if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
> > + bpf_map_put(st_ops_assoc);
> > +
> > + WRITE_ONCE(prog->aux->st_ops_assoc, NULL);
> > +}
> > +
> > +/*
> > + * Get a reference to the struct_ops struct (i.e., kdata) associated with a
> > + * program.
> > + *
> > + * If the returned pointer is not NULL, it must points to a valid and
> > + * initialized struct_ops. The struct_ops may or may not be attached.
> > + * Kernel struct_ops implementers are responsible for tracking and checking
> > + * the state of the struct_ops if the use case requires an attached struct_ops.
> > + */
> > +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
> > +{
> > + struct bpf_map *st_ops_assoc = READ_ONCE(aux->st_ops_assoc);
> > + struct bpf_struct_ops_map *st_map;
> > +
> > + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
> > + return NULL;
> > +
> > + st_map = (struct bpf_struct_ops_map *)st_ops_assoc;
> > +
> > + if (smp_load_acquire(&st_map->kvalue.common.state) == BPF_STRUCT_OPS_STATE_INIT) {
> > + bpf_map_put(st_ops_assoc);
> ^^^^
>
> Does bpf_prog_get_assoc_struct_ops() need to call bpf_map_put() here?
>
Right. bpf_map_put() is not needed here. It was some artifact in v4.
Will respin and fix it.
> The function comment says "Get a reference to the struct_ops struct"
> but the function never calls bpf_map_get/inc() to acquire a reference.
> It only reads aux->st_ops_assoc via READ_ONCE().
>
> When the state check fails (INIT state), the function calls bpf_map_put()
> which drops the reference that was acquired in bpf_prog_assoc_struct_ops().
> But on the success path below, it returns kdata without any refcount
> operation.
>
> This creates an imbalance:
> - INIT state path: drops a reference, returns NULL
> - Non-INIT path: no refcount change, returns kdata
>
> The caller has no way to know whether the reference count was modified.
> Also, the function is EXPORTED but doesn't document the reference counting
> semantics clearly.
>
> > + return NULL;
> > + }
> > +
> > + return &st_map->kvalue.data;
> > +}
> > +EXPORT_SYMBOL_GPL(bpf_prog_get_assoc_struct_ops);
>
> Can this function race with bpf_prog_disassoc_struct_ops()?
>
> Since bpf_prog_get_assoc_struct_ops() reads aux->st_ops_assoc via
> READ_ONCE() without holding the mutex or acquiring a reference, there's
> a window where:
>
> Thread A: reads st_ops_assoc pointer
> Thread B: calls bpf_prog_disassoc_struct_ops()->bpf_map_put() (drops last ref)
> Thread A: dereferences st_map->kvalue.common.state (use-after-free)
>
> For non-struct_ops programs, bpf_prog_assoc_struct_ops() holds a
> reference via bpf_map_inc(). This reference is dropped in
> bpf_prog_disassoc_struct_ops() when the program is freed. Without
> acquiring a new reference in bpf_prog_get_assoc_struct_ops(), the map
> can be freed during the race window above.
This is fine. bpf_prog_get_assoc_struct_ops() should only be called by
kfunc (i.e., when a BPF program is still alive). Refcount has been
bumped during the BPF_PROG_ASSOC_STRUCT_OPS command or during async
callback registration.
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19077679684
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-04 18:03 ` bot+bpf-ci
@ 2025-11-04 18:10 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 18:10 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, ast, eddyz87, yonghong.song, clm, ihor.solodrai
On Tue, Nov 4, 2025 at 10:03 AM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 865b0dae3..557570479 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
>
> [ ... ]
>
> > +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> > +{
> > + struct bpf_map *st_ops_assoc = NULL;
> > + int err;
> > +
> > + prog = bpf_prog_inc_not_zero(prog);
> > + if (IS_ERR(prog))
> > + return PTR_ERR(prog);
> > +
> > + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
> > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> > + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> > + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
> > + if (IS_ERR(st_ops_assoc)) {
> > + err = PTR_ERR(st_ops_assoc);
> > + goto put_prog;
> > + }
> > + }
>
> Can this race with bpf_prog_disassoc_struct_ops()? Between reading
> st_ops_assoc and incrementing it, another thread could dissociate the
> map:
This is fine. struct_ops map will wait for struct_ops BPF programs to
finish running (an RCU gp) before freeing itself. Therefore, when
another thread is freeing the map and this thread is calling
bpf_async_res_get(), if this thread still sees
prog->aux->st_ops_assoc, bpf_map_inc_not_zero() will fail and stop the
async callback from being registered.
>
> bpf_async_res_get():
> READ_ONCE(prog->aux->st_ops_assoc) // reads valid map pointer
>
> bpf_prog_disassoc_struct_ops():
> guard(mutex)(&prog->aux->st_ops_assoc_mutex)
> WRITE_ONCE(prog->aux->st_ops_assoc, NULL)
> bpf_map_put(st_ops_assoc) // potentially frees map
>
> bpf_async_res_get():
> bpf_map_inc_not_zero(st_ops_assoc) // use-after-free
>
> The map could be freed via RCU and memory reused before
> bpf_map_inc_not_zero() executes. Other functions that access
> st_ops_assoc (bpf_prog_assoc_struct_ops and bpf_prog_disassoc_struct_ops)
> hold prog->aux->st_ops_assoc_mutex. Additionally, bpf_map_inc_not_zero's
> documentation states "map_idr_lock should have been held or the map
> should have been protected by rcu read lock."
>
> Should bpf_async_res_get() hold the st_ops_assoc_mutex or an RCU read
> lock around the st_ops_assoc access?
>
> > +
> > + res->prog = prog;
> > + res->st_ops_assoc = st_ops_assoc;
> > + return 0;
> > +put_prog:
> > + bpf_prog_put(prog);
> > + return err;
> > +}
>
> [ ... ]
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19077679684
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
@ 2025-11-04 21:59 ` Song Liu
2025-11-04 23:26 ` Amery Hung
2025-11-04 22:47 ` Andrii Nakryiko
2025-11-06 0:57 ` Martin KaFai Lau
3 siblings, 1 reply; 30+ messages in thread
From: Song Liu @ 2025-11-04 21:59 UTC (permalink / raw)
To: Amery Hung
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
alexei.starovoitov@gmail.com, andrii@kernel.org,
daniel@iogearbox.net, tj@kernel.org, martin.lau@kernel.org,
Kernel Team, roman.gushchin@linux.dev
> On Nov 4, 2025, at 9:26 AM, Amery Hung <ameryhung@gmail.com> wrote:
>
> Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating
> a BPF program with a struct_ops map. This command takes a file
> descriptor of a struct_ops map and a BPF program and set
> prog->aux->st_ops_assoc to the kdata of the struct_ops map.
>
> The command does not accept a struct_ops program nor a non-struct_ops
> map. Programs of a struct_ops map is automatically associated with the
> map during map update. If a program is shared between two struct_ops
> maps, prog->aux->st_ops_assoc will be poisoned to indicate that the
> associated struct_ops is ambiguous. The pointer, once poisoned, cannot
> be reset since we have lost track of associated struct_ops. For other
> program types, the associated struct_ops map, once set, cannot be
> changed later. This restriction may be lifted in the future if there is
> a use case.
>
> A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve
> the associated struct_ops pointer. The returned pointer, if not NULL, is
> guaranteed to be valid and point to a fully updated struct_ops struct.
> For struct_ops program reused in multiple struct_ops map, the return
> will be NULL.
>
> To make sure the returned pointer to be valid, the command increases the
> refcount of the map for every associated non-struct_ops programs. For
> struct_ops programs, the destruction of a struct_ops map already waits for
> its BPF programs to finish running. A later patch will further make sure
> the map will not be freed when an async callback schedule from struct_ops
> is running.
>
> struct_ops implementers should note that the struct_ops returned may or
> may not be attached. The struct_ops implementer will be responsible for
> tracking and checking the state of the associated struct_ops map if the
> use case requires an attached struct_ops.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
> include/linux/bpf.h | 16 ++++++
> include/uapi/linux/bpf.h | 17 +++++++
> kernel/bpf/bpf_struct_ops.c | 90 ++++++++++++++++++++++++++++++++++
> kernel/bpf/core.c | 3 ++
> kernel/bpf/syscall.c | 46 +++++++++++++++++
> tools/include/uapi/linux/bpf.h | 17 +++++++
> 6 files changed, 189 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a47d67db3be5..0f71030c03e1 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1726,6 +1726,8 @@ struct bpf_prog_aux {
> struct rcu_head rcu;
> };
> struct bpf_stream stream[2];
> + struct mutex st_ops_assoc_mutex;
> + struct bpf_map *st_ops_assoc;
> };
In the bpf-oom thread, we agreed (mostly agreed?) that we will allow
attaching a struct_ops map multiple times.
To match this design, shall we associate a BPF program with a
bpf_struct_ops_link instead of bpf_map? This requires one more
pointer deref to get the pointer to the struct_ops map. But the
solution will be more future proof.
Does this make sense?
Thanks,
Song
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
2025-11-04 21:59 ` Song Liu
@ 2025-11-04 22:47 ` Andrii Nakryiko
2025-11-04 23:27 ` Amery Hung
2025-11-06 0:57 ` Martin KaFai Lau
3 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-11-04 22:47 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
>
> Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating
> a BPF program with a struct_ops map. This command takes a file
> descriptor of a struct_ops map and a BPF program and set
> prog->aux->st_ops_assoc to the kdata of the struct_ops map.
>
> The command does not accept a struct_ops program nor a non-struct_ops
> map. Programs of a struct_ops map is automatically associated with the
> map during map update. If a program is shared between two struct_ops
> maps, prog->aux->st_ops_assoc will be poisoned to indicate that the
> associated struct_ops is ambiguous. The pointer, once poisoned, cannot
> be reset since we have lost track of associated struct_ops. For other
> program types, the associated struct_ops map, once set, cannot be
> changed later. This restriction may be lifted in the future if there is
> a use case.
>
> A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve
> the associated struct_ops pointer. The returned pointer, if not NULL, is
> guaranteed to be valid and point to a fully updated struct_ops struct.
> For struct_ops program reused in multiple struct_ops map, the return
> will be NULL.
>
> To make sure the returned pointer to be valid, the command increases the
> refcount of the map for every associated non-struct_ops programs. For
> struct_ops programs, the destruction of a struct_ops map already waits for
> its BPF programs to finish running. A later patch will further make sure
> the map will not be freed when an async callback schedule from struct_ops
> is running.
>
> struct_ops implementers should note that the struct_ops returned may or
> may not be attached. The struct_ops implementer will be responsible for
> tracking and checking the state of the associated struct_ops map if the
> use case requires an attached struct_ops.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
> include/linux/bpf.h | 16 ++++++
> include/uapi/linux/bpf.h | 17 +++++++
> kernel/bpf/bpf_struct_ops.c | 90 ++++++++++++++++++++++++++++++++++
> kernel/bpf/core.c | 3 ++
> kernel/bpf/syscall.c | 46 +++++++++++++++++
> tools/include/uapi/linux/bpf.h | 17 +++++++
> 6 files changed, 189 insertions(+)
>
[...]
> static void bpf_struct_ops_map_free_image(struct bpf_struct_ops_map *st_map)
> {
> int i;
> @@ -801,6 +812,12 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> goto reset_unlock;
> }
>
> + err = bpf_prog_assoc_struct_ops(prog, &st_map->map);
> + if (err) {
> + bpf_prog_put(prog);
> + goto reset_unlock;
> + }
> +
> link = kzalloc(sizeof(*link), GFP_USER);
> if (!link) {
I think we need to call bpf_prog_disassoc_struct_ops() here if
kzalloc() fails (just like we do bpf_prog_put). After kzalloc link
will be put into plink list and generic clean up path will handle all
this, but not here.
> bpf_prog_put(prog);
> @@ -980,6 +997,8 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
> if (btf_is_module(st_map->btf))
> module_put(st_map->st_ops_desc->st_ops->owner);
>
> + bpf_struct_ops_map_dissoc_progs(st_map);
> +
> bpf_struct_ops_map_del_ksyms(st_map);
>
> /* The struct_ops's function may switch to another struct_ops.
[...]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-04 17:26 ` [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback Amery Hung
2025-11-04 18:03 ` bot+bpf-ci
@ 2025-11-04 23:20 ` Andrii Nakryiko
2025-11-05 23:03 ` Amery Hung
2025-11-06 2:13 ` Martin KaFai Lau
2 siblings, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-11-04 23:20 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
>
> Take a refcount of the associated struct_ops map to prevent the map from
> being freed when an async callback scheduled from a struct_ops program
> runs.
>
> Since struct_ops programs do not take refcounts on the struct_ops map,
> it is possible for a struct_ops map to be freed when an async callback
> scheduled from it runs. To prevent this, take a refcount on prog->aux->
> st_ops_assoc and save it in a newly created struct bpf_async_res for
> every async mechanism. The reference needs to be preserved in
> bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
> and reference leak could happen.
>
> bpf_async_res will contain a async callback's BPF program and resources
> related to the BPF program. The resources will be acquired when
> registering a callback and released when cancelled or when the map
> associated with the callback is freed.
>
> Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
> what it now does.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
> kernel/bpf/helpers.c | 105 +++++++++++++++++++++++++++++--------------
> 1 file changed, 72 insertions(+), 33 deletions(-)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 930e132f440f..5c081cd604d5 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1092,9 +1092,14 @@ static void *map_key_from_value(struct bpf_map *map, void *value, u32 *arr_idx)
> return (void *)value - round_up(map->key_size, 8);
> }
>
> +struct bpf_async_res {
"res" has a strong "result" meaning, which is a distraction here.
Maybe "bpf_async_ctx"? And then we can use prep and put (reset?)
terminology?
> + struct bpf_prog *prog;
> + struct bpf_map *st_ops_assoc;
> +};
> +
> struct bpf_async_cb {
> struct bpf_map *map;
> - struct bpf_prog *prog;
> + struct bpf_async_res res;
> void __rcu *callback_fn;
> void *value;
> union {
> @@ -1299,8 +1304,8 @@ static int __bpf_async_init(struct bpf_async_kern *async, struct bpf_map *map, u
> break;
> }
> cb->map = map;
> - cb->prog = NULL;
> cb->flags = flags;
> + memset(&cb->res, 0, sizeof(cb->res));
> rcu_assign_pointer(cb->callback_fn, NULL);
>
> WRITE_ONCE(async->cb, cb);
> @@ -1351,11 +1356,47 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
> .arg3_type = ARG_ANYTHING,
> };
>
> +static void bpf_async_res_put(struct bpf_async_res *res)
> +{
> + bpf_prog_put(res->prog);
> +
> + if (res->st_ops_assoc)
> + bpf_map_put(res->st_ops_assoc);
> +}
> +
> +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> +{
> + struct bpf_map *st_ops_assoc = NULL;
> + int err;
> +
> + prog = bpf_prog_inc_not_zero(prog);
> + if (IS_ERR(prog))
> + return PTR_ERR(prog);
> +
> + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
I think in about a month we'll forget why we inc_not_zero only for
STRUCT_OPS programs, so I'd add comment here that non-struct_ops
programs have explicit refcount on st_ops_assoc, so as long as we have
that inc_not_zero(prog) above, we don't need to also bump map refcount
> + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
> + if (IS_ERR(st_ops_assoc)) {
> + err = PTR_ERR(st_ops_assoc);
> + goto put_prog;
nit: might be a bit premature to structure code with goto put_prog. As of now:
bpf_prog_put(prog);
return PTR_ERR(st_ops_assoc);
is short and sweet and good enough?
> + }
> + }
> +
> + res->prog = prog;
> + res->st_ops_assoc = st_ops_assoc;
question: do we want to assign BPF_PTR_POISON to res->st_ops_assoc or
is it better to keep it as NULL in such a case? I'm not sure, just
bringing up the possibility
> + return 0;
> +put_prog:
> + bpf_prog_put(prog);
> + return err;
> +}
> +
> static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback_fn,
> struct bpf_prog_aux *aux, unsigned int flags,
> enum bpf_async_type type)
> {
> struct bpf_prog *prev, *prog = aux->prog;
> + struct bpf_async_res res;
> struct bpf_async_cb *cb;
> int ret = 0;
>
> @@ -1376,20 +1417,18 @@ static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback
> ret = -EPERM;
> goto out;
> }
> - prev = cb->prog;
> + prev = cb->res.prog;
> if (prev != prog) {
> - /* Bump prog refcnt once. Every bpf_timer_set_callback()
> + /* Get prog and related resources once. Every bpf_timer_set_callback()
> * can pick different callback_fn-s within the same prog.
> */
> - prog = bpf_prog_inc_not_zero(prog);
> - if (IS_ERR(prog)) {
> - ret = PTR_ERR(prog);
> + ret = bpf_async_res_get(&res, prog);
> + if (ret)
> goto out;
> - }
> if (prev)
> - /* Drop prev prog refcnt when swapping with new prog */
> - bpf_prog_put(prev);
> - cb->prog = prog;
> + /* Put prev prog and related resources when swapping with new prog */
> + bpf_async_res_put(&cb->res);
> + cb->res = res;
> }
we discussed this offline, but I'll summarize here:
I think we need to abstract this away as bpf_async_ctx_update(), which
will accept a new prog pointer, and internally will deal with
necessary ref count inc/put as necessary, depending on whether prog
changed or not. With st_ops_assoc, prog pointer may not change, but
the underlying st_ops_assoc might (it can go from NULL to valid
assoc). And the implementation will be careful to leave previous async
ctx as it was if anything goes wrong (just like existing code
behaves).
> rcu_assign_pointer(cb->callback_fn, callback_fn);
> out:
> @@ -1423,7 +1462,7 @@ BPF_CALL_3(bpf_timer_start, struct bpf_async_kern *, timer, u64, nsecs, u64, fla
> return -EINVAL;
> __bpf_spin_lock_irqsave(&timer->lock);
> t = timer->timer;
> - if (!t || !t->cb.prog) {
> + if (!t || !t->cb.res.prog) {
> ret = -EINVAL;
> goto out;
> }
> @@ -1451,14 +1490,14 @@ static const struct bpf_func_proto bpf_timer_start_proto = {
> .arg3_type = ARG_ANYTHING,
> };
>
> -static void drop_prog_refcnt(struct bpf_async_cb *async)
> +static void bpf_async_cb_reset(struct bpf_async_cb *cb)
> {
> - struct bpf_prog *prog = async->prog;
> + struct bpf_prog *prog = cb->res.prog;
>
> if (prog) {
> - bpf_prog_put(prog);
> - async->prog = NULL;
> - rcu_assign_pointer(async->callback_fn, NULL);
> + bpf_async_res_put(&cb->res);
> + memset(&cb->res, 0, sizeof(cb->res));
shouldn't bpf_async_res_put() leave cb->res in zeroed out state? why
extra memset(0)?
> + rcu_assign_pointer(cb->callback_fn, NULL);
> }
> }
>
[...]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 21:59 ` Song Liu
@ 2025-11-04 23:26 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 23:26 UTC (permalink / raw)
To: Song Liu
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
alexei.starovoitov@gmail.com, andrii@kernel.org,
daniel@iogearbox.net, tj@kernel.org, martin.lau@kernel.org,
Kernel Team, roman.gushchin@linux.dev
On Tue, Nov 4, 2025 at 2:00 PM Song Liu <songliubraving@meta.com> wrote:
>
>
>
> > On Nov 4, 2025, at 9:26 AM, Amery Hung <ameryhung@gmail.com> wrote:
> >
> > Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating
> > a BPF program with a struct_ops map. This command takes a file
> > descriptor of a struct_ops map and a BPF program and set
> > prog->aux->st_ops_assoc to the kdata of the struct_ops map.
> >
> > The command does not accept a struct_ops program nor a non-struct_ops
> > map. Programs of a struct_ops map is automatically associated with the
> > map during map update. If a program is shared between two struct_ops
> > maps, prog->aux->st_ops_assoc will be poisoned to indicate that the
> > associated struct_ops is ambiguous. The pointer, once poisoned, cannot
> > be reset since we have lost track of associated struct_ops. For other
> > program types, the associated struct_ops map, once set, cannot be
> > changed later. This restriction may be lifted in the future if there is
> > a use case.
> >
> > A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve
> > the associated struct_ops pointer. The returned pointer, if not NULL, is
> > guaranteed to be valid and point to a fully updated struct_ops struct.
> > For struct_ops program reused in multiple struct_ops map, the return
> > will be NULL.
> >
> > To make sure the returned pointer to be valid, the command increases the
> > refcount of the map for every associated non-struct_ops programs. For
> > struct_ops programs, the destruction of a struct_ops map already waits for
> > its BPF programs to finish running. A later patch will further make sure
> > the map will not be freed when an async callback schedule from struct_ops
> > is running.
> >
> > struct_ops implementers should note that the struct_ops returned may or
> > may not be attached. The struct_ops implementer will be responsible for
> > tracking and checking the state of the associated struct_ops map if the
> > use case requires an attached struct_ops.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> > include/linux/bpf.h | 16 ++++++
> > include/uapi/linux/bpf.h | 17 +++++++
> > kernel/bpf/bpf_struct_ops.c | 90 ++++++++++++++++++++++++++++++++++
> > kernel/bpf/core.c | 3 ++
> > kernel/bpf/syscall.c | 46 +++++++++++++++++
> > tools/include/uapi/linux/bpf.h | 17 +++++++
> > 6 files changed, 189 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index a47d67db3be5..0f71030c03e1 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1726,6 +1726,8 @@ struct bpf_prog_aux {
> > struct rcu_head rcu;
> > };
> > struct bpf_stream stream[2];
> > + struct mutex st_ops_assoc_mutex;
> > + struct bpf_map *st_ops_assoc;
> > };
>
> In the bpf-oom thread, we agreed (mostly agreed?) that we will allow
> attaching a struct_ops map multiple times.
>
> To match this design, shall we associate a BPF program with a
> bpf_struct_ops_link instead of bpf_map? This requires one more
> pointer deref to get the pointer to the struct_ops map. But the
> solution will be more future proof.
>
> Does this make sense?
I think it makes sense and can be a future work to have the ability to
associate attachments. The command can be extended to take a link to
struct_ops map and a link to bpf program.
For this patchset, I think firstly we should still aim for providing a
way to associate the implementation (think of it as C++ class).
>
> Thanks,
> Song
>
>
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops
2025-11-04 17:26 ` [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
@ 2025-11-04 23:26 ` Andrii Nakryiko
2025-11-04 23:39 ` Amery Hung
1 sibling, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2025-11-04 23:26 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
>
> Add low-level wrapper and libbpf API for BPF_PROG_ASSOC_STRUCT_OPS
> command in the bpf() syscall.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
> tools/lib/bpf/bpf.c | 19 +++++++++++++++++++
> tools/lib/bpf/bpf.h | 21 +++++++++++++++++++++
> tools/lib/bpf/libbpf.c | 30 ++++++++++++++++++++++++++++++
> tools/lib/bpf/libbpf.h | 16 ++++++++++++++++
> tools/lib/bpf/libbpf.map | 2 ++
> 5 files changed, 88 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index b66f5fbfbbb2..21b57a629916 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -1397,3 +1397,22 @@ int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
> err = sys_bpf(BPF_PROG_STREAM_READ_BY_FD, &attr, attr_sz);
> return libbpf_err_errno(err);
> }
> +
> +int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
> + struct bpf_prog_assoc_struct_ops_opts *opts)
> +{
> + const size_t attr_sz = offsetofend(union bpf_attr, prog_assoc_struct_ops);
> + union bpf_attr attr;
> + int err;
> +
> + if (!OPTS_VALID(opts, bpf_prog_assoc_struct_ops_opts))
> + return libbpf_err(-EINVAL);
> +
> + memset(&attr, 0, attr_sz);
> + attr.prog_assoc_struct_ops.map_fd = map_fd;
> + attr.prog_assoc_struct_ops.prog_fd = prog_fd;
> + attr.prog_assoc_struct_ops.flags = OPTS_GET(opts, flags, 0);
> +
> + err = sys_bpf(BPF_PROG_ASSOC_STRUCT_OPS, &attr, attr_sz);
> + return libbpf_err_errno(err);
> +}
> diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> index e983a3e40d61..1f9c28d27795 100644
> --- a/tools/lib/bpf/bpf.h
> +++ b/tools/lib/bpf/bpf.h
> @@ -733,6 +733,27 @@ struct bpf_prog_stream_read_opts {
> LIBBPF_API int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
> struct bpf_prog_stream_read_opts *opts);
>
> +struct bpf_prog_assoc_struct_ops_opts {
> + size_t sz;
> + __u32 flags;
> + size_t :0;
> +};
> +#define bpf_prog_assoc_struct_ops_opts__last_field flags
> +
> +/**
> + * @brief **bpf_prog_assoc_struct_ops** associates a BPF program with a
> + * struct_ops map.
> + *
> + * @param prog_fd FD for the BPF program
> + * @param map_fd FD for the struct_ops map to be associated with the BPF program
> + * @param opts optional options, can be NULL
> + *
> + * @return 0 on success; negative error code, otherwise (errno is also set to
> + * the error code)
> + */
> +LIBBPF_API int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
> + struct bpf_prog_assoc_struct_ops_opts *opts);
> +
> #ifdef __cplusplus
> } /* extern "C" */
> #endif
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index fbe74686c97d..260e1feaa665 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -13891,6 +13891,36 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
> return 0;
> }
>
> +int bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> + struct bpf_prog_assoc_struct_ops_opts *opts)
> +{
> + int prog_fd;
> +
> + prog_fd = bpf_program__fd(prog);
> + if (prog_fd < 0) {
> + pr_warn("prog '%s': can't associate BPF program without FD (was it loaded?)\n",
> + prog->name);
> + return -EINVAL;
> + }
> +
> + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
> + pr_warn("prog '%s': can't associate struct_ops program\n", prog->name);
> + return -EINVAL;
> + }
> +
> + if (map->fd < 0) {
heh, this is a bug. we use create_placeholder_fd() to create fixed FDs
associated with maps, and later we replace them with the real
underlying BPF map kernel objects. It's all details, but the point is
that this won't detect map that wasn't created. Use bpf_map__fd()
instead, it handles that correctly.
> + pr_warn("map '%s': can't associate BPF map without FD (was it created?)\n", map->name);
> + return -EINVAL;
> + }
> +
> + if (!bpf_map__is_struct_ops(map)) {
> + pr_warn("map '%s': can't associate non-struct_ops map\n", map->name);
> + return -EINVAL;
> + }
> +
> + return bpf_prog_assoc_struct_ops(prog_fd, map->fd, opts);
> +}
> +
> int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz)
> {
> int err = 0, n, len, start, end = -1;
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 5118d0a90e24..45720b7c2aaa 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -1003,6 +1003,22 @@ LIBBPF_API int
> bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd,
> const char *attach_func_name);
>
> +struct bpf_prog_assoc_struct_ops_opts; /* defined in bpf.h */
> +
> +/**
> + * @brief **bpf_program__assoc_struct_ops()** associates a BPF program with a
> + * struct_ops map.
> + *
> + * @param prog BPF program
> + * @param map struct_ops map to be associated with the BPF program
> + * @param opts optional options, can be NULL
> + *
> + * @return error code; or 0 if no error occurred.
we normally specify returns like so:
@return 0, on success; negative error code, otherwise
keep it consistent?
> + */
> +LIBBPF_API int
> +bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> + struct bpf_prog_assoc_struct_ops_opts *opts);
> +
> /**
> * @brief **bpf_object__find_map_by_name()** returns BPF map of
> * the given name, if it exists within the passed BPF object
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 8ed8749907d4..84fb90a016c9 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -451,4 +451,6 @@ LIBBPF_1.7.0 {
> global:
> bpf_map__set_exclusive_program;
> bpf_map__exclusive_program;
> + bpf_prog_assoc_struct_ops;
> + bpf_program__assoc_struct_ops;
> } LIBBPF_1.6.0;
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 22:47 ` Andrii Nakryiko
@ 2025-11-04 23:27 ` Amery Hung
0 siblings, 0 replies; 30+ messages in thread
From: Amery Hung @ 2025-11-04 23:27 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 2:47 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > Add a new BPF command BPF_PROG_ASSOC_STRUCT_OPS to allow associating
> > a BPF program with a struct_ops map. This command takes a file
> > descriptor of a struct_ops map and a BPF program and set
> > prog->aux->st_ops_assoc to the kdata of the struct_ops map.
> >
> > The command does not accept a struct_ops program nor a non-struct_ops
> > map. Programs of a struct_ops map is automatically associated with the
> > map during map update. If a program is shared between two struct_ops
> > maps, prog->aux->st_ops_assoc will be poisoned to indicate that the
> > associated struct_ops is ambiguous. The pointer, once poisoned, cannot
> > be reset since we have lost track of associated struct_ops. For other
> > program types, the associated struct_ops map, once set, cannot be
> > changed later. This restriction may be lifted in the future if there is
> > a use case.
> >
> > A kernel helper bpf_prog_get_assoc_struct_ops() can be used to retrieve
> > the associated struct_ops pointer. The returned pointer, if not NULL, is
> > guaranteed to be valid and point to a fully updated struct_ops struct.
> > For struct_ops program reused in multiple struct_ops map, the return
> > will be NULL.
> >
> > To make sure the returned pointer to be valid, the command increases the
> > refcount of the map for every associated non-struct_ops programs. For
> > struct_ops programs, the destruction of a struct_ops map already waits for
> > its BPF programs to finish running. A later patch will further make sure
> > the map will not be freed when an async callback schedule from struct_ops
> > is running.
> >
> > struct_ops implementers should note that the struct_ops returned may or
> > may not be attached. The struct_ops implementer will be responsible for
> > tracking and checking the state of the associated struct_ops map if the
> > use case requires an attached struct_ops.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> > include/linux/bpf.h | 16 ++++++
> > include/uapi/linux/bpf.h | 17 +++++++
> > kernel/bpf/bpf_struct_ops.c | 90 ++++++++++++++++++++++++++++++++++
> > kernel/bpf/core.c | 3 ++
> > kernel/bpf/syscall.c | 46 +++++++++++++++++
> > tools/include/uapi/linux/bpf.h | 17 +++++++
> > 6 files changed, 189 insertions(+)
> >
>
> [...]
>
> > static void bpf_struct_ops_map_free_image(struct bpf_struct_ops_map *st_map)
> > {
> > int i;
> > @@ -801,6 +812,12 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > goto reset_unlock;
> > }
> >
> > + err = bpf_prog_assoc_struct_ops(prog, &st_map->map);
> > + if (err) {
> > + bpf_prog_put(prog);
> > + goto reset_unlock;
> > + }
> > +
> > link = kzalloc(sizeof(*link), GFP_USER);
> > if (!link) {
>
> I think we need to call bpf_prog_disassoc_struct_ops() here if
> kzalloc() fails (just like we do bpf_prog_put). After kzalloc link
> will be put into plink list and generic clean up path will handle all
> this, but not here.
>
1 point for catching a bug that was missed by AI. I will move
bpf_prog_assoc_struct_ops() after the link is successfully created.
> > bpf_prog_put(prog);
> > @@ -980,6 +997,8 @@ static void bpf_struct_ops_map_free(struct bpf_map *map)
> > if (btf_is_module(st_map->btf))
> > module_put(st_map->st_ops_desc->st_ops->owner);
> >
> > + bpf_struct_ops_map_dissoc_progs(st_map);
> > +
> > bpf_struct_ops_map_del_ksyms(st_map);
> >
> > /* The struct_ops's function may switch to another struct_ops.
>
> [...]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops
2025-11-04 17:54 ` bot+bpf-ci
@ 2025-11-04 23:27 ` Andrii Nakryiko
0 siblings, 0 replies; 30+ messages in thread
From: Andrii Nakryiko @ 2025-11-04 23:27 UTC (permalink / raw)
To: bot+bpf-ci
Cc: ameryhung, bpf, netdev, alexei.starovoitov, andrii, daniel, tj,
martin.lau, kernel-team, ast, eddyz87, yonghong.song, clm,
ihor.solodrai
On Tue, Nov 4, 2025 at 9:54 AM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index fbe74686c..260e1feaa 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
>
> [ ... ]
>
> > @@ -13891,6 +13891,36 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
> > return 0;
> > }
> >
> > +int bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> > + struct bpf_prog_assoc_struct_ops_opts *opts)
> > +{
> > + int prog_fd;
> > +
> > + prog_fd = bpf_program__fd(prog);
> > + if (prog_fd < 0) {
> > + pr_warn("prog '%s': can't associate BPF program without FD (was it loaded?)\n",
> > + prog->name);
> ^^^^^^^^^^
>
> Can prog be NULL here? If bpf_program__fd() returns -EINVAL because prog
libbpf APIs generally do not tolerate passing NULL for bpf_program or
bpf_map pointers. This is fine, this shouldn't happen with valid usage
of an API (and we don't guard against this).
> was NULL, this dereferences prog->name without a NULL check. Looking at
> bpf_program__set_attach_target() just above, it checks "if (!prog)"
> before any dereferences.
>
> > + return -EINVAL;
> > + }
> > +
> > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
> ^^^^^^^^^^
>
> Same issue - prog->type is dereferenced without verifying prog is not
> NULL.
>
> > + pr_warn("prog '%s': can't associate struct_ops program\n", prog->name);
> > + return -EINVAL;
> > + }
> > +
> > + if (map->fd < 0) {
> ^^^^^^
>
> Can map be NULL? There's no NULL check before dereferencing map->fd.
> Other map APIs like bpf_map__fd() and bpf_map__name() check for NULL
> before use.
>
> > + pr_warn("map '%s': can't associate BPF map without FD (was it created?)\n", map->name);
> > + return -EINVAL;
> > + }
> > +
> > + if (!bpf_map__is_struct_ops(map)) {
> > + pr_warn("map '%s': can't associate non-struct_ops map\n", map->name);
> > + return -EINVAL;
> > + }
> > +
> > + return bpf_prog_assoc_struct_ops(prog_fd, map->fd, opts);
> > +}
> > +
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19077679684
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops
2025-11-04 23:26 ` Andrii Nakryiko
@ 2025-11-04 23:39 ` Amery Hung
2025-11-05 0:46 ` Andrii Nakryiko
0 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2025-11-04 23:39 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 3:27 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > Add low-level wrapper and libbpf API for BPF_PROG_ASSOC_STRUCT_OPS
> > command in the bpf() syscall.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> > tools/lib/bpf/bpf.c | 19 +++++++++++++++++++
> > tools/lib/bpf/bpf.h | 21 +++++++++++++++++++++
> > tools/lib/bpf/libbpf.c | 30 ++++++++++++++++++++++++++++++
> > tools/lib/bpf/libbpf.h | 16 ++++++++++++++++
> > tools/lib/bpf/libbpf.map | 2 ++
> > 5 files changed, 88 insertions(+)
> >
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index b66f5fbfbbb2..21b57a629916 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -1397,3 +1397,22 @@ int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
> > err = sys_bpf(BPF_PROG_STREAM_READ_BY_FD, &attr, attr_sz);
> > return libbpf_err_errno(err);
> > }
> > +
> > +int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
> > + struct bpf_prog_assoc_struct_ops_opts *opts)
> > +{
> > + const size_t attr_sz = offsetofend(union bpf_attr, prog_assoc_struct_ops);
> > + union bpf_attr attr;
> > + int err;
> > +
> > + if (!OPTS_VALID(opts, bpf_prog_assoc_struct_ops_opts))
> > + return libbpf_err(-EINVAL);
> > +
> > + memset(&attr, 0, attr_sz);
> > + attr.prog_assoc_struct_ops.map_fd = map_fd;
> > + attr.prog_assoc_struct_ops.prog_fd = prog_fd;
> > + attr.prog_assoc_struct_ops.flags = OPTS_GET(opts, flags, 0);
> > +
> > + err = sys_bpf(BPF_PROG_ASSOC_STRUCT_OPS, &attr, attr_sz);
> > + return libbpf_err_errno(err);
> > +}
> > diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> > index e983a3e40d61..1f9c28d27795 100644
> > --- a/tools/lib/bpf/bpf.h
> > +++ b/tools/lib/bpf/bpf.h
> > @@ -733,6 +733,27 @@ struct bpf_prog_stream_read_opts {
> > LIBBPF_API int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
> > struct bpf_prog_stream_read_opts *opts);
> >
> > +struct bpf_prog_assoc_struct_ops_opts {
> > + size_t sz;
> > + __u32 flags;
> > + size_t :0;
> > +};
> > +#define bpf_prog_assoc_struct_ops_opts__last_field flags
> > +
> > +/**
> > + * @brief **bpf_prog_assoc_struct_ops** associates a BPF program with a
> > + * struct_ops map.
> > + *
> > + * @param prog_fd FD for the BPF program
> > + * @param map_fd FD for the struct_ops map to be associated with the BPF program
> > + * @param opts optional options, can be NULL
> > + *
> > + * @return 0 on success; negative error code, otherwise (errno is also set to
> > + * the error code)
> > + */
> > +LIBBPF_API int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
> > + struct bpf_prog_assoc_struct_ops_opts *opts);
> > +
> > #ifdef __cplusplus
> > } /* extern "C" */
> > #endif
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index fbe74686c97d..260e1feaa665 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -13891,6 +13891,36 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
> > return 0;
> > }
> >
> > +int bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> > + struct bpf_prog_assoc_struct_ops_opts *opts)
> > +{
> > + int prog_fd;
> > +
> > + prog_fd = bpf_program__fd(prog);
> > + if (prog_fd < 0) {
> > + pr_warn("prog '%s': can't associate BPF program without FD (was it loaded?)\n",
> > + prog->name);
> > + return -EINVAL;
> > + }
> > +
> > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
> > + pr_warn("prog '%s': can't associate struct_ops program\n", prog->name);
> > + return -EINVAL;
> > + }
> > +
> > + if (map->fd < 0) {
>
> heh, this is a bug. we use create_placeholder_fd() to create fixed FDs
> associated with maps, and later we replace them with the real
> underlying BPF map kernel objects. It's all details, but the point is
> that this won't detect map that wasn't created. Use bpf_map__fd()
> instead, it handles that correctly.
I saw quite a few libbpf API doing this check (e.g., bpf_map__pin(),
bpf_link__update_map(), bpf_map__attach_struct_ops()). Should we also
fix them?
>
> > + pr_warn("map '%s': can't associate BPF map without FD (was it created?)\n", map->name);
> > + return -EINVAL;
> > + }
> > +
> > + if (!bpf_map__is_struct_ops(map)) {
> > + pr_warn("map '%s': can't associate non-struct_ops map\n", map->name);
> > + return -EINVAL;
> > + }
> > +
> > + return bpf_prog_assoc_struct_ops(prog_fd, map->fd, opts);
> > +}
> > +
> > int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz)
> > {
> > int err = 0, n, len, start, end = -1;
> > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> > index 5118d0a90e24..45720b7c2aaa 100644
> > --- a/tools/lib/bpf/libbpf.h
> > +++ b/tools/lib/bpf/libbpf.h
> > @@ -1003,6 +1003,22 @@ LIBBPF_API int
> > bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd,
> > const char *attach_func_name);
> >
> > +struct bpf_prog_assoc_struct_ops_opts; /* defined in bpf.h */
> > +
> > +/**
> > + * @brief **bpf_program__assoc_struct_ops()** associates a BPF program with a
> > + * struct_ops map.
> > + *
> > + * @param prog BPF program
> > + * @param map struct_ops map to be associated with the BPF program
> > + * @param opts optional options, can be NULL
> > + *
> > + * @return error code; or 0 if no error occurred.
>
> we normally specify returns like so:
>
> @return 0, on success; negative error code, otherwise
>
> keep it consistent?
Okay. Will change.
BTW. The return comment is copied from bpf_program__set_attach_target().
>
> > + */
> > +LIBBPF_API int
> > +bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> > + struct bpf_prog_assoc_struct_ops_opts *opts);
> > +
> > /**
> > * @brief **bpf_object__find_map_by_name()** returns BPF map of
> > * the given name, if it exists within the passed BPF object
> > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > index 8ed8749907d4..84fb90a016c9 100644
> > --- a/tools/lib/bpf/libbpf.map
> > +++ b/tools/lib/bpf/libbpf.map
> > @@ -451,4 +451,6 @@ LIBBPF_1.7.0 {
> > global:
> > bpf_map__set_exclusive_program;
> > bpf_map__exclusive_program;
> > + bpf_prog_assoc_struct_ops;
> > + bpf_program__assoc_struct_ops;
> > } LIBBPF_1.6.0;
> > --
> > 2.47.3
> >
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops
2025-11-04 23:39 ` Amery Hung
@ 2025-11-05 0:46 ` Andrii Nakryiko
0 siblings, 0 replies; 30+ messages in thread
From: Andrii Nakryiko @ 2025-11-05 0:46 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 3:39 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 3:27 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
> > >
> > > Add low-level wrapper and libbpf API for BPF_PROG_ASSOC_STRUCT_OPS
> > > command in the bpf() syscall.
> > >
> > > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > > ---
> > > tools/lib/bpf/bpf.c | 19 +++++++++++++++++++
> > > tools/lib/bpf/bpf.h | 21 +++++++++++++++++++++
> > > tools/lib/bpf/libbpf.c | 30 ++++++++++++++++++++++++++++++
> > > tools/lib/bpf/libbpf.h | 16 ++++++++++++++++
> > > tools/lib/bpf/libbpf.map | 2 ++
> > > 5 files changed, 88 insertions(+)
> > >
> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > > index b66f5fbfbbb2..21b57a629916 100644
> > > --- a/tools/lib/bpf/bpf.c
> > > +++ b/tools/lib/bpf/bpf.c
> > > @@ -1397,3 +1397,22 @@ int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
> > > err = sys_bpf(BPF_PROG_STREAM_READ_BY_FD, &attr, attr_sz);
> > > return libbpf_err_errno(err);
> > > }
> > > +
> > > +int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
> > > + struct bpf_prog_assoc_struct_ops_opts *opts)
> > > +{
> > > + const size_t attr_sz = offsetofend(union bpf_attr, prog_assoc_struct_ops);
> > > + union bpf_attr attr;
> > > + int err;
> > > +
> > > + if (!OPTS_VALID(opts, bpf_prog_assoc_struct_ops_opts))
> > > + return libbpf_err(-EINVAL);
> > > +
> > > + memset(&attr, 0, attr_sz);
> > > + attr.prog_assoc_struct_ops.map_fd = map_fd;
> > > + attr.prog_assoc_struct_ops.prog_fd = prog_fd;
> > > + attr.prog_assoc_struct_ops.flags = OPTS_GET(opts, flags, 0);
> > > +
> > > + err = sys_bpf(BPF_PROG_ASSOC_STRUCT_OPS, &attr, attr_sz);
> > > + return libbpf_err_errno(err);
> > > +}
> > > diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> > > index e983a3e40d61..1f9c28d27795 100644
> > > --- a/tools/lib/bpf/bpf.h
> > > +++ b/tools/lib/bpf/bpf.h
> > > @@ -733,6 +733,27 @@ struct bpf_prog_stream_read_opts {
> > > LIBBPF_API int bpf_prog_stream_read(int prog_fd, __u32 stream_id, void *buf, __u32 buf_len,
> > > struct bpf_prog_stream_read_opts *opts);
> > >
> > > +struct bpf_prog_assoc_struct_ops_opts {
> > > + size_t sz;
> > > + __u32 flags;
> > > + size_t :0;
> > > +};
> > > +#define bpf_prog_assoc_struct_ops_opts__last_field flags
> > > +
> > > +/**
> > > + * @brief **bpf_prog_assoc_struct_ops** associates a BPF program with a
> > > + * struct_ops map.
> > > + *
> > > + * @param prog_fd FD for the BPF program
> > > + * @param map_fd FD for the struct_ops map to be associated with the BPF program
> > > + * @param opts optional options, can be NULL
> > > + *
> > > + * @return 0 on success; negative error code, otherwise (errno is also set to
> > > + * the error code)
> > > + */
> > > +LIBBPF_API int bpf_prog_assoc_struct_ops(int prog_fd, int map_fd,
> > > + struct bpf_prog_assoc_struct_ops_opts *opts);
> > > +
> > > #ifdef __cplusplus
> > > } /* extern "C" */
> > > #endif
> > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > index fbe74686c97d..260e1feaa665 100644
> > > --- a/tools/lib/bpf/libbpf.c
> > > +++ b/tools/lib/bpf/libbpf.c
> > > @@ -13891,6 +13891,36 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
> > > return 0;
> > > }
> > >
> > > +int bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> > > + struct bpf_prog_assoc_struct_ops_opts *opts)
> > > +{
> > > + int prog_fd;
> > > +
> > > + prog_fd = bpf_program__fd(prog);
> > > + if (prog_fd < 0) {
> > > + pr_warn("prog '%s': can't associate BPF program without FD (was it loaded?)\n",
> > > + prog->name);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS) {
> > > + pr_warn("prog '%s': can't associate struct_ops program\n", prog->name);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + if (map->fd < 0) {
> >
> > heh, this is a bug. we use create_placeholder_fd() to create fixed FDs
> > associated with maps, and later we replace them with the real
> > underlying BPF map kernel objects. It's all details, but the point is
> > that this won't detect map that wasn't created. Use bpf_map__fd()
> > instead, it handles that correctly.
>
> I saw quite a few libbpf API doing this check (e.g., bpf_map__pin(),
> bpf_link__update_map(), bpf_map__attach_struct_ops()). Should we also
> fix them?
yep, probably :)
>
> >
> > > + pr_warn("map '%s': can't associate BPF map without FD (was it created?)\n", map->name);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + if (!bpf_map__is_struct_ops(map)) {
> > > + pr_warn("map '%s': can't associate non-struct_ops map\n", map->name);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + return bpf_prog_assoc_struct_ops(prog_fd, map->fd, opts);
> > > +}
> > > +
> > > int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz)
> > > {
> > > int err = 0, n, len, start, end = -1;
> > > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> > > index 5118d0a90e24..45720b7c2aaa 100644
> > > --- a/tools/lib/bpf/libbpf.h
> > > +++ b/tools/lib/bpf/libbpf.h
> > > @@ -1003,6 +1003,22 @@ LIBBPF_API int
> > > bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd,
> > > const char *attach_func_name);
> > >
> > > +struct bpf_prog_assoc_struct_ops_opts; /* defined in bpf.h */
> > > +
> > > +/**
> > > + * @brief **bpf_program__assoc_struct_ops()** associates a BPF program with a
> > > + * struct_ops map.
> > > + *
> > > + * @param prog BPF program
> > > + * @param map struct_ops map to be associated with the BPF program
> > > + * @param opts optional options, can be NULL
> > > + *
> > > + * @return error code; or 0 if no error occurred.
> >
> > we normally specify returns like so:
> >
> > @return 0, on success; negative error code, otherwise
> >
> > keep it consistent?
>
> Okay. Will change.
>
> BTW. The return comment is copied from bpf_program__set_attach_target().
well, we should strive for consistency :)
>
> >
> > > + */
> > > +LIBBPF_API int
> > > +bpf_program__assoc_struct_ops(struct bpf_program *prog, struct bpf_map *map,
> > > + struct bpf_prog_assoc_struct_ops_opts *opts);
> > > +
> > > /**
> > > * @brief **bpf_object__find_map_by_name()** returns BPF map of
> > > * the given name, if it exists within the passed BPF object
> > > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > > index 8ed8749907d4..84fb90a016c9 100644
> > > --- a/tools/lib/bpf/libbpf.map
> > > +++ b/tools/lib/bpf/libbpf.map
> > > @@ -451,4 +451,6 @@ LIBBPF_1.7.0 {
> > > global:
> > > bpf_map__set_exclusive_program;
> > > bpf_map__exclusive_program;
> > > + bpf_prog_assoc_struct_ops;
> > > + bpf_program__assoc_struct_ops;
> > > } LIBBPF_1.6.0;
> > > --
> > > 2.47.3
> > >
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-04 23:20 ` Andrii Nakryiko
@ 2025-11-05 23:03 ` Amery Hung
2025-11-06 16:54 ` Andrii Nakryiko
0 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2025-11-05 23:03 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Tue, Nov 4, 2025 at 3:21 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
> >
> > Take a refcount of the associated struct_ops map to prevent the map from
> > being freed when an async callback scheduled from a struct_ops program
> > runs.
> >
> > Since struct_ops programs do not take refcounts on the struct_ops map,
> > it is possible for a struct_ops map to be freed when an async callback
> > scheduled from it runs. To prevent this, take a refcount on prog->aux->
> > st_ops_assoc and save it in a newly created struct bpf_async_res for
> > every async mechanism. The reference needs to be preserved in
> > bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
> > and reference leak could happen.
> >
> > bpf_async_res will contain a async callback's BPF program and resources
> > related to the BPF program. The resources will be acquired when
> > registering a callback and released when cancelled or when the map
> > associated with the callback is freed.
> >
> > Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
> > what it now does.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> > kernel/bpf/helpers.c | 105 +++++++++++++++++++++++++++++--------------
> > 1 file changed, 72 insertions(+), 33 deletions(-)
> >
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 930e132f440f..5c081cd604d5 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -1092,9 +1092,14 @@ static void *map_key_from_value(struct bpf_map *map, void *value, u32 *arr_idx)
> > return (void *)value - round_up(map->key_size, 8);
> > }
> >
> > +struct bpf_async_res {
>
> "res" has a strong "result" meaning, which is a distraction here.
> Maybe "bpf_async_ctx"? And then we can use prep and put (reset?)
> terminology?
>
> > + struct bpf_prog *prog;
> > + struct bpf_map *st_ops_assoc;
> > +};
> > +
> > struct bpf_async_cb {
> > struct bpf_map *map;
> > - struct bpf_prog *prog;
> > + struct bpf_async_res res;
> > void __rcu *callback_fn;
> > void *value;
> > union {
> > @@ -1299,8 +1304,8 @@ static int __bpf_async_init(struct bpf_async_kern *async, struct bpf_map *map, u
> > break;
> > }
> > cb->map = map;
> > - cb->prog = NULL;
> > cb->flags = flags;
> > + memset(&cb->res, 0, sizeof(cb->res));
> > rcu_assign_pointer(cb->callback_fn, NULL);
> >
> > WRITE_ONCE(async->cb, cb);
> > @@ -1351,11 +1356,47 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
> > .arg3_type = ARG_ANYTHING,
> > };
> >
> > +static void bpf_async_res_put(struct bpf_async_res *res)
> > +{
> > + bpf_prog_put(res->prog);
> > +
> > + if (res->st_ops_assoc)
> > + bpf_map_put(res->st_ops_assoc);
> > +}
> > +
> > +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> > +{
> > + struct bpf_map *st_ops_assoc = NULL;
> > + int err;
> > +
> > + prog = bpf_prog_inc_not_zero(prog);
> > + if (IS_ERR(prog))
> > + return PTR_ERR(prog);
> > +
> > + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
>
> I think in about a month we'll forget why we inc_not_zero only for
> STRUCT_OPS programs, so I'd add comment here that non-struct_ops
> programs have explicit refcount on st_ops_assoc, so as long as we have
> that inc_not_zero(prog) above, we don't need to also bump map refcount
Will document this in the comment.
>
> > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> > + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> > + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
> > + if (IS_ERR(st_ops_assoc)) {
> > + err = PTR_ERR(st_ops_assoc);
> > + goto put_prog;
>
> nit: might be a bit premature to structure code with goto put_prog. As of now:
>
>
> bpf_prog_put(prog);
> return PTR_ERR(st_ops_assoc);
>
> is short and sweet and good enough?
Yes. We can change it to this style once there are more fields in bpf_async_ctx.
>
> > + }
> > + }
> > +
> > + res->prog = prog;
> > + res->st_ops_assoc = st_ops_assoc;
>
> question: do we want to assign BPF_PTR_POISON to res->st_ops_assoc or
> is it better to keep it as NULL in such a case? I'm not sure, just
> bringing up the possibility
As this doesn't make a difference on what
bpf_prog_get_assoc_struct_ops() returns, I'd keep it as NULL to
simplify things.
>
> > + return 0;
> > +put_prog:
> > + bpf_prog_put(prog);
> > + return err;
> > +}
> > +
> > static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback_fn,
> > struct bpf_prog_aux *aux, unsigned int flags,
> > enum bpf_async_type type)
> > {
> > struct bpf_prog *prev, *prog = aux->prog;
> > + struct bpf_async_res res;
> > struct bpf_async_cb *cb;
> > int ret = 0;
> >
> > @@ -1376,20 +1417,18 @@ static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback
> > ret = -EPERM;
> > goto out;
> > }
> > - prev = cb->prog;
> > + prev = cb->res.prog;
> > if (prev != prog) {
> > - /* Bump prog refcnt once. Every bpf_timer_set_callback()
> > + /* Get prog and related resources once. Every bpf_timer_set_callback()
> > * can pick different callback_fn-s within the same prog.
> > */
> > - prog = bpf_prog_inc_not_zero(prog);
> > - if (IS_ERR(prog)) {
> > - ret = PTR_ERR(prog);
> > + ret = bpf_async_res_get(&res, prog);
> > + if (ret)
> > goto out;
> > - }
> > if (prev)
> > - /* Drop prev prog refcnt when swapping with new prog */
> > - bpf_prog_put(prev);
> > - cb->prog = prog;
> > + /* Put prev prog and related resources when swapping with new prog */
> > + bpf_async_res_put(&cb->res);
> > + cb->res = res;
> > }
>
> we discussed this offline, but I'll summarize here:
>
> I think we need to abstract this away as bpf_async_ctx_update(), which
> will accept a new prog pointer, and internally will deal with
> necessary ref count inc/put as necessary, depending on whether prog
> changed or not. With st_ops_assoc, prog pointer may not change, but
> the underlying st_ops_assoc might (it can go from NULL to valid
> assoc). And the implementation will be careful to leave previous async
> ctx as it was if anything goes wrong (just like existing code
> behaves).
How about three APIs like below. First we just bump refcounts
unconditionally with prepare(). Then, xchg the local bpf_async_ctx
with the one embedded in callbacks with update(), and drop refcount in
cleanup().
This will have some overhead as there are unnecessary atomic op, but
can make update() much straightforward.
static void bpf_async_ctx_cleanup(struct bpf_async_ctx *ctx)
{
bpf_prog_put(ctx->prog);
if (ctx->st_ops_assoc)
bpf_map_put(ctx->st_ops_assoc);
memset(&ctx, 0, sizeof(*ctx));
}
static int bpf_async_ctx_prepare(struct bpf_async_ctx *ctx, struct
bpf_prog *prog)
{
struct bpf_map *st_ops_assoc = NULL;
int err;
prog = bpf_prog_inc_not_zero(prog);
if (IS_ERR(prog))
return PTR_ERR(prog);
st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
if (IS_ERR(st_ops_assoc)) {
bpf_prog_put(prog);
return PTR_ERR(st_ops_assoc);
}
}
ctx->prog = prog;
ctx->st_ops_assoc = st_ops_assoc;
return 0;
}
static void bpf_async_ctx_update(struct bpf_async_ctx *ctx, struct
bpf_async_ctx *new)
{
struct bpf_async_ctx old;
old = *ctx;
*ctx = *new;
bpf_async_ctx_cleanup(old);
}
>
> > rcu_assign_pointer(cb->callback_fn, callback_fn);
> > out:
> > @@ -1423,7 +1462,7 @@ BPF_CALL_3(bpf_timer_start, struct bpf_async_kern *, timer, u64, nsecs, u64, fla
> > return -EINVAL;
> > __bpf_spin_lock_irqsave(&timer->lock);
> > t = timer->timer;
> > - if (!t || !t->cb.prog) {
> > + if (!t || !t->cb.res.prog) {
> > ret = -EINVAL;
> > goto out;
> > }
> > @@ -1451,14 +1490,14 @@ static const struct bpf_func_proto bpf_timer_start_proto = {
> > .arg3_type = ARG_ANYTHING,
> > };
> >
> > -static void drop_prog_refcnt(struct bpf_async_cb *async)
> > +static void bpf_async_cb_reset(struct bpf_async_cb *cb)
> > {
> > - struct bpf_prog *prog = async->prog;
> > + struct bpf_prog *prog = cb->res.prog;
> >
> > if (prog) {
> > - bpf_prog_put(prog);
> > - async->prog = NULL;
> > - rcu_assign_pointer(async->callback_fn, NULL);
> > + bpf_async_res_put(&cb->res);
> > + memset(&cb->res, 0, sizeof(cb->res));
>
> shouldn't bpf_async_res_put() leave cb->res in zeroed out state? why
> extra memset(0)?
Will move memset(0) into bpf_async_res_put().
>
> > + rcu_assign_pointer(cb->callback_fn, NULL);
> > }
> > }
> >
>
> [...]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
` (2 preceding siblings ...)
2025-11-04 22:47 ` Andrii Nakryiko
@ 2025-11-06 0:57 ` Martin KaFai Lau
2025-11-06 1:01 ` Amery Hung
3 siblings, 1 reply; 30+ messages in thread
From: Martin KaFai Lau @ 2025-11-06 0:57 UTC (permalink / raw)
To: Amery Hung
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, bpf
On 11/4/25 9:26 AM, Amery Hung wrote:
> +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
> +{
> + struct bpf_map *st_ops_assoc = READ_ONCE(aux->st_ops_assoc);
> + struct bpf_struct_ops_map *st_map;
> +
> + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
> + return NULL;
> +
> + st_map = (struct bpf_struct_ops_map *)st_ops_assoc;
> +
> + if (smp_load_acquire(&st_map->kvalue.common.state) == BPF_STRUCT_OPS_STATE_INIT) {
> + bpf_map_put(st_ops_assoc);
hmm... why bpf_map_put is needed?
Should the state be checked only once during assoc time instead of
checking it every time bpf_prog_get_assoc_struct_ops is called?
> + return NULL;
> + }
> +
> + return &st_map->kvalue.data;
> +}
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-06 0:57 ` Martin KaFai Lau
@ 2025-11-06 1:01 ` Amery Hung
2025-11-06 2:17 ` Martin KaFai Lau
0 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2025-11-06 1:01 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, bpf
On Wed, Nov 5, 2025 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
>
>
> On 11/4/25 9:26 AM, Amery Hung wrote:
> > +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
> > +{
> > + struct bpf_map *st_ops_assoc = READ_ONCE(aux->st_ops_assoc);
> > + struct bpf_struct_ops_map *st_map;
> > +
> > + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
> > + return NULL;
> > +
> > + st_map = (struct bpf_struct_ops_map *)st_ops_assoc;
> > +
> > + if (smp_load_acquire(&st_map->kvalue.common.state) == BPF_STRUCT_OPS_STATE_INIT) {
> > + bpf_map_put(st_ops_assoc);
>
> hmm... why bpf_map_put is needed?
>
AI also caught this. This is not needed. I overlooked it when changing
from v4, where bpf_prog_get_assoc_struct_ops() used to first bump the
refcount.
> Should the state be checked only once during assoc time instead of
> checking it every time bpf_prog_get_assoc_struct_ops is called?
>
> > + return NULL;
> > + }
> > +
> > + return &st_map->kvalue.data;
> > +}
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-04 17:26 ` [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback Amery Hung
2025-11-04 18:03 ` bot+bpf-ci
2025-11-04 23:20 ` Andrii Nakryiko
@ 2025-11-06 2:13 ` Martin KaFai Lau
2025-11-06 17:57 ` Amery Hung
2 siblings, 1 reply; 30+ messages in thread
From: Martin KaFai Lau @ 2025-11-06 2:13 UTC (permalink / raw)
To: Amery Hung
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, bpf
On 11/4/25 9:26 AM, Amery Hung wrote:
> Take a refcount of the associated struct_ops map to prevent the map from
> being freed when an async callback scheduled from a struct_ops program
> runs.
>
> Since struct_ops programs do not take refcounts on the struct_ops map,
> it is possible for a struct_ops map to be freed when an async callback
> scheduled from it runs. To prevent this, take a refcount on prog->aux->
> st_ops_assoc and save it in a newly created struct bpf_async_res for
> every async mechanism. The reference needs to be preserved in
> bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
> and reference leak could happen.
>
> bpf_async_res will contain a async callback's BPF program and resources
> related to the BPF program. The resources will be acquired when
> registering a callback and released when cancelled or when the map
> associated with the callback is freed.
>
> Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
> what it now does.
>
[ ... ]
> +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> +{
> + struct bpf_map *st_ops_assoc = NULL;
> + int err;
> +
> + prog = bpf_prog_inc_not_zero(prog);
> + if (IS_ERR(prog))
> + return PTR_ERR(prog);
> +
> + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
> + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
The READ_ONCE and inc_not_zero is an unusual combo. Should it be
rcu_dereference and prog->aux->st_ops_assoc should be "__rcu" tagged?
If prog->aux->st_ops_assoc is protected by rcu, can the user (kfunc?)
uses the prog->aux->st_ops_assoc depending on the rcu grace period alone
without bpf_map_inc_not_zero? Does it matter if prog->aux->st_ops_assoc
is changed? but this patch does not seem to consider the changing case also.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops
2025-11-06 1:01 ` Amery Hung
@ 2025-11-06 2:17 ` Martin KaFai Lau
0 siblings, 0 replies; 30+ messages in thread
From: Martin KaFai Lau @ 2025-11-06 2:17 UTC (permalink / raw)
To: Amery Hung
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, bpf
On 11/5/25 5:01 PM, Amery Hung wrote:
> On Wed, Nov 5, 2025 at 4:57 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>>
>>
>> On 11/4/25 9:26 AM, Amery Hung wrote:
>>> +void *bpf_prog_get_assoc_struct_ops(const struct bpf_prog_aux *aux)
>>> +{
>>> + struct bpf_map *st_ops_assoc = READ_ONCE(aux->st_ops_assoc);
>>> + struct bpf_struct_ops_map *st_map;
>>> +
>>> + if (!st_ops_assoc || st_ops_assoc == BPF_PTR_POISON)
>>> + return NULL;
>>> +
>>> + st_map = (struct bpf_struct_ops_map *)st_ops_assoc;
>>> +
>>> + if (smp_load_acquire(&st_map->kvalue.common.state) == BPF_STRUCT_OPS_STATE_INIT) {
>>> + bpf_map_put(st_ops_assoc);
>>
>> hmm... why bpf_map_put is needed?
>>
>
> AI also caught this. This is not needed. I overlooked it when changing
> from v4, where bpf_prog_get_assoc_struct_ops() used to first bump the
> refcount.
ah. My bad for the noise. Fixed my mail client. Somehow only that one
email got filtered.
>
>> Should the state be checked only once during assoc time instead of
>> checking it every time bpf_prog_get_assoc_struct_ops is called?
>>
>>> + return NULL;
>>> + }
>>> +
>>> + return &st_map->kvalue.data;
>>> +}
>>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-05 23:03 ` Amery Hung
@ 2025-11-06 16:54 ` Andrii Nakryiko
0 siblings, 0 replies; 30+ messages in thread
From: Andrii Nakryiko @ 2025-11-06 16:54 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team
On Wed, Nov 5, 2025 at 3:03 PM Amery Hung <ameryhung@gmail.com> wrote:
>
> On Tue, Nov 4, 2025 at 3:21 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Nov 4, 2025 at 9:27 AM Amery Hung <ameryhung@gmail.com> wrote:
> > >
> > > Take a refcount of the associated struct_ops map to prevent the map from
> > > being freed when an async callback scheduled from a struct_ops program
> > > runs.
> > >
> > > Since struct_ops programs do not take refcounts on the struct_ops map,
> > > it is possible for a struct_ops map to be freed when an async callback
> > > scheduled from it runs. To prevent this, take a refcount on prog->aux->
> > > st_ops_assoc and save it in a newly created struct bpf_async_res for
> > > every async mechanism. The reference needs to be preserved in
> > > bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
> > > and reference leak could happen.
> > >
> > > bpf_async_res will contain a async callback's BPF program and resources
> > > related to the BPF program. The resources will be acquired when
> > > registering a callback and released when cancelled or when the map
> > > associated with the callback is freed.
> > >
> > > Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
> > > what it now does.
> > >
> > > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > > ---
> > > kernel/bpf/helpers.c | 105 +++++++++++++++++++++++++++++--------------
> > > 1 file changed, 72 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index 930e132f440f..5c081cd604d5 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -1092,9 +1092,14 @@ static void *map_key_from_value(struct bpf_map *map, void *value, u32 *arr_idx)
> > > return (void *)value - round_up(map->key_size, 8);
> > > }
> > >
> > > +struct bpf_async_res {
> >
> > "res" has a strong "result" meaning, which is a distraction here.
> > Maybe "bpf_async_ctx"? And then we can use prep and put (reset?)
> > terminology?
> >
> > > + struct bpf_prog *prog;
> > > + struct bpf_map *st_ops_assoc;
> > > +};
> > > +
> > > struct bpf_async_cb {
> > > struct bpf_map *map;
> > > - struct bpf_prog *prog;
> > > + struct bpf_async_res res;
> > > void __rcu *callback_fn;
> > > void *value;
> > > union {
> > > @@ -1299,8 +1304,8 @@ static int __bpf_async_init(struct bpf_async_kern *async, struct bpf_map *map, u
> > > break;
> > > }
> > > cb->map = map;
> > > - cb->prog = NULL;
> > > cb->flags = flags;
> > > + memset(&cb->res, 0, sizeof(cb->res));
> > > rcu_assign_pointer(cb->callback_fn, NULL);
> > >
> > > WRITE_ONCE(async->cb, cb);
> > > @@ -1351,11 +1356,47 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
> > > .arg3_type = ARG_ANYTHING,
> > > };
> > >
> > > +static void bpf_async_res_put(struct bpf_async_res *res)
> > > +{
> > > + bpf_prog_put(res->prog);
> > > +
> > > + if (res->st_ops_assoc)
> > > + bpf_map_put(res->st_ops_assoc);
> > > +}
> > > +
> > > +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> > > +{
> > > + struct bpf_map *st_ops_assoc = NULL;
> > > + int err;
> > > +
> > > + prog = bpf_prog_inc_not_zero(prog);
> > > + if (IS_ERR(prog))
> > > + return PTR_ERR(prog);
> > > +
> > > + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
> >
> > I think in about a month we'll forget why we inc_not_zero only for
> > STRUCT_OPS programs, so I'd add comment here that non-struct_ops
> > programs have explicit refcount on st_ops_assoc, so as long as we have
> > that inc_not_zero(prog) above, we don't need to also bump map refcount
>
> Will document this in the comment.
>
> >
> > > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> > > + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> > > + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
> > > + if (IS_ERR(st_ops_assoc)) {
> > > + err = PTR_ERR(st_ops_assoc);
> > > + goto put_prog;
> >
> > nit: might be a bit premature to structure code with goto put_prog. As of now:
> >
> >
> > bpf_prog_put(prog);
> > return PTR_ERR(st_ops_assoc);
> >
> > is short and sweet and good enough?
>
> Yes. We can change it to this style once there are more fields in bpf_async_ctx.
>
> >
> > > + }
> > > + }
> > > +
> > > + res->prog = prog;
> > > + res->st_ops_assoc = st_ops_assoc;
> >
> > question: do we want to assign BPF_PTR_POISON to res->st_ops_assoc or
> > is it better to keep it as NULL in such a case? I'm not sure, just
> > bringing up the possibility
>
> As this doesn't make a difference on what
> bpf_prog_get_assoc_struct_ops() returns, I'd keep it as NULL to
> simplify things.
>
> >
> > > + return 0;
> > > +put_prog:
> > > + bpf_prog_put(prog);
> > > + return err;
> > > +}
> > > +
> > > static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback_fn,
> > > struct bpf_prog_aux *aux, unsigned int flags,
> > > enum bpf_async_type type)
> > > {
> > > struct bpf_prog *prev, *prog = aux->prog;
> > > + struct bpf_async_res res;
> > > struct bpf_async_cb *cb;
> > > int ret = 0;
> > >
> > > @@ -1376,20 +1417,18 @@ static int __bpf_async_set_callback(struct bpf_async_kern *async, void *callback
> > > ret = -EPERM;
> > > goto out;
> > > }
> > > - prev = cb->prog;
> > > + prev = cb->res.prog;
> > > if (prev != prog) {
> > > - /* Bump prog refcnt once. Every bpf_timer_set_callback()
> > > + /* Get prog and related resources once. Every bpf_timer_set_callback()
> > > * can pick different callback_fn-s within the same prog.
> > > */
> > > - prog = bpf_prog_inc_not_zero(prog);
> > > - if (IS_ERR(prog)) {
> > > - ret = PTR_ERR(prog);
> > > + ret = bpf_async_res_get(&res, prog);
> > > + if (ret)
> > > goto out;
> > > - }
> > > if (prev)
> > > - /* Drop prev prog refcnt when swapping with new prog */
> > > - bpf_prog_put(prev);
> > > - cb->prog = prog;
> > > + /* Put prev prog and related resources when swapping with new prog */
> > > + bpf_async_res_put(&cb->res);
> > > + cb->res = res;
> > > }
> >
> > we discussed this offline, but I'll summarize here:
> >
> > I think we need to abstract this away as bpf_async_ctx_update(), which
> > will accept a new prog pointer, and internally will deal with
> > necessary ref count inc/put as necessary, depending on whether prog
> > changed or not. With st_ops_assoc, prog pointer may not change, but
> > the underlying st_ops_assoc might (it can go from NULL to valid
> > assoc). And the implementation will be careful to leave previous async
> > ctx as it was if anything goes wrong (just like existing code
> > behaves).
>
> How about three APIs like below. First we just bump refcounts
> unconditionally with prepare(). Then, xchg the local bpf_async_ctx
> with the one embedded in callbacks with update(), and drop refcount in
> cleanup().
>
Do we need separate prepare and update steps? Wouldn't they always go
together anyways?
> This will have some overhead as there are unnecessary atomic op, but
> can make update() much straightforward.
>
> static void bpf_async_ctx_cleanup(struct bpf_async_ctx *ctx)
> {
> bpf_prog_put(ctx->prog);
>
> if (ctx->st_ops_assoc)
> bpf_map_put(ctx->st_ops_assoc);
>
> memset(&ctx, 0, sizeof(*ctx));
> }
>
> static int bpf_async_ctx_prepare(struct bpf_async_ctx *ctx, struct
> bpf_prog *prog)
> {
> struct bpf_map *st_ops_assoc = NULL;
> int err;
>
> prog = bpf_prog_inc_not_zero(prog);
> if (IS_ERR(prog))
> return PTR_ERR(prog);
>
> st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
> if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
> if (IS_ERR(st_ops_assoc)) {
> bpf_prog_put(prog);
> return PTR_ERR(st_ops_assoc);
> }
> }
>
> ctx->prog = prog;
> ctx->st_ops_assoc = st_ops_assoc;
> return 0;
> }
>
> static void bpf_async_ctx_update(struct bpf_async_ctx *ctx, struct
> bpf_async_ctx *new)
> {
> struct bpf_async_ctx old;
>
> old = *ctx;
> *ctx = *new;
>
> bpf_async_ctx_cleanup(old);
> }
>
>
> >
> > > rcu_assign_pointer(cb->callback_fn, callback_fn);
> > > out:
> > > @@ -1423,7 +1462,7 @@ BPF_CALL_3(bpf_timer_start, struct bpf_async_kern *, timer, u64, nsecs, u64, fla
> > > return -EINVAL;
> > > __bpf_spin_lock_irqsave(&timer->lock);
> > > t = timer->timer;
> > > - if (!t || !t->cb.prog) {
> > > + if (!t || !t->cb.res.prog) {
> > > ret = -EINVAL;
> > > goto out;
> > > }
> > > @@ -1451,14 +1490,14 @@ static const struct bpf_func_proto bpf_timer_start_proto = {
> > > .arg3_type = ARG_ANYTHING,
> > > };
> > >
> > > -static void drop_prog_refcnt(struct bpf_async_cb *async)
> > > +static void bpf_async_cb_reset(struct bpf_async_cb *cb)
> > > {
> > > - struct bpf_prog *prog = async->prog;
> > > + struct bpf_prog *prog = cb->res.prog;
> > >
> > > if (prog) {
> > > - bpf_prog_put(prog);
> > > - async->prog = NULL;
> > > - rcu_assign_pointer(async->callback_fn, NULL);
> > > + bpf_async_res_put(&cb->res);
> > > + memset(&cb->res, 0, sizeof(cb->res));
> >
> > shouldn't bpf_async_res_put() leave cb->res in zeroed out state? why
> > extra memset(0)?
>
> Will move memset(0) into bpf_async_res_put().
I'd rather have explicit NULL assignments for each field that needs to
be cleared. Having "generic" memset(0) will make it easier to forget
to clear stuff (especially if clearing involves some extra operation
wil refcount decrement or whatnot), IMO. I'd do ctx->prog = NULL;
ctx->st_ops_assoc = NULL.
>
> >
> > > + rcu_assign_pointer(cb->callback_fn, NULL);
> > > }
> > > }
> > >
> >
> > [...]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-06 2:13 ` Martin KaFai Lau
@ 2025-11-06 17:57 ` Amery Hung
2025-11-06 19:37 ` Martin KaFai Lau
0 siblings, 1 reply; 30+ messages in thread
From: Amery Hung @ 2025-11-06 17:57 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, bpf
On Wed, Nov 5, 2025 at 6:13 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 11/4/25 9:26 AM, Amery Hung wrote:
> > Take a refcount of the associated struct_ops map to prevent the map from
> > being freed when an async callback scheduled from a struct_ops program
> > runs.
> >
> > Since struct_ops programs do not take refcounts on the struct_ops map,
> > it is possible for a struct_ops map to be freed when an async callback
> > scheduled from it runs. To prevent this, take a refcount on prog->aux->
> > st_ops_assoc and save it in a newly created struct bpf_async_res for
> > every async mechanism. The reference needs to be preserved in
> > bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
> > and reference leak could happen.
> >
> > bpf_async_res will contain a async callback's BPF program and resources
> > related to the BPF program. The resources will be acquired when
> > registering a callback and released when cancelled or when the map
> > associated with the callback is freed.
> >
> > Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
> > what it now does.
> >
>
> [ ... ]
>
> > +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
> > +{
> > + struct bpf_map *st_ops_assoc = NULL;
> > + int err;
> > +
> > + prog = bpf_prog_inc_not_zero(prog);
> > + if (IS_ERR(prog))
> > + return PTR_ERR(prog);
> > +
> > + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
> > + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
> > + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
> > + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
>
> The READ_ONCE and inc_not_zero is an unusual combo. Should it be
> rcu_dereference and prog->aux->st_ops_assoc should be "__rcu" tagged?
>
Understood the underlying struct_ops map is protected by RCU, but
prog->aux->st_ops_assoc is not protected by RCU and can change
anytime.
> If prog->aux->st_ops_assoc is protected by rcu, can the user (kfunc?)
> uses the prog->aux->st_ops_assoc depending on the rcu grace period alone
> without bpf_map_inc_not_zero? Does it matter if prog->aux->st_ops_assoc
> is changed? but this patch does not seem to consider the changing case also.
>
I think bumping refcount makes bpf_prog_get_assoc_struct_ops() easier
to use: Users do not need to worry about the lifetime of the return
kdata, RCU, and the execution context.
The main problem is that async callbacks are not running in RCU
read-side critical section, so it will require callers of
bpf_prog_get_assoc_struct_ops() to do rcu_read_lock{_trace}().
The change of st_ops_assoc indeed is missed here. st_ops_assoc can
change from NULL to a valid kdata. Will fix this in the next respin.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback
2025-11-06 17:57 ` Amery Hung
@ 2025-11-06 19:37 ` Martin KaFai Lau
0 siblings, 0 replies; 30+ messages in thread
From: Martin KaFai Lau @ 2025-11-06 19:37 UTC (permalink / raw)
To: Amery Hung
Cc: netdev, alexei.starovoitov, andrii, daniel, tj, martin.lau,
kernel-team, bpf
On 11/6/25 9:57 AM, Amery Hung wrote:
> On Wed, Nov 5, 2025 at 6:13 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 11/4/25 9:26 AM, Amery Hung wrote:
>>> Take a refcount of the associated struct_ops map to prevent the map from
>>> being freed when an async callback scheduled from a struct_ops program
>>> runs.
>>>
>>> Since struct_ops programs do not take refcounts on the struct_ops map,
>>> it is possible for a struct_ops map to be freed when an async callback
>>> scheduled from it runs. To prevent this, take a refcount on prog->aux->
>>> st_ops_assoc and save it in a newly created struct bpf_async_res for
>>> every async mechanism. The reference needs to be preserved in
>>> bpf_async_res since prog->aux->st_ops_assoc can be poisoned anytime
>>> and reference leak could happen.
>>>
>>> bpf_async_res will contain a async callback's BPF program and resources
>>> related to the BPF program. The resources will be acquired when
>>> registering a callback and released when cancelled or when the map
>>> associated with the callback is freed.
>>>
>>> Also rename drop_prog_refcnt to bpf_async_cb_reset to better reflect
>>> what it now does.
>>>
>>
>> [ ... ]
>>
>>> +static int bpf_async_res_get(struct bpf_async_res *res, struct bpf_prog *prog)
>>> +{
>>> + struct bpf_map *st_ops_assoc = NULL;
>>> + int err;
>>> +
>>> + prog = bpf_prog_inc_not_zero(prog);
>>> + if (IS_ERR(prog))
>>> + return PTR_ERR(prog);
>>> +
>>> + st_ops_assoc = READ_ONCE(prog->aux->st_ops_assoc);
>>> + if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
>>> + st_ops_assoc && st_ops_assoc != BPF_PTR_POISON) {
>>> + st_ops_assoc = bpf_map_inc_not_zero(st_ops_assoc);
>>
>> The READ_ONCE and inc_not_zero is an unusual combo. Should it be
>> rcu_dereference and prog->aux->st_ops_assoc should be "__rcu" tagged?
>>
>
> Understood the underlying struct_ops map is protected by RCU, but
> prog->aux->st_ops_assoc is not protected by RCU and can change
> anytime.
hmm... at least for BPF_PROG_TYPE_STRUCT_OPS, the struct_ops map refcnt
is not taken in patch 2. The prog->aux->st_ops_assoc can be used is
because of the rcu gp.
Another thing I am likely missing is, the refcnted st_ops_assoc is saved
in res->st_ops_assoc. If I read it correctly, the kfunc is using
bpf_prog_get_assoc_struct_ops() which is reading from
[prog->]aux->st_ops_assoc instead of the saved res->st_ops_assoc. Can
the aux->st_ops_assoc be pointing to another struct_ops map different
from res->st_ops_assoc?
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2025-11-06 19:38 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 17:26 [PATCH bpf-next v5 0/7] Support associating BPF programs with struct_ops Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 1/7] bpf: Allow verifier to fixup kernel module kfuncs Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 2/7] bpf: Support associating BPF program with struct_ops Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
2025-11-04 18:03 ` Amery Hung
2025-11-04 21:59 ` Song Liu
2025-11-04 23:26 ` Amery Hung
2025-11-04 22:47 ` Andrii Nakryiko
2025-11-04 23:27 ` Amery Hung
2025-11-06 0:57 ` Martin KaFai Lau
2025-11-06 1:01 ` Amery Hung
2025-11-06 2:17 ` Martin KaFai Lau
2025-11-04 17:26 ` [PATCH bpf-next v5 3/7] bpf: Pin associated struct_ops when registering async callback Amery Hung
2025-11-04 18:03 ` bot+bpf-ci
2025-11-04 18:10 ` Amery Hung
2025-11-04 23:20 ` Andrii Nakryiko
2025-11-05 23:03 ` Amery Hung
2025-11-06 16:54 ` Andrii Nakryiko
2025-11-06 2:13 ` Martin KaFai Lau
2025-11-06 17:57 ` Amery Hung
2025-11-06 19:37 ` Martin KaFai Lau
2025-11-04 17:26 ` [PATCH bpf-next v5 4/7] libbpf: Add support for associating BPF program with struct_ops Amery Hung
2025-11-04 17:54 ` bot+bpf-ci
2025-11-04 23:27 ` Andrii Nakryiko
2025-11-04 23:26 ` Andrii Nakryiko
2025-11-04 23:39 ` Amery Hung
2025-11-05 0:46 ` Andrii Nakryiko
2025-11-04 17:26 ` [PATCH bpf-next v5 5/7] selftests/bpf: Test BPF_PROG_ASSOC_STRUCT_OPS command Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 6/7] selftests/bpf: Test ambiguous associated struct_ops Amery Hung
2025-11-04 17:26 ` [PATCH bpf-next v5 7/7] selftests/bpf: Test getting associated struct_ops in timer callback Amery Hung
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).