[PATCH bpf-next 0/8] Libbpf-side __arg

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support
@ 2023-12-20 23:31 Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 1/8] libbpf: make uniform use of btf__fd() accessor inside libbpf Andrii Nakryiko
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Support __arg_ctx global function argument tag semantics even on older kernels
that don't natively support it through btf_decl_tag("arg:ctx").

Patches #2-#6 are preparatory work to allow to postpone BTF loading into the
kernel until after all the BPF program relocations (including global func
appending to main programs) are done. Patch #4 is perhaps the most important
and establishes pre-created stable placeholder FDs, so that relocations can
embed valid map FDs into ldimm64 instructions.

Once BTF is done after relocation, what's left is to adjust BTF information to
have each main program's copy of each used global subprog to point to its own
adjusted FUNC -> FUNC_PROTO type chain (if they use __arg_ctx) in such a way
as to satisfy type expectations of BPF verifier regarding the PTR_TO_CTX
argument definition. See patch #7 for details.

Patch #8 adds few more __arg_ctx use cases (edge cases like multiple arguments
having __arg_ctx, etc) to test_global_func_ctx_args.c, to make it simple to
validate that this logic indeed works on old kernels. It does.

Andrii Nakryiko (8):
  libbpf: make uniform use of btf__fd() accessor inside libbpf
  libbpf: use explicit map reuse flag to skip map creation steps
  libbpf: don't rely on map->fd as an indicator of map being created
  libbpf: use stable map placeholder FDs
  libbpf: move exception callbacks assignment logic into relocation step
  libbpf: move BTF loading step after relocation step
  libbpf: implement __arg_ctx fallback logic
  selftests/bpf: add arg:ctx cases to test_global_funcs tests

 tools/lib/bpf/libbpf.c                        | 548 +++++++++++++-----
 tools/lib/bpf/libbpf_internal.h               |  24 +
 .../bpf/progs/test_global_func_ctx_args.c     |  49 ++
 3 files changed, 474 insertions(+), 147 deletions(-)

-- 
2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 1/8] libbpf: make uniform use of btf__fd() accessor inside libbpf
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 2/8] libbpf: use explicit map reuse flag to skip map creation steps Andrii Nakryiko
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

It makes future grepping and code analysis a bit easier.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index ac54ebc0629f..e4fe79d5693f 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -7048,7 +7048,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog
 	load_attr.prog_ifindex = prog->prog_ifindex;
 
 	/* specify func_info/line_info only if kernel supports them */
-	btf_fd = bpf_object__btf_fd(obj);
+	btf_fd = btf__fd(obj->btf);
 	if (btf_fd >= 0 && kernel_supports(obj, FEAT_BTF_FUNC)) {
 		load_attr.prog_btf_fd = btf_fd;
 		load_attr.func_info = prog->func_info;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 2/8] libbpf: use explicit map reuse flag to skip map creation steps
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 1/8] libbpf: make uniform use of btf__fd() accessor inside libbpf Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 3/8] libbpf: don't rely on map->fd as an indicator of map being created Andrii Nakryiko
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Instead of inferring whether map already point to previously
created/pinned BPF map (which user can specify through
bpf_map__reuse_fd() or bpf_object__reuse_map() APIs), use explicit
map->reused flag that is set in such case.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index e4fe79d5693f..2ead8fc1e344 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5463,7 +5463,7 @@ bpf_object__create_maps(struct bpf_object *obj)
 			}
 		}
 
-		if (map->fd >= 0) {
+		if (map->reused) {
 			pr_debug("map '%s': skipping creation (preset fd=%d)\n",
 				 map->name, map->fd);
 		} else {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 3/8] libbpf: don't rely on map->fd as an indicator of map being created
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 1/8] libbpf: make uniform use of btf__fd() accessor inside libbpf Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 2/8] libbpf: use explicit map reuse flag to skip map creation steps Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 4/8] libbpf: use stable map placeholder FDs Andrii Nakryiko
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

With the upcoming switch to preallocated placeholder FDs for maps,
switch various getters/setter away from checking map->fd. Use
map_is_created() helper that detect whether BPF map can be modified based
on map->obj->loaded state, with special provision for maps set up with
bpf_map__reuse_fd().

For backwards compatibility, we take map_is_created() into account in
bpf_map__fd() getter as well. This way before bpf_object__load() phase
bpf_map__fd() will always return -1, just as before the changes in
subsequent patches adding stable map->fd placeholders.

We also get rid of all internal uses of bpf_map__fd() getter, as it's
more oriented for uses external to libbpf. The above map_is_created()
check actually interferes with some of the internal uses, if map FD is
fetched through bpf_map__fd().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 42 +++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2ead8fc1e344..467bd187b67d 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5198,6 +5198,11 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 
 static void bpf_map__destroy(struct bpf_map *map);
 
+static bool map_is_created(const struct bpf_map *map)
+{
+	return map->obj->loaded || map->reused;
+}
+
 static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner)
 {
 	LIBBPF_OPTS(bpf_map_create_opts, create_attr);
@@ -5229,7 +5234,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 					map->name, err);
 				return err;
 			}
-			map->inner_map_fd = bpf_map__fd(map->inner_map);
+			map->inner_map_fd = map->inner_map->fd;
 		}
 		if (map->inner_map_fd >= 0)
 			create_attr.inner_map_fd = map->inner_map_fd;
@@ -5312,7 +5317,7 @@ static int init_map_in_map_slots(struct bpf_object *obj, struct bpf_map *map)
 			continue;
 
 		targ_map = map->init_slots[i];
-		fd = bpf_map__fd(targ_map);
+		fd = targ_map->fd;
 
 		if (obj->gen_loader) {
 			bpf_gen__populate_outer_map(obj->gen_loader,
@@ -7133,7 +7138,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog
 				if (map->libbpf_type != LIBBPF_MAP_RODATA)
 					continue;
 
-				if (bpf_prog_bind_map(ret, bpf_map__fd(map), NULL)) {
+				if (bpf_prog_bind_map(ret, map->fd, NULL)) {
 					cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
 					pr_warn("prog '%s': failed to bind map '%s': %s\n",
 						prog->name, map->real_name, cp);
@@ -9599,7 +9604,11 @@ int libbpf_attach_type_by_name(const char *name,
 
 int bpf_map__fd(const struct bpf_map *map)
 {
-	return map ? map->fd : libbpf_err(-EINVAL);
+	if (!map)
+		return libbpf_err(-EINVAL);
+	if (!map_is_created(map))
+		return -1;
+	return map->fd;
 }
 
 static bool map_uses_real_name(const struct bpf_map *map)
@@ -9635,7 +9644,7 @@ enum bpf_map_type bpf_map__type(const struct bpf_map *map)
 
 int bpf_map__set_type(struct bpf_map *map, enum bpf_map_type type)
 {
-	if (map->fd >= 0)
+	if (map_is_created(map))
 		return libbpf_err(-EBUSY);
 	map->def.type = type;
 	return 0;
@@ -9648,7 +9657,7 @@ __u32 bpf_map__map_flags(const struct bpf_map *map)
 
 int bpf_map__set_map_flags(struct bpf_map *map, __u32 flags)
 {
-	if (map->fd >= 0)
+	if (map_is_created(map))
 		return libbpf_err(-EBUSY);
 	map->def.map_flags = flags;
 	return 0;
@@ -9661,7 +9670,7 @@ __u64 bpf_map__map_extra(const struct bpf_map *map)
 
 int bpf_map__set_map_extra(struct bpf_map *map, __u64 map_extra)
 {
-	if (map->fd >= 0)
+	if (map_is_created(map))
 		return libbpf_err(-EBUSY);
 	map->map_extra = map_extra;
 	return 0;
@@ -9674,7 +9683,7 @@ __u32 bpf_map__numa_node(const struct bpf_map *map)
 
 int bpf_map__set_numa_node(struct bpf_map *map, __u32 numa_node)
 {
-	if (map->fd >= 0)
+	if (map_is_created(map))
 		return libbpf_err(-EBUSY);
 	map->numa_node = numa_node;
 	return 0;
@@ -9687,7 +9696,7 @@ __u32 bpf_map__key_size(const struct bpf_map *map)
 
 int bpf_map__set_key_size(struct bpf_map *map, __u32 size)
 {
-	if (map->fd >= 0)
+	if (map_is_created(map))
 		return libbpf_err(-EBUSY);
 	map->def.key_size = size;
 	return 0;
@@ -9771,7 +9780,7 @@ static int map_btf_datasec_resize(struct bpf_map *map, __u32 size)
 
 int bpf_map__set_value_size(struct bpf_map *map, __u32 size)
 {
-	if (map->fd >= 0)
+	if (map->obj->loaded || map->reused)
 		return libbpf_err(-EBUSY);
 
 	if (map->mmaped) {
@@ -9812,8 +9821,11 @@ __u32 bpf_map__btf_value_type_id(const struct bpf_map *map)
 int bpf_map__set_initial_value(struct bpf_map *map,
 			       const void *data, size_t size)
 {
+	if (map->obj->loaded || map->reused)
+		return libbpf_err(-EBUSY);
+
 	if (!map->mmaped || map->libbpf_type == LIBBPF_MAP_KCONFIG ||
-	    size != map->def.value_size || map->fd >= 0)
+	    size != map->def.value_size)
 		return libbpf_err(-EINVAL);
 
 	memcpy(map->mmaped, data, size);
@@ -9840,7 +9852,7 @@ __u32 bpf_map__ifindex(const struct bpf_map *map)
 
 int bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex)
 {
-	if (map->fd >= 0)
+	if (map_is_created(map))
 		return libbpf_err(-EBUSY);
 	map->map_ifindex = ifindex;
 	return 0;
@@ -9945,7 +9957,7 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name)
 static int validate_map_op(const struct bpf_map *map, size_t key_sz,
 			   size_t value_sz, bool check_value_sz)
 {
-	if (map->fd <= 0)
+	if (!map_is_created(map)) /* map is not yet created */
 		return -ENOENT;
 
 	if (map->def.key_size != key_sz) {
@@ -12398,7 +12410,7 @@ int bpf_link__update_map(struct bpf_link *link, const struct bpf_map *map)
 	__u32 zero = 0;
 	int err;
 
-	if (!bpf_map__is_struct_ops(map) || map->fd < 0)
+	if (!bpf_map__is_struct_ops(map) || !map_is_created(map))
 		return -EINVAL;
 
 	st_ops_link = container_of(link, struct bpf_link_struct_ops, link);
@@ -13302,7 +13314,7 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s)
 	for (i = 0; i < s->map_cnt; i++) {
 		struct bpf_map *map = *s->maps[i].map;
 		size_t mmap_sz = bpf_map_mmap_sz(map->def.value_size, map->def.max_entries);
-		int prot, map_fd = bpf_map__fd(map);
+		int prot, map_fd = map->fd;
 		void **mmaped = s->maps[i].mmaped;
 
 		if (!mmaped)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 4/8] libbpf: use stable map placeholder FDs
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2023-12-20 23:31 ` [PATCH bpf-next 3/8] libbpf: don't rely on map->fd as an indicator of map being created Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 5/8] libbpf: move exception callbacks assignment logic into relocation step Andrii Nakryiko
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Move map creation to later during BPF object loading by pre-creating
stable placeholder FDs (initially pointing to /dev/null). Use dup2()
syscall to then atomically make those placeholder FDs point to real
kernel BPF map objects.

This change allows to delay BPF map creation to after all the BPF
program relocations. That, in turn, allows to delay BTF finalization and
loading into kernel to after all the relocations as well. We'll take
advantage of the latter in subsequent patches to allow libbpf to adjust
BTF in a way that helps with BPF global function usage.

Clean up a few places where we close map->fd, which now shouldn't
happen, because map->fd should be a valid FD regardless of whether map
was created or not. Surprisingly and nicely it simplifies a bunch of
error handling code. If this change doesn't backfire, I'm tempted to
pre-create such stable FDs for other entities (progs, maybe even BTF).
We previously did some manipulations to make gen_loader work, with
stable map FDs this hack is not necessary for maps (we still have it for
BTF, but I left it as is for now).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c          | 94 +++++++++++++++++++--------------
 tools/lib/bpf/libbpf_internal.h | 24 +++++++++
 2 files changed, 78 insertions(+), 40 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 467bd187b67d..98bec2f5fe39 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1515,7 +1515,21 @@ static struct bpf_map *bpf_object__add_map(struct bpf_object *obj)
 
 	map = &obj->maps[obj->nr_maps++];
 	map->obj = obj;
-	map->fd = -1;
+	/* Preallocate map FD without actually creating BPF map just yet.
+	 * These map FD "placeholders" will be reused later without changing
+	 * FD value when map is actually created in the kernel.
+	 *
+	 * This is useful to be able to perform BPF program relocations
+	 * without having to create BPF maps before that step. This allows us
+	 * to finalize and load BTF very late in BPF object's loading phase,
+	 * right before BPF maps have to be created and BPF programs have to
+	 * be loaded. By having these map FD placeholders we can perform all
+	 * the sanitizations, relocations, and any other adjustments before we
+	 * start creating actual BPF kernel objects (BTF, maps, progs).
+	 */
+	map->fd = create_placeholder_fd();
+	if (map->fd < 0)
+		return ERR_PTR(map->fd);
 	map->inner_map_fd = -1;
 	map->autocreate = true;
 
@@ -2607,7 +2621,9 @@ static int bpf_object__init_user_btf_map(struct bpf_object *obj,
 		map->inner_map = calloc(1, sizeof(*map->inner_map));
 		if (!map->inner_map)
 			return -ENOMEM;
-		map->inner_map->fd = -1;
+		map->inner_map->fd = create_placeholder_fd();
+		if (map->inner_map->fd < 0)
+			return map->inner_map->fd;
 		map->inner_map->sec_idx = sec_idx;
 		map->inner_map->name = malloc(strlen(map_name) + sizeof(".inner") + 1);
 		if (!map->inner_map->name)
@@ -4547,14 +4563,12 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd)
 		goto err_free_new_name;
 	}
 
-	err = zclose(map->fd);
-	if (err) {
-		err = -errno;
-		goto err_close_new_fd;
-	}
+	err = reuse_fd(map->fd, new_fd);
+	if (err)
+		goto err_free_new_name;
+
 	free(map->name);
 
-	map->fd = new_fd;
 	map->name = new_name;
 	map->def.type = info.type;
 	map->def.key_size = info.key_size;
@@ -4568,8 +4582,6 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd)
 
 	return 0;
 
-err_close_new_fd:
-	close(new_fd);
 err_free_new_name:
 	free(new_name);
 	return libbpf_err(err);
@@ -5208,7 +5220,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 	LIBBPF_OPTS(bpf_map_create_opts, create_attr);
 	struct bpf_map_def *def = &map->def;
 	const char *map_name = NULL;
-	int err = 0;
+	int err = 0, map_fd;
 
 	if (kernel_supports(obj, FEAT_PROG_NAME))
 		map_name = map->name;
@@ -5267,17 +5279,19 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 		bpf_gen__map_create(obj->gen_loader, def->type, map_name,
 				    def->key_size, def->value_size, def->max_entries,
 				    &create_attr, is_inner ? -1 : map - obj->maps);
-		/* Pretend to have valid FD to pass various fd >= 0 checks.
-		 * This fd == 0 will not be used with any syscall and will be reset to -1 eventually.
+		/* We keep pretenting we have valid FD to pass various fd >= 0
+		 * checks by just keeping original placeholder FDs in place.
+		 * See bpf_object_prepare_maps() comments.
+		 * This placeholder fd will not be used with any syscall and
+		 * will be reset to -1 eventually.
 		 */
-		map->fd = 0;
+		map_fd = map->fd;
 	} else {
-		map->fd = bpf_map_create(def->type, map_name,
-					 def->key_size, def->value_size,
-					 def->max_entries, &create_attr);
+		map_fd = bpf_map_create(def->type, map_name,
+					def->key_size, def->value_size,
+					def->max_entries, &create_attr);
 	}
-	if (map->fd < 0 && (create_attr.btf_key_type_id ||
-			    create_attr.btf_value_type_id)) {
+	if (map_fd < 0 && (create_attr.btf_key_type_id || create_attr.btf_value_type_id)) {
 		char *cp, errmsg[STRERR_BUFSIZE];
 
 		err = -errno;
@@ -5289,13 +5303,11 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 		create_attr.btf_value_type_id = 0;
 		map->btf_key_type_id = 0;
 		map->btf_value_type_id = 0;
-		map->fd = bpf_map_create(def->type, map_name,
-					 def->key_size, def->value_size,
-					 def->max_entries, &create_attr);
+		map_fd = bpf_map_create(def->type, map_name,
+					def->key_size, def->value_size,
+					def->max_entries, &create_attr);
 	}
 
-	err = map->fd < 0 ? -errno : 0;
-
 	if (bpf_map_type__is_map_in_map(def->type) && map->inner_map) {
 		if (obj->gen_loader)
 			map->inner_map->fd = -1;
@@ -5303,7 +5315,19 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b
 		zfree(&map->inner_map);
 	}
 
-	return err;
+	if (map_fd < 0)
+		return -errno;
+
+	/* obj->gen_loader case, prevent reuse_fd() from closing map_fd */
+	if (map->fd == map_fd)
+		return 0;
+
+	/* Keep placeholder FD value but now point it to the BPF map object.
+	 * This way everything that relied on this map's FD (e.g., relocated
+	 * ldimm64 instructions) will stay valid and won't need adjustments.
+	 * map->fd stays valid but now point to what map_fd points to.
+	 */
+	return reuse_fd(map->fd, map_fd);
 }
 
 static int init_map_in_map_slots(struct bpf_object *obj, struct bpf_map *map)
@@ -5387,10 +5411,8 @@ static int bpf_object_init_prog_arrays(struct bpf_object *obj)
 			continue;
 
 		err = init_prog_array_slots(obj, map);
-		if (err < 0) {
-			zclose(map->fd);
+		if (err < 0)
 			return err;
-		}
 	}
 	return 0;
 }
@@ -5413,8 +5435,7 @@ static int map_set_def_max_entries(struct bpf_map *map)
 	return 0;
 }
 
-static int
-bpf_object__create_maps(struct bpf_object *obj)
+static int bpf_object_create_maps(struct bpf_object *obj)
 {
 	struct bpf_map *map;
 	char *cp, errmsg[STRERR_BUFSIZE];
@@ -5481,25 +5502,20 @@ bpf_object__create_maps(struct bpf_object *obj)
 
 			if (bpf_map__is_internal(map)) {
 				err = bpf_object__populate_internal_map(obj, map);
-				if (err < 0) {
-					zclose(map->fd);
+				if (err < 0)
 					goto err_out;
-				}
 			}
 
 			if (map->init_slots_sz && map->def.type != BPF_MAP_TYPE_PROG_ARRAY) {
 				err = init_map_in_map_slots(obj, map);
-				if (err < 0) {
-					zclose(map->fd);
+				if (err < 0)
 					goto err_out;
-				}
 			}
 		}
 
 		if (map->pin_path && !map->pinned) {
 			err = bpf_map__pin(map, NULL);
 			if (err) {
-				zclose(map->fd);
 				if (!retried && err == -EEXIST) {
 					retried = true;
 					goto retry;
@@ -8073,8 +8089,8 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
 	err = err ? : bpf_object__sanitize_and_load_btf(obj);
 	err = err ? : bpf_object__sanitize_maps(obj);
 	err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
-	err = err ? : bpf_object__create_maps(obj);
 	err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path);
+	err = err ? : bpf_object_create_maps(obj);
 	err = err ? : bpf_object__load_progs(obj, extra_log_level);
 	err = err ? : bpf_object_init_prog_arrays(obj);
 	err = err ? : bpf_object_prepare_struct_ops(obj);
@@ -8083,8 +8099,6 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
 		/* reset FDs */
 		if (obj->btf)
 			btf__set_fd(obj->btf, -1);
-		for (i = 0; i < obj->nr_maps; i++)
-			obj->maps[i].fd = -1;
 		if (!err)
 			err = bpf_gen__finish(obj->gen_loader, obj->nr_programs, obj->nr_maps);
 	}
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index b5d334754e5d..662a3df1e29f 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -555,6 +555,30 @@ static inline int ensure_good_fd(int fd)
 	return fd;
 }
 
+static inline int create_placeholder_fd(void)
+{
+	int fd;
+
+	fd = ensure_good_fd(open("/dev/null", O_WRONLY | O_CLOEXEC));
+	if (fd < 0)
+		return -errno;
+	return fd;
+}
+
+/* Point *fixed_fd* to the same file that *tmp_fd* points to.
+ * Regardless of success, *tmp_fd* is closed.
+ * Whatever *fixed_fd* pointed to is closed silently.
+ */
+static inline int reuse_fd(int fixed_fd, int tmp_fd)
+{
+	int err;
+
+	err = dup2(tmp_fd, fixed_fd);
+	err = err < 0 ? -errno : 0;
+	close(tmp_fd); /* clean up temporary FD */
+	return err;
+}
+
 /* The following two functions are exposed to bpftool */
 int bpf_core_add_cands(struct bpf_core_cand *local_cand,
 		       size_t local_essent_len,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 5/8] libbpf: move exception callbacks assignment logic into relocation step
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2023-12-20 23:31 ` [PATCH bpf-next 4/8] libbpf: use stable map placeholder FDs Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 6/8] libbpf: move BTF loading step after " Andrii Nakryiko
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Move the logic of finding and assigning exception callback indices from
BTF sanitization step to program relocations step, which seems more
logical and will unblock moving BTF loading to after relocation step.

Exception callbacks discovery and assignment has no dependency on BTF
being loaded into the kernel, it only uses BTF information. It does need
to happen before subprogram relocations happen, though. Which is why the
split.

No functional changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 165 +++++++++++++++++++++--------------------
 1 file changed, 85 insertions(+), 80 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 98bec2f5fe39..26a0869626bf 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -3182,86 +3182,6 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 		}
 	}
 
-	if (!kernel_supports(obj, FEAT_BTF_DECL_TAG))
-		goto skip_exception_cb;
-	for (i = 0; i < obj->nr_programs; i++) {
-		struct bpf_program *prog = &obj->programs[i];
-		int j, k, n;
-
-		if (prog_is_subprog(obj, prog))
-			continue;
-		n = btf__type_cnt(obj->btf);
-		for (j = 1; j < n; j++) {
-			const char *str = "exception_callback:", *name;
-			size_t len = strlen(str);
-			struct btf_type *t;
-
-			t = btf_type_by_id(obj->btf, j);
-			if (!btf_is_decl_tag(t) || btf_decl_tag(t)->component_idx != -1)
-				continue;
-
-			name = btf__str_by_offset(obj->btf, t->name_off);
-			if (strncmp(name, str, len))
-				continue;
-
-			t = btf_type_by_id(obj->btf, t->type);
-			if (!btf_is_func(t) || btf_func_linkage(t) != BTF_FUNC_GLOBAL) {
-				pr_warn("prog '%s': exception_callback:<value> decl tag not applied to the main program\n",
-					prog->name);
-				return -EINVAL;
-			}
-			if (strcmp(prog->name, btf__str_by_offset(obj->btf, t->name_off)))
-				continue;
-			/* Multiple callbacks are specified for the same prog,
-			 * the verifier will eventually return an error for this
-			 * case, hence simply skip appending a subprog.
-			 */
-			if (prog->exception_cb_idx >= 0) {
-				prog->exception_cb_idx = -1;
-				break;
-			}
-
-			name += len;
-			if (str_is_empty(name)) {
-				pr_warn("prog '%s': exception_callback:<value> decl tag contains empty value\n",
-					prog->name);
-				return -EINVAL;
-			}
-
-			for (k = 0; k < obj->nr_programs; k++) {
-				struct bpf_program *subprog = &obj->programs[k];
-
-				if (!prog_is_subprog(obj, subprog))
-					continue;
-				if (strcmp(name, subprog->name))
-					continue;
-				/* Enforce non-hidden, as from verifier point of
-				 * view it expects global functions, whereas the
-				 * mark_btf_static fixes up linkage as static.
-				 */
-				if (!subprog->sym_global || subprog->mark_btf_static) {
-					pr_warn("prog '%s': exception callback %s must be a global non-hidden function\n",
-						prog->name, subprog->name);
-					return -EINVAL;
-				}
-				/* Let's see if we already saw a static exception callback with the same name */
-				if (prog->exception_cb_idx >= 0) {
-					pr_warn("prog '%s': multiple subprogs with same name as exception callback '%s'\n",
-					        prog->name, subprog->name);
-					return -EINVAL;
-				}
-				prog->exception_cb_idx = k;
-				break;
-			}
-
-			if (prog->exception_cb_idx >= 0)
-				continue;
-			pr_warn("prog '%s': cannot find exception callback '%s'\n", prog->name, name);
-			return -ENOENT;
-		}
-	}
-skip_exception_cb:
-
 	sanitize = btf_needs_sanitization(obj);
 	if (sanitize) {
 		const void *raw_data;
@@ -6648,6 +6568,88 @@ static void bpf_object__sort_relos(struct bpf_object *obj)
 	}
 }
 
+static int bpf_prog_assign_exc_cb(struct bpf_object *obj, struct bpf_program *prog)
+{
+	const char *str = "exception_callback:";
+	size_t pfx_len = strlen(str);
+	int i, j, n;
+
+	if (!obj->btf || !kernel_supports(obj, FEAT_BTF_DECL_TAG))
+		return 0;
+
+	n = btf__type_cnt(obj->btf);
+	for (i = 1; i < n; i++) {
+		const char *name;
+		struct btf_type *t;
+
+		t = btf_type_by_id(obj->btf, i);
+		if (!btf_is_decl_tag(t) || btf_decl_tag(t)->component_idx != -1)
+			continue;
+
+		name = btf__str_by_offset(obj->btf, t->name_off);
+		if (strncmp(name, str, pfx_len) != 0)
+			continue;
+
+		t = btf_type_by_id(obj->btf, t->type);
+		if (!btf_is_func(t) || btf_func_linkage(t) != BTF_FUNC_GLOBAL) {
+			pr_warn("prog '%s': exception_callback:<value> decl tag not applied to the main program\n",
+				prog->name);
+			return -EINVAL;
+		}
+		if (strcmp(prog->name, btf__str_by_offset(obj->btf, t->name_off)) != 0)
+			continue;
+		/* Multiple callbacks are specified for the same prog,
+		 * the verifier will eventually return an error for this
+		 * case, hence simply skip appending a subprog.
+		 */
+		if (prog->exception_cb_idx >= 0) {
+			prog->exception_cb_idx = -1;
+			break;
+		}
+
+		name += pfx_len;
+		if (str_is_empty(name)) {
+			pr_warn("prog '%s': exception_callback:<value> decl tag contains empty value\n",
+				prog->name);
+			return -EINVAL;
+		}
+
+		for (j = 0; j < obj->nr_programs; j++) {
+			struct bpf_program *subprog = &obj->programs[j];
+
+			if (!prog_is_subprog(obj, subprog))
+				continue;
+			if (strcmp(name, subprog->name) != 0)
+				continue;
+			/* Enforce non-hidden, as from verifier point of
+			 * view it expects global functions, whereas the
+			 * mark_btf_static fixes up linkage as static.
+			 */
+			if (!subprog->sym_global || subprog->mark_btf_static) {
+				pr_warn("prog '%s': exception callback %s must be a global non-hidden function\n",
+					prog->name, subprog->name);
+				return -EINVAL;
+			}
+			/* Let's see if we already saw a static exception callback with the same name */
+			if (prog->exception_cb_idx >= 0) {
+				pr_warn("prog '%s': multiple subprogs with same name as exception callback '%s'\n",
+					prog->name, subprog->name);
+				return -EINVAL;
+			}
+			prog->exception_cb_idx = j;
+			break;
+		}
+
+		if (prog->exception_cb_idx >= 0)
+			continue;
+
+		pr_warn("prog '%s': cannot find exception callback '%s'\n", prog->name, name);
+		return -ENOENT;
+	}
+
+	return 0;
+}
+
 static int
 bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 {
@@ -6708,6 +6710,9 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 
+		err = bpf_prog_assign_exc_cb(obj, prog);
+		if (err)
+			return err;
 		/* Now, also append exception callback if it has not been done already. */
 		if (prog->exception_cb_idx >= 0) {
 			struct bpf_program *subprog = &obj->programs[prog->exception_cb_idx];
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 6/8] libbpf: move BTF loading step after relocation step
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2023-12-20 23:31 ` [PATCH bpf-next 5/8] libbpf: move exception callbacks assignment logic into relocation step Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-22  1:13   ` Alexei Starovoitov
  2023-12-20 23:31 ` [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic Andrii Nakryiko
  2023-12-20 23:31 ` [PATCH bpf-next 8/8] selftests/bpf: add arg:ctx cases to test_global_funcs tests Andrii Nakryiko
  7 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

With all the preparations in previous patches done we are ready to
postpone BTF loading and sanitization step until after all the
relocations are performed.

While at it, simplify step name from bpf_object__sanitize_and_load_btf
to bpf_object_load_btf. Map creation and program loading steps also
perform sanitization, but they don't explicitly reflect it in overly
verbose function name. So keep naming and approch consistent here.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 26a0869626bf..92171bcf4c25 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -3132,7 +3132,7 @@ static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force)
 	return 0;
 }
 
-static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
+static int bpf_object_load_btf(struct bpf_object *obj)
 {
 	struct btf *kern_btf = obj->btf;
 	bool btf_mandatory, sanitize;
@@ -8091,10 +8091,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
 	err = bpf_object__probe_loading(obj);
 	err = err ? : bpf_object__load_vmlinux_btf(obj, false);
 	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
-	err = err ? : bpf_object__sanitize_and_load_btf(obj);
 	err = err ? : bpf_object__sanitize_maps(obj);
 	err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
 	err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path);
+	err = err ? : bpf_object_load_btf(obj);
 	err = err ? : bpf_object_create_maps(obj);
 	err = err ? : bpf_object__load_progs(obj, extra_log_level);
 	err = err ? : bpf_object_init_prog_arrays(obj);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2023-12-20 23:31 ` [PATCH bpf-next 6/8] libbpf: move BTF loading step after " Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  2023-12-22  1:26   ` Alexei Starovoitov
  2023-12-22 13:15   ` Jiri Olsa
  2023-12-20 23:31 ` [PATCH bpf-next 8/8] selftests/bpf: add arg:ctx cases to test_global_funcs tests Andrii Nakryiko
  7 siblings, 2 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Out of all special global func arg tag annotations, __arg_ctx is
practically is the most immediately useful and most critical to have
working across multitude kernel version, if possible. This would allow
end users to write much simpler code if __arg_ctx semantics worked for
older kernels that don't natively understand btf_decl_tag("arg:ctx") in
verifier logic.

Luckily, it is possible to ensure __arg_ctx works on old kernels through
a bit of extra work done by libbpf, at least in a lot of common cases.

To explain the overall idea, we need to go back at how context argument
was supported in global funcs before __arg_ctx support was added. This
was done based on special struct name checks in kernel. E.g., for
BPF_PROG_TYPE_PERF_EVENT the expectation is that argument type `struct
bpf_perf_event_data *` mark that argument as PTR_TO_CTX. This is all
good as long as global function is used from the same BPF program types
only, which is often not the case. If the same subprog has to be called
from, say, kprobe and perf_event program types, there is no single
definition that would satisfy BPF verifier. Subprog will have context
argument either for kprobe (if using bpf_user_pt_regs_t struct name) or
perf_event (with bpf_perf_event_data struct name), but not both.

This limitation was the reason to add btf_decl_tag("arg:ctx"), making
the actual argument type not important, so that user can just define
"generic" signature:

  __noinline int global_subprog(void *ctx __arg_ctx) { ... }

I won't belabor how libbpf is implementing subprograms, see a huge
comment next to bpf_object__relocate_calls() function. The idea is that
each main/entry BPF program gets its own copy of global_subprog's code
appended.

This per-program copy of global subprog code *and* associated func_info
.BTF.ext information, pointing to FUNC -> FUNC_PROTO BTF type chain
allows libbpf to simulate __arg_ctx behavior transparently, even if the
kernel doesn't yet support __arg_ctx annotation natively.

The idea is straightforward: each time we append global subprog's code
and func_info information, we adjust its FUNC -> FUNC_PROTO type
information, if necessary (that is, libbpf can detect the presence of
btf_decl_tag("arg:ctx") just like BPF verifier would do it).

The rest is just mechanical and somewhat painful BTF manipulation code.
It's painful because we need to clone FUNC -> FUNC_PROTO, instead of
reusing it, as same FUNC -> FUNC_PROTO chain might be used by another
main BPF program within the same BPF object, so we can't just modify it
in-place (and cloning BTF types within the same struct btf object is
painful due to constant memory invalidation, see comments in code).
Uploaded BPF object's BTF information has to work for all BPF
programs at the same time.

Once we have FUNC -> FUNC_PROTO clones, we make sure that instead of
using some `void *ctx` parameter definition, we have an expected `struct
bpf_perf_event_data *ctx` definition (as far as BPF verifier and kernel
is concerned), which will mark it as context for BPF verifier. Same
global subprog relocated and copied into another main BPF program will
get different type information according to main program's type. It all
works out in the end in a completely transparent way for end user.

Libbpf maintains internal program type -> expected context struct name
mapping internally. Note, not all BPF program types have named context
struct, so this approach won't work for such programs (just like it
didn't before __arg_ctx). So native __arg_ctx is still important to have
in kernel to have generic context support across all BPF program types.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 239 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 231 insertions(+), 8 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 92171bcf4c25..1a7354b6a289 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6168,7 +6168,7 @@ reloc_prog_func_and_line_info(const struct bpf_object *obj,
 	int err;
 
 	/* no .BTF.ext relocation if .BTF.ext is missing or kernel doesn't
-	 * supprot func/line info
+	 * support func/line info
 	 */
 	if (!obj->btf_ext || !kernel_supports(obj, FEAT_BTF_FUNC))
 		return 0;
@@ -6650,8 +6650,223 @@ static int bpf_prog_assign_exc_cb(struct bpf_object *obj, struct bpf_program *pr
 	return 0;
 }
 
-static int
-bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
+static struct {
+	enum bpf_prog_type prog_type;
+	const char *ctx_name;
+} global_ctx_map[] = {
+	{ BPF_PROG_TYPE_CGROUP_DEVICE,           "bpf_cgroup_dev_ctx" },
+	{ BPF_PROG_TYPE_CGROUP_SKB,              "__sk_buff" },
+	{ BPF_PROG_TYPE_CGROUP_SOCK,             "bpf_sock" },
+	{ BPF_PROG_TYPE_CGROUP_SOCK_ADDR,        "bpf_sock_addr" },
+	{ BPF_PROG_TYPE_CGROUP_SOCKOPT,          "bpf_sockopt" },
+	{ BPF_PROG_TYPE_CGROUP_SYSCTL,           "bpf_sysctl" },
+	{ BPF_PROG_TYPE_FLOW_DISSECTOR,          "__sk_buff" },
+	{ BPF_PROG_TYPE_KPROBE,                  "bpf_user_pt_regs_t" },
+	{ BPF_PROG_TYPE_LWT_IN,                  "__sk_buff" },
+	{ BPF_PROG_TYPE_LWT_OUT,                 "__sk_buff" },
+	{ BPF_PROG_TYPE_LWT_SEG6LOCAL,           "__sk_buff" },
+	{ BPF_PROG_TYPE_LWT_XMIT,                "__sk_buff" },
+	{ BPF_PROG_TYPE_NETFILTER,               "bpf_nf_ctx" },
+	{ BPF_PROG_TYPE_PERF_EVENT,              "bpf_perf_event_data" },
+	{ BPF_PROG_TYPE_RAW_TRACEPOINT,          "bpf_raw_tracepoint_args" },
+	{ BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" },
+	{ BPF_PROG_TYPE_SCHED_ACT,               "__sk_buff" },
+	{ BPF_PROG_TYPE_SCHED_CLS,               "__sk_buff" },
+	{ BPF_PROG_TYPE_SK_LOOKUP,               "bpf_sk_lookup" },
+	{ BPF_PROG_TYPE_SK_MSG,                  "sk_msg_md" },
+	{ BPF_PROG_TYPE_SK_REUSEPORT,            "sk_reuseport_md" },
+	{ BPF_PROG_TYPE_SK_SKB,                  "__sk_buff" },
+	{ BPF_PROG_TYPE_SOCK_OPS,                "bpf_sock_ops" },
+	{ BPF_PROG_TYPE_SOCKET_FILTER,           "__sk_buff" },
+	{ BPF_PROG_TYPE_XDP,                     "xdp_md" },
+	/* all other program types don't have "named" context structs */
+};
+
+/* Check if main program or global subprog's function prototype has `arg:ctx`
+ * argument tags, and, if necessary, substitute correct type to match what BPF
+ * verifier would expect, taking into account specific program type. This
+ * allows to support __arg_ctx tag transparently on old kernels that don't yet
+ * have a native support for it in the verifier, making user's life much
+ * easier.
+ */
+static int bpf_program_fixup_func_info(struct bpf_object *obj, struct bpf_program *prog)
+{
+	const char *ctx_name = NULL, *ctx_tag = "arg:ctx";
+	struct bpf_func_info_min *func_rec;
+	struct btf_type *fn_t, *fn_proto_t;
+	struct btf *btf = obj->btf;
+	const struct btf_type *t;
+	struct btf_param *p;
+	int ptr_id = 0, struct_id, tag_id, orig_fn_id;
+	int i, j, n, arg_idx, arg_cnt, err, name_off, rec_idx;
+	int *orig_ids;
+
+	/* no .BTF.ext, no problem */
+	if (!obj->btf_ext || !prog->func_info)
+		return 0;
+
+	/* some BPF program types just don't have named context structs, so
+	 * this fallback mechanism doesn't work for them
+	 */
+	for (i = 0; i < ARRAY_SIZE(global_ctx_map); i++) {
+		if (global_ctx_map[i].prog_type != prog->type)
+			continue;
+		ctx_name = global_ctx_map[i].ctx_name;
+		break;
+	}
+	if (!ctx_name)
+		return 0;
+
+	/* remember original func BTF IDs to detect if we already cloned them */
+	orig_ids = calloc(prog->func_info_cnt, sizeof(*orig_ids));
+	if (!orig_ids)
+		return -ENOMEM;
+	for (i = 0; i < prog->func_info_cnt; i++) {
+		func_rec = prog->func_info + prog->func_info_rec_size * i;
+		orig_ids[i] = func_rec->type_id;
+	}
+
+	/* go through each DECL_TAG with "arg:ctx" and see if it points to one
+	 * of our subprogs; if yes and subprog is global and needs adjustment,
+	 * clone and adjust FUNC -> FUNC_PROTO combo
+	 */
+	for (i = 1, n = btf__type_cnt(btf); i < n; i++) {
+		/* only DECL_TAG with "arg:ctx" value are interesting */
+		t = btf__type_by_id(btf, i);
+		if (!btf_is_decl_tag(t))
+			continue;
+		if (strcmp(btf__str_by_offset(btf, t->name_off), ctx_tag) != 0)
+			continue;
+
+		/* only global funcs need adjustment, if at all */
+		orig_fn_id = t->type;
+		fn_t = btf_type_by_id(btf, orig_fn_id);
+		if (!btf_is_func(fn_t) || btf_func_linkage(fn_t) != BTF_FUNC_GLOBAL)
+			continue;
+
+		/* sanity check FUNC -> FUNC_PROTO chain, just in case */
+		fn_proto_t = btf_type_by_id(btf, fn_t->type);
+		if (!fn_proto_t || !btf_is_func_proto(fn_proto_t))
+			continue;
+
+		/* find corresponding func_info record */
+		func_rec = NULL;
+		for (rec_idx = 0; rec_idx < prog->func_info_cnt; rec_idx++) {
+			if (orig_ids[rec_idx] == t->type) {
+				func_rec = prog->func_info + prog->func_info_rec_size * rec_idx;
+				break;
+			}
+		}
+		/* current main program doesn't call into this subprog */
+		if (!func_rec)
+			continue;
+
+		/* some more sanity checking of DECL_TAG */
+		arg_cnt = btf_vlen(fn_proto_t);
+		arg_idx = btf_decl_tag(t)->component_idx;
+		if (arg_idx < 0 || arg_idx >= arg_cnt)
+			continue;
+
+		/* check if existing parameter already matches verifier expectations */
+		p = &btf_params(fn_proto_t)[arg_idx];
+		t = skip_mods_and_typedefs(btf, p->type, NULL);
+		if (btf_is_ptr(t) &&
+		    (t = skip_mods_and_typedefs(btf, t->type, NULL)) &&
+		    btf_is_struct(t) &&
+		    strcmp(btf__str_by_offset(btf, t->name_off), ctx_name) == 0) {
+			continue; /* no need for fix up */
+		}
+
+		/* clone fn/fn_proto, unless we already did it for another arg */
+		if (func_rec->type_id == orig_fn_id) {
+			int fn_id, fn_proto_id, ret_type_id, orig_proto_id;
+
+			/* Note that each btf__add_xxx() operation invalidates
+			 * all btf_type and string pointers, so we need to be
+			 * very careful when cloning BTF types. BTF type
+			 * pointers have to be always refetched. And to avoid
+			 * problems with invalidated string pointers, we
+			 * add empty strings initially, then just fix up
+			 * name_off offsets in place. Offsets are stable for
+			 * existing strings, so that works out.
+			 */
+			name_off = fn_t->name_off; /* we are about to invalidate fn_t */
+			ret_type_id = fn_proto_t->type; /* and fn_proto_t as well */
+			orig_proto_id = fn_t->type; /* original FUNC_PROTO ID */
+
+			/* clone FUNC first, btf__add_func() enforces
+			 * non-empty name, so use entry program's name as
+			 * a placeholder, which we replace immediately
+			 */
+			fn_id = btf__add_func(btf, prog->name, btf_func_linkage(fn_t), fn_t->type);
+			if (fn_id < 0)
+				return -EINVAL;
+			fn_t = btf_type_by_id(btf, fn_id);
+			fn_t->name_off = name_off; /* reuse original string */
+			fn_t->type = fn_id + 1; /* we can predict FUNC_PROTO ID */
+
+			/* clone FUNC_PROTO and its params now */
+			fn_proto_id = btf__add_func_proto(btf, ret_type_id);
+			if (fn_proto_id < 0) {
+				err = -EINVAL;
+				goto err_out;
+			}
+			for (j = 0; j < arg_cnt; j++) {
+				/* copy original parameter data */
+				t = btf_type_by_id(btf, orig_proto_id);
+				p = &btf_params(t)[j];
+				name_off = p->name_off;
+
+				err = btf__add_func_param(btf, "", p->type);
+				if (err)
+					goto err_out;
+				fn_proto_t = btf_type_by_id(btf, fn_proto_id);
+				p = &btf_params(fn_proto_t)[j];
+				p->name_off = name_off; /* use remembered str offset */
+			}
+
+			/* point func_info record to a cloned FUNC type */
+			func_rec->type_id = fn_id;
+		}
+
+		/* create PTR -> STRUCT type chain to mark PTR_TO_CTX argument;
+		 * we do it just once per main BPF program, as all global
+		 * funcs share the same program type, so need only PTR ->
+		 * STRUCT type chain
+		 */
+		if (ptr_id == 0) {
+			struct_id = btf__add_struct(btf, ctx_name, 0);
+			ptr_id = btf__add_ptr(btf, struct_id);
+			if (ptr_id < 0 || struct_id < 0) {
+				err = -EINVAL;
+				goto err_out;
+			}
+		}
+
+		/* for completeness, clone DECL_TAG and point it to cloned param */
+		tag_id = btf__add_decl_tag(btf, ctx_tag, func_rec->type_id, arg_idx);
+		if (tag_id < 0) {
+			err = -EINVAL;
+			goto err_out;
+		}
+
+		/* all the BTF manipulations invalidated pointers, refetch them */
+		fn_t = btf_type_by_id(btf, func_rec->type_id);
+		fn_proto_t = btf_type_by_id(btf, fn_t->type);
+
+		/* fix up type ID pointed to by param */
+		p = &btf_params(fn_proto_t)[arg_idx];
+		p->type = ptr_id;
+	}
+
+	free(orig_ids);
+	return 0;
+err_out:
+	free(orig_ids);
+	return err;
+}
+
+static int bpf_object_relocate(struct bpf_object *obj, const char *targ_btf_path)
 {
 	struct bpf_program *prog;
 	size_t i, j;
@@ -6732,19 +6947,28 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			}
 		}
 	}
-	/* Process data relos for main programs */
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
 		if (prog_is_subprog(obj, prog))
 			continue;
 		if (!prog->autoload)
 			continue;
+
+		/* Process data relos for main programs */
 		err = bpf_object__relocate_data(obj, prog);
 		if (err) {
 			pr_warn("prog '%s': failed to relocate data references: %d\n",
 				prog->name, err);
 			return err;
 		}
+
+		/* Fix up .BTF.ext information, if necessary */
+		err = bpf_program_fixup_func_info(obj, prog);
+		if (err) {
+			pr_warn("prog '%s': failed to perform .BTF.ext fix ups: %d\n",
+				prog->name, err);
+			return err;
+		}
 	}
 
 	return 0;
@@ -7456,8 +7680,7 @@ static int bpf_program_record_relos(struct bpf_program *prog)
 	return 0;
 }
 
-static int
-bpf_object__load_progs(struct bpf_object *obj, int log_level)
+static int bpf_object_load_progs(struct bpf_object *obj, int log_level)
 {
 	struct bpf_program *prog;
 	size_t i;
@@ -8093,10 +8316,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
 	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
 	err = err ? : bpf_object__sanitize_maps(obj);
 	err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
-	err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path);
+	err = err ? : bpf_object_relocate(obj, obj->btf_custom_path ? : target_btf_path);
 	err = err ? : bpf_object_load_btf(obj);
 	err = err ? : bpf_object_create_maps(obj);
-	err = err ? : bpf_object__load_progs(obj, extra_log_level);
+	err = err ? : bpf_object_load_progs(obj, extra_log_level);
 	err = err ? : bpf_object_init_prog_arrays(obj);
 	err = err ? : bpf_object_prepare_struct_ops(obj);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next 8/8] selftests/bpf: add arg:ctx cases to test_global_funcs tests
  2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
                   ` (6 preceding siblings ...)
  2023-12-20 23:31 ` [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic Andrii Nakryiko
@ 2023-12-20 23:31 ` Andrii Nakryiko
  7 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2023-12-20 23:31 UTC (permalink / raw)
  To: bpf, ast, daniel, martin.lau; +Cc: andrii, kernel-team

Add a few extra cases of global funcs with context arguments. This time
rely on "arg:ctx" decl_tag (__arg_ctx macro), but put it next to
"classic" cases where context argument has to be of an exact type that
BPF verifier expects (e.g., bpf_user_pt_regs_t for kprobe/uprobe).

Colocating all these cases separately from other global func args that
rely on arg:xxx decl tags (in verifier_global_subprogs.c) allows for
simpler backwards compatibility testing on old kernels. All the cases in
test_global_func_ctx_args.c are supposed to work on older kernels, which
was manually validated during development.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../bpf/progs/test_global_func_ctx_args.c     | 49 +++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
index 7faa8eef0598..9a06e5eb1fbe 100644
--- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
+++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c
@@ -102,3 +102,52 @@ int perf_event_ctx(void *ctx)
 {
 	return perf_event_ctx_subprog(ctx);
 }
+
+/* this global subprog can be now called from many types of entry progs, each
+ * with different context type
+ */
+__weak int subprog_ctx_tag(void *ctx __arg_ctx)
+{
+	return bpf_get_stack(ctx, stack, sizeof(stack), 0);
+}
+
+struct my_struct { int x; };
+
+__weak int subprog_multi_ctx_tags(void *ctx1 __arg_ctx,
+				  struct my_struct *mem,
+				  void *ctx2 __arg_ctx)
+{
+	if (!mem)
+		return 0;
+
+	return bpf_get_stack(ctx1, stack, sizeof(stack), 0) +
+	       mem->x +
+	       bpf_get_stack(ctx2, stack, sizeof(stack), 0);
+}
+
+SEC("?raw_tp")
+__success __log_level(2)
+int arg_tag_ctx_raw_tp(void *ctx)
+{
+	struct my_struct x = { .x = 123 };
+
+	return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx);
+}
+
+SEC("?perf_event")
+__success __log_level(2)
+int arg_tag_ctx_perf(void *ctx)
+{
+	struct my_struct x = { .x = 123 };
+
+	return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx);
+}
+
+SEC("?kprobe")
+__success __log_level(2)
+int arg_tag_ctx_kprobe(void *ctx)
+{
+	struct my_struct x = { .x = 123 };
+
+	return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx);
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 6/8] libbpf: move BTF loading step after relocation step
  2023-12-20 23:31 ` [PATCH bpf-next 6/8] libbpf: move BTF loading step after " Andrii Nakryiko
@ 2023-12-22  1:13   ` Alexei Starovoitov
  2024-01-02 16:49     ` Andrii Nakryiko
  0 siblings, 1 reply; 16+ messages in thread
From: Alexei Starovoitov @ 2023-12-22  1:13 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, martin.lau, kernel-team

On Wed, Dec 20, 2023 at 03:31:25PM -0800, Andrii Nakryiko wrote:
>  
> -static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
> +static int bpf_object_load_btf(struct bpf_object *obj)
>  {
>  	struct btf *kern_btf = obj->btf;
>  	bool btf_mandatory, sanitize;
> @@ -8091,10 +8091,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
>  	err = bpf_object__probe_loading(obj);
>  	err = err ? : bpf_object__load_vmlinux_btf(obj, false);
>  	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
> -	err = err ? : bpf_object__sanitize_and_load_btf(obj);
>  	err = err ? : bpf_object__sanitize_maps(obj);
>  	err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
>  	err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path);
> +	err = err ? : bpf_object_load_btf(obj);

Here and in the previous patch:
-bpf_object__create_maps(struct bpf_object *obj)
+static int bpf_object_create_maps(struct bpf_object *obj)

Let's keep __ convention. No need to deviate for these two methods.
Otherwise above loading sequence looks odd and questions pop on why
one method with single underscore and others are with two.

pw-bot: cr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic
  2023-12-20 23:31 ` [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic Andrii Nakryiko
@ 2023-12-22  1:26   ` Alexei Starovoitov
  2024-01-02 17:06     ` Andrii Nakryiko
  2023-12-22 13:15   ` Jiri Olsa
  1 sibling, 1 reply; 16+ messages in thread
From: Alexei Starovoitov @ 2023-12-22  1:26 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, martin.lau, kernel-team

On Wed, Dec 20, 2023 at 03:31:26PM -0800, Andrii Nakryiko wrote:
> This limitation was the reason to add btf_decl_tag("arg:ctx"), making
> the actual argument type not important, so that user can just define
> "generic" signature:
> 
>   __noinline int global_subprog(void *ctx __arg_ctx) { ... }

Just realized that we probably need to enforce in both libbpf doing
rewrite and in the kernel that __arg_ctx is either valid
'struct correct_type_for_bpf_prog *' or 'void *'.

Otherwise the user will get surprising behavior when
int foo(struct __sk_buff *ctx __arg_ctx)
{
  ctx->len;
}
will get rewritten to 'struct pt_regs *ctx' based on prog type
while all asm instructions inside prog were compiled with 'struct __sk_buff'
and CO_RE performs relocations against that type.
 
> +static struct {
> +	enum bpf_prog_type prog_type;
> +	const char *ctx_name;
> +} global_ctx_map[] = {
> +	{ BPF_PROG_TYPE_CGROUP_DEVICE,           "bpf_cgroup_dev_ctx" },
> +	{ BPF_PROG_TYPE_CGROUP_SKB,              "__sk_buff" },
> +	{ BPF_PROG_TYPE_CGROUP_SOCK,             "bpf_sock" },
> +	{ BPF_PROG_TYPE_CGROUP_SOCK_ADDR,        "bpf_sock_addr" },
> +	{ BPF_PROG_TYPE_CGROUP_SOCKOPT,          "bpf_sockopt" },
> +	{ BPF_PROG_TYPE_CGROUP_SYSCTL,           "bpf_sysctl" },
> +	{ BPF_PROG_TYPE_FLOW_DISSECTOR,          "__sk_buff" },
> +	{ BPF_PROG_TYPE_KPROBE,                  "bpf_user_pt_regs_t" },
> +	{ BPF_PROG_TYPE_LWT_IN,                  "__sk_buff" },
> +	{ BPF_PROG_TYPE_LWT_OUT,                 "__sk_buff" },
> +	{ BPF_PROG_TYPE_LWT_SEG6LOCAL,           "__sk_buff" },
> +	{ BPF_PROG_TYPE_LWT_XMIT,                "__sk_buff" },
> +	{ BPF_PROG_TYPE_NETFILTER,               "bpf_nf_ctx" },
> +	{ BPF_PROG_TYPE_PERF_EVENT,              "bpf_perf_event_data" },
> +	{ BPF_PROG_TYPE_RAW_TRACEPOINT,          "bpf_raw_tracepoint_args" },
> +	{ BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" },
> +	{ BPF_PROG_TYPE_SCHED_ACT,               "__sk_buff" },
> +	{ BPF_PROG_TYPE_SCHED_CLS,               "__sk_buff" },
> +	{ BPF_PROG_TYPE_SK_LOOKUP,               "bpf_sk_lookup" },
> +	{ BPF_PROG_TYPE_SK_MSG,                  "sk_msg_md" },
> +	{ BPF_PROG_TYPE_SK_REUSEPORT,            "sk_reuseport_md" },
> +	{ BPF_PROG_TYPE_SK_SKB,                  "__sk_buff" },
> +	{ BPF_PROG_TYPE_SOCK_OPS,                "bpf_sock_ops" },
> +	{ BPF_PROG_TYPE_SOCKET_FILTER,           "__sk_buff" },
> +	{ BPF_PROG_TYPE_XDP,                     "xdp_md" },

We already share the .c files (like relo_core.c) between kernel and libbpf
let's share here as well to avoid copy paste.
All of the above is available in include/linux/bpf_types.h

> +		/* clone fn/fn_proto, unless we already did it for another arg */
> +		if (func_rec->type_id == orig_fn_id) {

It feels that body of this 'if' can be factored out as a separate helper function.

> -static int
> -bpf_object__load_progs(struct bpf_object *obj, int log_level)
> +static int bpf_object_load_progs(struct bpf_object *obj, int log_level)

pls keep __ convention.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic
  2023-12-20 23:31 ` [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic Andrii Nakryiko
  2023-12-22  1:26   ` Alexei Starovoitov
@ 2023-12-22 13:15   ` Jiri Olsa
  2024-01-02 17:11     ` Andrii Nakryiko
  1 sibling, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2023-12-22 13:15 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, martin.lau, kernel-team

On Wed, Dec 20, 2023 at 03:31:26PM -0800, Andrii Nakryiko wrote:
> Out of all special global func arg tag annotations, __arg_ctx is
> practically is the most immediately useful and most critical to have
> working across multitude kernel version, if possible. This would allow
> end users to write much simpler code if __arg_ctx semantics worked for
> older kernels that don't natively understand btf_decl_tag("arg:ctx") in
> verifier logic.

curious what's the workaround now.. having separate function for each
program type instead of just one global function? I wonder ebpf/cilium
library could do the same thing

whole patchset lgtm:

Acked-by: Jiri Olsa <jolsa@kernel.org>

jirka

> 
> Luckily, it is possible to ensure __arg_ctx works on old kernels through
> a bit of extra work done by libbpf, at least in a lot of common cases.
> 
> To explain the overall idea, we need to go back at how context argument
> was supported in global funcs before __arg_ctx support was added. This
> was done based on special struct name checks in kernel. E.g., for
> BPF_PROG_TYPE_PERF_EVENT the expectation is that argument type `struct
> bpf_perf_event_data *` mark that argument as PTR_TO_CTX. This is all
> good as long as global function is used from the same BPF program types
> only, which is often not the case. If the same subprog has to be called
> from, say, kprobe and perf_event program types, there is no single
> definition that would satisfy BPF verifier. Subprog will have context
> argument either for kprobe (if using bpf_user_pt_regs_t struct name) or
> perf_event (with bpf_perf_event_data struct name), but not both.
> 
> This limitation was the reason to add btf_decl_tag("arg:ctx"), making
> the actual argument type not important, so that user can just define
> "generic" signature:
> 
>   __noinline int global_subprog(void *ctx __arg_ctx) { ... }
> 
> I won't belabor how libbpf is implementing subprograms, see a huge
> comment next to bpf_object__relocate_calls() function. The idea is that
> each main/entry BPF program gets its own copy of global_subprog's code
> appended.
> 
> This per-program copy of global subprog code *and* associated func_info
> .BTF.ext information, pointing to FUNC -> FUNC_PROTO BTF type chain
> allows libbpf to simulate __arg_ctx behavior transparently, even if the
> kernel doesn't yet support __arg_ctx annotation natively.
> 
> The idea is straightforward: each time we append global subprog's code
> and func_info information, we adjust its FUNC -> FUNC_PROTO type
> information, if necessary (that is, libbpf can detect the presence of
> btf_decl_tag("arg:ctx") just like BPF verifier would do it).
> 
> The rest is just mechanical and somewhat painful BTF manipulation code.
> It's painful because we need to clone FUNC -> FUNC_PROTO, instead of
> reusing it, as same FUNC -> FUNC_PROTO chain might be used by another
> main BPF program within the same BPF object, so we can't just modify it
> in-place (and cloning BTF types within the same struct btf object is
> painful due to constant memory invalidation, see comments in code).
> Uploaded BPF object's BTF information has to work for all BPF
> programs at the same time.
> 
> Once we have FUNC -> FUNC_PROTO clones, we make sure that instead of
> using some `void *ctx` parameter definition, we have an expected `struct
> bpf_perf_event_data *ctx` definition (as far as BPF verifier and kernel
> is concerned), which will mark it as context for BPF verifier. Same
> global subprog relocated and copied into another main BPF program will
> get different type information according to main program's type. It all
> works out in the end in a completely transparent way for end user.
> 
> Libbpf maintains internal program type -> expected context struct name
> mapping internally. Note, not all BPF program types have named context
> struct, so this approach won't work for such programs (just like it
> didn't before __arg_ctx). So native __arg_ctx is still important to have
> in kernel to have generic context support across all BPF program types.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  tools/lib/bpf/libbpf.c | 239 +++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 231 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 92171bcf4c25..1a7354b6a289 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -6168,7 +6168,7 @@ reloc_prog_func_and_line_info(const struct bpf_object *obj,
>  	int err;
>  
>  	/* no .BTF.ext relocation if .BTF.ext is missing or kernel doesn't
> -	 * supprot func/line info
> +	 * support func/line info
>  	 */
>  	if (!obj->btf_ext || !kernel_supports(obj, FEAT_BTF_FUNC))
>  		return 0;
> @@ -6650,8 +6650,223 @@ static int bpf_prog_assign_exc_cb(struct bpf_object *obj, struct bpf_program *pr
>  	return 0;
>  }
>  
> -static int
> -bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
> +static struct {
> +	enum bpf_prog_type prog_type;
> +	const char *ctx_name;
> +} global_ctx_map[] = {
> +	{ BPF_PROG_TYPE_CGROUP_DEVICE,           "bpf_cgroup_dev_ctx" },
> +	{ BPF_PROG_TYPE_CGROUP_SKB,              "__sk_buff" },
> +	{ BPF_PROG_TYPE_CGROUP_SOCK,             "bpf_sock" },
> +	{ BPF_PROG_TYPE_CGROUP_SOCK_ADDR,        "bpf_sock_addr" },
> +	{ BPF_PROG_TYPE_CGROUP_SOCKOPT,          "bpf_sockopt" },
> +	{ BPF_PROG_TYPE_CGROUP_SYSCTL,           "bpf_sysctl" },
> +	{ BPF_PROG_TYPE_FLOW_DISSECTOR,          "__sk_buff" },
> +	{ BPF_PROG_TYPE_KPROBE,                  "bpf_user_pt_regs_t" },
> +	{ BPF_PROG_TYPE_LWT_IN,                  "__sk_buff" },
> +	{ BPF_PROG_TYPE_LWT_OUT,                 "__sk_buff" },
> +	{ BPF_PROG_TYPE_LWT_SEG6LOCAL,           "__sk_buff" },
> +	{ BPF_PROG_TYPE_LWT_XMIT,                "__sk_buff" },
> +	{ BPF_PROG_TYPE_NETFILTER,               "bpf_nf_ctx" },
> +	{ BPF_PROG_TYPE_PERF_EVENT,              "bpf_perf_event_data" },
> +	{ BPF_PROG_TYPE_RAW_TRACEPOINT,          "bpf_raw_tracepoint_args" },
> +	{ BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" },
> +	{ BPF_PROG_TYPE_SCHED_ACT,               "__sk_buff" },
> +	{ BPF_PROG_TYPE_SCHED_CLS,               "__sk_buff" },
> +	{ BPF_PROG_TYPE_SK_LOOKUP,               "bpf_sk_lookup" },
> +	{ BPF_PROG_TYPE_SK_MSG,                  "sk_msg_md" },
> +	{ BPF_PROG_TYPE_SK_REUSEPORT,            "sk_reuseport_md" },
> +	{ BPF_PROG_TYPE_SK_SKB,                  "__sk_buff" },
> +	{ BPF_PROG_TYPE_SOCK_OPS,                "bpf_sock_ops" },
> +	{ BPF_PROG_TYPE_SOCKET_FILTER,           "__sk_buff" },
> +	{ BPF_PROG_TYPE_XDP,                     "xdp_md" },
> +	/* all other program types don't have "named" context structs */
> +};
> +
> +/* Check if main program or global subprog's function prototype has `arg:ctx`
> + * argument tags, and, if necessary, substitute correct type to match what BPF
> + * verifier would expect, taking into account specific program type. This
> + * allows to support __arg_ctx tag transparently on old kernels that don't yet
> + * have a native support for it in the verifier, making user's life much
> + * easier.
> + */
> +static int bpf_program_fixup_func_info(struct bpf_object *obj, struct bpf_program *prog)
> +{
> +	const char *ctx_name = NULL, *ctx_tag = "arg:ctx";
> +	struct bpf_func_info_min *func_rec;
> +	struct btf_type *fn_t, *fn_proto_t;
> +	struct btf *btf = obj->btf;
> +	const struct btf_type *t;
> +	struct btf_param *p;
> +	int ptr_id = 0, struct_id, tag_id, orig_fn_id;
> +	int i, j, n, arg_idx, arg_cnt, err, name_off, rec_idx;
> +	int *orig_ids;
> +
> +	/* no .BTF.ext, no problem */
> +	if (!obj->btf_ext || !prog->func_info)
> +		return 0;
> +
> +	/* some BPF program types just don't have named context structs, so
> +	 * this fallback mechanism doesn't work for them
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(global_ctx_map); i++) {
> +		if (global_ctx_map[i].prog_type != prog->type)
> +			continue;
> +		ctx_name = global_ctx_map[i].ctx_name;
> +		break;
> +	}
> +	if (!ctx_name)
> +		return 0;
> +
> +	/* remember original func BTF IDs to detect if we already cloned them */
> +	orig_ids = calloc(prog->func_info_cnt, sizeof(*orig_ids));
> +	if (!orig_ids)
> +		return -ENOMEM;
> +	for (i = 0; i < prog->func_info_cnt; i++) {
> +		func_rec = prog->func_info + prog->func_info_rec_size * i;
> +		orig_ids[i] = func_rec->type_id;
> +	}
> +
> +	/* go through each DECL_TAG with "arg:ctx" and see if it points to one
> +	 * of our subprogs; if yes and subprog is global and needs adjustment,
> +	 * clone and adjust FUNC -> FUNC_PROTO combo
> +	 */
> +	for (i = 1, n = btf__type_cnt(btf); i < n; i++) {
> +		/* only DECL_TAG with "arg:ctx" value are interesting */
> +		t = btf__type_by_id(btf, i);
> +		if (!btf_is_decl_tag(t))
> +			continue;
> +		if (strcmp(btf__str_by_offset(btf, t->name_off), ctx_tag) != 0)
> +			continue;
> +
> +		/* only global funcs need adjustment, if at all */
> +		orig_fn_id = t->type;
> +		fn_t = btf_type_by_id(btf, orig_fn_id);
> +		if (!btf_is_func(fn_t) || btf_func_linkage(fn_t) != BTF_FUNC_GLOBAL)
> +			continue;
> +
> +		/* sanity check FUNC -> FUNC_PROTO chain, just in case */
> +		fn_proto_t = btf_type_by_id(btf, fn_t->type);
> +		if (!fn_proto_t || !btf_is_func_proto(fn_proto_t))
> +			continue;
> +
> +		/* find corresponding func_info record */
> +		func_rec = NULL;
> +		for (rec_idx = 0; rec_idx < prog->func_info_cnt; rec_idx++) {
> +			if (orig_ids[rec_idx] == t->type) {
> +				func_rec = prog->func_info + prog->func_info_rec_size * rec_idx;
> +				break;
> +			}
> +		}
> +		/* current main program doesn't call into this subprog */
> +		if (!func_rec)
> +			continue;
> +
> +		/* some more sanity checking of DECL_TAG */
> +		arg_cnt = btf_vlen(fn_proto_t);
> +		arg_idx = btf_decl_tag(t)->component_idx;
> +		if (arg_idx < 0 || arg_idx >= arg_cnt)
> +			continue;
> +
> +		/* check if existing parameter already matches verifier expectations */
> +		p = &btf_params(fn_proto_t)[arg_idx];
> +		t = skip_mods_and_typedefs(btf, p->type, NULL);
> +		if (btf_is_ptr(t) &&
> +		    (t = skip_mods_and_typedefs(btf, t->type, NULL)) &&
> +		    btf_is_struct(t) &&
> +		    strcmp(btf__str_by_offset(btf, t->name_off), ctx_name) == 0) {
> +			continue; /* no need for fix up */
> +		}
> +
> +		/* clone fn/fn_proto, unless we already did it for another arg */
> +		if (func_rec->type_id == orig_fn_id) {
> +			int fn_id, fn_proto_id, ret_type_id, orig_proto_id;
> +
> +			/* Note that each btf__add_xxx() operation invalidates
> +			 * all btf_type and string pointers, so we need to be
> +			 * very careful when cloning BTF types. BTF type
> +			 * pointers have to be always refetched. And to avoid
> +			 * problems with invalidated string pointers, we
> +			 * add empty strings initially, then just fix up
> +			 * name_off offsets in place. Offsets are stable for
> +			 * existing strings, so that works out.
> +			 */
> +			name_off = fn_t->name_off; /* we are about to invalidate fn_t */
> +			ret_type_id = fn_proto_t->type; /* and fn_proto_t as well */
> +			orig_proto_id = fn_t->type; /* original FUNC_PROTO ID */
> +
> +			/* clone FUNC first, btf__add_func() enforces
> +			 * non-empty name, so use entry program's name as
> +			 * a placeholder, which we replace immediately
> +			 */
> +			fn_id = btf__add_func(btf, prog->name, btf_func_linkage(fn_t), fn_t->type);
> +			if (fn_id < 0)
> +				return -EINVAL;
> +			fn_t = btf_type_by_id(btf, fn_id);
> +			fn_t->name_off = name_off; /* reuse original string */
> +			fn_t->type = fn_id + 1; /* we can predict FUNC_PROTO ID */
> +
> +			/* clone FUNC_PROTO and its params now */
> +			fn_proto_id = btf__add_func_proto(btf, ret_type_id);
> +			if (fn_proto_id < 0) {
> +				err = -EINVAL;
> +				goto err_out;
> +			}
> +			for (j = 0; j < arg_cnt; j++) {
> +				/* copy original parameter data */
> +				t = btf_type_by_id(btf, orig_proto_id);
> +				p = &btf_params(t)[j];
> +				name_off = p->name_off;
> +
> +				err = btf__add_func_param(btf, "", p->type);
> +				if (err)
> +					goto err_out;
> +				fn_proto_t = btf_type_by_id(btf, fn_proto_id);
> +				p = &btf_params(fn_proto_t)[j];
> +				p->name_off = name_off; /* use remembered str offset */
> +			}
> +
> +			/* point func_info record to a cloned FUNC type */
> +			func_rec->type_id = fn_id;
> +		}
> +
> +		/* create PTR -> STRUCT type chain to mark PTR_TO_CTX argument;
> +		 * we do it just once per main BPF program, as all global
> +		 * funcs share the same program type, so need only PTR ->
> +		 * STRUCT type chain
> +		 */
> +		if (ptr_id == 0) {
> +			struct_id = btf__add_struct(btf, ctx_name, 0);
> +			ptr_id = btf__add_ptr(btf, struct_id);
> +			if (ptr_id < 0 || struct_id < 0) {
> +				err = -EINVAL;
> +				goto err_out;
> +			}
> +		}
> +
> +		/* for completeness, clone DECL_TAG and point it to cloned param */
> +		tag_id = btf__add_decl_tag(btf, ctx_tag, func_rec->type_id, arg_idx);
> +		if (tag_id < 0) {
> +			err = -EINVAL;
> +			goto err_out;
> +		}
> +
> +		/* all the BTF manipulations invalidated pointers, refetch them */
> +		fn_t = btf_type_by_id(btf, func_rec->type_id);
> +		fn_proto_t = btf_type_by_id(btf, fn_t->type);
> +
> +		/* fix up type ID pointed to by param */
> +		p = &btf_params(fn_proto_t)[arg_idx];
> +		p->type = ptr_id;
> +	}
> +
> +	free(orig_ids);
> +	return 0;
> +err_out:
> +	free(orig_ids);
> +	return err;
> +}
> +
> +static int bpf_object_relocate(struct bpf_object *obj, const char *targ_btf_path)
>  {
>  	struct bpf_program *prog;
>  	size_t i, j;
> @@ -6732,19 +6947,28 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
>  			}
>  		}
>  	}
> -	/* Process data relos for main programs */
>  	for (i = 0; i < obj->nr_programs; i++) {
>  		prog = &obj->programs[i];
>  		if (prog_is_subprog(obj, prog))
>  			continue;
>  		if (!prog->autoload)
>  			continue;
> +
> +		/* Process data relos for main programs */
>  		err = bpf_object__relocate_data(obj, prog);
>  		if (err) {
>  			pr_warn("prog '%s': failed to relocate data references: %d\n",
>  				prog->name, err);
>  			return err;
>  		}
> +
> +		/* Fix up .BTF.ext information, if necessary */
> +		err = bpf_program_fixup_func_info(obj, prog);
> +		if (err) {
> +			pr_warn("prog '%s': failed to perform .BTF.ext fix ups: %d\n",
> +				prog->name, err);
> +			return err;
> +		}
>  	}
>  
>  	return 0;
> @@ -7456,8 +7680,7 @@ static int bpf_program_record_relos(struct bpf_program *prog)
>  	return 0;
>  }
>  
> -static int
> -bpf_object__load_progs(struct bpf_object *obj, int log_level)
> +static int bpf_object_load_progs(struct bpf_object *obj, int log_level)
>  {
>  	struct bpf_program *prog;
>  	size_t i;
> @@ -8093,10 +8316,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
>  	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
>  	err = err ? : bpf_object__sanitize_maps(obj);
>  	err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
> -	err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path);
> +	err = err ? : bpf_object_relocate(obj, obj->btf_custom_path ? : target_btf_path);
>  	err = err ? : bpf_object_load_btf(obj);
>  	err = err ? : bpf_object_create_maps(obj);
> -	err = err ? : bpf_object__load_progs(obj, extra_log_level);
> +	err = err ? : bpf_object_load_progs(obj, extra_log_level);
>  	err = err ? : bpf_object_init_prog_arrays(obj);
>  	err = err ? : bpf_object_prepare_struct_ops(obj);
>  
> -- 
> 2.34.1
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 6/8] libbpf: move BTF loading step after relocation step
  2023-12-22  1:13   ` Alexei Starovoitov
@ 2024-01-02 16:49     ` Andrii Nakryiko
  0 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2024-01-02 16:49 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, kernel-team

On Thu, Dec 21, 2023 at 5:13 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Dec 20, 2023 at 03:31:25PM -0800, Andrii Nakryiko wrote:
> >
> > -static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
> > +static int bpf_object_load_btf(struct bpf_object *obj)
> >  {
> >       struct btf *kern_btf = obj->btf;
> >       bool btf_mandatory, sanitize;
> > @@ -8091,10 +8091,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
> >       err = bpf_object__probe_loading(obj);
> >       err = err ? : bpf_object__load_vmlinux_btf(obj, false);
> >       err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
> > -     err = err ? : bpf_object__sanitize_and_load_btf(obj);
> >       err = err ? : bpf_object__sanitize_maps(obj);
> >       err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
> >       err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path);
> > +     err = err ? : bpf_object_load_btf(obj);
>
> Here and in the previous patch:
> -bpf_object__create_maps(struct bpf_object *obj)
> +static int bpf_object_create_maps(struct bpf_object *obj)
>
> Let's keep __ convention. No need to deviate for these two methods.
> Otherwise above loading sequence looks odd and questions pop on why
> one method with single underscore and others are with two.

I agree about consistency, but these are internal "methods", and since
at least libbpf 1.0 we established a convention that only public APIs
will use double-underscore naming. We've been lazily converting old
code as we touched it, but I'll just include a pre-patch that will
rename all the remaining inconsistently named ones in one go to not
have to worry about this going forward.


>
> pw-bot: cr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic
  2023-12-22  1:26   ` Alexei Starovoitov
@ 2024-01-02 17:06     ` Andrii Nakryiko
  2024-01-03  0:30       ` Andrii Nakryiko
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2024-01-02 17:06 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, kernel-team

On Thu, Dec 21, 2023 at 5:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Dec 20, 2023 at 03:31:26PM -0800, Andrii Nakryiko wrote:
> > This limitation was the reason to add btf_decl_tag("arg:ctx"), making
> > the actual argument type not important, so that user can just define
> > "generic" signature:
> >
> >   __noinline int global_subprog(void *ctx __arg_ctx) { ... }
>
> Just realized that we probably need to enforce in both libbpf doing
> rewrite and in the kernel that __arg_ctx is either valid
> 'struct correct_type_for_bpf_prog *' or 'void *'.
>
> Otherwise the user will get surprising behavior when
> int foo(struct __sk_buff *ctx __arg_ctx)
> {
>   ctx->len;
> }
> will get rewritten to 'struct pt_regs *ctx' based on prog type
> while all asm instructions inside prog were compiled with 'struct __sk_buff'
> and CO_RE performs relocations against that type.

Nothing really prevents users from misusing types even today, so it
doesn't seem like a common problem, luckily.

But really the problem is that some program types don't have an
associated struct name at all, but still a valid context. Like for
LSM/TRACING programs it's a u64[]/u64 *. For tracepoints context is
defined as just plain u64 (according to bpf_ctx_convert), and so on.

Oh, and there is KPROBE program type, where it's (typedef)
bpf_user_pt_regs_t, and for backwards compatibility reasons we also
allow basically non-existing `struct bpf_user_pt_regs_t`.

So it gets messy. Either way, I have a patch set coming up for
kernel-side __arg_xxx tags support, so let's discuss it there?

>
> > +static struct {
> > +     enum bpf_prog_type prog_type;
> > +     const char *ctx_name;
> > +} global_ctx_map[] = {
> > +     { BPF_PROG_TYPE_CGROUP_DEVICE,           "bpf_cgroup_dev_ctx" },
> > +     { BPF_PROG_TYPE_CGROUP_SKB,              "__sk_buff" },
> > +     { BPF_PROG_TYPE_CGROUP_SOCK,             "bpf_sock" },
> > +     { BPF_PROG_TYPE_CGROUP_SOCK_ADDR,        "bpf_sock_addr" },
> > +     { BPF_PROG_TYPE_CGROUP_SOCKOPT,          "bpf_sockopt" },
> > +     { BPF_PROG_TYPE_CGROUP_SYSCTL,           "bpf_sysctl" },
> > +     { BPF_PROG_TYPE_FLOW_DISSECTOR,          "__sk_buff" },
> > +     { BPF_PROG_TYPE_KPROBE,                  "bpf_user_pt_regs_t" },
> > +     { BPF_PROG_TYPE_LWT_IN,                  "__sk_buff" },
> > +     { BPF_PROG_TYPE_LWT_OUT,                 "__sk_buff" },
> > +     { BPF_PROG_TYPE_LWT_SEG6LOCAL,           "__sk_buff" },
> > +     { BPF_PROG_TYPE_LWT_XMIT,                "__sk_buff" },
> > +     { BPF_PROG_TYPE_NETFILTER,               "bpf_nf_ctx" },
> > +     { BPF_PROG_TYPE_PERF_EVENT,              "bpf_perf_event_data" },
> > +     { BPF_PROG_TYPE_RAW_TRACEPOINT,          "bpf_raw_tracepoint_args" },
> > +     { BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" },
> > +     { BPF_PROG_TYPE_SCHED_ACT,               "__sk_buff" },
> > +     { BPF_PROG_TYPE_SCHED_CLS,               "__sk_buff" },
> > +     { BPF_PROG_TYPE_SK_LOOKUP,               "bpf_sk_lookup" },
> > +     { BPF_PROG_TYPE_SK_MSG,                  "sk_msg_md" },
> > +     { BPF_PROG_TYPE_SK_REUSEPORT,            "sk_reuseport_md" },
> > +     { BPF_PROG_TYPE_SK_SKB,                  "__sk_buff" },
> > +     { BPF_PROG_TYPE_SOCK_OPS,                "bpf_sock_ops" },
> > +     { BPF_PROG_TYPE_SOCKET_FILTER,           "__sk_buff" },
> > +     { BPF_PROG_TYPE_XDP,                     "xdp_md" },
>
> We already share the .c files (like relo_core.c) between kernel and libbpf
> let's share here as well to avoid copy paste.
> All of the above is available in include/linux/bpf_types.h

True, but libbpf sources are built both as part of the kernel repo and
separately on Github, so we'll need to start syncing
include/linux/bpf_types.h into tools/include, so that's a bit of
inconvenience.

But most of all I don't want to do it for a few other reasons.

This table was manually constructed by inspecting struct bpf_ctx_convert with:

  bpftool btf dump file /sys/kernel/btf/vmlinux format c | rg
bpf_ctx_convert -A65 | rg _prog

And it has to be manual because of other program types that don't have
associated struct for context. So even if there was bpf_types.h, we
can't use it as is.

But, if your concern is maintainability of this, I don't think that's
a problem at all. Even if we add a new program type with its own
struct name for context, we don't even have to extend this table
(though we might, if we want to), because at that point kernel is
guaranteed to have in-kernel native support for __arg_ctx, so libbpf
doesn't have to do type rewriting.

Also, this probably is the first explicit table that shows which
struct names should be used for global subprog context argument (if
not using __arg_ctx, of course). Which is not really a reason per se,
but it beats reading kernel code, and (non-trivially) figuring out
that one needs to look how struct bpf_ctx_convert is generated, etc.
Having this explicit table is much easier to link as a reference for
those special context type names.

>
> > +             /* clone fn/fn_proto, unless we already did it for another arg */
> > +             if (func_rec->type_id == orig_fn_id) {
>
> It feels that body of this 'if' can be factored out as a separate helper function.
>

Sure, I'll try to factor it out.

> > -static int
> > -bpf_object__load_progs(struct bpf_object *obj, int log_level)
> > +static int bpf_object_load_progs(struct bpf_object *obj, int log_level)
>
> pls keep __ convention.

replied on another patch, I'll do a conversion to consistent naming in
the next revision

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic
  2023-12-22 13:15   ` Jiri Olsa
@ 2024-01-02 17:11     ` Andrii Nakryiko
  0 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2024-01-02 17:11 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, kernel-team

On Fri, Dec 22, 2023 at 5:15 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Wed, Dec 20, 2023 at 03:31:26PM -0800, Andrii Nakryiko wrote:
> > Out of all special global func arg tag annotations, __arg_ctx is
> > practically is the most immediately useful and most critical to have
> > working across multitude kernel version, if possible. This would allow
> > end users to write much simpler code if __arg_ctx semantics worked for
> > older kernels that don't natively understand btf_decl_tag("arg:ctx") in
> > verifier logic.
>
> curious what's the workaround now.. having separate function for each
> program type instead of just one global function? I wonder ebpf/cilium
> library could do the same thing

You mean what users do today? Something like this:

/* static, so types don't matter */

static int common_logic(void *ctx, ...) { ... }

/* global */ int kprobe_logic(struct bpf_user_pt_regs_t *ctx)
{
    return common_logic(ctx);
}

/* global */ int perf_event_logic(struct bpf_perf_event_data *ctx)
{
    return common_logic(ctx);
}

And so on. So it's not great, but it works.

The problem arises when you have nested global functions that need to
pass context.


/* global */ int kprobe_logic_1(struct bpf_user_pt_regs_t *ctx)
{
    ...
}

/* global */ int kprobe_logic_2(struct bpf_user_pt_regs_t *ctx)
{
    int x;

    x = kprobe_logic_1(ctx);
    ...
}


With this nesting of global funcs the above trick doesn't work anymore
because common_logic() can't call per-program global function anymore.

>
> whole patchset lgtm:
>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
>

Thanks!


> jirka
>
> >
> > Luckily, it is possible to ensure __arg_ctx works on old kernels through
> > a bit of extra work done by libbpf, at least in a lot of common cases.
> >
> > To explain the overall idea, we need to go back at how context argument
> > was supported in global funcs before __arg_ctx support was added. This
> > was done based on special struct name checks in kernel. E.g., for
> > BPF_PROG_TYPE_PERF_EVENT the expectation is that argument type `struct
> > bpf_perf_event_data *` mark that argument as PTR_TO_CTX. This is all
> > good as long as global function is used from the same BPF program types
> > only, which is often not the case. If the same subprog has to be called
> > from, say, kprobe and perf_event program types, there is no single
> > definition that would satisfy BPF verifier. Subprog will have context
> > argument either for kprobe (if using bpf_user_pt_regs_t struct name) or
> > perf_event (with bpf_perf_event_data struct name), but not both.
> >
> > This limitation was the reason to add btf_decl_tag("arg:ctx"), making
> > the actual argument type not important, so that user can just define
> > "generic" signature:
> >
> >   __noinline int global_subprog(void *ctx __arg_ctx) { ... }
> >
> > I won't belabor how libbpf is implementing subprograms, see a huge
> > comment next to bpf_object__relocate_calls() function. The idea is that
> > each main/entry BPF program gets its own copy of global_subprog's code
> > appended.
> >
> > This per-program copy of global subprog code *and* associated func_info
> > .BTF.ext information, pointing to FUNC -> FUNC_PROTO BTF type chain
> > allows libbpf to simulate __arg_ctx behavior transparently, even if the
> > kernel doesn't yet support __arg_ctx annotation natively.
> >
> > The idea is straightforward: each time we append global subprog's code
> > and func_info information, we adjust its FUNC -> FUNC_PROTO type
> > information, if necessary (that is, libbpf can detect the presence of
> > btf_decl_tag("arg:ctx") just like BPF verifier would do it).
> >
> > The rest is just mechanical and somewhat painful BTF manipulation code.
> > It's painful because we need to clone FUNC -> FUNC_PROTO, instead of
> > reusing it, as same FUNC -> FUNC_PROTO chain might be used by another
> > main BPF program within the same BPF object, so we can't just modify it
> > in-place (and cloning BTF types within the same struct btf object is
> > painful due to constant memory invalidation, see comments in code).
> > Uploaded BPF object's BTF information has to work for all BPF
> > programs at the same time.
> >
> > Once we have FUNC -> FUNC_PROTO clones, we make sure that instead of
> > using some `void *ctx` parameter definition, we have an expected `struct
> > bpf_perf_event_data *ctx` definition (as far as BPF verifier and kernel
> > is concerned), which will mark it as context for BPF verifier. Same
> > global subprog relocated and copied into another main BPF program will
> > get different type information according to main program's type. It all
> > works out in the end in a completely transparent way for end user.
> >
> > Libbpf maintains internal program type -> expected context struct name
> > mapping internally. Note, not all BPF program types have named context
> > struct, so this approach won't work for such programs (just like it
> > didn't before __arg_ctx). So native __arg_ctx is still important to have
> > in kernel to have generic context support across all BPF program types.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  tools/lib/bpf/libbpf.c | 239 +++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 231 insertions(+), 8 deletions(-)
> >

please trim

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic
  2024-01-02 17:06     ` Andrii Nakryiko
@ 2024-01-03  0:30       ` Andrii Nakryiko
  0 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2024-01-03  0:30 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, ast, daniel, martin.lau, kernel-team

On Tue, Jan 2, 2024 at 9:06 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Dec 21, 2023 at 5:26 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Dec 20, 2023 at 03:31:26PM -0800, Andrii Nakryiko wrote:
> > > This limitation was the reason to add btf_decl_tag("arg:ctx"), making
> > > the actual argument type not important, so that user can just define
> > > "generic" signature:
> > >
> > >   __noinline int global_subprog(void *ctx __arg_ctx) { ... }
> >
> > Just realized that we probably need to enforce in both libbpf doing
> > rewrite and in the kernel that __arg_ctx is either valid
> > 'struct correct_type_for_bpf_prog *' or 'void *'.
> >
> > Otherwise the user will get surprising behavior when
> > int foo(struct __sk_buff *ctx __arg_ctx)
> > {
> >   ctx->len;
> > }
> > will get rewritten to 'struct pt_regs *ctx' based on prog type
> > while all asm instructions inside prog were compiled with 'struct __sk_buff'
> > and CO_RE performs relocations against that type.
>
> Nothing really prevents users from misusing types even today, so it
> doesn't seem like a common problem, luckily.
>
> But really the problem is that some program types don't have an
> associated struct name at all, but still a valid context. Like for
> LSM/TRACING programs it's a u64[]/u64 *. For tracepoints context is
> defined as just plain u64 (according to bpf_ctx_convert), and so on.
>
> Oh, and there is KPROBE program type, where it's (typedef)
> bpf_user_pt_regs_t, and for backwards compatibility reasons we also
> allow basically non-existing `struct bpf_user_pt_regs_t`.
>
> So it gets messy. Either way, I have a patch set coming up for
> kernel-side __arg_xxx tags support, so let's discuss it there?
>
> >
> > > +static struct {
> > > +     enum bpf_prog_type prog_type;
> > > +     const char *ctx_name;
> > > +} global_ctx_map[] = {
> > > +     { BPF_PROG_TYPE_CGROUP_DEVICE,           "bpf_cgroup_dev_ctx" },
> > > +     { BPF_PROG_TYPE_CGROUP_SKB,              "__sk_buff" },
> > > +     { BPF_PROG_TYPE_CGROUP_SOCK,             "bpf_sock" },
> > > +     { BPF_PROG_TYPE_CGROUP_SOCK_ADDR,        "bpf_sock_addr" },
> > > +     { BPF_PROG_TYPE_CGROUP_SOCKOPT,          "bpf_sockopt" },
> > > +     { BPF_PROG_TYPE_CGROUP_SYSCTL,           "bpf_sysctl" },
> > > +     { BPF_PROG_TYPE_FLOW_DISSECTOR,          "__sk_buff" },
> > > +     { BPF_PROG_TYPE_KPROBE,                  "bpf_user_pt_regs_t" },
> > > +     { BPF_PROG_TYPE_LWT_IN,                  "__sk_buff" },
> > > +     { BPF_PROG_TYPE_LWT_OUT,                 "__sk_buff" },
> > > +     { BPF_PROG_TYPE_LWT_SEG6LOCAL,           "__sk_buff" },
> > > +     { BPF_PROG_TYPE_LWT_XMIT,                "__sk_buff" },
> > > +     { BPF_PROG_TYPE_NETFILTER,               "bpf_nf_ctx" },
> > > +     { BPF_PROG_TYPE_PERF_EVENT,              "bpf_perf_event_data" },
> > > +     { BPF_PROG_TYPE_RAW_TRACEPOINT,          "bpf_raw_tracepoint_args" },
> > > +     { BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" },
> > > +     { BPF_PROG_TYPE_SCHED_ACT,               "__sk_buff" },
> > > +     { BPF_PROG_TYPE_SCHED_CLS,               "__sk_buff" },
> > > +     { BPF_PROG_TYPE_SK_LOOKUP,               "bpf_sk_lookup" },
> > > +     { BPF_PROG_TYPE_SK_MSG,                  "sk_msg_md" },
> > > +     { BPF_PROG_TYPE_SK_REUSEPORT,            "sk_reuseport_md" },
> > > +     { BPF_PROG_TYPE_SK_SKB,                  "__sk_buff" },
> > > +     { BPF_PROG_TYPE_SOCK_OPS,                "bpf_sock_ops" },
> > > +     { BPF_PROG_TYPE_SOCKET_FILTER,           "__sk_buff" },
> > > +     { BPF_PROG_TYPE_XDP,                     "xdp_md" },
> >
> > We already share the .c files (like relo_core.c) between kernel and libbpf
> > let's share here as well to avoid copy paste.
> > All of the above is available in include/linux/bpf_types.h
>
> True, but libbpf sources are built both as part of the kernel repo and
> separately on Github, so we'll need to start syncing
> include/linux/bpf_types.h into tools/include, so that's a bit of
> inconvenience.
>
> But most of all I don't want to do it for a few other reasons.
>
> This table was manually constructed by inspecting struct bpf_ctx_convert with:
>
>   bpftool btf dump file /sys/kernel/btf/vmlinux format c | rg
> bpf_ctx_convert -A65 | rg _prog
>
> And it has to be manual because of other program types that don't have
> associated struct for context. So even if there was bpf_types.h, we
> can't use it as is.

Another headache I realized as I was reading someone else's patch is
all the #ifdef CONFIG_xxx checks, which we'd need to fake to even get
a full list of program types. In short, it's more trouble than it's
worth.

>
> But, if your concern is maintainability of this, I don't think that's
> a problem at all. Even if we add a new program type with its own
> struct name for context, we don't even have to extend this table
> (though we might, if we want to), because at that point kernel is
> guaranteed to have in-kernel native support for __arg_ctx, so libbpf
> doesn't have to do type rewriting.
>
> Also, this probably is the first explicit table that shows which
> struct names should be used for global subprog context argument (if
> not using __arg_ctx, of course). Which is not really a reason per se,
> but it beats reading kernel code, and (non-trivially) figuring out
> that one needs to look how struct bpf_ctx_convert is generated, etc.
> Having this explicit table is much easier to link as a reference for
> those special context type names.
>
> >
> > > +             /* clone fn/fn_proto, unless we already did it for another arg */
> > > +             if (func_rec->type_id == orig_fn_id) {
> >
> > It feels that body of this 'if' can be factored out as a separate helper function.
> >
>
> Sure, I'll try to factor it out.
>
> > > -static int
> > > -bpf_object__load_progs(struct bpf_object *obj, int log_level)
> > > +static int bpf_object_load_progs(struct bpf_object *obj, int log_level)
> >
> > pls keep __ convention.
>
> replied on another patch, I'll do a conversion to consistent naming in
> the next revision

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-01-03  0:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-20 23:31 [PATCH bpf-next 0/8] Libbpf-side __arg_ctx fallback support Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 1/8] libbpf: make uniform use of btf__fd() accessor inside libbpf Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 2/8] libbpf: use explicit map reuse flag to skip map creation steps Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 3/8] libbpf: don't rely on map->fd as an indicator of map being created Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 4/8] libbpf: use stable map placeholder FDs Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 5/8] libbpf: move exception callbacks assignment logic into relocation step Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 6/8] libbpf: move BTF loading step after " Andrii Nakryiko
2023-12-22  1:13   ` Alexei Starovoitov
2024-01-02 16:49     ` Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic Andrii Nakryiko
2023-12-22  1:26   ` Alexei Starovoitov
2024-01-02 17:06     ` Andrii Nakryiko
2024-01-03  0:30       ` Andrii Nakryiko
2023-12-22 13:15   ` Jiri Olsa
2024-01-02 17:11     ` Andrii Nakryiko
2023-12-20 23:31 ` [PATCH bpf-next 8/8] selftests/bpf: add arg:ctx cases to test_global_funcs tests Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox