[PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG

BPF List
 help / color / mirror / Atom feed

* [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD
@ 2024-11-19 10:15 Anton Protopopov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper Anton Protopopov
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov

Add a new attribute to the bpf(BPF_PROG_LOAD) system call. If this
new attribute is non-zero, then the fd_array is considered to be a
continuous array of the fd_array_cnt length and to contain only
proper map file descriptors, or btf file descriptors, or zeroes.

This change allows maps, which aren't referenced directly by a BPF
program, to be bound to the program _and_ also to be present during
the program verification (so BPF_PROG_BIND_MAP is not enough for this
use case).

The primary reason for this change is that it is a prerequisite for
adding "instruction set" maps, which are both non-referenced by the
program and must be present during the program verification.

The first three commits add the new functionality, the fourth adds
corresponding self-tests, and the last two are small additional fixes.

v1 -> v2:
  * rewrite the add_fd_from_fd_array() function (Eduard)
  * a few cleanups in selftests (Eduard)

Anton Protopopov (6):
  bpf: add a __btf_get_by_fd helper
  bpf: move map/prog compatibility checks
  bpf: add fd_array_cnt attribute for prog_load
  selftests/bpf: Add tests for fd_array_cnt
  bpf: fix potential error return
  selftest/bpf: replace magic constants by macros

 include/linux/btf.h                           |  13 +
 include/uapi/linux/bpf.h                      |  10 +
 kernel/bpf/btf.c                              |  13 +-
 kernel/bpf/core.c                             |   9 +-
 kernel/bpf/syscall.c                          |   2 +-
 kernel/bpf/verifier.c                         | 203 +++++++---
 tools/include/uapi/linux/bpf.h                |  10 +
 .../selftests/bpf/prog_tests/fd_array.c       | 381 ++++++++++++++++++
 tools/testing/selftests/bpf/progs/syscall.c   |   6 +-
 9 files changed, 566 insertions(+), 81 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fd_array.c

-- 
2.34.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper
  2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
@ 2024-11-19 10:15 ` Anton Protopopov
  2024-11-26  1:31   ` Alexei Starovoitov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 2/6] bpf: move map/prog compatibility checks Anton Protopopov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov

Add a new helper to get a pointer to a struct btf from a file
descriptor which doesn't increase a refcount.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
---
 include/linux/btf.h | 13 +++++++++++++
 kernel/bpf/btf.c    | 13 ++++---------
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 4214e76c9168..050051a578a8 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -4,6 +4,7 @@
 #ifndef _LINUX_BTF_H
 #define _LINUX_BTF_H 1
 
+#include <linux/file.h>
 #include <linux/types.h>
 #include <linux/bpfptr.h>
 #include <linux/bsearch.h>
@@ -143,6 +144,18 @@ void btf_get(struct btf *btf);
 void btf_put(struct btf *btf);
 const struct btf_header *btf_header(const struct btf *btf);
 int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_sz);
+
+static inline struct btf *__btf_get_by_fd(struct fd f)
+{
+	if (fd_empty(f))
+		return ERR_PTR(-EBADF);
+
+	if (unlikely(fd_file(f)->f_op != &btf_fops))
+		return ERR_PTR(-EINVAL);
+
+	return fd_file(f)->private_data;
+}
+
 struct btf *btf_get_by_fd(int fd);
 int btf_get_info_by_fd(const struct btf *btf,
 		       const union bpf_attr *attr,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index e7a59e6462a9..ad5310fa1d3b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -7743,17 +7743,12 @@ int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 
 struct btf *btf_get_by_fd(int fd)
 {
-	struct btf *btf;
 	CLASS(fd, f)(fd);
+	struct btf *btf;
 
-	if (fd_empty(f))
-		return ERR_PTR(-EBADF);
-
-	if (fd_file(f)->f_op != &btf_fops)
-		return ERR_PTR(-EINVAL);
-
-	btf = fd_file(f)->private_data;
-	refcount_inc(&btf->refcnt);
+	btf = __btf_get_by_fd(f);
+	if (!IS_ERR(btf))
+		refcount_inc(&btf->refcnt);
 
 	return btf;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 bpf-next 2/6] bpf: move map/prog compatibility checks
  2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper Anton Protopopov
@ 2024-11-19 10:15 ` Anton Protopopov
  2024-11-26 18:44   ` Andrii Nakryiko
  2024-11-19 10:15 ` [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load Anton Protopopov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov

Move some inlined map/prog compatibility checks from the
resolve_pseudo_ldimm64() function to the dedicated
check_map_prog_compatibility() function. Call the latter function
from the add_used_map_from_fd() function directly.

This simplifies code and optimizes logic a bit, as before these
changes the check_map_prog_compatibility() function was executed on
every map usage, which doesn't make sense, as it doesn't include any
per-instruction checks, only map type vs. prog type.

(This patch also simplifies a consequent patch which will call the
add_used_map_from_fd() function from another code path.)

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
---
 kernel/bpf/verifier.c | 101 +++++++++++++++++++-----------------------
 1 file changed, 46 insertions(+), 55 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1c4ebb326785..8e034a22aa2a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19064,6 +19064,12 @@ static bool is_tracing_prog_type(enum bpf_prog_type type)
 	}
 }
 
+static bool bpf_map_is_cgroup_storage(struct bpf_map *map)
+{
+	return (map->map_type == BPF_MAP_TYPE_CGROUP_STORAGE ||
+		map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE);
+}
+
 static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 					struct bpf_map *map,
 					struct bpf_prog *prog)
@@ -19142,25 +19148,48 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 			return -EINVAL;
 		}
 
-	return 0;
-}
+	if (bpf_map_is_cgroup_storage(map) &&
+	    bpf_cgroup_storage_assign(env->prog->aux, map)) {
+		verbose(env, "only one cgroup storage of each type is allowed\n");
+		return -EBUSY;
+	}
 
-static bool bpf_map_is_cgroup_storage(struct bpf_map *map)
-{
-	return (map->map_type == BPF_MAP_TYPE_CGROUP_STORAGE ||
-		map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE);
+	if (map->map_type == BPF_MAP_TYPE_ARENA) {
+		if (env->prog->aux->arena) {
+			verbose(env, "Only one arena per program\n");
+			return -EBUSY;
+		}
+		if (!env->allow_ptr_leaks || !env->bpf_capable) {
+			verbose(env, "CAP_BPF and CAP_PERFMON are required to use arena\n");
+			return -EPERM;
+		}
+		if (!env->prog->jit_requested) {
+			verbose(env, "JIT is required to use arena\n");
+			return -EOPNOTSUPP;
+		}
+		if (!bpf_jit_supports_arena()) {
+			verbose(env, "JIT doesn't support arena\n");
+			return -EOPNOTSUPP;
+		}
+		env->prog->aux->arena = (void *)map;
+		if (!bpf_arena_get_user_vm_start(env->prog->aux->arena)) {
+			verbose(env, "arena's user address must be set via map_extra or mmap()\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
 }
 
 /* Add map behind fd to used maps list, if it's not already there, and return
- * its index. Also set *reused to true if this map was already in the list of
- * used maps.
+ * its index.
  * Returns <0 on error, or >= 0 index, on success.
  */
-static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd, bool *reused)
+static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
 {
 	CLASS(fd, f)(fd);
 	struct bpf_map *map;
-	int i;
+	int i, err;
 
 	map = __bpf_map_get(f);
 	if (IS_ERR(map)) {
@@ -19169,12 +19198,9 @@ static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd, bool *reus
 	}
 
 	/* check whether we recorded this map already */
-	for (i = 0; i < env->used_map_cnt; i++) {
-		if (env->used_maps[i] == map) {
-			*reused = true;
+	for (i = 0; i < env->used_map_cnt; i++)
+		if (env->used_maps[i] == map)
 			return i;
-		}
-	}
 
 	if (env->used_map_cnt >= MAX_USED_MAPS) {
 		verbose(env, "The total number of maps per program has reached the limit of %u\n",
@@ -19182,6 +19208,10 @@ static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd, bool *reus
 		return -E2BIG;
 	}
 
+	err = check_map_prog_compatibility(env, map, env->prog);
+	if (err)
+		return err;
+
 	if (env->prog->sleepable)
 		atomic64_inc(&map->sleepable_refcnt);
 
@@ -19192,7 +19222,6 @@ static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd, bool *reus
 	 */
 	bpf_map_inc(map);
 
-	*reused = false;
 	env->used_maps[env->used_map_cnt++] = map;
 
 	return env->used_map_cnt - 1;
@@ -19229,7 +19258,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			int map_idx;
 			u64 addr;
 			u32 fd;
-			bool reused;
 
 			if (i == insn_cnt - 1 || insn[1].code != 0 ||
 			    insn[1].dst_reg != 0 || insn[1].src_reg != 0 ||
@@ -19290,7 +19318,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 				break;
 			}
 
-			map_idx = add_used_map_from_fd(env, fd, &reused);
+			map_idx = add_used_map_from_fd(env, fd);
 			if (map_idx < 0)
 				return map_idx;
 			map = env->used_maps[map_idx];
@@ -19298,10 +19326,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			aux = &env->insn_aux_data[i];
 			aux->map_index = map_idx;
 
-			err = check_map_prog_compatibility(env, map, env->prog);
-			if (err)
-				return err;
-
 			if (insn[0].src_reg == BPF_PSEUDO_MAP_FD ||
 			    insn[0].src_reg == BPF_PSEUDO_MAP_IDX) {
 				addr = (unsigned long)map;
@@ -19332,39 +19356,6 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			insn[0].imm = (u32)addr;
 			insn[1].imm = addr >> 32;
 
-			/* proceed with extra checks only if its newly added used map */
-			if (reused)
-				goto next_insn;
-
-			if (bpf_map_is_cgroup_storage(map) &&
-			    bpf_cgroup_storage_assign(env->prog->aux, map)) {
-				verbose(env, "only one cgroup storage of each type is allowed\n");
-				return -EBUSY;
-			}
-			if (map->map_type == BPF_MAP_TYPE_ARENA) {
-				if (env->prog->aux->arena) {
-					verbose(env, "Only one arena per program\n");
-					return -EBUSY;
-				}
-				if (!env->allow_ptr_leaks || !env->bpf_capable) {
-					verbose(env, "CAP_BPF and CAP_PERFMON are required to use arena\n");
-					return -EPERM;
-				}
-				if (!env->prog->jit_requested) {
-					verbose(env, "JIT is required to use arena\n");
-					return -EOPNOTSUPP;
-				}
-				if (!bpf_jit_supports_arena()) {
-					verbose(env, "JIT doesn't support arena\n");
-					return -EOPNOTSUPP;
-				}
-				env->prog->aux->arena = (void *)map;
-				if (!bpf_arena_get_user_vm_start(env->prog->aux->arena)) {
-					verbose(env, "arena's user address must be set via map_extra or mmap()\n");
-					return -EINVAL;
-				}
-			}
-
 next_insn:
 			insn++;
 			i++;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper Anton Protopopov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 2/6] bpf: move map/prog compatibility checks Anton Protopopov
@ 2024-11-19 10:15 ` Anton Protopopov
  2024-11-26  1:38   ` Alexei Starovoitov
  2024-11-26  2:11   ` Hou Tao
  2024-11-19 10:15 ` [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt Anton Protopopov
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov

The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
of file descriptors: maps or btfs. This field was introduced as a
sparse array. Introduce a new attribute, fd_array_cnt, which, if
present, indicates that the fd_array is a continuous array of the
corresponding length.

If fd_array_cnt is non-zero, then every map in the fd_array will be
bound to the program, as if it was used by the program. This
functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
maps can be used by the verifier during the program load.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
---
 include/uapi/linux/bpf.h       |  10 ++++
 kernel/bpf/syscall.c           |   2 +-
 kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
 tools/include/uapi/linux/bpf.h |  10 ++++
 4 files changed, 113 insertions(+), 15 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4162afc6b5d0..2acf9b336371 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1573,6 +1573,16 @@ union bpf_attr {
 		 * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
 		 */
 		__s32		prog_token_fd;
+		/* The fd_array_cnt can be used to pass the length of the
+		 * fd_array array. In this case all the [map] file descriptors
+		 * passed in this array will be bound to the program, even if
+		 * the maps are not referenced directly. The functionality is
+		 * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
+		 * used by the verifier during the program load. If provided,
+		 * then the fd_array[0,...,fd_array_cnt-1] is expected to be
+		 * continuous.
+		 */
+		__u32		fd_array_cnt;
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 58190ca724a2..7e3fbc23c742 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2729,7 +2729,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define BPF_PROG_LOAD_LAST_FIELD prog_token_fd
+#define BPF_PROG_LOAD_LAST_FIELD fd_array_cnt
 
 static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8e034a22aa2a..a84ba93c0036 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19181,22 +19181,10 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 	return 0;
 }
 
-/* Add map behind fd to used maps list, if it's not already there, and return
- * its index.
- * Returns <0 on error, or >= 0 index, on success.
- */
-static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
+static int add_used_map(struct bpf_verifier_env *env, struct bpf_map *map)
 {
-	CLASS(fd, f)(fd);
-	struct bpf_map *map;
 	int i, err;
 
-	map = __bpf_map_get(f);
-	if (IS_ERR(map)) {
-		verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
-		return PTR_ERR(map);
-	}
-
 	/* check whether we recorded this map already */
 	for (i = 0; i < env->used_map_cnt; i++)
 		if (env->used_maps[i] == map)
@@ -19227,6 +19215,24 @@ static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
 	return env->used_map_cnt - 1;
 }
 
+/* Add map behind fd to used maps list, if it's not already there, and return
+ * its index.
+ * Returns <0 on error, or >= 0 index, on success.
+ */
+static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
+{
+	struct bpf_map *map;
+	CLASS(fd, f)(fd);
+
+	map = __bpf_map_get(f);
+	if (IS_ERR(map)) {
+		verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
+		return PTR_ERR(map);
+	}
+
+	return add_used_map(env, map);
+}
+
 /* find and rewrite pseudo imm in ld_imm64 instructions:
  *
  * 1. if it accesses map FD, replace it with actual map pointer.
@@ -22526,6 +22532,75 @@ struct btf *bpf_get_btf_vmlinux(void)
 	return btf_vmlinux;
 }
 
+/*
+ * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
+ * this case expect that every file descriptor in the array is either a map or
+ * a BTF, or a hole (0). Everything else is considered to be trash.
+ */
+static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
+{
+	struct bpf_map *map;
+	CLASS(fd, f)(fd);
+	int ret;
+
+	map = __bpf_map_get(f);
+	if (!IS_ERR(map)) {
+		ret = add_used_map(env, map);
+		if (ret < 0)
+			return ret;
+		return 0;
+	}
+
+	if (!IS_ERR(__btf_get_by_fd(f)))
+		return 0;
+
+	if (!fd)
+		return 0;
+
+	verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
+	return PTR_ERR(map);
+}
+
+static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
+{
+	int size = sizeof(int) * attr->fd_array_cnt;
+	int *copy;
+	int ret;
+	int i;
+
+	if (attr->fd_array_cnt >= MAX_USED_MAPS)
+		return -E2BIG;
+
+	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
+
+	/*
+	 * The only difference between old (no fd_array_cnt is given) and new
+	 * APIs is that in the latter case the fd_array is expected to be
+	 * continuous and is scanned for map fds right away
+	 */
+	if (!size)
+		return 0;
+
+	copy = kzalloc(size, GFP_KERNEL);
+	if (!copy)
+		return -ENOMEM;
+
+	if (copy_from_bpfptr_offset(copy, env->fd_array, 0, size)) {
+		ret = -EFAULT;
+		goto free_copy;
+	}
+
+	for (i = 0; i < attr->fd_array_cnt; i++) {
+		ret = add_fd_from_fd_array(env, copy[i]);
+		if (ret)
+			goto free_copy;
+	}
+
+free_copy:
+	kfree(copy);
+	return ret;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
 {
 	u64 start_time = ktime_get_ns();
@@ -22557,7 +22632,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
 	env->ops = bpf_verifier_ops[env->prog->type];
-	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
+	ret = env_init_fd_array(env, attr, uattr);
+	if (ret)
+		goto err_free_aux_data;
 
 	env->allow_ptr_leaks = bpf_allow_ptr_leaks(env->prog->aux->token);
 	env->allow_uninit_stack = bpf_allow_uninit_stack(env->prog->aux->token);
@@ -22775,6 +22852,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 err_unlock:
 	if (!is_priv)
 		mutex_unlock(&bpf_verifier_lock);
+err_free_aux_data:
 	vfree(env->insn_aux_data);
 	kvfree(env->insn_hist);
 err_free_env:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4162afc6b5d0..2acf9b336371 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1573,6 +1573,16 @@ union bpf_attr {
 		 * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
 		 */
 		__s32		prog_token_fd;
+		/* The fd_array_cnt can be used to pass the length of the
+		 * fd_array array. In this case all the [map] file descriptors
+		 * passed in this array will be bound to the program, even if
+		 * the maps are not referenced directly. The functionality is
+		 * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
+		 * used by the verifier during the program load. If provided,
+		 * then the fd_array[0,...,fd_array_cnt-1] is expected to be
+		 * continuous.
+		 */
+		__u32		fd_array_cnt;
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt
  2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
                   ` (2 preceding siblings ...)
  2024-11-19 10:15 ` [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load Anton Protopopov
@ 2024-11-19 10:15 ` Anton Protopopov
  2024-11-26 18:54   ` Andrii Nakryiko
  2024-11-19 10:15 ` [PATCH v2 bpf-next 5/6] bpf: fix potential error return Anton Protopopov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 6/6] selftest/bpf: replace magic constants by macros Anton Protopopov
  5 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov

Add a new set of tests to test the new field in PROG_LOAD-related
part of bpf_attr: fd_array_cnt.

Add the following test cases:

  * fd_array_cnt/no-fd-array: program is loaded in a normal
    way, without any fd_array present

  * fd_array_cnt/fd-array-ok: pass two extra non-used maps,
    check that they're bound to the program

  * fd_array_cnt/fd-array-dup-input: pass a few extra maps,
    only two of which are unique

  * fd_array_cnt/fd-array-ref-maps-in-array: pass a map in
    fd_array which is also referenced from within the program

  * fd_array_cnt/fd-array-trash-input: pass array with some trash

  * fd_array_cnt/fd-array-with-holes: pass an array with holes (fd=0)

  * fd_array_cnt/fd-array-2big: pass too large array

All the tests above are using the bpf(2) syscall directly,
no libbpf involved.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
---
 .../selftests/bpf/prog_tests/fd_array.c       | 381 ++++++++++++++++++
 1 file changed, 381 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fd_array.c

diff --git a/tools/testing/selftests/bpf/prog_tests/fd_array.c b/tools/testing/selftests/bpf/prog_tests/fd_array.c
new file mode 100644
index 000000000000..1b47386e66c3
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/fd_array.c
@@ -0,0 +1,381 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+
+#include <linux/btf.h>
+#include <sys/syscall.h>
+#include <bpf/bpf.h>
+
+#include "../test_btf.h"
+
+static inline int _bpf_map_create(void)
+{
+	static union bpf_attr attr = {
+		.map_type = BPF_MAP_TYPE_ARRAY,
+		.key_size = 4,
+		.value_size = 8,
+		.max_entries = 1,
+	};
+
+	return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
+}
+
+static int _btf_create(void)
+{
+	struct btf_blob {
+		struct btf_header btf_hdr;
+		__u32 types[8];
+		__u32 str;
+	} raw_btf = {
+		.btf_hdr = {
+			.magic = BTF_MAGIC,
+			.version = BTF_VERSION,
+			.hdr_len = sizeof(struct btf_header),
+			.type_len = sizeof(raw_btf.types),
+			.str_off = offsetof(struct btf_blob, str) - offsetof(struct btf_blob, types),
+			.str_len = sizeof(raw_btf.str),
+		},
+		.types = {
+			/* long */
+			BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 64, 8),  /* [1] */
+			/* unsigned long */
+			BTF_TYPE_INT_ENC(0, 0, 0, 64, 8),  /* [2] */
+		},
+	};
+	static union bpf_attr attr = {
+		.btf_size = sizeof(raw_btf),
+	};
+
+	attr.btf = (long)&raw_btf;
+
+	return syscall(__NR_bpf, BPF_BTF_LOAD, &attr, sizeof(attr));
+}
+
+static bool map_exists(__u32 id)
+{
+	int fd;
+
+	fd = bpf_map_get_fd_by_id(id);
+	if (fd >= 0) {
+		close(fd);
+		return true;
+	}
+	return false;
+}
+
+static inline int bpf_prog_get_map_ids(int prog_fd, __u32 *nr_map_ids, __u32 *map_ids)
+{
+	__u32 len = sizeof(struct bpf_prog_info);
+	struct bpf_prog_info info = {
+		.nr_map_ids = *nr_map_ids,
+		.map_ids = ptr_to_u64(map_ids),
+	};
+	int err;
+
+	err = bpf_prog_get_info_by_fd(prog_fd, &info, &len);
+	if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd"))
+		return -1;
+
+	*nr_map_ids = info.nr_map_ids;
+
+	return 0;
+}
+
+static int __load_test_prog(int map_fd, int *fd_array, int fd_array_cnt)
+{
+	/* A trivial program which uses one map */
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_FD(BPF_REG_1, map_fd),
+		BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+		BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	union bpf_attr attr = {
+		.prog_type = BPF_PROG_TYPE_XDP, /* we don't care */
+		.insns     = ptr_to_u64(insns),
+		.insn_cnt  = ARRAY_SIZE(insns),
+		.license   = ptr_to_u64("GPL"),
+		.fd_array = ptr_to_u64(fd_array),
+		.fd_array_cnt = fd_array_cnt,
+	};
+
+	return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
+}
+
+static int load_test_prog(int *fd_array, int fd_array_cnt)
+{
+	int map_fd;
+	int ret;
+
+	map_fd = _bpf_map_create();
+	if (!ASSERT_GE(map_fd, 0, "_bpf_map_create"))
+		return map_fd;
+
+	ret = __load_test_prog(map_fd, fd_array, fd_array_cnt);
+	close(map_fd);
+
+	/* switch back to returning the actual value */
+	if (ret < 0)
+		return -errno;
+	return ret;
+}
+
+static bool check_expected_map_ids(int prog_fd, int expected, __u32 *map_ids, __u32 *nr_map_ids)
+{
+	int err;
+
+	err = bpf_prog_get_map_ids(prog_fd, nr_map_ids, map_ids);
+	if (!ASSERT_OK(err, "bpf_prog_get_map_ids"))
+		return false;
+	if (!ASSERT_EQ(*nr_map_ids, expected, "unexpected nr_map_ids"))
+		return false;
+
+	return true;
+}
+
+/*
+ * Load a program, which uses one map. No fd_array maps are present.
+ * On return only one map is expected to be bound to prog.
+ */
+static void check_fd_array_cnt__no_fd_array(void)
+{
+	__u32 map_ids[16];
+	__u32 nr_map_ids;
+	int prog_fd = -1;
+
+	prog_fd = load_test_prog(NULL, 0);
+	if (!ASSERT_GE(prog_fd, 0, "BPF_PROG_LOAD"))
+		return;
+	nr_map_ids = ARRAY_SIZE(map_ids);
+	check_expected_map_ids(prog_fd, 1, map_ids, &nr_map_ids);
+	close(prog_fd);
+}
+
+/*
+ * Load a program, which uses one map, and pass two extra, non-equal, maps in
+ * fd_array with fd_array_cnt=2. On return three maps are expected to be bound
+ * to the program.
+ */
+static void check_fd_array_cnt__fd_array_ok(void)
+{
+	int extra_fds[2] = { -1, -1 };
+	__u32 map_ids[16];
+	__u32 nr_map_ids;
+	int prog_fd = -1;
+
+	extra_fds[0] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[0], 0, "_bpf_map_create"))
+		goto cleanup;
+	extra_fds[1] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[1], 0, "_bpf_map_create"))
+		goto cleanup;
+	prog_fd = load_test_prog(extra_fds, 2);
+	if (!ASSERT_GE(prog_fd, 0, "BPF_PROG_LOAD"))
+		goto cleanup;
+	nr_map_ids = ARRAY_SIZE(map_ids);
+	if (!check_expected_map_ids(prog_fd, 3, map_ids, &nr_map_ids))
+		goto cleanup;
+
+	/* maps should still exist when original file descriptors are closed */
+	close(extra_fds[0]);
+	close(extra_fds[1]);
+	if (!ASSERT_EQ(map_exists(map_ids[0]), true, "map_ids[0] should exist"))
+		goto cleanup;
+	if (!ASSERT_EQ(map_exists(map_ids[1]), true, "map_ids[1] should exist"))
+		goto cleanup;
+
+	/* some fds might be invalid, so ignore return codes */
+cleanup:
+	close(extra_fds[1]);
+	close(extra_fds[0]);
+	close(prog_fd);
+}
+
+/*
+ * Load a program with a few extra maps duplicated in the fd_array.
+ * After the load maps should only be referenced once.
+ */
+static void check_fd_array_cnt__duplicated_maps(void)
+{
+	int extra_fds[4] = { -1, -1, -1, -1 };
+	__u32 map_ids[16];
+	__u32 nr_map_ids;
+	int prog_fd = -1;
+
+	extra_fds[0] = extra_fds[2] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[0], 0, "_bpf_map_create"))
+		goto cleanup;
+	extra_fds[1] = extra_fds[3] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[1], 0, "_bpf_map_create"))
+		goto cleanup;
+	prog_fd = load_test_prog(extra_fds, 4);
+	if (!ASSERT_GE(prog_fd, 0, "BPF_PROG_LOAD"))
+		goto cleanup;
+	nr_map_ids = ARRAY_SIZE(map_ids);
+	if (!check_expected_map_ids(prog_fd, 3, map_ids, &nr_map_ids))
+		goto cleanup;
+
+	/* maps should still exist when original file descriptors are closed */
+	close(extra_fds[0]);
+	close(extra_fds[1]);
+	if (!ASSERT_EQ(map_exists(map_ids[0]), true, "map should exist"))
+		goto cleanup;
+	if (!ASSERT_EQ(map_exists(map_ids[1]), true, "map should exist"))
+		goto cleanup;
+
+	/* some fds might be invalid, so ignore return codes */
+cleanup:
+	close(extra_fds[1]);
+	close(extra_fds[0]);
+	close(prog_fd);
+}
+
+/*
+ * Check that if maps which are referenced by a program are
+ * passed in fd_array, then they will be referenced only once
+ */
+static void check_fd_array_cnt__referenced_maps_in_fd_array(void)
+{
+	int extra_fds[1] = { -1 };
+	__u32 map_ids[16];
+	__u32 nr_map_ids;
+	int prog_fd = -1;
+
+	extra_fds[0] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[0], 0, "_bpf_map_create"))
+		goto cleanup;
+	prog_fd = __load_test_prog(extra_fds[0], extra_fds, 1);
+	if (!ASSERT_GE(prog_fd, 0, "BPF_PROG_LOAD"))
+		goto cleanup;
+	nr_map_ids = ARRAY_SIZE(map_ids);
+	if (!check_expected_map_ids(prog_fd, 1, map_ids, &nr_map_ids))
+		goto cleanup;
+
+	/* map should still exist when original file descriptor is closed */
+	close(extra_fds[0]);
+	if (!ASSERT_EQ(map_exists(map_ids[0]), true, "map should exist"))
+		goto cleanup;
+
+	/* some fds might be invalid, so ignore return codes */
+cleanup:
+	close(extra_fds[0]);
+	close(prog_fd);
+}
+
+/*
+ * Test that a program with trash in fd_array can't be loaded:
+ * only map and BTF file descriptors should be accepted.
+ */
+static void check_fd_array_cnt__fd_array_with_trash(void)
+{
+	int extra_fds[3] = { -1, -1, -1 };
+	int prog_fd = -1;
+
+	extra_fds[0] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[0], 0, "_bpf_map_create"))
+		goto cleanup;
+	extra_fds[1] = _btf_create();
+	if (!ASSERT_GE(extra_fds[1], 0, "_btf_create"))
+		goto cleanup;
+
+	/* trash 1: not a file descriptor */
+	extra_fds[2] = 0xbeef;
+	prog_fd = load_test_prog(extra_fds, 3);
+	if (!ASSERT_EQ(prog_fd, -EBADF, "prog should have been rejected with -EBADF"))
+		goto cleanup;
+
+	/* trash 2: not a map or btf */
+	extra_fds[2] = socket(AF_INET, SOCK_STREAM, 0);
+	if (!ASSERT_GE(extra_fds[2], 0, "socket"))
+		goto cleanup;
+
+	prog_fd = load_test_prog(extra_fds, 3);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "prog should have been rejected with -EINVAL"))
+		goto cleanup;
+
+	/* some fds might be invalid, so ignore return codes */
+cleanup:
+	close(extra_fds[2]);
+	close(extra_fds[1]);
+	close(extra_fds[0]);
+}
+
+/*
+ * Test that a program with zeroes (= holes) in fd_array can be loaded:
+ * only map and BTF file descriptors should be accepted.
+ */
+static void check_fd_array_cnt__fd_array_with_holes(void)
+{
+	int extra_fds[4] = { -1, -1, -1, -1 };
+	int prog_fd = -1;
+
+	extra_fds[0] = _bpf_map_create();
+	if (!ASSERT_GE(extra_fds[0], 0, "_bpf_map_create"))
+		goto cleanup;
+	/* punch a hole*/
+	extra_fds[1] = 0;
+	extra_fds[2] = _btf_create();
+	if (!ASSERT_GE(extra_fds[1], 0, "_btf_create"))
+		goto cleanup;
+	/* punch a hole*/
+	extra_fds[3] = 0;
+
+	prog_fd = load_test_prog(extra_fds, 4);
+	ASSERT_GE(prog_fd, 0, "prog with holes should have been loaded");
+
+	/* some fds might be invalid, so ignore return codes */
+cleanup:
+	close(extra_fds[2]);
+	close(extra_fds[0]);
+	close(prog_fd);
+}
+
+/*
+ * Test that a program with too big fd_array can't be loaded.
+ */
+static void check_fd_array_cnt__fd_array_too_big(void)
+{
+	int extra_fds[65];
+	int prog_fd = -1;
+	int i;
+
+	for (i = 0; i < 65; i++) {
+		extra_fds[i] = _bpf_map_create();
+		if (!ASSERT_GE(extra_fds[i], 0, "_bpf_map_create"))
+			goto cleanup_fds;
+	}
+
+	prog_fd = load_test_prog(extra_fds, 65);
+	ASSERT_EQ(prog_fd, -E2BIG, "prog should have been rejected with -E2BIG");
+
+cleanup_fds:
+	while (i > 0)
+		close(extra_fds[--i]);
+}
+
+void test_fd_array_cnt(void)
+{
+	if (test__start_subtest("no-fd-array"))
+		check_fd_array_cnt__no_fd_array();
+
+	if (test__start_subtest("fd-array-ok"))
+		check_fd_array_cnt__fd_array_ok();
+
+	if (test__start_subtest("fd-array-dup-input"))
+		check_fd_array_cnt__duplicated_maps();
+
+	if (test__start_subtest("fd-array-ref-maps-in-array"))
+		check_fd_array_cnt__referenced_maps_in_fd_array();
+
+	if (test__start_subtest("fd-array-trash-input"))
+		check_fd_array_cnt__fd_array_with_trash();
+
+	if (test__start_subtest("fd-array-with-holes"))
+		check_fd_array_cnt__fd_array_with_holes();
+
+	if (test__start_subtest("fd-array-2big"))
+		check_fd_array_cnt__fd_array_too_big();
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 bpf-next 5/6] bpf: fix potential error return
  2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
                   ` (3 preceding siblings ...)
  2024-11-19 10:15 ` [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt Anton Protopopov
@ 2024-11-19 10:15 ` Anton Protopopov
  2024-11-26  1:43   ` Alexei Starovoitov
  2024-11-19 10:15 ` [PATCH v2 bpf-next 6/6] selftest/bpf: replace magic constants by macros Anton Protopopov
  5 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov, Jiri Olsa

The bpf_remove_insns() function returns WARN_ON_ONCE(error), where
error is a result of bpf_adj_branches(), and thus should be always 0
However, if for any reason it is not 0, then it will be converted to
boolean by WARN_ON_ONCE and returned to user space as 1, not an actual
error value. Fix this by returning the original err after the WARN check.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/bpf/core.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 14d9288441f2..a15059918768 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -539,6 +539,8 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 
 int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
 {
+	int err;
+
 	/* Branch offsets can't overflow when program is shrinking, no need
 	 * to call bpf_adj_branches(..., true) here
 	 */
@@ -546,7 +548,12 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
 		sizeof(struct bpf_insn) * (prog->len - off - cnt));
 	prog->len -= cnt;
 
-	return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false));
+	err = bpf_adj_branches(prog, off, off + cnt, off, false);
+	WARN_ON_ONCE(err);
+	if (err)
+		return err;
+
+	return 0;
 }
 
 static void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 bpf-next 6/6] selftest/bpf: replace magic constants by macros
  2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
                   ` (4 preceding siblings ...)
  2024-11-19 10:15 ` [PATCH v2 bpf-next 5/6] bpf: fix potential error return Anton Protopopov
@ 2024-11-19 10:15 ` Anton Protopopov
  5 siblings, 0 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-19 10:15 UTC (permalink / raw)
  To: bpf; +Cc: Anton Protopopov, Eduard Zingerman

Replace magic constants in a BTF structure initialization code by
proper macros, as is done in other similar selftests.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/syscall.c b/tools/testing/selftests/bpf/progs/syscall.c
index 0f4dfb770c32..b698cc62a371 100644
--- a/tools/testing/selftests/bpf/progs/syscall.c
+++ b/tools/testing/selftests/bpf/progs/syscall.c
@@ -76,9 +76,9 @@ static int btf_load(void)
 			.magic = BTF_MAGIC,
 			.version = BTF_VERSION,
 			.hdr_len = sizeof(struct btf_header),
-			.type_len = sizeof(__u32) * 8,
-			.str_off = sizeof(__u32) * 8,
-			.str_len = sizeof(__u32),
+			.type_len = sizeof(raw_btf.types),
+			.str_off = offsetof(struct btf_blob, str) - offsetof(struct btf_blob, types),
+			.str_len = sizeof(raw_btf.str),
 		},
 		.types = {
 			/* long */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper
  2024-11-19 10:15 ` [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper Anton Protopopov
@ 2024-11-26  1:31   ` Alexei Starovoitov
  2024-11-26 16:33     ` Anton Protopopov
  0 siblings, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2024-11-26  1:31 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf

On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> Add a new helper to get a pointer to a struct btf from a file
> descriptor which doesn't increase a refcount.
>
> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> ---
>  include/linux/btf.h | 13 +++++++++++++
>  kernel/bpf/btf.c    | 13 ++++---------
>  2 files changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/btf.h b/include/linux/btf.h
> index 4214e76c9168..050051a578a8 100644
> --- a/include/linux/btf.h
> +++ b/include/linux/btf.h
> @@ -4,6 +4,7 @@
>  #ifndef _LINUX_BTF_H
>  #define _LINUX_BTF_H 1
>
> +#include <linux/file.h>
>  #include <linux/types.h>
>  #include <linux/bpfptr.h>
>  #include <linux/bsearch.h>
> @@ -143,6 +144,18 @@ void btf_get(struct btf *btf);
>  void btf_put(struct btf *btf);
>  const struct btf_header *btf_header(const struct btf *btf);
>  int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_sz);
> +
> +static inline struct btf *__btf_get_by_fd(struct fd f)
> +{
> +       if (fd_empty(f))
> +               return ERR_PTR(-EBADF);
> +
> +       if (unlikely(fd_file(f)->f_op != &btf_fops))
> +               return ERR_PTR(-EINVAL);
> +
> +       return fd_file(f)->private_data;
> +}

Maybe let's call it __btf_get() and place it next to __bpf_map_get() ?
So names and function bodies are directly comparable?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-19 10:15 ` [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load Anton Protopopov
@ 2024-11-26  1:38   ` Alexei Starovoitov
  2024-11-26 17:05     ` Anton Protopopov
  2024-11-26  2:11   ` Hou Tao
  1 sibling, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2024-11-26  1:38 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf

On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> of file descriptors: maps or btfs. This field was introduced as a
> sparse array. Introduce a new attribute, fd_array_cnt, which, if
> present, indicates that the fd_array is a continuous array of the
> corresponding length.
>
> If fd_array_cnt is non-zero, then every map in the fd_array will be
> bound to the program, as if it was used by the program. This
> functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> maps can be used by the verifier during the program load.
>
> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> ---
>  include/uapi/linux/bpf.h       |  10 ++++
>  kernel/bpf/syscall.c           |   2 +-
>  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
>  tools/include/uapi/linux/bpf.h |  10 ++++
>  4 files changed, 113 insertions(+), 15 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4162afc6b5d0..2acf9b336371 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1573,6 +1573,16 @@ union bpf_attr {
>                  * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
>                  */
>                 __s32           prog_token_fd;
> +               /* The fd_array_cnt can be used to pass the length of the
> +                * fd_array array. In this case all the [map] file descriptors
> +                * passed in this array will be bound to the program, even if
> +                * the maps are not referenced directly. The functionality is
> +                * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
> +                * used by the verifier during the program load. If provided,
> +                * then the fd_array[0,...,fd_array_cnt-1] is expected to be
> +                * continuous.
> +                */
> +               __u32           fd_array_cnt;
>         };
>
>         struct { /* anonymous struct used by BPF_OBJ_* commands */
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 58190ca724a2..7e3fbc23c742 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -2729,7 +2729,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
>  }
>
>  /* last field in 'union bpf_attr' used by this command */
> -#define BPF_PROG_LOAD_LAST_FIELD prog_token_fd
> +#define BPF_PROG_LOAD_LAST_FIELD fd_array_cnt
>
>  static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
>  {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 8e034a22aa2a..a84ba93c0036 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19181,22 +19181,10 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
>         return 0;
>  }
>
> -/* Add map behind fd to used maps list, if it's not already there, and return
> - * its index.
> - * Returns <0 on error, or >= 0 index, on success.
> - */
> -static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
> +static int add_used_map(struct bpf_verifier_env *env, struct bpf_map *map)

_from_fd suffix is imo too verbose.
Let's use __add_used_map() here instead?

>  {
> -       CLASS(fd, f)(fd);
> -       struct bpf_map *map;
>         int i, err;
>
> -       map = __bpf_map_get(f);
> -       if (IS_ERR(map)) {
> -               verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
> -               return PTR_ERR(map);
> -       }
> -
>         /* check whether we recorded this map already */
>         for (i = 0; i < env->used_map_cnt; i++)
>                 if (env->used_maps[i] == map)
> @@ -19227,6 +19215,24 @@ static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
>         return env->used_map_cnt - 1;
>  }
>
> +/* Add map behind fd to used maps list, if it's not already there, and return
> + * its index.
> + * Returns <0 on error, or >= 0 index, on success.
> + */
> +static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)

and keep this one as add_used_map() ?

> +{
> +       struct bpf_map *map;
> +       CLASS(fd, f)(fd);
> +
> +       map = __bpf_map_get(f);
> +       if (IS_ERR(map)) {
> +               verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
> +               return PTR_ERR(map);
> +       }
> +
> +       return add_used_map(env, map);
> +}
> +
>  /* find and rewrite pseudo imm in ld_imm64 instructions:
>   *
>   * 1. if it accesses map FD, replace it with actual map pointer.
> @@ -22526,6 +22532,75 @@ struct btf *bpf_get_btf_vmlinux(void)
>         return btf_vmlinux;
>  }
>
> +/*
> + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> + * this case expect that every file descriptor in the array is either a map or
> + * a BTF, or a hole (0). Everything else is considered to be trash.
> + */
> +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> +{
> +       struct bpf_map *map;
> +       CLASS(fd, f)(fd);
> +       int ret;
> +
> +       map = __bpf_map_get(f);
> +       if (!IS_ERR(map)) {
> +               ret = add_used_map(env, map);
> +               if (ret < 0)
> +                       return ret;
> +               return 0;
> +       }
> +
> +       if (!IS_ERR(__btf_get_by_fd(f)))
> +               return 0;
> +
> +       if (!fd)
> +               return 0;

This is not allowed in new apis.
zero cannot be special.

> +
> +       verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
> +       return PTR_ERR(map);
> +}
> +
> +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)

What an odd name... why is 'env_' there?

> +{
> +       int size = sizeof(int) * attr->fd_array_cnt;

int overflow.

> +       int *copy;
> +       int ret;
> +       int i;
> +
> +       if (attr->fd_array_cnt >= MAX_USED_MAPS)
> +               return -E2BIG;
> +
> +       env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
> +
> +       /*
> +        * The only difference between old (no fd_array_cnt is given) and new
> +        * APIs is that in the latter case the fd_array is expected to be
> +        * continuous and is scanned for map fds right away
> +        */
> +       if (!size)
> +               return 0;
> +
> +       copy = kzalloc(size, GFP_KERNEL);

the slab will WARN with dmesg splat on large sizes.

> +       if (!copy)
> +               return -ENOMEM;
> +
> +       if (copy_from_bpfptr_offset(copy, env->fd_array, 0, size)) {
> +               ret = -EFAULT;
> +               goto free_copy;
> +       }
> +
> +       for (i = 0; i < attr->fd_array_cnt; i++) {
> +               ret = add_fd_from_fd_array(env, copy[i]);
> +               if (ret)
> +                       goto free_copy;
> +       }

I don't get this feature.
Why bother copying and checking for validity?
What does it buy ?

pw-bot: cr

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 5/6] bpf: fix potential error return
  2024-11-19 10:15 ` [PATCH v2 bpf-next 5/6] bpf: fix potential error return Anton Protopopov
@ 2024-11-26  1:43   ` Alexei Starovoitov
  2024-11-26 16:36     ` Anton Protopopov
  0 siblings, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2024-11-26  1:43 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf, Jiri Olsa

On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> The bpf_remove_insns() function returns WARN_ON_ONCE(error), where
> error is a result of bpf_adj_branches(), and thus should be always 0
> However, if for any reason it is not 0, then it will be converted to
> boolean by WARN_ON_ONCE and returned to user space as 1, not an actual
> error value. Fix this by returning the original err after the WARN check.
>
> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/bpf/core.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 14d9288441f2..a15059918768 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -539,6 +539,8 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
>
>  int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
>  {
> +       int err;
> +
>         /* Branch offsets can't overflow when program is shrinking, no need
>          * to call bpf_adj_branches(..., true) here
>          */
> @@ -546,7 +548,12 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
>                 sizeof(struct bpf_insn) * (prog->len - off - cnt));
>         prog->len -= cnt;
>
> -       return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false));
> +       err = bpf_adj_branches(prog, off, off + cnt, off, false);
> +       WARN_ON_ONCE(err);
> +       if (err)
> +               return err;
> +
> +       return 0;

That looks very odd. Just return err ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-19 10:15 ` [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load Anton Protopopov
  2024-11-26  1:38   ` Alexei Starovoitov
@ 2024-11-26  2:11   ` Hou Tao
  2024-11-27  6:44     ` Anton Protopopov
  1 sibling, 1 reply; 24+ messages in thread
From: Hou Tao @ 2024-11-26  2:11 UTC (permalink / raw)
  To: Anton Protopopov, bpf

Hi,

On 11/19/2024 6:15 PM, Anton Protopopov wrote:
> The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> of file descriptors: maps or btfs. This field was introduced as a
> sparse array. Introduce a new attribute, fd_array_cnt, which, if
> present, indicates that the fd_array is a continuous array of the
> corresponding length.
>
> If fd_array_cnt is non-zero, then every map in the fd_array will be
> bound to the program, as if it was used by the program. This
> functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> maps can be used by the verifier during the program load.
>
> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> ---
>  include/uapi/linux/bpf.h       |  10 ++++
>  kernel/bpf/syscall.c           |   2 +-
>  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
>  tools/include/uapi/linux/bpf.h |  10 ++++
>  4 files changed, 113 insertions(+), 15 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4162afc6b5d0..2acf9b336371 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1573,6 +1573,16 @@ union bpf_attr {
>  		 * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
>  		 */
>  		__s32		prog_token_fd;
> +		/* The fd_array_cnt can be used to pass the length of the
> +		 * fd_array array. In this case all the [map] file descriptors
> +		 * passed in this array will be bound to the program, even if
> +		 * the maps are not referenced directly. The functionality is
> +		 * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
> +		 * used by the verifier during the program load. If provided,
> +		 * then the fd_array[0,...,fd_array_cnt-1] is expected to be
> +		 * continuous.
> +		 */
> +		__u32		fd_array_cnt;
>  	};
>  
>  	struct { /* anonymous struct used by BPF_OBJ_* commands */

SNIP
> +/*
> + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> + * this case expect that every file descriptor in the array is either a map or
> + * a BTF, or a hole (0). Everything else is considered to be trash.
> + */
> +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> +{
> +	struct bpf_map *map;
> +	CLASS(fd, f)(fd);
> +	int ret;
> +
> +	map = __bpf_map_get(f);
> +	if (!IS_ERR(map)) {
> +		ret = add_used_map(env, map);
> +		if (ret < 0)
> +			return ret;
> +		return 0;
> +	}
> +
> +	if (!IS_ERR(__btf_get_by_fd(f)))
> +		return 0;

For fd_array_cnt > 0 case, does it need to handle BTF fd case ? If it
does, these returned BTFs should be saved in somewhere, otherewise,
these BTFs will be leaked.
> +
> +	if (!fd)
> +		return 0;
> +
> +	verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
> +	return PTR_ERR(map);
> +}
> +
> +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
> +{
> +	int size = sizeof(int) * attr->fd_array_cnt;
> +	int *copy;
> +	int ret;
> +	int i;
> +
> +	if (attr->fd_array_cnt >= MAX_USED_MAPS)
> +		return -E2BIG;
> +
> +	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
> +
> +	/*
> +	 * The only difference between old (no fd_array_cnt is given) and new
> +	 * APIs is that in the latter case the fd_array is expected to be
> +	 * continuous and is scanned for map fds right away
> +	 */
> +	if (!size)
> +		return 0;
> +
> +	copy = kzalloc(size, GFP_KERNEL);
> +	if (!copy)
> +		return -ENOMEM;
> +
> +	if (copy_from_bpfptr_offset(copy, env->fd_array, 0, size)) {
> +		ret = -EFAULT;
> +		goto free_copy;
> +	}

It is better to use kvmemdup_bpfptr() instead.
> +
> +	for (i = 0; i < attr->fd_array_cnt; i++) {
> +		ret = add_fd_from_fd_array(env, copy[i]);
> +		if (ret)
> +			goto free_copy;
> +	}
> +
> +free_copy:
> +	kfree(copy);
> +	return ret;
> +}
> +
>  int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
>  {
>  	u64 start_time = ktime_get_ns();
> @@ -22557,7 +22632,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>  		env->insn_aux_data[i].orig_idx = i;
>  	env->prog = *prog;
>  	env->ops = bpf_verifier_ops[env->prog->type];
> -	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
> +	ret = env_init_fd_array(env, attr, uattr);
> +	if (ret)
> +		goto err_free_aux_data;

These maps saved in env->used_map will also be leaked.
>  
>  	env->allow_ptr_leaks = bpf_allow_ptr_leaks(env->prog->aux->token);
>  	env->allow_uninit_stack = bpf_allow_uninit_stack(env->prog->aux->token);
> @@ -22775,6 +22852,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>  err_unlock:
>  	if (!is_priv)
>  		mutex_unlock(&bpf_verifier_lock);
> +err_free_aux_data:
>  	vfree(env->insn_aux_data);
>  	kvfree(env->insn_hist);
>  err_free_env:
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 4162afc6b5d0..2acf9b336371 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -1573,6 +1573,16 @@ union bpf_attr {
>  		 * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
>  		 */
>  		__s32		prog_token_fd;
> +		/* The fd_array_cnt can be used to pass the length of the
> +		 * fd_array array. In this case all the [map] file descriptors
> +		 * passed in this array will be bound to the program, even if
> +		 * the maps are not referenced directly. The functionality is
> +		 * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
> +		 * used by the verifier during the program load. If provided,
> +		 * then the fd_array[0,...,fd_array_cnt-1] is expected to be
> +		 * continuous.
> +		 */
> +		__u32		fd_array_cnt;
>  	};
>  
>  	struct { /* anonymous struct used by BPF_OBJ_* commands */


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper
  2024-11-26  1:31   ` Alexei Starovoitov
@ 2024-11-26 16:33     ` Anton Protopopov
  2024-11-26 16:52       ` Alexei Starovoitov
  0 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-26 16:33 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: bpf

On 24/11/25 05:31PM, Alexei Starovoitov wrote:
> On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > Add a new helper to get a pointer to a struct btf from a file
> > descriptor which doesn't increase a refcount.
> >
> > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > ---
> >  include/linux/btf.h | 13 +++++++++++++
> >  kernel/bpf/btf.c    | 13 ++++---------
> >  2 files changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index 4214e76c9168..050051a578a8 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -4,6 +4,7 @@
> >  #ifndef _LINUX_BTF_H
> >  #define _LINUX_BTF_H 1
> >
> > +#include <linux/file.h>
> >  #include <linux/types.h>
> >  #include <linux/bpfptr.h>
> >  #include <linux/bsearch.h>
> > @@ -143,6 +144,18 @@ void btf_get(struct btf *btf);
> >  void btf_put(struct btf *btf);
> >  const struct btf_header *btf_header(const struct btf *btf);
> >  int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_sz);
> > +
> > +static inline struct btf *__btf_get_by_fd(struct fd f)
> > +{
> > +       if (fd_empty(f))
> > +               return ERR_PTR(-EBADF);
> > +
> > +       if (unlikely(fd_file(f)->f_op != &btf_fops))
> > +               return ERR_PTR(-EINVAL);
> > +
> > +       return fd_file(f)->private_data;
> > +}
> 
> Maybe let's call it __btf_get() and place it next to __bpf_map_get() ?
> So names and function bodies are directly comparable?

I named it so because the corresponding helper which is taking a ref is named
btf_get_by_fd(). And btf_get() is actually increasing a refcnt. In the bpf_map
case naming is a bit different (and also not super-consistent,
bpf_map_inc/bpf_map_put to +- refcnt). Do you want me to make names more
consistent, globally?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 5/6] bpf: fix potential error return
  2024-11-26  1:43   ` Alexei Starovoitov
@ 2024-11-26 16:36     ` Anton Protopopov
  0 siblings, 0 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-26 16:36 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: bpf, Jiri Olsa

On 24/11/25 05:43PM, Alexei Starovoitov wrote:
> On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > The bpf_remove_insns() function returns WARN_ON_ONCE(error), where
> > error is a result of bpf_adj_branches(), and thus should be always 0
> > However, if for any reason it is not 0, then it will be converted to
> > boolean by WARN_ON_ONCE and returned to user space as 1, not an actual
> > error value. Fix this by returning the original err after the WARN check.
> >
> > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > Acked-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/bpf/core.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 14d9288441f2..a15059918768 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -539,6 +539,8 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
> >
> >  int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
> >  {
> > +       int err;
> > +
> >         /* Branch offsets can't overflow when program is shrinking, no need
> >          * to call bpf_adj_branches(..., true) here
> >          */
> > @@ -546,7 +548,12 @@ int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt)
> >                 sizeof(struct bpf_insn) * (prog->len - off - cnt));
> >         prog->len -= cnt;
> >
> > -       return WARN_ON_ONCE(bpf_adj_branches(prog, off, off + cnt, off, false));
> > +       err = bpf_adj_branches(prog, off, off + cnt, off, false);
> > +       WARN_ON_ONCE(err);
> > +       if (err)
> > +               return err;
> > +
> > +       return 0;
> 
> That looks very odd. Just return err ?

Ah, yes, thanks. This was supposed to be followed up by a patch which
adds code in between, but as this patch is out of scope of this set,
I will just return err here.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper
  2024-11-26 16:33     ` Anton Protopopov
@ 2024-11-26 16:52       ` Alexei Starovoitov
  0 siblings, 0 replies; 24+ messages in thread
From: Alexei Starovoitov @ 2024-11-26 16:52 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf

On Tue, Nov 26, 2024 at 8:30 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> On 24/11/25 05:31PM, Alexei Starovoitov wrote:
> > On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > >
> > > Add a new helper to get a pointer to a struct btf from a file
> > > descriptor which doesn't increase a refcount.
> > >
> > > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > > ---
> > >  include/linux/btf.h | 13 +++++++++++++
> > >  kernel/bpf/btf.c    | 13 ++++---------
> > >  2 files changed, 17 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > > index 4214e76c9168..050051a578a8 100644
> > > --- a/include/linux/btf.h
> > > +++ b/include/linux/btf.h
> > > @@ -4,6 +4,7 @@
> > >  #ifndef _LINUX_BTF_H
> > >  #define _LINUX_BTF_H 1
> > >
> > > +#include <linux/file.h>
> > >  #include <linux/types.h>
> > >  #include <linux/bpfptr.h>
> > >  #include <linux/bsearch.h>
> > > @@ -143,6 +144,18 @@ void btf_get(struct btf *btf);
> > >  void btf_put(struct btf *btf);
> > >  const struct btf_header *btf_header(const struct btf *btf);
> > >  int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_sz);
> > > +
> > > +static inline struct btf *__btf_get_by_fd(struct fd f)
> > > +{
> > > +       if (fd_empty(f))
> > > +               return ERR_PTR(-EBADF);
> > > +
> > > +       if (unlikely(fd_file(f)->f_op != &btf_fops))
> > > +               return ERR_PTR(-EINVAL);
> > > +
> > > +       return fd_file(f)->private_data;
> > > +}
> >
> > Maybe let's call it __btf_get() and place it next to __bpf_map_get() ?
> > So names and function bodies are directly comparable?
>
> I named it so because the corresponding helper which is taking a ref is named
> btf_get_by_fd(). And btf_get() is actually increasing a refcnt. In the bpf_map
> case naming is a bit different (and also not super-consistent,
> bpf_map_inc/bpf_map_put to +- refcnt).

I see. __btf_get_by_fd() is fine then.
Only place it next to __bpf_map_get() and add a comment that
double underscore helpers don't inc refcnt while correspoing bpf_map_get()
and btf_get_by_fd().

> Do you want me to make names more
> consistent, globally?

No. let's avoid the churn.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-26  1:38   ` Alexei Starovoitov
@ 2024-11-26 17:05     ` Anton Protopopov
  2024-11-26 18:51       ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-26 17:05 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: bpf

On 24/11/25 05:38PM, Alexei Starovoitov wrote:
> On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> > of file descriptors: maps or btfs. This field was introduced as a
> > sparse array. Introduce a new attribute, fd_array_cnt, which, if
> > present, indicates that the fd_array is a continuous array of the
> > corresponding length.
> >
> > If fd_array_cnt is non-zero, then every map in the fd_array will be
> > bound to the program, as if it was used by the program. This
> > functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> > maps can be used by the verifier during the program load.
> >
> > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > ---
> >  include/uapi/linux/bpf.h       |  10 ++++
> >  kernel/bpf/syscall.c           |   2 +-
> >  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
> >  tools/include/uapi/linux/bpf.h |  10 ++++
> >  4 files changed, 113 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 4162afc6b5d0..2acf9b336371 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1573,6 +1573,16 @@ union bpf_attr {
> >                  * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
> >                  */
> >                 __s32           prog_token_fd;
> > +               /* The fd_array_cnt can be used to pass the length of the
> > +                * fd_array array. In this case all the [map] file descriptors
> > +                * passed in this array will be bound to the program, even if
> > +                * the maps are not referenced directly. The functionality is
> > +                * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
> > +                * used by the verifier during the program load. If provided,
> > +                * then the fd_array[0,...,fd_array_cnt-1] is expected to be
> > +                * continuous.
> > +                */
> > +               __u32           fd_array_cnt;
> >         };
> >
> >         struct { /* anonymous struct used by BPF_OBJ_* commands */
> > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > index 58190ca724a2..7e3fbc23c742 100644
> > --- a/kernel/bpf/syscall.c
> > +++ b/kernel/bpf/syscall.c
> > @@ -2729,7 +2729,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
> >  }
> >
> >  /* last field in 'union bpf_attr' used by this command */
> > -#define BPF_PROG_LOAD_LAST_FIELD prog_token_fd
> > +#define BPF_PROG_LOAD_LAST_FIELD fd_array_cnt
> >
> >  static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
> >  {
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 8e034a22aa2a..a84ba93c0036 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -19181,22 +19181,10 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
> >         return 0;
> >  }
> >
> > -/* Add map behind fd to used maps list, if it's not already there, and return
> > - * its index.
> > - * Returns <0 on error, or >= 0 index, on success.
> > - */
> > -static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
> > +static int add_used_map(struct bpf_verifier_env *env, struct bpf_map *map)
> 
> _from_fd suffix is imo too verbose.
> Let's use __add_used_map() here instead?

Will do.

> >  {
> > -       CLASS(fd, f)(fd);
> > -       struct bpf_map *map;
> >         int i, err;
> >
> > -       map = __bpf_map_get(f);
> > -       if (IS_ERR(map)) {
> > -               verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
> > -               return PTR_ERR(map);
> > -       }
> > -
> >         /* check whether we recorded this map already */
> >         for (i = 0; i < env->used_map_cnt; i++)
> >                 if (env->used_maps[i] == map)
> > @@ -19227,6 +19215,24 @@ static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
> >         return env->used_map_cnt - 1;
> >  }
> >
> > +/* Add map behind fd to used maps list, if it's not already there, and return
> > + * its index.
> > + * Returns <0 on error, or >= 0 index, on success.
> > + */
> > +static int add_used_map_from_fd(struct bpf_verifier_env *env, int fd)
> 
> and keep this one as add_used_map() ?

Will do.

> > +{
> > +       struct bpf_map *map;
> > +       CLASS(fd, f)(fd);
> > +
> > +       map = __bpf_map_get(f);
> > +       if (IS_ERR(map)) {
> > +               verbose(env, "fd %d is not pointing to valid bpf_map\n", fd);
> > +               return PTR_ERR(map);
> > +       }
> > +
> > +       return add_used_map(env, map);
> > +}
> > +
> >  /* find and rewrite pseudo imm in ld_imm64 instructions:
> >   *
> >   * 1. if it accesses map FD, replace it with actual map pointer.
> > @@ -22526,6 +22532,75 @@ struct btf *bpf_get_btf_vmlinux(void)
> >         return btf_vmlinux;
> >  }
> >
> > +/*
> > + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> > + * this case expect that every file descriptor in the array is either a map or
> > + * a BTF, or a hole (0). Everything else is considered to be trash.
> > + */
> > +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> > +{
> > +       struct bpf_map *map;
> > +       CLASS(fd, f)(fd);
> > +       int ret;
> > +
> > +       map = __bpf_map_get(f);
> > +       if (!IS_ERR(map)) {
> > +               ret = add_used_map(env, map);
> > +               if (ret < 0)
> > +                       return ret;
> > +               return 0;
> > +       }
> > +
> > +       if (!IS_ERR(__btf_get_by_fd(f)))
> > +               return 0;
> > +
> > +       if (!fd)
> > +               return 0;
> 
> This is not allowed in new apis.
> zero cannot be special.

I thought that this is ok since I check for all possible valid FDs by this
point: if fd=0 is a valid map or btf, then we will return it by this point.

Why I wanted to keep 0 as a valid value, even if it is not pointing to any
map/btf is for case when, like in libbpf gen, fd_array is populated with map
fds starting from 0, and with btf fds from some offset, so in between there may
be 0s. This is probably better to disallow this, and, if fd_array_cnt != 0,
then to check if all [0...fd_array_cnt) elements are valid.

> > +
> > +       verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
> > +       return PTR_ERR(map);
> > +}
> > +
> > +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
> 
> What an odd name... why is 'env_' there?

will remove

> 
> > +{
> > +       int size = sizeof(int) * attr->fd_array_cnt;
> 
> int overflow.

Thanks, will fix.

> > +       int *copy;
> > +       int ret;
> > +       int i;
> > +
> > +       if (attr->fd_array_cnt >= MAX_USED_MAPS)
> > +               return -E2BIG;

(This actually should be MAX_USED_MAPS + MAX_USED_BTFS)

> > +
> > +       env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
> > +
> > +       /*
> > +        * The only difference between old (no fd_array_cnt is given) and new
> > +        * APIs is that in the latter case the fd_array is expected to be
> > +        * continuous and is scanned for map fds right away
> > +        */
> > +       if (!size)
> > +               return 0;
> > +
> > +       copy = kzalloc(size, GFP_KERNEL);
> 
> the slab will WARN with dmesg splat on large sizes.

The size would be actually smaller than 4*(MAX_USED_MAPS + MAX_USED_BTFS),
as we check this above.

> 
> > +       if (!copy)
> > +               return -ENOMEM;
> > +
> > +       if (copy_from_bpfptr_offset(copy, env->fd_array, 0, size)) {
> > +               ret = -EFAULT;
> > +               goto free_copy;
> > +       }
> > +
> > +       for (i = 0; i < attr->fd_array_cnt; i++) {
> > +               ret = add_fd_from_fd_array(env, copy[i]);
> > +               if (ret)
> > +                       goto free_copy;
> > +       }
> 
> I don't get this feature.
> Why bother copying and checking for validity?
> What does it buy ?

So, the main reason for this whole change is to allow unrelated maps, which
aren't referenced by a program directly, to be noticed and available during the
verification. Thus, I want to go through the array here and add them to
used_maps. (In a consequent patch, "instuction sets" maps are treated
additionally at this point as well.)

The reason to discard that copy here was that "old api" when fd_array_cnt is 0
is still valid and in this case we can't copy fd_array in full. Later during
the verification fd_array elements are accessed individually by copying from
bpfptr. I thought that freeing this copy here is more readable than to add
a check at every such occasion.

> pw-bot: cr

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 2/6] bpf: move map/prog compatibility checks
  2024-11-19 10:15 ` [PATCH v2 bpf-next 2/6] bpf: move map/prog compatibility checks Anton Protopopov
@ 2024-11-26 18:44   ` Andrii Nakryiko
  0 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2024-11-26 18:44 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf

On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> Move some inlined map/prog compatibility checks from the
> resolve_pseudo_ldimm64() function to the dedicated
> check_map_prog_compatibility() function. Call the latter function
> from the add_used_map_from_fd() function directly.
>
> This simplifies code and optimizes logic a bit, as before these
> changes the check_map_prog_compatibility() function was executed on
> every map usage, which doesn't make sense, as it doesn't include any
> per-instruction checks, only map type vs. prog type.
>
> (This patch also simplifies a consequent patch which will call the
> add_used_map_from_fd() function from another code path.)
>
> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> ---
>  kernel/bpf/verifier.c | 101 +++++++++++++++++++-----------------------
>  1 file changed, 46 insertions(+), 55 deletions(-)
>

LGTM, I think this is a good improvement. As far as I can tell the
laziness of all those checks are preserved and shouldn't interfere
with the dead code elimination and CO-RE

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 1c4ebb326785..8e034a22aa2a 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19064,6 +19064,12 @@ static bool is_tracing_prog_type(enum bpf_prog_type type)
>         }
>  }
>

[...]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-26 17:05     ` Anton Protopopov
@ 2024-11-26 18:51       ` Andrii Nakryiko
  2024-11-26 20:40         ` Alexei Starovoitov
  2024-11-27  6:49         ` Anton Protopopov
  0 siblings, 2 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2024-11-26 18:51 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: Alexei Starovoitov, bpf

On Tue, Nov 26, 2024 at 9:27 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> On 24/11/25 05:38PM, Alexei Starovoitov wrote:
> > On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > >
> > > The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> > > of file descriptors: maps or btfs. This field was introduced as a
> > > sparse array. Introduce a new attribute, fd_array_cnt, which, if
> > > present, indicates that the fd_array is a continuous array of the
> > > corresponding length.
> > >
> > > If fd_array_cnt is non-zero, then every map in the fd_array will be
> > > bound to the program, as if it was used by the program. This
> > > functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> > > maps can be used by the verifier during the program load.
> > >
> > > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > > ---
> > >  include/uapi/linux/bpf.h       |  10 ++++
> > >  kernel/bpf/syscall.c           |   2 +-
> > >  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
> > >  tools/include/uapi/linux/bpf.h |  10 ++++
> > >  4 files changed, 113 insertions(+), 15 deletions(-)
> > >

[...]

> > > +/*
> > > + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> > > + * this case expect that every file descriptor in the array is either a map or
> > > + * a BTF, or a hole (0). Everything else is considered to be trash.
> > > + */
> > > +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> > > +{
> > > +       struct bpf_map *map;
> > > +       CLASS(fd, f)(fd);
> > > +       int ret;
> > > +
> > > +       map = __bpf_map_get(f);
> > > +       if (!IS_ERR(map)) {
> > > +               ret = add_used_map(env, map);
> > > +               if (ret < 0)
> > > +                       return ret;
> > > +               return 0;
> > > +       }
> > > +
> > > +       if (!IS_ERR(__btf_get_by_fd(f)))
> > > +               return 0;
> > > +
> > > +       if (!fd)
> > > +               return 0;
> >
> > This is not allowed in new apis.
> > zero cannot be special.
>
> I thought that this is ok since I check for all possible valid FDs by this
> point: if fd=0 is a valid map or btf, then we will return it by this point.
>
> Why I wanted to keep 0 as a valid value, even if it is not pointing to any
> map/btf is for case when, like in libbpf gen, fd_array is populated with map
> fds starting from 0, and with btf fds from some offset, so in between there may
> be 0s. This is probably better to disallow this, and, if fd_array_cnt != 0,
> then to check if all [0...fd_array_cnt) elements are valid.

If fd_array_cnt != 0 we can define that fd_array isn't sparse anymore
and every entry has to be valid. Let's do that.

>
> > > +
> > > +       verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
> > > +       return PTR_ERR(map);
> > > +}
> > > +
> > > +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
> >
> > What an odd name... why is 'env_' there?
>

[...]

> > I don't get this feature.
> > Why bother copying and checking for validity?
> > What does it buy ?
>
> So, the main reason for this whole change is to allow unrelated maps, which
> aren't referenced by a program directly, to be noticed and available during the
> verification. Thus, I want to go through the array here and add them to
> used_maps. (In a consequent patch, "instuction sets" maps are treated
> additionally at this point as well.)
>
> The reason to discard that copy here was that "old api" when fd_array_cnt is 0
> is still valid and in this case we can't copy fd_array in full. Later during
> the verification fd_array elements are accessed individually by copying from
> bpfptr. I thought that freeing this copy here is more readable than to add
> a check at every such occasion.

I think Alexei meant why you need to make an entire copy of fd_array,
if you can just read one element at a time (still with
copy_from_bpfptr_offset()), validate/add map/btf from that FD, and
move to the next element. No need to make a copy of an array.

>
> > pw-bot: cr
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt
  2024-11-19 10:15 ` [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt Anton Protopopov
@ 2024-11-26 18:54   ` Andrii Nakryiko
  2024-11-27  6:45     ` Anton Protopopov
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2024-11-26 18:54 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf

On Tue, Nov 19, 2024 at 2:13 AM Anton Protopopov <aspsk@isovalent.com> wrote:
>
> Add a new set of tests to test the new field in PROG_LOAD-related
> part of bpf_attr: fd_array_cnt.
>
> Add the following test cases:
>
>   * fd_array_cnt/no-fd-array: program is loaded in a normal
>     way, without any fd_array present
>
>   * fd_array_cnt/fd-array-ok: pass two extra non-used maps,
>     check that they're bound to the program
>
>   * fd_array_cnt/fd-array-dup-input: pass a few extra maps,
>     only two of which are unique
>
>   * fd_array_cnt/fd-array-ref-maps-in-array: pass a map in
>     fd_array which is also referenced from within the program
>
>   * fd_array_cnt/fd-array-trash-input: pass array with some trash
>
>   * fd_array_cnt/fd-array-with-holes: pass an array with holes (fd=0)
>
>   * fd_array_cnt/fd-array-2big: pass too large array
>
> All the tests above are using the bpf(2) syscall directly,
> no libbpf involved.
>
> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> ---
>  .../selftests/bpf/prog_tests/fd_array.c       | 381 ++++++++++++++++++
>  1 file changed, 381 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/fd_array.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/fd_array.c b/tools/testing/selftests/bpf/prog_tests/fd_array.c
> new file mode 100644
> index 000000000000..1b47386e66c3
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/fd_array.c
> @@ -0,0 +1,381 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <test_progs.h>
> +
> +#include <linux/btf.h>
> +#include <sys/syscall.h>
> +#include <bpf/bpf.h>
> +
> +#include "../test_btf.h"
> +
> +static inline int _bpf_map_create(void)
> +{
> +       static union bpf_attr attr = {
> +               .map_type = BPF_MAP_TYPE_ARRAY,
> +               .key_size = 4,
> +               .value_size = 8,
> +               .max_entries = 1,
> +       };
> +
> +       return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
> +}

libbpf provides bpf_map_create() API. Please use that (and make sure
it supports the new field as well), don't re-define your own wrappers.

> +
> +static int _btf_create(void)
> +{
> +       struct btf_blob {
> +               struct btf_header btf_hdr;
> +               __u32 types[8];
> +               __u32 str;
> +       } raw_btf = {
> +               .btf_hdr = {
> +                       .magic = BTF_MAGIC,
> +                       .version = BTF_VERSION,
> +                       .hdr_len = sizeof(struct btf_header),
> +                       .type_len = sizeof(raw_btf.types),
> +                       .str_off = offsetof(struct btf_blob, str) - offsetof(struct btf_blob, types),
> +                       .str_len = sizeof(raw_btf.str),
> +               },
> +               .types = {
> +                       /* long */
> +                       BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 64, 8),  /* [1] */
> +                       /* unsigned long */
> +                       BTF_TYPE_INT_ENC(0, 0, 0, 64, 8),  /* [2] */
> +               },
> +       };
> +       static union bpf_attr attr = {
> +               .btf_size = sizeof(raw_btf),
> +       };
> +
> +       attr.btf = (long)&raw_btf;
> +
> +       return syscall(__NR_bpf, BPF_BTF_LOAD, &attr, sizeof(attr));

ditto, libbpf provides low-level API wrappers for a reason, let's tick to them

> +}
> +
> +static bool map_exists(__u32 id)
> +{
> +       int fd;
> +
> +       fd = bpf_map_get_fd_by_id(id);
> +       if (fd >= 0) {
> +               close(fd);
> +               return true;
> +       }
> +       return false;
> +}
> +
> +static inline int bpf_prog_get_map_ids(int prog_fd, __u32 *nr_map_ids, __u32 *map_ids)
> +{
> +       __u32 len = sizeof(struct bpf_prog_info);
> +       struct bpf_prog_info info = {
> +               .nr_map_ids = *nr_map_ids,
> +               .map_ids = ptr_to_u64(map_ids),
> +       };
> +       int err;
> +
> +       err = bpf_prog_get_info_by_fd(prog_fd, &info, &len);
> +       if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd"))
> +               return -1;
> +
> +       *nr_map_ids = info.nr_map_ids;
> +
> +       return 0;
> +}
> +
> +static int __load_test_prog(int map_fd, int *fd_array, int fd_array_cnt)
> +{
> +       /* A trivial program which uses one map */
> +       struct bpf_insn insns[] = {
> +               BPF_LD_MAP_FD(BPF_REG_1, map_fd),
> +               BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> +               BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> +               BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> +               BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
> +               BPF_MOV64_IMM(BPF_REG_0, 0),
> +               BPF_EXIT_INSN(),
> +       };
> +       union bpf_attr attr = {
> +               .prog_type = BPF_PROG_TYPE_XDP, /* we don't care */
> +               .insns     = ptr_to_u64(insns),
> +               .insn_cnt  = ARRAY_SIZE(insns),
> +               .license   = ptr_to_u64("GPL"),
> +               .fd_array = ptr_to_u64(fd_array),
> +               .fd_array_cnt = fd_array_cnt,
> +       };
> +
> +       return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));

bpf_prog_load() API

> +}
> +
> +static int load_test_prog(int *fd_array, int fd_array_cnt)
> +{
> +       int map_fd;
> +       int ret;
> +
> +       map_fd = _bpf_map_create();
> +       if (!ASSERT_GE(map_fd, 0, "_bpf_map_create"))
> +               return map_fd;
> +
> +       ret = __load_test_prog(map_fd, fd_array, fd_array_cnt);
> +       close(map_fd);
> +
> +       /* switch back to returning the actual value */
> +       if (ret < 0)
> +               return -errno;
> +       return ret;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-26 18:51       ` Andrii Nakryiko
@ 2024-11-26 20:40         ` Alexei Starovoitov
  2024-11-27  6:54           ` Anton Protopopov
  2024-11-27  6:49         ` Anton Protopopov
  1 sibling, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2024-11-26 20:40 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: Anton Protopopov, bpf

On Tue, Nov 26, 2024 at 10:51 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 26, 2024 at 9:27 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > On 24/11/25 05:38PM, Alexei Starovoitov wrote:
> > > On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > >
> > > > The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> > > > of file descriptors: maps or btfs. This field was introduced as a
> > > > sparse array. Introduce a new attribute, fd_array_cnt, which, if
> > > > present, indicates that the fd_array is a continuous array of the
> > > > corresponding length.
> > > >
> > > > If fd_array_cnt is non-zero, then every map in the fd_array will be
> > > > bound to the program, as if it was used by the program. This
> > > > functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> > > > maps can be used by the verifier during the program load.
> > > >
> > > > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > > > ---
> > > >  include/uapi/linux/bpf.h       |  10 ++++
> > > >  kernel/bpf/syscall.c           |   2 +-
> > > >  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
> > > >  tools/include/uapi/linux/bpf.h |  10 ++++
> > > >  4 files changed, 113 insertions(+), 15 deletions(-)
> > > >
>
> [...]
>
> > > > +/*
> > > > + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> > > > + * this case expect that every file descriptor in the array is either a map or
> > > > + * a BTF, or a hole (0). Everything else is considered to be trash.
> > > > + */
> > > > +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> > > > +{
> > > > +       struct bpf_map *map;
> > > > +       CLASS(fd, f)(fd);
> > > > +       int ret;
> > > > +
> > > > +       map = __bpf_map_get(f);
> > > > +       if (!IS_ERR(map)) {
> > > > +               ret = add_used_map(env, map);
> > > > +               if (ret < 0)
> > > > +                       return ret;
> > > > +               return 0;
> > > > +       }
> > > > +
> > > > +       if (!IS_ERR(__btf_get_by_fd(f)))
> > > > +               return 0;
> > > > +
> > > > +       if (!fd)
> > > > +               return 0;
> > >
> > > This is not allowed in new apis.
> > > zero cannot be special.
> >
> > I thought that this is ok since I check for all possible valid FDs by this
> > point: if fd=0 is a valid map or btf, then we will return it by this point.
> >
> > Why I wanted to keep 0 as a valid value, even if it is not pointing to any
> > map/btf is for case when, like in libbpf gen, fd_array is populated with map
> > fds starting from 0, and with btf fds from some offset, so in between there may
> > be 0s. This is probably better to disallow this, and, if fd_array_cnt != 0,
> > then to check if all [0...fd_array_cnt) elements are valid.
>
> If fd_array_cnt != 0 we can define that fd_array isn't sparse anymore
> and every entry has to be valid. Let's do that.

Exactly.
libbpf gen_loader has a very simplistic implementation of
add_map_fd() and add_kfunc_btf_fd().
It leaves gaps only to keep implementation simple.
It can and probably should be changed to make it contiguous.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-26  2:11   ` Hou Tao
@ 2024-11-27  6:44     ` Anton Protopopov
  2024-11-28  4:15       ` Hou Tao
  0 siblings, 1 reply; 24+ messages in thread
From: Anton Protopopov @ 2024-11-27  6:44 UTC (permalink / raw)
  To: Hou Tao; +Cc: bpf

On 24/11/26 10:11AM, Hou Tao wrote:
> Hi,
> 
> On 11/19/2024 6:15 PM, Anton Protopopov wrote:
> > The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> > of file descriptors: maps or btfs. This field was introduced as a
> > sparse array. Introduce a new attribute, fd_array_cnt, which, if
> > present, indicates that the fd_array is a continuous array of the
> > corresponding length.
> >
> > If fd_array_cnt is non-zero, then every map in the fd_array will be
> > bound to the program, as if it was used by the program. This
> > functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> > maps can be used by the verifier during the program load.
> >
> > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > ---
> >  include/uapi/linux/bpf.h       |  10 ++++
> >  kernel/bpf/syscall.c           |   2 +-
> >  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
> >  tools/include/uapi/linux/bpf.h |  10 ++++
> >  4 files changed, 113 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 4162afc6b5d0..2acf9b336371 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1573,6 +1573,16 @@ union bpf_attr {
> >  		 * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
> >  		 */
> >  		__s32		prog_token_fd;
> > +		/* The fd_array_cnt can be used to pass the length of the
> > +		 * fd_array array. In this case all the [map] file descriptors
> > +		 * passed in this array will be bound to the program, even if
> > +		 * the maps are not referenced directly. The functionality is
> > +		 * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
> > +		 * used by the verifier during the program load. If provided,
> > +		 * then the fd_array[0,...,fd_array_cnt-1] is expected to be
> > +		 * continuous.
> > +		 */
> > +		__u32		fd_array_cnt;
> >  	};
> >  
> >  	struct { /* anonymous struct used by BPF_OBJ_* commands */
> 
> SNIP
> > +/*
> > + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> > + * this case expect that every file descriptor in the array is either a map or
> > + * a BTF, or a hole (0). Everything else is considered to be trash.
> > + */
> > +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> > +{
> > +	struct bpf_map *map;
> > +	CLASS(fd, f)(fd);
> > +	int ret;
> > +
> > +	map = __bpf_map_get(f);
> > +	if (!IS_ERR(map)) {
> > +		ret = add_used_map(env, map);
> > +		if (ret < 0)
> > +			return ret;
> > +		return 0;
> > +	}
> > +
> > +	if (!IS_ERR(__btf_get_by_fd(f)))
> > +		return 0;
> 
> For fd_array_cnt > 0 case, does it need to handle BTF fd case ? If it
> does, these returned BTFs should be saved in somewhere, otherewise,
> these BTFs will be leaked.

ATM we don't actually store BTFs here. The __btf_get_by_fd doesn't
increase the refcnt, so no leaks.

> > +	if (!fd)
> > +		return 0;
> > +
> > +	verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
> > +	return PTR_ERR(map);
> > +}
> > +
> > +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
> > +{
> > +	int size = sizeof(int) * attr->fd_array_cnt;
> > +	int *copy;
> > +	int ret;
> > +	int i;
> > +
> > +	if (attr->fd_array_cnt >= MAX_USED_MAPS)
> > +		return -E2BIG;
> > +
> > +	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
> > +
> > +	/*
> > +	 * The only difference between old (no fd_array_cnt is given) and new
> > +	 * APIs is that in the latter case the fd_array is expected to be
> > +	 * continuous and is scanned for map fds right away
> > +	 */
> > +	if (!size)
> > +		return 0;
> > +
> > +	copy = kzalloc(size, GFP_KERNEL);
> > +	if (!copy)
> > +		return -ENOMEM;
> > +
> > +	if (copy_from_bpfptr_offset(copy, env->fd_array, 0, size)) {
> > +		ret = -EFAULT;
> > +		goto free_copy;
> > +	}
> 
> It is better to use kvmemdup_bpfptr() instead.

Thanks for the hint. As suggested by Alexei, I will remove the memory
allocation here altogether.

> > +
> > +	for (i = 0; i < attr->fd_array_cnt; i++) {
> > +		ret = add_fd_from_fd_array(env, copy[i]);
> > +		if (ret)
> > +			goto free_copy;
> > +	}
> > +
> > +free_copy:
> > +	kfree(copy);
> > +	return ret;
> > +}
> > +
> >  int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
> >  {
> >  	u64 start_time = ktime_get_ns();
> > @@ -22557,7 +22632,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
> >  		env->insn_aux_data[i].orig_idx = i;
> >  	env->prog = *prog;
> >  	env->ops = bpf_verifier_ops[env->prog->type];
> > -	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
> > +	ret = env_init_fd_array(env, attr, uattr);
> > +	if (ret)
> > +		goto err_free_aux_data;
> 
> These maps saved in env->used_map will also be leaked.

Yeah, thanks, actually, env->used_map contents will be leaked (if
error occurs here or until we get to after `goto err_unlock`), so
I will rewrite the init/error path.

> >  
> >  	env->allow_ptr_leaks = bpf_allow_ptr_leaks(env->prog->aux->token);
> >  	env->allow_uninit_stack = bpf_allow_uninit_stack(env->prog->aux->token);
> > @@ -22775,6 +22852,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
> >  err_unlock:
> >  	if (!is_priv)
> >  		mutex_unlock(&bpf_verifier_lock);
> > +err_free_aux_data:
> >  	vfree(env->insn_aux_data);
> >  	kvfree(env->insn_hist);
> >  err_free_env:
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 4162afc6b5d0..2acf9b336371 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -1573,6 +1573,16 @@ union bpf_attr {
> >  		 * If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
> >  		 */
> >  		__s32		prog_token_fd;
> > +		/* The fd_array_cnt can be used to pass the length of the
> > +		 * fd_array array. In this case all the [map] file descriptors
> > +		 * passed in this array will be bound to the program, even if
> > +		 * the maps are not referenced directly. The functionality is
> > +		 * similar to the BPF_PROG_BIND_MAP syscall, but maps can be
> > +		 * used by the verifier during the program load. If provided,
> > +		 * then the fd_array[0,...,fd_array_cnt-1] is expected to be
> > +		 * continuous.
> > +		 */
> > +		__u32		fd_array_cnt;
> >  	};
> >  
> >  	struct { /* anonymous struct used by BPF_OBJ_* commands */
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt
  2024-11-26 18:54   ` Andrii Nakryiko
@ 2024-11-27  6:45     ` Anton Protopopov
  0 siblings, 0 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-27  6:45 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf

On 24/11/26 10:54AM, Andrii Nakryiko wrote:
> On Tue, Nov 19, 2024 at 2:13 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > Add a new set of tests to test the new field in PROG_LOAD-related
> > part of bpf_attr: fd_array_cnt.
> >
> > Add the following test cases:
> >
> >   * fd_array_cnt/no-fd-array: program is loaded in a normal
> >     way, without any fd_array present
> >
> >   * fd_array_cnt/fd-array-ok: pass two extra non-used maps,
> >     check that they're bound to the program
> >
> >   * fd_array_cnt/fd-array-dup-input: pass a few extra maps,
> >     only two of which are unique
> >
> >   * fd_array_cnt/fd-array-ref-maps-in-array: pass a map in
> >     fd_array which is also referenced from within the program
> >
> >   * fd_array_cnt/fd-array-trash-input: pass array with some trash
> >
> >   * fd_array_cnt/fd-array-with-holes: pass an array with holes (fd=0)
> >
> >   * fd_array_cnt/fd-array-2big: pass too large array
> >
> > All the tests above are using the bpf(2) syscall directly,
> > no libbpf involved.
> >
> > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > ---
> >  .../selftests/bpf/prog_tests/fd_array.c       | 381 ++++++++++++++++++
> >  1 file changed, 381 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/fd_array.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/fd_array.c b/tools/testing/selftests/bpf/prog_tests/fd_array.c
> > new file mode 100644
> > index 000000000000..1b47386e66c3
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/fd_array.c
> > @@ -0,0 +1,381 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <test_progs.h>
> > +
> > +#include <linux/btf.h>
> > +#include <sys/syscall.h>
> > +#include <bpf/bpf.h>
> > +
> > +#include "../test_btf.h"
> > +
> > +static inline int _bpf_map_create(void)
> > +{
> > +       static union bpf_attr attr = {
> > +               .map_type = BPF_MAP_TYPE_ARRAY,
> > +               .key_size = 4,
> > +               .value_size = 8,
> > +               .max_entries = 1,
> > +       };
> > +
> > +       return syscall(__NR_bpf, BPF_MAP_CREATE, &attr, sizeof(attr));
> > +}
> 
> libbpf provides bpf_map_create() API. Please use that (and make sure
> it supports the new field as well), don't re-define your own wrappers.

Thanks, ack (for here and your comments below)

> 
> > +
> > +static int _btf_create(void)
> > +{
> > +       struct btf_blob {
> > +               struct btf_header btf_hdr;
> > +               __u32 types[8];
> > +               __u32 str;
> > +       } raw_btf = {
> > +               .btf_hdr = {
> > +                       .magic = BTF_MAGIC,
> > +                       .version = BTF_VERSION,
> > +                       .hdr_len = sizeof(struct btf_header),
> > +                       .type_len = sizeof(raw_btf.types),
> > +                       .str_off = offsetof(struct btf_blob, str) - offsetof(struct btf_blob, types),
> > +                       .str_len = sizeof(raw_btf.str),
> > +               },
> > +               .types = {
> > +                       /* long */
> > +                       BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 64, 8),  /* [1] */
> > +                       /* unsigned long */
> > +                       BTF_TYPE_INT_ENC(0, 0, 0, 64, 8),  /* [2] */
> > +               },
> > +       };
> > +       static union bpf_attr attr = {
> > +               .btf_size = sizeof(raw_btf),
> > +       };
> > +
> > +       attr.btf = (long)&raw_btf;
> > +
> > +       return syscall(__NR_bpf, BPF_BTF_LOAD, &attr, sizeof(attr));
> 
> ditto, libbpf provides low-level API wrappers for a reason, let's tick to them
> 
> > +}
> > +
> > +static bool map_exists(__u32 id)
> > +{
> > +       int fd;
> > +
> > +       fd = bpf_map_get_fd_by_id(id);
> > +       if (fd >= 0) {
> > +               close(fd);
> > +               return true;
> > +       }
> > +       return false;
> > +}
> > +
> > +static inline int bpf_prog_get_map_ids(int prog_fd, __u32 *nr_map_ids, __u32 *map_ids)
> > +{
> > +       __u32 len = sizeof(struct bpf_prog_info);
> > +       struct bpf_prog_info info = {
> > +               .nr_map_ids = *nr_map_ids,
> > +               .map_ids = ptr_to_u64(map_ids),
> > +       };
> > +       int err;
> > +
> > +       err = bpf_prog_get_info_by_fd(prog_fd, &info, &len);
> > +       if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd"))
> > +               return -1;
> > +
> > +       *nr_map_ids = info.nr_map_ids;
> > +
> > +       return 0;
> > +}
> > +
> > +static int __load_test_prog(int map_fd, int *fd_array, int fd_array_cnt)
> > +{
> > +       /* A trivial program which uses one map */
> > +       struct bpf_insn insns[] = {
> > +               BPF_LD_MAP_FD(BPF_REG_1, map_fd),
> > +               BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
> > +               BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> > +               BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> > +               BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
> > +               BPF_MOV64_IMM(BPF_REG_0, 0),
> > +               BPF_EXIT_INSN(),
> > +       };
> > +       union bpf_attr attr = {
> > +               .prog_type = BPF_PROG_TYPE_XDP, /* we don't care */
> > +               .insns     = ptr_to_u64(insns),
> > +               .insn_cnt  = ARRAY_SIZE(insns),
> > +               .license   = ptr_to_u64("GPL"),
> > +               .fd_array = ptr_to_u64(fd_array),
> > +               .fd_array_cnt = fd_array_cnt,
> > +       };
> > +
> > +       return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr));
> 
> bpf_prog_load() API
> 
> > +}
> > +
> > +static int load_test_prog(int *fd_array, int fd_array_cnt)
> > +{
> > +       int map_fd;
> > +       int ret;
> > +
> > +       map_fd = _bpf_map_create();
> > +       if (!ASSERT_GE(map_fd, 0, "_bpf_map_create"))
> > +               return map_fd;
> > +
> > +       ret = __load_test_prog(map_fd, fd_array, fd_array_cnt);
> > +       close(map_fd);
> > +
> > +       /* switch back to returning the actual value */
> > +       if (ret < 0)
> > +               return -errno;
> > +       return ret;
> > +}
> > +
> 
> [...]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-26 18:51       ` Andrii Nakryiko
  2024-11-26 20:40         ` Alexei Starovoitov
@ 2024-11-27  6:49         ` Anton Protopopov
  1 sibling, 0 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-27  6:49 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: Alexei Starovoitov, bpf

On 24/11/26 10:51AM, Andrii Nakryiko wrote:
> On Tue, Nov 26, 2024 at 9:27 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> >
> > On 24/11/25 05:38PM, Alexei Starovoitov wrote:
> > > On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > >
> > > > The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> > > > of file descriptors: maps or btfs. This field was introduced as a
> > > > sparse array. Introduce a new attribute, fd_array_cnt, which, if
> > > > present, indicates that the fd_array is a continuous array of the
> > > > corresponding length.
> > > >
> > > > If fd_array_cnt is non-zero, then every map in the fd_array will be
> > > > bound to the program, as if it was used by the program. This
> > > > functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> > > > maps can be used by the verifier during the program load.
> > > >
> > > > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > > > ---
> > > >  include/uapi/linux/bpf.h       |  10 ++++
> > > >  kernel/bpf/syscall.c           |   2 +-
> > > >  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
> > > >  tools/include/uapi/linux/bpf.h |  10 ++++
> > > >  4 files changed, 113 insertions(+), 15 deletions(-)
> > > >
> 
> [...]
> 
> > > > +/*
> > > > + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> > > > + * this case expect that every file descriptor in the array is either a map or
> > > > + * a BTF, or a hole (0). Everything else is considered to be trash.
> > > > + */
> > > > +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> > > > +{
> > > > +       struct bpf_map *map;
> > > > +       CLASS(fd, f)(fd);
> > > > +       int ret;
> > > > +
> > > > +       map = __bpf_map_get(f);
> > > > +       if (!IS_ERR(map)) {
> > > > +               ret = add_used_map(env, map);
> > > > +               if (ret < 0)
> > > > +                       return ret;
> > > > +               return 0;
> > > > +       }
> > > > +
> > > > +       if (!IS_ERR(__btf_get_by_fd(f)))
> > > > +               return 0;
> > > > +
> > > > +       if (!fd)
> > > > +               return 0;
> > >
> > > This is not allowed in new apis.
> > > zero cannot be special.
> >
> > I thought that this is ok since I check for all possible valid FDs by this
> > point: if fd=0 is a valid map or btf, then we will return it by this point.
> >
> > Why I wanted to keep 0 as a valid value, even if it is not pointing to any
> > map/btf is for case when, like in libbpf gen, fd_array is populated with map
> > fds starting from 0, and with btf fds from some offset, so in between there may
> > be 0s. This is probably better to disallow this, and, if fd_array_cnt != 0,
> > then to check if all [0...fd_array_cnt) elements are valid.
> 
> If fd_array_cnt != 0 we can define that fd_array isn't sparse anymore
> and every entry has to be valid. Let's do that.

Yes, makes sense

> >
> > > > +
> > > > +       verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
> > > > +       return PTR_ERR(map);
> > > > +}
> > > > +
> > > > +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
> > >
> > > What an odd name... why is 'env_' there?
> >
> 
> [...]
> 
> > > I don't get this feature.
> > > Why bother copying and checking for validity?
> > > What does it buy ?
> >
> > So, the main reason for this whole change is to allow unrelated maps, which
> > aren't referenced by a program directly, to be noticed and available during the
> > verification. Thus, I want to go through the array here and add them to
> > used_maps. (In a consequent patch, "instuction sets" maps are treated
> > additionally at this point as well.)
> >
> > The reason to discard that copy here was that "old api" when fd_array_cnt is 0
> > is still valid and in this case we can't copy fd_array in full. Later during
> > the verification fd_array elements are accessed individually by copying from
> > bpfptr. I thought that freeing this copy here is more readable than to add
> > a check at every such occasion.
> 
> I think Alexei meant why you need to make an entire copy of fd_array,
> if you can just read one element at a time (still with
> copy_from_bpfptr_offset()), validate/add map/btf from that FD, and
> move to the next element. No need to make a copy of an array.
> 
> >
> > > pw-bot: cr
> >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-26 20:40         ` Alexei Starovoitov
@ 2024-11-27  6:54           ` Anton Protopopov
  0 siblings, 0 replies; 24+ messages in thread
From: Anton Protopopov @ 2024-11-27  6:54 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Andrii Nakryiko, bpf

On 24/11/26 12:40PM, Alexei Starovoitov wrote:
> On Tue, Nov 26, 2024 at 10:51 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Nov 26, 2024 at 9:27 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > >
> > > On 24/11/25 05:38PM, Alexei Starovoitov wrote:
> > > > On Tue, Nov 19, 2024 at 2:17 AM Anton Protopopov <aspsk@isovalent.com> wrote:
> > > > >
> > > > > The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
> > > > > of file descriptors: maps or btfs. This field was introduced as a
> > > > > sparse array. Introduce a new attribute, fd_array_cnt, which, if
> > > > > present, indicates that the fd_array is a continuous array of the
> > > > > corresponding length.
> > > > >
> > > > > If fd_array_cnt is non-zero, then every map in the fd_array will be
> > > > > bound to the program, as if it was used by the program. This
> > > > > functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
> > > > > maps can be used by the verifier during the program load.
> > > > >
> > > > > Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
> > > > > ---
> > > > >  include/uapi/linux/bpf.h       |  10 ++++
> > > > >  kernel/bpf/syscall.c           |   2 +-
> > > > >  kernel/bpf/verifier.c          | 106 ++++++++++++++++++++++++++++-----
> > > > >  tools/include/uapi/linux/bpf.h |  10 ++++
> > > > >  4 files changed, 113 insertions(+), 15 deletions(-)
> > > > >
> >
> > [...]
> >
> > > > > +/*
> > > > > + * The add_fd_from_fd_array() is executed only if fd_array_cnt is given.  In
> > > > > + * this case expect that every file descriptor in the array is either a map or
> > > > > + * a BTF, or a hole (0). Everything else is considered to be trash.
> > > > > + */
> > > > > +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
> > > > > +{
> > > > > +       struct bpf_map *map;
> > > > > +       CLASS(fd, f)(fd);
> > > > > +       int ret;
> > > > > +
> > > > > +       map = __bpf_map_get(f);
> > > > > +       if (!IS_ERR(map)) {
> > > > > +               ret = add_used_map(env, map);
> > > > > +               if (ret < 0)
> > > > > +                       return ret;
> > > > > +               return 0;
> > > > > +       }
> > > > > +
> > > > > +       if (!IS_ERR(__btf_get_by_fd(f)))
> > > > > +               return 0;
> > > > > +
> > > > > +       if (!fd)
> > > > > +               return 0;
> > > >
> > > > This is not allowed in new apis.
> > > > zero cannot be special.
> > >
> > > I thought that this is ok since I check for all possible valid FDs by this
> > > point: if fd=0 is a valid map or btf, then we will return it by this point.
> > >
> > > Why I wanted to keep 0 as a valid value, even if it is not pointing to any
> > > map/btf is for case when, like in libbpf gen, fd_array is populated with map
> > > fds starting from 0, and with btf fds from some offset, so in between there may
> > > be 0s. This is probably better to disallow this, and, if fd_array_cnt != 0,
> > > then to check if all [0...fd_array_cnt) elements are valid.
> >
> > If fd_array_cnt != 0 we can define that fd_array isn't sparse anymore
> > and every entry has to be valid. Let's do that.
> 
> Exactly.
> libbpf gen_loader has a very simplistic implementation of
> add_map_fd() and add_kfunc_btf_fd().
> It leaves gaps only to keep implementation simple.
> It can and probably should be changed to make it contiguous.

Ok, makes sense. I will send v3 based on all the comments.

For the libbpf gen_loader part, I can address this in one of the follow-ups
(when fd_array_cnt is actually set in libbpf).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load
  2024-11-27  6:44     ` Anton Protopopov
@ 2024-11-28  4:15       ` Hou Tao
  0 siblings, 0 replies; 24+ messages in thread
From: Hou Tao @ 2024-11-28  4:15 UTC (permalink / raw)
  To: Anton Protopopov; +Cc: bpf

H,

On 11/27/2024 2:44 PM, Anton Protopopov wrote:
> On 24/11/26 10:11AM, Hou Tao wrote:
>> Hi,
>>
>> On 11/19/2024 6:15 PM, Anton Protopopov wrote:
>>> The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
>>> of file descriptors: maps or btfs. This field was introduced as a
>>> sparse array. Introduce a new attribute, fd_array_cnt, which, if
>>> present, indicates that the fd_array is a continuous array of the
>>> corresponding length.
>>>
>>> If fd_array_cnt is non-zero, then every map in the fd_array will be
>>> bound to the program, as if it was used by the program. This
>>> functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
>>> maps can be used by the verifier during the program load.
>>>
>>> Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
>>> ---

SNIP
>>> +static int add_fd_from_fd_array(struct bpf_verifier_env *env, int fd)
>>> +{
>>> +	struct bpf_map *map;
>>> +	CLASS(fd, f)(fd);
>>> +	int ret;
>>> +
>>> +	map = __bpf_map_get(f);
>>> +	if (!IS_ERR(map)) {
>>> +		ret = add_used_map(env, map);
>>> +		if (ret < 0)
>>> +			return ret;
>>> +		return 0;
>>> +	}
>>> +
>>> +	if (!IS_ERR(__btf_get_by_fd(f)))
>>> +		return 0;
>> For fd_array_cnt > 0 case, does it need to handle BTF fd case ? If it
>> does, these returned BTFs should be saved in somewhere, otherewise,
>> these BTFs will be leaked.
> ATM we don't actually store BTFs here. The __btf_get_by_fd doesn't
> increase the refcnt, so no leaks.

Yes. You are right, I just mis-read the implementation of
__btf_get_by_fd().
>
>>> +	if (!fd)
>>> +		return 0;
>>> +
>>> +	verbose(env, "fd %d is not pointing to valid bpf_map or btf\n", fd);
>>> +	return PTR_ERR(map);
>>> +}
>>> +
>>> +static int env_init_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr, bpfptr_t uattr)
>>> +{
>>> +	int size = sizeof(int) * attr->fd_array_cnt;
>>> +	int *copy;
>>> +	int ret;
>>> +	int i;
>>> +
>>> +	if (attr->fd_array_cnt >= MAX_USED_MAPS)
>>> +		return -E2BIG;
>>> +
>>> +	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
>>> +
>>> +	/*
>>> +	 * The only difference between old (no fd_array_cnt is given) and new
>>> +	 * APIs is that in the latter case the fd_array is expected to be
>>> +	 * continuous and is scanned for map fds right away
>>> +	 */
>>> +	if (!size)
>>> +		return 0;
>>> +
>>> +	copy = kzalloc(size, GFP_KERNEL);
>>> +	if (!copy)
>>> +		return -ENOMEM;
>>> +
>>> +	if (copy_from_bpfptr_offset(copy, env->fd_array, 0, size)) {
>>> +		ret = -EFAULT;
>>> +		goto free_copy;
>>> +	}
>> It is better to use kvmemdup_bpfptr() instead.
> Thanks for the hint. As suggested by Alexei, I will remove the memory
> allocation here altogether.

I see.
>
>>> +
>>> +	for (i = 0; i < attr->fd_array_cnt; i++) {
>>> +		ret = add_fd_from_fd_array(env, copy[i]);
>>> +		if (ret)
>>> +			goto free_copy;
>>> +	}
>>> +
>>> +free_copy:
>>> +	kfree(copy);
>>> +	return ret;
>>> +}
>>> +
>>>  int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
>>>  {
>>>  	u64 start_time = ktime_get_ns();
>>> @@ -22557,7 +22632,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>>>  		env->insn_aux_data[i].orig_idx = i;
>>>  	env->prog = *prog;
>>>  	env->ops = bpf_verifier_ops[env->prog->type];
>>> -	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
>>> +	ret = env_init_fd_array(env, attr, uattr);
>>> +	if (ret)
>>> +		goto err_free_aux_data;
>> These maps saved in env->used_map will also be leaked.
> Yeah, thanks, actually, env->used_map contents will be leaked (if
> error occurs here or until we get to after `goto err_unlock`), so
> I will rewrite the init/error path.

Glad to hear that。


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2024-11-28  4:15 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-19 10:15 [PATCH v2 bpf-next 0/6] Add fd_array_cnt attribute for BPF_PROG_LOAD Anton Protopopov
2024-11-19 10:15 ` [PATCH v2 bpf-next 1/6] bpf: add a __btf_get_by_fd helper Anton Protopopov
2024-11-26  1:31   ` Alexei Starovoitov
2024-11-26 16:33     ` Anton Protopopov
2024-11-26 16:52       ` Alexei Starovoitov
2024-11-19 10:15 ` [PATCH v2 bpf-next 2/6] bpf: move map/prog compatibility checks Anton Protopopov
2024-11-26 18:44   ` Andrii Nakryiko
2024-11-19 10:15 ` [PATCH v2 bpf-next 3/6] bpf: add fd_array_cnt attribute for prog_load Anton Protopopov
2024-11-26  1:38   ` Alexei Starovoitov
2024-11-26 17:05     ` Anton Protopopov
2024-11-26 18:51       ` Andrii Nakryiko
2024-11-26 20:40         ` Alexei Starovoitov
2024-11-27  6:54           ` Anton Protopopov
2024-11-27  6:49         ` Anton Protopopov
2024-11-26  2:11   ` Hou Tao
2024-11-27  6:44     ` Anton Protopopov
2024-11-28  4:15       ` Hou Tao
2024-11-19 10:15 ` [PATCH v2 bpf-next 4/6] selftests/bpf: Add tests for fd_array_cnt Anton Protopopov
2024-11-26 18:54   ` Andrii Nakryiko
2024-11-27  6:45     ` Anton Protopopov
2024-11-19 10:15 ` [PATCH v2 bpf-next 5/6] bpf: fix potential error return Anton Protopopov
2024-11-26  1:43   ` Alexei Starovoitov
2024-11-26 16:36     ` Anton Protopopov
2024-11-19 10:15 ` [PATCH v2 bpf-next 6/6] selftest/bpf: replace magic constants by macros Anton Protopopov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox