From: Andrii Nakryiko <andrii@kernel.org>
To: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>
Cc: <andrii@kernel.org>, <kernel-team@fb.com>, Tejun Heo <tj@kernel.org>
Subject: [PATCH bpf-next 14/17] bpf: implement number iterator
Date: Thu, 2 Mar 2023 15:50:12 -0800 [thread overview]
Message-ID: <20230302235015.2044271-15-andrii@kernel.org> (raw)
In-Reply-To: <20230302235015.2044271-1-andrii@kernel.org>
Implement first open-coded iterator type over a range of integers.
It's public API consists of:
- bpf_iter_num_new() constructor, which accepts [start, end) range
(that is, start is inclusive, while end is exclusive).
- bpf_iter_num_next() which will keep returning 4-byte read-only
pointer to int until the range is exhausted, at which point NULL will
be returned. If bpf_iter_num_next() is kept calling after this, NULL
will be persistently returned.
- bpf_iter_num_destroy() destructor, that needs to be called at some
point to clean up iterator state. BPF verifier enforces that iterator
destructor is called at some point before BPF program exits.
Note that `start = end = X` is a valid combination to setup empty
iterator. bpf_iter_num_new() will return 0 (success) for any such
combination.
If bpf_iter_num_new() detects invalid combination of input arguments, it
returns error, resets iterator state to, effectively, empty iterator, so
any subsequent call to bpf_iter_num_next() will keep returning NULL.
BPF verifier has no knowledge that returned integers are in the [start,
end) value range, as both `start` and `end` are not statically
known/enforced, they are runtime values only.
While implementation is pretty trivial, some care needs to be taken to
avoid overflows and underflows. Subsequent selftests will validate
correctness of [start, end) semantics, especially around extremes
(INT_MIN and INT_MAX).
Similarly to bpf_loop(), we enforce that no more than BPF_MAX_LOOPS can
be specified.
bpf_iter_num_{new,next,destroy}() is a logical evolution from bounded
BPF loops and bpf_loop() helper and is the basis for implementing
ergonomic BPF loops with no statically known and verified bounds.
Subsequent patches implement bpf_for() macro, demonstrating how this can
be wrapped into something that works and feels like a normal for() loop
in C language.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
include/linux/bpf.h | 14 +++++++--
kernel/bpf/bpf_iter.c | 71 +++++++++++++++++++++++++++++++++++++++++++
kernel/bpf/helpers.c | 3 ++
kernel/bpf/verifier.c | 24 +++++++++++++--
4 files changed, 107 insertions(+), 5 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a968282ba324..2a730759a471 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -613,6 +613,9 @@ enum bpf_type_flag {
/* DYNPTR points to xdp_buff */
DYNPTR_TYPE_XDP = BIT(16 + BPF_BASE_TYPE_BITS),
+ /* ITER of integers */
+ ITER_TYPE_NUM = BIT(17 + BPF_BASE_TYPE_BITS),
+
__BPF_TYPE_FLAG_MAX,
__BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1,
};
@@ -620,7 +623,7 @@ enum bpf_type_flag {
#define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF | DYNPTR_TYPE_SKB \
| DYNPTR_TYPE_XDP)
-#define ITER_TYPE_FLAG_MASK (0)
+#define ITER_TYPE_FLAG_MASK (ITER_TYPE_NUM)
/* Max number of base types. */
#define BPF_BASE_TYPE_LIMIT (1UL << BPF_BASE_TYPE_BITS)
@@ -1167,6 +1170,7 @@ u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr);
enum bpf_iter_type {
BPF_ITER_TYPE_INVALID,
+ BPF_ITER_TYPE_NUM,
};
#ifdef CONFIG_BPF_JIT
@@ -1622,8 +1626,12 @@ struct bpf_array {
#define BPF_COMPLEXITY_LIMIT_INSNS 1000000 /* yes. 1M insns */
#define MAX_TAIL_CALL_CNT 33
-/* Maximum number of loops for bpf_loop */
-#define BPF_MAX_LOOPS BIT(23)
+/* Maximum number of loops for bpf_loop and bpf_iter_num.
+ * It's enum to expose it (and thus make it discoverable) through BTF.
+ */
+enum {
+ BPF_MAX_LOOPS = 8 * 1024 * 1024,
+};
#define BPF_F_ACCESS_MASK (BPF_F_RDONLY | \
BPF_F_RDONLY_PROG | \
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 5dc307bdeaeb..504189a3b474 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0-only
/* Copyright (c) 2020 Facebook */
+#include "linux/build_bug.h"
#include <linux/fs.h>
#include <linux/anon_inodes.h>
#include <linux/filter.h>
@@ -776,3 +777,73 @@ const struct bpf_func_proto bpf_loop_proto = {
.arg3_type = ARG_PTR_TO_STACK_OR_NULL,
.arg4_type = ARG_ANYTHING,
};
+
+struct bpf_iter_num_kern {
+ int cur; /* current value, inclusive */
+ int end; /* final value, exclusive */
+ __u64 :64;
+ __u64 :64;
+} __aligned(8);
+
+__diag_push();
+__diag_ignore_all("-Wmissing-prototypes",
+ "Global functions as their definitions will be in vmlinux BTF");
+
+__bpf_kfunc int bpf_iter_num_new(struct bpf_iter *it__uninit, int start, int end)
+{
+ struct bpf_iter_num_kern *s = (void *)it__uninit;
+
+ BUILD_BUG_ON(sizeof(struct bpf_iter_num_kern) != sizeof(struct bpf_iter));
+ BUILD_BUG_ON(__alignof__(struct bpf_iter_num_kern) != __alignof__(struct bpf_iter));
+
+ /* start == end is legit, it's an empty range and we'll just get NULL
+ * on first (and any subsequent) bpf_iter_num_next() call
+ */
+ if (start > end) {
+ s->cur = s->end = 0;
+ return -EINVAL;
+ }
+
+ /* avoid overflows, e.g., if start == INT_MIN and end == INT_MAX */
+ if ((s64)end - (s64)start > BPF_MAX_LOOPS) {
+ s->cur = s->end = 0;
+ return -E2BIG;
+ }
+
+ /* user will call bpf_iter_num_next() first,
+ * which will set s->cur to exactly start value;
+ * underflow shouldn't matter
+ */
+ s->cur = start - 1;
+ s->end = end;
+
+ return 0;
+}
+
+__bpf_kfunc int *bpf_iter_num_next(struct bpf_iter* it)
+{
+ struct bpf_iter_num_kern *s = (void *)it;
+
+ /* check failed initialization or if we are done (same behavior);
+ * need to be careful about overflow, so convert to s64 for checks,
+ * e.g., if s->cur == s->end == INT_MAX, we can't just do
+ * s->cur + 1 >= s->end
+ */
+ if ((s64)(s->cur + 1) >= s->end) {
+ s->cur = s->end = 0;
+ return NULL;
+ }
+
+ s->cur++;
+
+ return &s->cur;
+}
+
+__bpf_kfunc void bpf_iter_num_destroy(struct bpf_iter *it)
+{
+ struct bpf_iter_num_kern *s = (void *)it;
+
+ s->cur = s->end = 0;
+}
+
+__diag_pop();
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index de9ef8476e29..23c8f2313d5a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2398,6 +2398,9 @@ BTF_ID_FLAGS(func, bpf_rcu_read_lock)
BTF_ID_FLAGS(func, bpf_rcu_read_unlock)
BTF_ID_FLAGS(func, bpf_dynptr_slice, KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_num_new)
+BTF_ID_FLAGS(func, bpf_iter_num_next, KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_num_destroy)
BTF_SET8_END(common_btf_ids)
static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 58754929ee33..9671b4f354e9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -779,6 +779,8 @@ static const char *dynptr_type_str(enum bpf_dynptr_type type)
static const char *iter_type_str(enum bpf_iter_type type)
{
switch (type) {
+ case BPF_ITER_TYPE_NUM:
+ return "num";
case BPF_ITER_TYPE_INVALID:
return "<invalid>";
default:
@@ -1157,6 +1159,8 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
static enum bpf_dynptr_type arg_to_iter_type(enum bpf_arg_type arg_type)
{
switch (arg_type & ITER_TYPE_FLAG_MASK) {
+ case ITER_TYPE_NUM:
+ return BPF_ITER_TYPE_NUM;
default:
return BPF_ITER_TYPE_INVALID;
}
@@ -9445,6 +9449,9 @@ enum special_kfunc_type {
KF_bpf_dynptr_from_xdp,
KF_bpf_dynptr_slice,
KF_bpf_dynptr_slice_rdwr,
+ KF_bpf_iter_num_new,
+ KF_bpf_iter_num_next,
+ KF_bpf_iter_num_destroy,
};
BTF_SET_START(special_kfunc_set)
@@ -9483,6 +9490,9 @@ BTF_ID(func, bpf_dynptr_from_skb)
BTF_ID(func, bpf_dynptr_from_xdp)
BTF_ID(func, bpf_dynptr_slice)
BTF_ID(func, bpf_dynptr_slice_rdwr)
+BTF_ID(func, bpf_iter_num_new)
+BTF_ID(func, bpf_iter_num_next)
+BTF_ID(func, bpf_iter_num_destroy)
static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta)
{
@@ -9496,7 +9506,7 @@ static bool is_kfunc_bpf_rcu_read_unlock(struct bpf_kfunc_call_arg_meta *meta)
static bool is_iter_next_kfunc(int btf_id)
{
- return false;
+ return btf_id == special_kfunc_list[KF_bpf_iter_num_next];
}
static enum kfunc_ptr_arg_type
@@ -10278,7 +10288,17 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
if (is_kfunc_arg_uninit(btf, &args[i]))
iter_arg_type |= MEM_UNINIT;
- ret = process_iter_arg(env, regno, insn_idx, iter_arg_type, meta);
+ if (meta->func_id == special_kfunc_list[KF_bpf_iter_num_new] ||
+ meta->func_id == special_kfunc_list[KF_bpf_iter_num_next]) {
+ iter_arg_type |= ITER_TYPE_NUM;
+ } else if (meta->func_id == special_kfunc_list[KF_bpf_iter_num_destroy]) {
+ iter_arg_type |= ITER_TYPE_NUM | OBJ_RELEASE;
+ } else {
+ verbose(env, "verifier internal error: unrecognized iterator kfunc\n");
+ return -EFAULT;
+ }
+
+ ret = process_iter_arg(env, regno, insn_idx, iter_arg_type, meta);
if (ret < 0)
return ret;
break;
--
2.30.2
next prev parent reply other threads:[~2023-03-02 23:50 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-02 23:49 [PATCH bpf-next 00/17] BPF open-coded iterators Andrii Nakryiko
2023-03-02 23:49 ` [PATCH bpf-next 01/17] bpf: improve stack slot state printing Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 02/17] bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER} Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 03/17] selftests/bpf: enhance align selftest's expected log matching Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 04/17] bpf: honor env->test_state_freq flag in is_state_visited() Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 05/17] selftests/bpf: adjust log_fixup's buffer size for proper truncation Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 06/17] bpf: clean up visit_insn()'s instruction processing Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 07/17] bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 08/17] bpf: ensure that r0 is marked scratched after any function call Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 09/17] bpf: move kfunc_call_arg_meta higher in the file Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 10/17] bpf: mark PTR_TO_MEM as non-null register type Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 11/17] bpf: generalize dynptr_get_spi to be usable for iters Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 12/17] bpf: add support for fixed-size memory pointer returns for kfuncs Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 13/17] bpf: add support for open-coded iterator loops Andrii Nakryiko
2023-03-04 20:02 ` Alexei Starovoitov
2023-03-04 23:27 ` Andrii Nakryiko
2023-03-05 23:46 ` Alexei Starovoitov
2023-03-07 21:54 ` Andrii Nakryiko
2023-03-02 23:50 ` Andrii Nakryiko [this message]
2023-03-04 20:21 ` [PATCH bpf-next 14/17] bpf: implement number iterator Alexei Starovoitov
2023-03-04 23:27 ` Andrii Nakryiko
2023-03-05 23:49 ` Alexei Starovoitov
2023-03-02 23:50 ` [PATCH bpf-next 15/17] selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros Andrii Nakryiko
2023-03-04 20:34 ` Alexei Starovoitov
2023-03-04 23:28 ` Andrii Nakryiko
2023-03-06 0:12 ` Alexei Starovoitov
2023-03-07 21:54 ` Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 16/17] selftests/bpf: add iterators tests Andrii Nakryiko
2023-03-04 20:39 ` Alexei Starovoitov
2023-03-04 23:29 ` Andrii Nakryiko
2023-03-06 0:14 ` Alexei Starovoitov
2023-03-04 21:09 ` Jiri Olsa
2023-03-04 23:29 ` Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 17/17] selftests/bpf: add number iterator tests Andrii Nakryiko
2023-03-04 19:30 ` [PATCH bpf-next 00/17] BPF open-coded iterators patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230302235015.2044271-15-andrii@kernel.org \
--to=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox