* [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break
@ 2024-03-01 3:37 Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 3:37 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
v2 -> v3: Major change
- drop bpf_can_loop() kfunc and introduce may_goto instruction instead
kfunc is a function call while may_goto doesn't consume any registers
and LLVM can produce much better code due to less register pressure.
- instead of counting from zero to BPF_MAX_LOOPS start from it instead
and break out of the loop when count reaches zero
- use may_goto instruction in cond_break macro
- recognize that 'exact' state comparison doesn't need to be truly exact.
regsafe() should ignore precision and liveness marks, but range_within
logic is safe to use while evaluating open coded iterators.
Alexei Starovoitov (4):
bpf: Introduce may_goto instruction
bpf: Recognize that two registers are safe when their ranges match
bpf: Add cond_break macro
selftests/bpf: Test may_goto
include/linux/bpf_verifier.h | 2 +
include/uapi/linux/bpf.h | 1 +
kernel/bpf/core.c | 1 +
kernel/bpf/disasm.c | 3 +
kernel/bpf/verifier.c | 269 +++++++++++++-----
tools/include/uapi/linux/bpf.h | 1 +
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../testing/selftests/bpf/bpf_experimental.h | 12 +
.../bpf/progs/verifier_iterating_callbacks.c | 72 ++++-
9 files changed, 291 insertions(+), 71 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 bpf-next 1/4] bpf: Introduce may_goto instruction
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
@ 2024-03-01 3:37 ` Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
` (4 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 3:37 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Introduce may_goto instruction that acts on a hidden bpf_iter_num, so that
bpf_iter_num_new(), bpf_iter_num_destroy() don't need to be called explicitly.
It can be used in any normal "for" or "while" loop, like
for (i = zero; i < cnt; cond_break, i++) {
The verifier recognizes that may_goto is used in the program,
reserves additional 8 bytes of stack, initializes them in subprog
prologue, and replaces may_goto instruction with:
aux_reg = *(u64 *)(fp - 40)
if aux_reg == 0 goto pc+off
aux_reg += 1
*(u64 *)(fp - 40) = aux_reg
may_goto instruction can be used by LLVM to implement __builtin_memcpy,
__builtin_strcmp.
may_goto is not a full substitute for bpf_for() macro.
bpf_for() doesn't have induction variable that verifiers sees,
so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded.
But when the code is written as:
for (i = 0; i < 100; cond_break, i++)
the verifier see 'i' as precise constant zero,
hence cond_break (aka may_goto) doesn't help to converge the loop.
A static or global variable can be used as a workaround:
static int zero = 0;
for (i = zero; i < 100; cond_break, i++) // works!
may_goto works well with arena pointers that don't need to be bounds-checked
on every iteration. Load/store from arena returns imprecise unbounded scalars.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
include/linux/bpf_verifier.h | 2 +
include/uapi/linux/bpf.h | 1 +
kernel/bpf/core.c | 1 +
kernel/bpf/disasm.c | 3 +
kernel/bpf/verifier.c | 235 +++++++++++++++++++++++++--------
tools/include/uapi/linux/bpf.h | 1 +
6 files changed, 189 insertions(+), 54 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 84365e6dd85d..8bd8bb32bb28 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -449,6 +449,7 @@ struct bpf_verifier_state {
u32 jmp_history_cnt;
u32 dfs_depth;
u32 callback_unroll_depth;
+ struct bpf_reg_state may_goto_reg;
};
#define bpf_get_spilled_reg(slot, frame, mask) \
@@ -619,6 +620,7 @@ struct bpf_subprog_info {
u32 start; /* insn idx of function entry point */
u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
u16 stack_depth; /* max. stack depth used by this function */
+ u16 stack_extra;
bool has_tail_call: 1;
bool tail_call_reachable: 1;
bool has_ld_abs: 1;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d2e6c5fcec01..8cf86566ad6d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -42,6 +42,7 @@
#define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */
#define BPF_JSLT 0xc0 /* SLT is signed, '<' */
#define BPF_JSLE 0xd0 /* SLE is signed, '<=' */
+#define BPF_JMA 0xe0 /* may_goto */
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 71c459a51d9e..ba6101447b49 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1675,6 +1675,7 @@ bool bpf_opcode_in_insntable(u8 code)
[BPF_LD | BPF_IND | BPF_B] = true,
[BPF_LD | BPF_IND | BPF_H] = true,
[BPF_LD | BPF_IND | BPF_W] = true,
+ [BPF_JMP | BPF_JMA] = true,
};
#undef BPF_INSN_3_TBL
#undef BPF_INSN_2_TBL
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index 49940c26a227..598cd38af84c 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -322,6 +322,9 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
} else if (insn->code == (BPF_JMP | BPF_JA)) {
verbose(cbs->private_data, "(%02x) goto pc%+d\n",
insn->code, insn->off);
+ } else if (insn->code == (BPF_JMP | BPF_JMA)) {
+ verbose(cbs->private_data, "(%02x) may_goto pc%+d\n",
+ insn->code, insn->off);
} else if (insn->code == (BPF_JMP32 | BPF_JA)) {
verbose(cbs->private_data, "(%02x) gotol pc%+d\n",
insn->code, insn->imm);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1c34b91b9583..a50395872d58 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1441,6 +1441,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
if (err)
return err;
}
+ dst_state->may_goto_reg = src->may_goto_reg;
return 0;
}
@@ -7878,6 +7879,43 @@ static int widen_imprecise_scalars(struct bpf_verifier_env *env,
return 0;
}
+static bool is_may_goto_insn(struct bpf_verifier_env *env, int insn_idx)
+{
+ return env->prog->insnsi[insn_idx].code == (BPF_JMP | BPF_JMA);
+}
+
+static struct bpf_reg_state *get_iter_reg_meta(struct bpf_verifier_state *st,
+ struct bpf_kfunc_call_arg_meta *meta)
+{
+ int iter_frameno = meta->iter.frameno;
+ int iter_spi = meta->iter.spi;
+
+ return &st->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
+}
+
+static struct bpf_reg_state *get_iter_reg(struct bpf_verifier_env *env,
+ struct bpf_verifier_state *st, int insn_idx)
+{
+ struct bpf_reg_state *iter_reg;
+ struct bpf_func_state *frame;
+ int spi;
+
+ if (is_may_goto_insn(env, insn_idx))
+ return &st->may_goto_reg;
+
+ frame = st->frame[st->curframe];
+ /* btf_check_iter_kfuncs() enforces that
+ * iter state pointer is always the first arg
+ */
+ iter_reg = &frame->regs[BPF_REG_1];
+ /* current state is valid due to states_equal(),
+ * so we can assume valid iter and reg state,
+ * no need for extra (re-)validations
+ */
+ spi = __get_spi(iter_reg->off + (s32)iter_reg->var_off.value);
+ return &st->frame[iter_reg->frameno]->stack[spi].spilled_ptr;
+}
+
/* process_iter_next_call() is called when verifier gets to iterator's next
* "method" (e.g., bpf_iter_num_next() for numbers iterator) call. We'll refer
* to it as just "iter_next()" in comments below.
@@ -7957,17 +7995,18 @@ static int widen_imprecise_scalars(struct bpf_verifier_env *env,
* bpf_iter_num_destroy(&it);
*/
static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
- struct bpf_kfunc_call_arg_meta *meta)
+ struct bpf_kfunc_call_arg_meta *meta, bool may_goto)
{
struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st;
struct bpf_func_state *cur_fr = cur_st->frame[cur_st->curframe], *queued_fr;
struct bpf_reg_state *cur_iter, *queued_iter;
- int iter_frameno = meta->iter.frameno;
- int iter_spi = meta->iter.spi;
BTF_TYPE_EMIT(struct bpf_iter);
- cur_iter = &env->cur_state->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
+ if (may_goto)
+ cur_iter = &cur_st->may_goto_reg;
+ else
+ cur_iter = get_iter_reg_meta(cur_st, meta);
if (cur_iter->iter.state != BPF_ITER_STATE_ACTIVE &&
cur_iter->iter.state != BPF_ITER_STATE_DRAINED) {
@@ -7990,25 +8029,32 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
* right at this instruction.
*/
prev_st = find_prev_entry(env, cur_st->parent, insn_idx);
+
/* branch out active iter state */
queued_st = push_stack(env, insn_idx + 1, insn_idx, false);
if (!queued_st)
return -ENOMEM;
- queued_iter = &queued_st->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
+ if (may_goto)
+ queued_iter = &queued_st->may_goto_reg;
+ else
+ queued_iter = get_iter_reg_meta(queued_st, meta);
queued_iter->iter.state = BPF_ITER_STATE_ACTIVE;
queued_iter->iter.depth++;
if (prev_st)
widen_imprecise_scalars(env, prev_st, queued_st);
- queued_fr = queued_st->frame[queued_st->curframe];
- mark_ptr_not_null_reg(&queued_fr->regs[BPF_REG_0]);
+ if (!may_goto) {
+ queued_fr = queued_st->frame[queued_st->curframe];
+ mark_ptr_not_null_reg(&queued_fr->regs[BPF_REG_0]);
+ }
}
/* switch to DRAINED state, but keep the depth unchanged */
/* mark current iter state as drained and assume returned NULL */
cur_iter->iter.state = BPF_ITER_STATE_DRAINED;
- __mark_reg_const_zero(env, &cur_fr->regs[BPF_REG_0]);
+ if (!may_goto)
+ __mark_reg_const_zero(env, &cur_fr->regs[BPF_REG_0]);
return 0;
}
@@ -12433,7 +12479,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
if (is_iter_next_kfunc(&meta)) {
- err = process_iter_next_call(env, insn_idx, &meta);
+ err = process_iter_next_call(env, insn_idx, &meta, false);
if (err)
return err;
}
@@ -14869,11 +14915,24 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
int err;
/* Only conditional jumps are expected to reach here. */
- if (opcode == BPF_JA || opcode > BPF_JSLE) {
+ if (opcode == BPF_JA || opcode > BPF_JMA) {
verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode);
return -EINVAL;
}
+ if (opcode == BPF_JMA) {
+ if (insn->code != (BPF_JMP | BPF_JMA) ||
+ insn->src_reg || insn->dst_reg) {
+ verbose(env, "invalid may_goto\n");
+ return -EINVAL;
+ }
+ err = process_iter_next_call(env, *insn_idx, NULL, true);
+ if (err)
+ return err;
+ *insn_idx += insn->off;
+ return 0;
+ }
+
/* check src2 operand */
err = check_reg_arg(env, insn->dst_reg, SRC_OP);
if (err)
@@ -15657,6 +15716,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
default:
/* conditional jump with two edges */
mark_prune_point(env, t);
+ if (insn->code == (BPF_JMP | BPF_JMA))
+ mark_force_checkpoint(env, t);
ret = push_insn(t, t + 1, FALLTHROUGH, env);
if (ret)
@@ -16767,6 +16828,9 @@ static bool states_equal(struct bpf_verifier_env *env,
if (old->active_rcu_lock != cur->active_rcu_lock)
return false;
+ if (old->may_goto_reg.iter.state != cur->may_goto_reg.iter.state)
+ return false;
+
/* for states to be equal callsites have to be the same
* and all frame states need to be equivalent
*/
@@ -17005,6 +17069,9 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
struct bpf_func_state *state;
int i, fr;
+ if (old->may_goto_reg.iter.depth != cur->may_goto_reg.iter.depth)
+ return true;
+
for (fr = old->curframe; fr >= 0; fr--) {
state = old->frame[fr];
for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
@@ -17109,23 +17176,11 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
* comparison would discard current state with r7=-32
* => unsafe memory access at 11 would not be caught.
*/
- if (is_iter_next_insn(env, insn_idx)) {
+ if (is_iter_next_insn(env, insn_idx) || is_may_goto_insn(env, insn_idx)) {
if (states_equal(env, &sl->state, cur, true)) {
- struct bpf_func_state *cur_frame;
- struct bpf_reg_state *iter_state, *iter_reg;
- int spi;
+ struct bpf_reg_state *iter_state;
- cur_frame = cur->frame[cur->curframe];
- /* btf_check_iter_kfuncs() enforces that
- * iter state pointer is always the first arg
- */
- iter_reg = &cur_frame->regs[BPF_REG_1];
- /* current state is valid due to states_equal(),
- * so we can assume valid iter and reg state,
- * no need for extra (re-)validations
- */
- spi = __get_spi(iter_reg->off + iter_reg->var_off.value);
- iter_state = &func(env, iter_reg)->stack[spi].spilled_ptr;
+ iter_state = get_iter_reg(env, cur, insn_idx);
if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE) {
update_loop_entry(cur, &sl->state);
goto hit;
@@ -19406,7 +19461,10 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
struct bpf_insn insn_buf[16];
struct bpf_prog *new_prog;
struct bpf_map *map_ptr;
- int i, ret, cnt, delta = 0;
+ int i, ret, cnt, delta = 0, cur_subprog = 0;
+ struct bpf_subprog_info *subprogs = env->subprog_info;
+ u16 stack_depth = subprogs[cur_subprog].stack_depth;
+ u16 stack_depth_extra = 0;
if (env->seen_exception && !env->exception_callback_subprog) {
struct bpf_insn patch[] = {
@@ -19426,7 +19484,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
mark_subprog_exc_cb(env, env->exception_callback_subprog);
}
- for (i = 0; i < insn_cnt; i++, insn++) {
+ for (i = 0; i < insn_cnt;) {
/* Make divide-by-zero exceptions impossible. */
if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) ||
insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) ||
@@ -19465,7 +19523,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement LD_ABS and LD_IND with a rewrite, if supported by the program type. */
@@ -19485,7 +19543,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Rewrite pointer arithmetic to mitigate speculation attacks. */
@@ -19500,7 +19558,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
aux = &env->insn_aux_data[i + delta];
if (!aux->alu_state ||
aux->alu_state == BPF_ALU_NON_POINTER)
- continue;
+ goto next_insn;
isneg = aux->alu_state & BPF_ALU_NEG_VALUE;
issrc = (aux->alu_state & BPF_ALU_SANITIZE) ==
@@ -19538,19 +19596,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
+ }
+
+ if (insn->code == (BPF_JMP | BPF_JMA)) {
+ int stack_off = -stack_depth - 8;
+
+ stack_depth_extra = 8;
+ insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_AX, BPF_REG_10, stack_off);
+ insn_buf[1] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, insn->off + 2);
+ insn_buf[2] = BPF_ALU64_IMM(BPF_SUB, BPF_REG_AX, 1);
+ insn_buf[3] = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_AX, stack_off);
+ cnt = 4;
+
+ new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+ if (!new_prog)
+ return -ENOMEM;
+
+ delta += cnt - 1;
+ env->prog = prog = new_prog;
+ insn = new_prog->insnsi + i + delta;
+ goto next_insn;
}
if (insn->code != (BPF_JMP | BPF_CALL))
- continue;
+ goto next_insn;
if (insn->src_reg == BPF_PSEUDO_CALL)
- continue;
+ goto next_insn;
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
if (ret)
return ret;
if (cnt == 0)
- continue;
+ goto next_insn;
new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
if (!new_prog)
@@ -19559,7 +19637,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
if (insn->imm == BPF_FUNC_get_route_realm)
@@ -19607,11 +19685,11 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
}
insn->imm = ret + 1;
- continue;
+ goto next_insn;
}
if (!bpf_map_ptr_unpriv(aux))
- continue;
+ goto next_insn;
/* instead of changing every JIT dealing with tail_call
* emit two extra insns:
@@ -19640,7 +19718,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
if (insn->imm == BPF_FUNC_timer_set_callback) {
@@ -19752,7 +19830,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
BUILD_BUG_ON(!__same_type(ops->map_lookup_elem,
@@ -19783,31 +19861,31 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
switch (insn->imm) {
case BPF_FUNC_map_lookup_elem:
insn->imm = BPF_CALL_IMM(ops->map_lookup_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_update_elem:
insn->imm = BPF_CALL_IMM(ops->map_update_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_delete_elem:
insn->imm = BPF_CALL_IMM(ops->map_delete_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_push_elem:
insn->imm = BPF_CALL_IMM(ops->map_push_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_pop_elem:
insn->imm = BPF_CALL_IMM(ops->map_pop_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_peek_elem:
insn->imm = BPF_CALL_IMM(ops->map_peek_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_redirect_map:
insn->imm = BPF_CALL_IMM(ops->map_redirect);
- continue;
+ goto next_insn;
case BPF_FUNC_for_each_map_elem:
insn->imm = BPF_CALL_IMM(ops->map_for_each_callback);
- continue;
+ goto next_insn;
case BPF_FUNC_map_lookup_percpu_elem:
insn->imm = BPF_CALL_IMM(ops->map_lookup_percpu_elem);
- continue;
+ goto next_insn;
}
goto patch_call_imm;
@@ -19835,7 +19913,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_get_func_arg inline. */
@@ -19860,7 +19938,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_get_func_ret inline. */
@@ -19888,7 +19966,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement get_func_arg_cnt inline. */
@@ -19903,7 +19981,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_get_func_ip inline. */
@@ -19918,7 +19996,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_kptr_xchg inline */
@@ -19936,7 +20014,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
patch_call_imm:
fn = env->ops->get_func_proto(insn->imm, env->prog);
@@ -19950,6 +20028,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
return -EFAULT;
}
insn->imm = fn->func - __bpf_call_base;
+next_insn:
+ if (subprogs[cur_subprog + 1].start == i + delta + 1) {
+ subprogs[cur_subprog].stack_depth += stack_depth_extra;
+ subprogs[cur_subprog].stack_extra = stack_depth_extra;
+ cur_subprog++;
+ stack_depth = subprogs[cur_subprog].stack_depth;
+ stack_depth_extra = 0;
+ }
+ i++; insn++;
+ }
+
+ env->prog->aux->stack_depth = subprogs[0].stack_depth;
+ for (i = 0; i < env->subprog_cnt; i++) {
+ int subprog_start = subprogs[i].start, j;
+ int stack_slots = subprogs[i].stack_extra / 8;
+
+ if (stack_slots >= ARRAY_SIZE(insn_buf)) {
+ verbose(env, "verifier bug: stack_extra is too large\n");
+ return -EFAULT;
+ }
+
+ /* Add insns to subprog prologue to zero init extra stack */
+ for (j = 0; j < stack_slots; j++)
+ insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
+ -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
+ if (j) {
+ insn_buf[j] = env->prog->insnsi[subprog_start];
+
+ new_prog = bpf_patch_insn_data(env, subprog_start, insn_buf, j + 1);
+ if (!new_prog)
+ return -ENOMEM;
+ env->prog = prog = new_prog;
+ }
}
/* Since poke tab is now finalized, publish aux to tracker. */
@@ -20140,6 +20251,21 @@ static void free_states(struct bpf_verifier_env *env)
}
}
+static void init_may_goto_reg(struct bpf_reg_state *st)
+{
+ __mark_reg_known_zero(st);
+ st->type = PTR_TO_STACK;
+ st->live |= REG_LIVE_WRITTEN;
+ st->ref_obj_id = 0;
+ st->iter.btf = NULL;
+ st->iter.btf_id = 0;
+ /* Init register state to sane values.
+ * Only iter.state and iter.depth are used during verification.
+ */
+ st->iter.state = BPF_ITER_STATE_ACTIVE;
+ st->iter.depth = 0;
+}
+
static int do_check_common(struct bpf_verifier_env *env, int subprog)
{
bool pop_log = !(env->log.level & BPF_LOG_LEVEL2);
@@ -20157,6 +20283,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
state->curframe = 0;
state->speculative = false;
state->branches = 1;
+ init_may_goto_reg(&state->may_goto_reg);
state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL);
if (!state->frame[0]) {
kfree(state);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d2e6c5fcec01..8cf86566ad6d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -42,6 +42,7 @@
#define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */
#define BPF_JSLT 0xc0 /* SLT is signed, '<' */
#define BPF_JSLE 0xd0 /* SLE is signed, '<=' */
+#define BPF_JMA 0xe0 /* may_goto */
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
@ 2024-03-01 3:37 ` Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov
` (3 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 3:37 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
When open code iterators, bpf_loop or may_goto is used the following two states
are equivalent and safe to prune the search:
cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
In other words "exact" state match should ignore liveness and precision marks,
since open coded iterator logic didn't complete their propagation,
but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr
is safe to rely on.
Avoid doing such comparison when regular infinite loop detection logic is used,
otherwise bounded loop logic will declare such "infinite loop" as false
positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop().
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
kernel/bpf/verifier.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a50395872d58..f3b1ffc66ee6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7830,6 +7830,11 @@ static struct bpf_verifier_state *find_prev_entry(struct bpf_verifier_env *env,
}
static void reset_idmap_scratch(struct bpf_verifier_env *env);
+enum exact_level {
+ NOT_EXACT,
+ EXACT,
+ RANGE_WITHIN
+};
static bool regs_exact(const struct bpf_reg_state *rold,
const struct bpf_reg_state *rcur,
struct bpf_idmap *idmap);
@@ -16281,8 +16286,8 @@ static int check_btf_info(struct bpf_verifier_env *env,
}
/* check %cur's range satisfies %old's */
-static bool range_within(struct bpf_reg_state *old,
- struct bpf_reg_state *cur)
+static bool range_within(const struct bpf_reg_state *old,
+ const struct bpf_reg_state *cur)
{
return old->umin_value <= cur->umin_value &&
old->umax_value >= cur->umax_value &&
@@ -16448,12 +16453,13 @@ static bool regs_exact(const struct bpf_reg_state *rold,
/* Returns true if (rold safe implies rcur safe) */
static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
- struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact)
+ struct bpf_reg_state *rcur, struct bpf_idmap *idmap,
+ enum exact_level exact)
{
- if (exact)
+ if (exact == EXACT)
return regs_exact(rold, rcur, idmap);
- if (!(rold->live & REG_LIVE_READ))
+ if (!(rold->live & REG_LIVE_READ) && exact != RANGE_WITHIN)
/* explored state didn't use this */
return true;
if (rold->type == NOT_INIT)
@@ -16495,7 +16501,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
check_scalar_ids(rold->id, rcur->id, idmap);
}
- if (!rold->precise)
+ if (!rold->precise && exact != RANGE_WITHIN)
return true;
/* Why check_ids() for scalar registers?
*
@@ -16606,7 +16612,7 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env,
}
static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
- struct bpf_func_state *cur, struct bpf_idmap *idmap, bool exact)
+ struct bpf_func_state *cur, struct bpf_idmap *idmap, enum exact_level exact)
{
int i, spi;
@@ -16770,7 +16776,7 @@ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
* the current state will reach 'bpf_exit' instruction safely
*/
static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old,
- struct bpf_func_state *cur, bool exact)
+ struct bpf_func_state *cur, enum exact_level exact)
{
int i;
@@ -16797,7 +16803,7 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env)
static bool states_equal(struct bpf_verifier_env *env,
struct bpf_verifier_state *old,
struct bpf_verifier_state *cur,
- bool exact)
+ enum exact_level exact)
{
int i;
@@ -17177,7 +17183,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
* => unsafe memory access at 11 would not be caught.
*/
if (is_iter_next_insn(env, insn_idx) || is_may_goto_insn(env, insn_idx)) {
- if (states_equal(env, &sl->state, cur, true)) {
+ if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
struct bpf_reg_state *iter_state;
iter_state = get_iter_reg(env, cur, insn_idx);
@@ -17189,13 +17195,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
goto skip_inf_loop_check;
}
if (calls_callback(env, insn_idx)) {
- if (states_equal(env, &sl->state, cur, true))
+ if (states_equal(env, &sl->state, cur, RANGE_WITHIN))
goto hit;
goto skip_inf_loop_check;
}
/* attempt to detect infinite loop to avoid unnecessary doomed work */
if (states_maybe_looping(&sl->state, cur) &&
- states_equal(env, &sl->state, cur, true) &&
+ states_equal(env, &sl->state, cur, EXACT) &&
!iter_active_depths_differ(&sl->state, cur) &&
sl->state.callback_unroll_depth == cur->callback_unroll_depth) {
verbose_linfo(env, insn_idx, "; ");
@@ -17252,7 +17258,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
*/
loop_entry = get_loop_entry(&sl->state);
force_exact = loop_entry && loop_entry->branches > 0;
- if (states_equal(env, &sl->state, cur, force_exact)) {
+ if (states_equal(env, &sl->state, cur, force_exact ? EXACT : NOT_EXACT)) {
if (force_exact)
update_loop_entry(cur, loop_entry);
hit:
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 bpf-next 3/4] bpf: Add cond_break macro
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
@ 2024-03-01 3:37 ` Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
` (2 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 3:37 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Use may_goto instruction to implement cond_break macro.
Ideally the macro should be written as:
asm volatile goto(".byte 0xe5;
.byte 0;
.short (%l[l_break] - . - 4) / 8;
.long 0;
but LLVM doesn't recognize fixup of 2 byte PC relative yet.
Hence use
asm volatile goto(".byte 0xe5;
.byte 0;
.long (%l[l_break] - . - 4) / 8;
.short 0;
that produces correct asm on little endian.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/bpf_experimental.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index 0d749006d107..2d408d8b9b70 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -326,6 +326,18 @@ l_true: \
})
#endif
+#define cond_break \
+ ({ __label__ l_break, l_continue; \
+ asm volatile goto(".byte 0xe5; \
+ .byte 0; \
+ .long (%l[l_break] - . - 4) / 8; \
+ .short 0" \
+ :::: l_break); \
+ goto l_continue; \
+ l_break: break; \
+ l_continue:; \
+ })
+
#ifndef bpf_nop_mov
#define bpf_nop_mov(var) \
asm volatile("%[reg]=%[reg]"::[reg]"r"((short)var))
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
` (2 preceding siblings ...)
2024-03-01 3:37 ` [PATCH v3 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov
@ 2024-03-01 3:37 ` Alexei Starovoitov
2024-03-01 19:47 ` John Fastabend
2024-03-01 21:22 ` Alexei Starovoitov
2024-03-01 5:24 ` [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break John Fastabend
2024-03-02 1:20 ` Eduard Zingerman
5 siblings, 2 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 3:37 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Add tests for may_goto instruction via cond_break macro.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/progs/verifier_iterating_callbacks.c | 72 ++++++++++++++++++-
2 files changed, 70 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
index 1a63996c0304..c6c31b960810 100644
--- a/tools/testing/selftests/bpf/DENYLIST.s390x
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -3,3 +3,4 @@
exceptions # JIT does not support calling kfunc bpf_throw (exceptions)
get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
+verifier_iter/cond_break
diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
index 5905e036e0ea..8476dc47623f 100644
--- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
+++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
@@ -1,8 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
-
-#include <linux/bpf.h>
-#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
+#include "bpf_experimental.h"
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
@@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
return 1000 * a + b + c;
}
+#define ARR_SZ 1000000
+int zero;
+char arr[ARR_SZ];
+
+SEC("socket")
+__success __retval(0xd495cdc0)
+int cond_break1(const void *ctx)
+{
+ unsigned int i;
+ unsigned int sum = 0;
+
+ for (i = zero; i < ARR_SZ; cond_break, i++)
+ sum += i;
+ for (i = zero; i < ARR_SZ; i++) {
+ barrier_var(i);
+ sum += i + arr[i];
+ cond_break;
+ }
+
+ return sum;
+}
+
+SEC("socket")
+__success __retval(999000000)
+int cond_break2(const void *ctx)
+{
+ int i, j;
+ int sum = 0;
+
+ for (i = zero; i < 1000; cond_break, i++)
+ for (j = zero; j < 1000; j++) {
+ sum += i + j;
+ cond_break;
+ }
+
+ return sum;
+}
+
+static __noinline int loop(void)
+{
+ int i, sum = 0;
+
+ for (i = zero; i <= 1000000; i++, cond_break)
+ sum += i;
+
+ return sum;
+}
+
+SEC("socket")
+__success __retval(0x6a5a2920)
+int cond_break3(const void *ctx)
+{
+ return loop();
+}
+
+SEC("socket")
+__success __retval(0x800000) /* BPF_MAX_LOOPS */
+int cond_break4(const void *ctx)
+{
+ int cnt = 0;
+
+ for (;;) {
+ cond_break;
+ cnt++;
+ }
+ return cnt;
+}
+
char _license[] SEC("license") = "GPL";
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* RE: [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
` (3 preceding siblings ...)
2024-03-01 3:37 ` [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
@ 2024-03-01 5:24 ` John Fastabend
2024-03-02 1:20 ` Eduard Zingerman
5 siblings, 0 replies; 14+ messages in thread
From: John Fastabend @ 2024-03-01 5:24 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> v2 -> v3: Major change
> - drop bpf_can_loop() kfunc and introduce may_goto instruction instead
> kfunc is a function call while may_goto doesn't consume any registers
> and LLVM can produce much better code due to less register pressure.
Nice back to the original instruction idea for loops. I was walking
around thinking about this for last day or so and had the same thought,
but you beat me to it.
The original troublesome parts was jumps into the loop. But will read
on to see the solution.
> - instead of counting from zero to BPF_MAX_LOOPS start from it instead
> and break out of the loop when count reaches zero
> - use may_goto instruction in cond_break macro
> - recognize that 'exact' state comparison doesn't need to be truly exact.
> regsafe() should ignore precision and liveness marks, but range_within
> logic is safe to use while evaluating open coded iterators.
I will need to review last bit is too dense for me to process right now.
I think this will be useful for lots of cases.
>
> Alexei Starovoitov (4):
> bpf: Introduce may_goto instruction
> bpf: Recognize that two registers are safe when their ranges match
> bpf: Add cond_break macro
> selftests/bpf: Test may_goto
>
> include/linux/bpf_verifier.h | 2 +
> include/uapi/linux/bpf.h | 1 +
> kernel/bpf/core.c | 1 +
> kernel/bpf/disasm.c | 3 +
> kernel/bpf/verifier.c | 269 +++++++++++++-----
> tools/include/uapi/linux/bpf.h | 1 +
> tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> .../testing/selftests/bpf/bpf_experimental.h | 12 +
> .../bpf/progs/verifier_iterating_callbacks.c | 72 ++++-
> 9 files changed, 291 insertions(+), 71 deletions(-)
>
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 3:37 ` [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
@ 2024-03-01 19:47 ` John Fastabend
2024-03-01 21:16 ` Alexei Starovoitov
2024-03-01 21:22 ` Alexei Starovoitov
1 sibling, 1 reply; 14+ messages in thread
From: John Fastabend @ 2024-03-01 19:47 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, kernel-team
Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add tests for may_goto instruction via cond_break macro.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
> tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> .../bpf/progs/verifier_iterating_callbacks.c | 72 ++++++++++++++++++-
> 2 files changed, 70 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> index 1a63996c0304..c6c31b960810 100644
> --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> @@ -3,3 +3,4 @@
> exceptions # JIT does not support calling kfunc bpf_throw (exceptions)
> get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
> stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
> +verifier_iter/cond_break
> diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> index 5905e036e0ea..8476dc47623f 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> @@ -1,8 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0
> -
> -#include <linux/bpf.h>
> -#include <bpf/bpf_helpers.h>
> #include "bpf_misc.h"
> +#include "bpf_experimental.h"
>
> struct {
> __uint(type, BPF_MAP_TYPE_ARRAY);
> @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> return 1000 * a + b + c;
> }
>
> +#define ARR_SZ 1000000
> +int zero;
> +char arr[ARR_SZ];
> +
> +SEC("socket")
> +__success __retval(0xd495cdc0)
> +int cond_break1(const void *ctx)
> +{
> + unsigned int i;
> + unsigned int sum = 0;
> +
> + for (i = zero; i < ARR_SZ; cond_break, i++)
> + sum += i;
> + for (i = zero; i < ARR_SZ; i++) {
> + barrier_var(i);
> + sum += i + arr[i];
> + cond_break;
> + }
> +
> + return sum;
> +}
> +
> +SEC("socket")
> +__success __retval(999000000)
> +int cond_break2(const void *ctx)
> +{
> + int i, j;
> + int sum = 0;
> +
> + for (i = zero; i < 1000; cond_break, i++)
> + for (j = zero; j < 1000; j++) {
> + sum += i + j;
> + cond_break;
> + }
> +
> + return sum;
> +}
> +
> +static __noinline int loop(void)
> +{
> + int i, sum = 0;
> +
> + for (i = zero; i <= 1000000; i++, cond_break)
> + sum += i;
> +
> + return sum;
> +}
> +
> +SEC("socket")
> +__success __retval(0x6a5a2920)
> +int cond_break3(const void *ctx)
> +{
> + return loop();
> +}
> +
> +SEC("socket")
> +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> +int cond_break4(const void *ctx)
> +{
> + int cnt = 0;
> +
> + for (;;) {
> + cond_break;
> + cnt++;
> + }
> + return cnt;
> +}
I found this test illustrative to show how the cond_break which
is to me "feels" like a global hidden iterator appears to not
be reinitialized across calls?
static __noinline int full_loop(void)
{
int cnt = 0;
for (;;) {
cond_break;
cnt++;
}
for (;;) {
cond_break;
cnt++;
}
bpf_printk("cnt==%d\n", cnt);
return cnt;
}
SEC("socket")
__success __retval(16777216)
int cond_break5(const void *ctx)
{
int cnt = 0;
for (;;) {
cond_break;
cnt++;
}
cnt += full_loop();
for (;;) {
cond_break;
cnt++;
}
return cnt;
}
This fails with,
do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
run_subtest:FAIL:654 Unexpected retval: 8388608 != 16777216
#430/15 verifier_iterating_callbacks/cond_break5:FAIL
#430 verifier_iterating_callbacks:FAIL
; cnt += full_loop();
118: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
120: b4 02 00 00 0d 00 00 00 w2 = 13
121: bc 73 00 00 00 00 00 00 w3 = w7
122: 85 00 00 00 06 00 00 00 call 6
;
I guess this is by design but I sort of expected each
call to have its own context. It does make some sense to
limit main and all calls to a max loop count so not
complaining. Maybe consider adding the test? I at least
thought it helped.
> +
> char _license[] SEC("license") = "GPL";
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 19:47 ` John Fastabend
@ 2024-03-01 21:16 ` Alexei Starovoitov
2024-03-01 21:47 ` John Fastabend
0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 21:16 UTC (permalink / raw)
To: John Fastabend
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, Kernel Team
On Fri, Mar 1, 2024 at 11:47 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Add tests for may_goto instruction via cond_break macro.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> > tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> > .../bpf/progs/verifier_iterating_callbacks.c | 72 ++++++++++++++++++-
> > 2 files changed, 70 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > index 1a63996c0304..c6c31b960810 100644
> > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > @@ -3,3 +3,4 @@
> > exceptions # JIT does not support calling kfunc bpf_throw (exceptions)
> > get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
> > stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
> > +verifier_iter/cond_break
> > diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > index 5905e036e0ea..8476dc47623f 100644
> > --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > @@ -1,8 +1,6 @@
> > // SPDX-License-Identifier: GPL-2.0
> > -
> > -#include <linux/bpf.h>
> > -#include <bpf/bpf_helpers.h>
> > #include "bpf_misc.h"
> > +#include "bpf_experimental.h"
> >
> > struct {
> > __uint(type, BPF_MAP_TYPE_ARRAY);
> > @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> > return 1000 * a + b + c;
> > }
> >
> > +#define ARR_SZ 1000000
> > +int zero;
> > +char arr[ARR_SZ];
> > +
> > +SEC("socket")
> > +__success __retval(0xd495cdc0)
> > +int cond_break1(const void *ctx)
> > +{
> > + unsigned int i;
> > + unsigned int sum = 0;
> > +
> > + for (i = zero; i < ARR_SZ; cond_break, i++)
> > + sum += i;
> > + for (i = zero; i < ARR_SZ; i++) {
> > + barrier_var(i);
> > + sum += i + arr[i];
> > + cond_break;
> > + }
> > +
> > + return sum;
> > +}
> > +
> > +SEC("socket")
> > +__success __retval(999000000)
> > +int cond_break2(const void *ctx)
> > +{
> > + int i, j;
> > + int sum = 0;
> > +
> > + for (i = zero; i < 1000; cond_break, i++)
> > + for (j = zero; j < 1000; j++) {
> > + sum += i + j;
> > + cond_break;
> > + }
> > +
> > + return sum;
> > +}
> > +
> > +static __noinline int loop(void)
> > +{
> > + int i, sum = 0;
> > +
> > + for (i = zero; i <= 1000000; i++, cond_break)
> > + sum += i;
> > +
> > + return sum;
> > +}
> > +
> > +SEC("socket")
> > +__success __retval(0x6a5a2920)
> > +int cond_break3(const void *ctx)
> > +{
> > + return loop();
> > +}
> > +
> > +SEC("socket")
> > +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> > +int cond_break4(const void *ctx)
> > +{
> > + int cnt = 0;
> > +
> > + for (;;) {
> > + cond_break;
> > + cnt++;
> > + }
> > + return cnt;
> > +}
>
> I found this test illustrative to show how the cond_break which
ohh. I shouldn't have exposed this implementation detail
in the test. I'll adjust it in the next revision.
> is to me "feels" like a global hidden iterator appears to not
> be reinitialized across calls?
...
> I guess this is by design but I sort of expected each
> call to have its own context. It does make some sense to
> limit main and all calls to a max loop count so not
> complaining. Maybe consider adding the test? I at least
> thought it helped.
At the moment each subprog has its own hidden counter,
but we might have different limits per program type.
Like sleepable might be allowed to loop longer.
The actual limit of BPF_MAX_LOOPS is a random number.
The bpf prog shouldn't rely on any particular loop count.
Most likely we'll add a watchdog soon and will start cancelling
bpf progs that were on cpu for more than a second
regardless of number of iterations.
Arena faults will be causing loops to terminate too.
And so on.
In other words "cond_break" is a contract between
the verifier and the program. The verifier allows the
program to loop assuming it's behaving well,
but reserves the right to terminate it.
So bpf author can assume that cond_break is a nop
if their program is well formed.
The loops with discoverable iteration count like
for (i = 0; i < 1000; i++)
are not really a target use case for cond_break.
It's mainly for loops that may have unbounded looping,
but should terminate quickly when code is correct.
Like walking a link list or strlen().
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 3:37 ` [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
2024-03-01 19:47 ` John Fastabend
@ 2024-03-01 21:22 ` Alexei Starovoitov
1 sibling, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 21:22 UTC (permalink / raw)
To: bpf
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, Kernel Team
On Thu, Feb 29, 2024 at 7:37 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> +#define ARR_SZ 1000000
> +int zero;
> +char arr[ARR_SZ];
> +
> +SEC("socket")
> +__success __retval(0xd495cdc0)
> +int cond_break1(const void *ctx)
> +{
> + unsigned int i;
> + unsigned int sum = 0;
This is the reason for CI -no_alu32 fail.
I'll fix it in the next revision with:
int cond_break1(const void *ctx)
{
- unsigned int i;
+ unsigned long i;
unsigned int sum = 0;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 21:16 ` Alexei Starovoitov
@ 2024-03-01 21:47 ` John Fastabend
2024-03-01 22:06 ` John Fastabend
0 siblings, 1 reply; 14+ messages in thread
From: John Fastabend @ 2024-03-01 21:47 UTC (permalink / raw)
To: Alexei Starovoitov, John Fastabend
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, Kernel Team
Alexei Starovoitov wrote:
> On Fri, Mar 1, 2024 at 11:47 AM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > Alexei Starovoitov wrote:
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Add tests for may_goto instruction via cond_break macro.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > > tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> > > .../bpf/progs/verifier_iterating_callbacks.c | 72 ++++++++++++++++++-
> > > 2 files changed, 70 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > index 1a63996c0304..c6c31b960810 100644
> > > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > @@ -3,3 +3,4 @@
> > > exceptions # JIT does not support calling kfunc bpf_throw (exceptions)
> > > get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
> > > stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
> > > +verifier_iter/cond_break
> > > diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > index 5905e036e0ea..8476dc47623f 100644
> > > --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > @@ -1,8 +1,6 @@
> > > // SPDX-License-Identifier: GPL-2.0
> > > -
> > > -#include <linux/bpf.h>
> > > -#include <bpf/bpf_helpers.h>
> > > #include "bpf_misc.h"
> > > +#include "bpf_experimental.h"
> > >
> > > struct {
> > > __uint(type, BPF_MAP_TYPE_ARRAY);
> > > @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> > > return 1000 * a + b + c;
> > > }
> > >
> > > +#define ARR_SZ 1000000
> > > +int zero;
> > > +char arr[ARR_SZ];
> > > +
> > > +SEC("socket")
> > > +__success __retval(0xd495cdc0)
> > > +int cond_break1(const void *ctx)
> > > +{
> > > + unsigned int i;
> > > + unsigned int sum = 0;
> > > +
> > > + for (i = zero; i < ARR_SZ; cond_break, i++)
> > > + sum += i;
> > > + for (i = zero; i < ARR_SZ; i++) {
> > > + barrier_var(i);
> > > + sum += i + arr[i];
> > > + cond_break;
> > > + }
> > > +
> > > + return sum;
> > > +}
> > > +
> > > +SEC("socket")
> > > +__success __retval(999000000)
> > > +int cond_break2(const void *ctx)
> > > +{
> > > + int i, j;
> > > + int sum = 0;
> > > +
> > > + for (i = zero; i < 1000; cond_break, i++)
> > > + for (j = zero; j < 1000; j++) {
> > > + sum += i + j;
> > > + cond_break;
> > > + }
> > > +
> > > + return sum;
> > > +}
> > > +
> > > +static __noinline int loop(void)
> > > +{
> > > + int i, sum = 0;
> > > +
> > > + for (i = zero; i <= 1000000; i++, cond_break)
> > > + sum += i;
> > > +
> > > + return sum;
> > > +}
> > > +
> > > +SEC("socket")
> > > +__success __retval(0x6a5a2920)
> > > +int cond_break3(const void *ctx)
> > > +{
> > > + return loop();
> > > +}
> > > +
> > > +SEC("socket")
> > > +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> > > +int cond_break4(const void *ctx)
> > > +{
> > > + int cnt = 0;
> > > +
> > > + for (;;) {
> > > + cond_break;
> > > + cnt++;
> > > + }
> > > + return cnt;
> > > +}
> >
> > I found this test illustrative to show how the cond_break which
>
> ohh. I shouldn't have exposed this implementation detail
> in the test. I'll adjust it in the next revision.
>
> > is to me "feels" like a global hidden iterator appears to not
> > be reinitialized across calls?
> ...
> > I guess this is by design but I sort of expected each
> > call to have its own context. It does make some sense to
> > limit main and all calls to a max loop count so not
> > complaining. Maybe consider adding the test? I at least
> > thought it helped.
>
> At the moment each subprog has its own hidden counter,
aha that is how I read the patch1 as well. But I'm trying to follow
why I get two different answers here.
Below passes all good the total there in break5 is 2xMAX_LOOPS which
is what I expect from above and reading patch. If I trace the code
I have two subprogs and each does fixup,
insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
-subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
This is the good one.
__noinline int full_loop(void)
{
int cnt = 0;
for (;;) {
cond_break;
cnt++;
}
for (;;) {
cond_break;
cnt++;
}
bpf_printk("cnt==%d\n", cnt);
return cnt;
}
SEC("socket")
__success __retval(16777216)
int cond_break5(const void *ctx)
{
int cnt = 0;
for (;;) {
cond_break;
cnt++;
}
cnt += full_loop();
for (;;) {
cond_break;
cnt++;
}
return cnt;
}
But adding static fails :( which I didn't expect. Is it obvious
why this is the case?
static __noinline int full_loop(void)
{
int cnt = 0;
for (;;) {
cond_break;
cnt++;
}
for (;;) {
cond_break;
cnt++;
}
bpf_printk("cnt==%d\n", cnt);
return cnt;
}
SEC("socket")
__success __retval(16777216)
int cond_break5(const void *ctx)
{
int cnt = 0;
for (;;) {
cond_break;
cnt++;
}
cnt += full_loop();
for (;;) {
cond_break;
cnt++;
}
return cnt;
}
From verifier side story is slightly different. There are still
two subprogs, but for subprog[0] has stack_slots==0? Debugging
now but maybe its obvious what that static is doing to you.
> but we might have different limits per program type.
> Like sleepable might be allowed to loop longer.
> The actual limit of BPF_MAX_LOOPS is a random number.
> The bpf prog shouldn't rely on any particular loop count.
> Most likely we'll add a watchdog soon and will start cancelling
> bpf progs that were on cpu for more than a second
> regardless of number of iterations.
> Arena faults will be causing loops to terminate too.
> And so on.
> In other words "cond_break" is a contract between
> the verifier and the program. The verifier allows the
> program to loop assuming it's behaving well,
> but reserves the right to terminate it.
> So bpf author can assume that cond_break is a nop
> if their program is well formed.
> The loops with discoverable iteration count like
> for (i = 0; i < 1000; i++)
> are not really a target use case for cond_break.
> It's mainly for loops that may have unbounded looping,
> but should terminate quickly when code is correct.
> Like walking a link list or strlen().
Yep we do this a lot and just create some artifical upper
bound so this is nicer for sure. Lots of Tetragon code reads
for (i = 0; i < MAX_LOOP; i++) {
do_stuff
if (exit_cond)
break;
}
.John
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 21:47 ` John Fastabend
@ 2024-03-01 22:06 ` John Fastabend
2024-03-01 22:12 ` Alexei Starovoitov
0 siblings, 1 reply; 14+ messages in thread
From: John Fastabend @ 2024-03-01 22:06 UTC (permalink / raw)
To: John Fastabend, Alexei Starovoitov, John Fastabend
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, Kernel Team
John Fastabend wrote:
> Alexei Starovoitov wrote:
> > On Fri, Mar 1, 2024 at 11:47 AM John Fastabend <john.fastabend@gmail.com> wrote:
> > >
> > > Alexei Starovoitov wrote:
> > > > From: Alexei Starovoitov <ast@kernel.org>
> > > >
> > > > Add tests for may_goto instruction via cond_break macro.
> > > >
> > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > > ---
> > > > tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> > > > .../bpf/progs/verifier_iterating_callbacks.c | 72 ++++++++++++++++++-
> > > > 2 files changed, 70 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > > index 1a63996c0304..c6c31b960810 100644
> > > > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > > > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > > > @@ -3,3 +3,4 @@
> > > > exceptions # JIT does not support calling kfunc bpf_throw (exceptions)
> > > > get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
> > > > stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
> > > > +verifier_iter/cond_break
> > > > diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > > index 5905e036e0ea..8476dc47623f 100644
> > > > --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > > +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
> > > > @@ -1,8 +1,6 @@
> > > > // SPDX-License-Identifier: GPL-2.0
> > > > -
> > > > -#include <linux/bpf.h>
> > > > -#include <bpf/bpf_helpers.h>
> > > > #include "bpf_misc.h"
> > > > +#include "bpf_experimental.h"
> > > >
> > > > struct {
> > > > __uint(type, BPF_MAP_TYPE_ARRAY);
> > > > @@ -239,4 +237,72 @@ int bpf_loop_iter_limit_nested(void *unused)
> > > > return 1000 * a + b + c;
> > > > }
> > > >
> > > > +#define ARR_SZ 1000000
> > > > +int zero;
> > > > +char arr[ARR_SZ];
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(0xd495cdc0)
> > > > +int cond_break1(const void *ctx)
> > > > +{
> > > > + unsigned int i;
> > > > + unsigned int sum = 0;
> > > > +
> > > > + for (i = zero; i < ARR_SZ; cond_break, i++)
> > > > + sum += i;
> > > > + for (i = zero; i < ARR_SZ; i++) {
> > > > + barrier_var(i);
> > > > + sum += i + arr[i];
> > > > + cond_break;
> > > > + }
> > > > +
> > > > + return sum;
> > > > +}
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(999000000)
> > > > +int cond_break2(const void *ctx)
> > > > +{
> > > > + int i, j;
> > > > + int sum = 0;
> > > > +
> > > > + for (i = zero; i < 1000; cond_break, i++)
> > > > + for (j = zero; j < 1000; j++) {
> > > > + sum += i + j;
> > > > + cond_break;
> > > > + }
> > > > +
> > > > + return sum;
> > > > +}
> > > > +
> > > > +static __noinline int loop(void)
> > > > +{
> > > > + int i, sum = 0;
> > > > +
> > > > + for (i = zero; i <= 1000000; i++, cond_break)
> > > > + sum += i;
> > > > +
> > > > + return sum;
> > > > +}
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(0x6a5a2920)
> > > > +int cond_break3(const void *ctx)
> > > > +{
> > > > + return loop();
> > > > +}
> > > > +
> > > > +SEC("socket")
> > > > +__success __retval(0x800000) /* BPF_MAX_LOOPS */
> > > > +int cond_break4(const void *ctx)
> > > > +{
> > > > + int cnt = 0;
> > > > +
> > > > + for (;;) {
> > > > + cond_break;
> > > > + cnt++;
> > > > + }
> > > > + return cnt;
> > > > +}
> > >
> > > I found this test illustrative to show how the cond_break which
> >
> > ohh. I shouldn't have exposed this implementation detail
> > in the test. I'll adjust it in the next revision.
> >
> > > is to me "feels" like a global hidden iterator appears to not
> > > be reinitialized across calls?
> > ...
> > > I guess this is by design but I sort of expected each
> > > call to have its own context. It does make some sense to
> > > limit main and all calls to a max loop count so not
> > > complaining. Maybe consider adding the test? I at least
> > > thought it helped.
> >
> > At the moment each subprog has its own hidden counter,
>
> aha that is how I read the patch1 as well. But I'm trying to follow
> why I get two different answers here.
>
> Below passes all good the total there in break5 is 2xMAX_LOOPS which
> is what I expect from above and reading patch. If I trace the code
> I have two subprogs and each does fixup,
>
> insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
> -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
>
> This is the good one.
>
> __noinline int full_loop(void)
> {
> int cnt = 0;
>
> for (;;) {
> cond_break;
> cnt++;
> }
>
> for (;;) {
> cond_break;
> cnt++;
> }
>
> bpf_printk("cnt==%d\n", cnt);
> return cnt;
> }
>
> SEC("socket")
> __success __retval(16777216)
> int cond_break5(const void *ctx)
> {
> int cnt = 0;
>
> for (;;) {
> cond_break;
> cnt++;
> }
>
> cnt += full_loop();
>
> for (;;) {
> cond_break;
> cnt++;
> }
> return cnt;
> }
>
> But adding static fails :( which I didn't expect. Is it obvious
> why this is the case?
>
> static __noinline int full_loop(void)
> {
> int cnt = 0;
>
> for (;;) {
> cond_break;
> cnt++;
> }
>
> for (;;) {
> cond_break;
> cnt++;
> }
>
> bpf_printk("cnt==%d\n", cnt);
> return cnt;
> }
>
> SEC("socket")
> __success __retval(16777216)
> int cond_break5(const void *ctx)
> {
> int cnt = 0;
>
> for (;;) {
> cond_break;
> cnt++;
> }
>
> cnt += full_loop();
>
> for (;;) {
> cond_break;
> cnt++;
> }
> return cnt;
> }
>
> From verifier side story is slightly different. There are still
> two subprogs, but for subprog[0] has stack_slots==0? Debugging
> now but maybe its obvious what that static is doing to you.
That was a typo its subprog[1] with stack_slots == 0. Also
tracing insn it seems in nonstatic case we hit multiple
insn->code (BPF_JMP| BPF_JMA) but in the static case only
find the first one. Object file seems to have multiples
though. I need to drop for the rest of the afternoon most
likely, but will try to see what sort of silly thing I did
later today or worse case Monday.
> k
> > but we might have different limits per program type.
> > Like sleepable might be allowed to loop longer.
> > The actual limit of BPF_MAX_LOOPS is a random number.
> > The bpf prog shouldn't rely on any particular loop count.
> > Most likely we'll add a watchdog soon and will start cancelling
> > bpf progs that were on cpu for more than a second
> > regardless of number of iterations.
> > Arena faults will be causing loops to terminate too.
> > And so on.
> > In other words "cond_break" is a contract between
> > the verifier and the program. The verifier allows the
> > program to loop assuming it's behaving well,
> > but reserves the right to terminate it.
> > So bpf author can assume that cond_break is a nop
> > if their program is well formed.
> > The loops with discoverable iteration count like
> > for (i = 0; i < 1000; i++)
> > are not really a target use case for cond_break.
> > It's mainly for loops that may have unbounded looping,
> > but should terminate quickly when code is correct.
> > Like walking a link list or strlen().
>
> Yep we do this a lot and just create some artifical upper
> bound so this is nicer for sure. Lots of Tetragon code reads
>
>
> for (i = 0; i < MAX_LOOP; i++) {
> do_stuff
> if (exit_cond)
> break;
> }
>
> .John
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-01 22:06 ` John Fastabend
@ 2024-03-01 22:12 ` Alexei Starovoitov
0 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-01 22:12 UTC (permalink / raw)
To: John Fastabend
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, Kernel Team
On Fri, Mar 1, 2024 at 2:07 PM John Fastabend <john.fastabend@gmail.com> wrote:
> >
> > SEC("socket")
> > __success __retval(16777216)
> > int cond_break5(const void *ctx)
> > {
> > int cnt = 0;
> >
> > for (;;) {
> > cond_break;
> > cnt++;
> > }
> >
> > cnt += full_loop();
> >
> > for (;;) {
> > cond_break;
> > cnt++;
> > }
> > return cnt;
> > }
> >
> > From verifier side story is slightly different. There are still
> > two subprogs, but for subprog[0] has stack_slots==0? Debugging
> > now but maybe its obvious what that static is doing to you.
>
> That was a typo its subprog[1] with stack_slots == 0. Also
> tracing insn it seems in nonstatic case we hit multiple
> insn->code (BPF_JMP| BPF_JMA) but in the static case only
> find the first one. Object file seems to have multiples
> though. I need to drop for the rest of the afternoon most
> likely, but will try to see what sort of silly thing I did
> later today or worse case Monday.
Thanks for the bug report.
For static case:
$ bpftool p dump xlated id 36
int cond_break5(const void * ctx):
; int cond_break5(const void *ctx)
0: (7a) *(u64 *)(r10 -8) = 8388608
1: (b4) w6 = 0
; cond_break;
2: (79) r11 = *(u64 *)(r10 -8)
3: (15) if r11 == 0x0 goto pc+4
4: (17) r11 -= 1
5: (7b) *(u64 *)(r10 -8) = r11
; cnt++;
6: (04) w6 += 1
7: (05) goto pc-6
; cnt += full_loop();
8: (85) call pc+2#bpf_prog_270866f75dae27c8_full_loop
; for (;;) {
9: (0c) w0 += w6
; return cnt;
10: (95) exit
int full_loop():
; static __noinline int full_loop(void)
11: (b4) w6 = 0
; bpf_printk("cnt==%d\n", cnt);
12: (18) r1 = map[id:35][0]+0
14: (b4) w2 = 9
15: (bc) w3 = w6
16: (85) call bpf_trace_printk#-87376
; return cnt;
17: (bc) w0 = w6
18: (95) exit
Looks like I made a mistake in may_goto verification.
Only the first loop remains. Other loops were removed as dead code.
It's certainly a bug in the patch 1. Will fix in the next revision.
pw-bot: cr
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
` (4 preceding siblings ...)
2024-03-01 5:24 ` [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break John Fastabend
@ 2024-03-02 1:20 ` Eduard Zingerman
2024-03-02 1:28 ` Alexei Starovoitov
5 siblings, 1 reply; 14+ messages in thread
From: Eduard Zingerman @ 2024-03-02 1:20 UTC (permalink / raw)
To: Alexei Starovoitov, bpf; +Cc: daniel, andrii, martin.lau, memxor, kernel-team
On Thu, 2024-02-29 at 19:37 -0800, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> v2 -> v3: Major change
> - drop bpf_can_loop() kfunc and introduce may_goto instruction instead
> kfunc is a function call while may_goto doesn't consume any registers
> and LLVM can produce much better code due to less register pressure.
> - instead of counting from zero to BPF_MAX_LOOPS start from it instead
> and break out of the loop when count reaches zero
> - use may_goto instruction in cond_break macro
> - recognize that 'exact' state comparison doesn't need to be truly exact.
> regsafe() should ignore precision and liveness marks, but range_within
> logic is safe to use while evaluating open coded iterators.
Sorry for the delay, I will look through this patch-set over the weekend.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-02 1:20 ` Eduard Zingerman
@ 2024-03-02 1:28 ` Alexei Starovoitov
0 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-02 1:28 UTC (permalink / raw)
To: Eduard Zingerman
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Kernel Team
On Fri, Mar 1, 2024 at 5:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-02-29 at 19:37 -0800, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > v2 -> v3: Major change
> > - drop bpf_can_loop() kfunc and introduce may_goto instruction instead
> > kfunc is a function call while may_goto doesn't consume any registers
> > and LLVM can produce much better code due to less register pressure.
> > - instead of counting from zero to BPF_MAX_LOOPS start from it instead
> > and break out of the loop when count reaches zero
> > - use may_goto instruction in cond_break macro
> > - recognize that 'exact' state comparison doesn't need to be truly exact.
> > regsafe() should ignore precision and liveness marks, but range_within
> > logic is safe to use while evaluating open coded iterators.
>
> Sorry for the delay, I will look through this patch-set over the weekend.
I fixed the drain issue reported by John,
fixed no_alu32, and will resubmit soon.
Ignore this set.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-03-02 1:28 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-01 3:37 [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov
2024-03-01 3:37 ` [PATCH v3 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
2024-03-01 19:47 ` John Fastabend
2024-03-01 21:16 ` Alexei Starovoitov
2024-03-01 21:47 ` John Fastabend
2024-03-01 22:06 ` John Fastabend
2024-03-01 22:12 ` Alexei Starovoitov
2024-03-01 21:22 ` Alexei Starovoitov
2024-03-01 5:24 ` [PATCH v3 bpf-next 0/4] bpf: Introduce may_goto and cond_break John Fastabend
2024-03-02 1:20 ` Eduard Zingerman
2024-03-02 1:28 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox