* [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break
@ 2024-03-02 2:00 Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-02 2:00 UTC (permalink / raw)
To: bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team
From: Alexei Starovoitov <ast@kernel.org>
v3 -> v4:
- fix drained issue reported by John.
may_goto insn could be implemented with sticky state (once
reaches 0 it stays 0), but the verifier shouldn't assume that.
It has to explore both branches.
Arguably drained iterator state shouldn't be there at all.
bpf_iter_css_next() is not sticky. Can be fixed, but auditing all
iterators for stickiness. That's an orthogonal discussion.
- explained JMA name reasons in patch 1
- fixed test_progs-no_alu32 issue and added another test
v2 -> v3: Major change
- drop bpf_can_loop() kfunc and introduce may_goto instruction instead
kfunc is a function call while may_goto doesn't consume any registers
and LLVM can produce much better code due to less register pressure.
- instead of counting from zero to BPF_MAX_LOOPS start from it instead
and break out of the loop when count reaches zero
- use may_goto instruction in cond_break macro
- recognize that 'exact' state comparison doesn't need to be truly exact.
regsafe() should ignore precision and liveness marks, but range_within
logic is safe to use while evaluating open coded iterators.
Alexei Starovoitov (4):
bpf: Introduce may_goto instruction
bpf: Recognize that two registers are safe when their ranges match
bpf: Add cond_break macro
selftests/bpf: Test may_goto
include/linux/bpf_verifier.h | 2 +
include/uapi/linux/bpf.h | 1 +
kernel/bpf/core.c | 1 +
kernel/bpf/disasm.c | 3 +
kernel/bpf/verifier.c | 280 +++++++++++++-----
tools/include/uapi/linux/bpf.h | 1 +
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../testing/selftests/bpf/bpf_experimental.h | 12 +
.../bpf/progs/verifier_iterating_callbacks.c | 103 ++++++-
9 files changed, 330 insertions(+), 74 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction
2024-03-02 2:00 [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
@ 2024-03-02 2:00 ` Alexei Starovoitov
2024-03-04 11:55 ` Eduard Zingerman
2024-03-02 2:00 ` [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-02 2:00 UTC (permalink / raw)
To: bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Introduce may_goto instruction that acts on a hidden bpf_iter_num, so that
bpf_iter_num_new(), bpf_iter_num_destroy() don't need to be called explicitly.
It can be used in any normal "for" or "while" loop, like
for (i = zero; i < cnt; cond_break, i++) {
The verifier recognizes that may_goto is used in the program,
reserves additional 8 bytes of stack, initializes them in subprog
prologue, and replaces may_goto instruction with:
aux_reg = *(u64 *)(fp - 40)
if aux_reg == 0 goto pc+off
aux_reg += 1
*(u64 *)(fp - 40) = aux_reg
may_goto instruction can be used by LLVM to implement __builtin_memcpy,
__builtin_strcmp.
may_goto is not a full substitute for bpf_for() macro.
bpf_for() doesn't have induction variable that verifiers sees,
so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded.
But when the code is written as:
for (i = 0; i < 100; cond_break, i++)
the verifier see 'i' as precise constant zero,
hence cond_break (aka may_goto) doesn't help to converge the loop.
A static or global variable can be used as a workaround:
static int zero = 0;
for (i = zero; i < 100; cond_break, i++) // works!
may_goto works well with arena pointers that don't need to be bounds-checked
on every iteration. Load/store from arena returns imprecise unbounded scalars.
Reserve new opcode BPF_JMP | BPF_JMA for may_goto insn.
JMA stands for "jump maybe", and "jump multipurpose", and "jump multi always".
Since goto_or_nop insn was proposed, it may use the same opcode.
may_goto vs goto_or_nop can be distinguished by src_reg:
code = BPF_JMP | BPF_JMA:
src_reg = 0 - may_goto
src_reg = 1 - goto_or_nop
We could have reused BPF_JMP | BPF_JA like:
src_reg = 0 - normal goto
src_reg = 1 - may_goto
src_reg = 2 - goto_or_nop
but JA is a real insn and it's unconditional, while may_goto and goto_or_nop
are pseudo instructions, and both are conditional. Hence it's better to
have a different opcode for them. Hence BPF_JMA.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
include/linux/bpf_verifier.h | 2 +
include/uapi/linux/bpf.h | 1 +
kernel/bpf/core.c | 1 +
kernel/bpf/disasm.c | 3 +
kernel/bpf/verifier.c | 246 +++++++++++++++++++++++++--------
tools/include/uapi/linux/bpf.h | 1 +
6 files changed, 197 insertions(+), 57 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 84365e6dd85d..8bd8bb32bb28 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -449,6 +449,7 @@ struct bpf_verifier_state {
u32 jmp_history_cnt;
u32 dfs_depth;
u32 callback_unroll_depth;
+ struct bpf_reg_state may_goto_reg;
};
#define bpf_get_spilled_reg(slot, frame, mask) \
@@ -619,6 +620,7 @@ struct bpf_subprog_info {
u32 start; /* insn idx of function entry point */
u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
u16 stack_depth; /* max. stack depth used by this function */
+ u16 stack_extra;
bool has_tail_call: 1;
bool tail_call_reachable: 1;
bool has_ld_abs: 1;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d2e6c5fcec01..8cf86566ad6d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -42,6 +42,7 @@
#define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */
#define BPF_JSLT 0xc0 /* SLT is signed, '<' */
#define BPF_JSLE 0xd0 /* SLE is signed, '<=' */
+#define BPF_JMA 0xe0 /* may_goto */
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 71c459a51d9e..ba6101447b49 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1675,6 +1675,7 @@ bool bpf_opcode_in_insntable(u8 code)
[BPF_LD | BPF_IND | BPF_B] = true,
[BPF_LD | BPF_IND | BPF_H] = true,
[BPF_LD | BPF_IND | BPF_W] = true,
+ [BPF_JMP | BPF_JMA] = true,
};
#undef BPF_INSN_3_TBL
#undef BPF_INSN_2_TBL
diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index 49940c26a227..598cd38af84c 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -322,6 +322,9 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
} else if (insn->code == (BPF_JMP | BPF_JA)) {
verbose(cbs->private_data, "(%02x) goto pc%+d\n",
insn->code, insn->off);
+ } else if (insn->code == (BPF_JMP | BPF_JMA)) {
+ verbose(cbs->private_data, "(%02x) may_goto pc%+d\n",
+ insn->code, insn->off);
} else if (insn->code == (BPF_JMP32 | BPF_JA)) {
verbose(cbs->private_data, "(%02x) gotol pc%+d\n",
insn->code, insn->imm);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1c34b91b9583..63ef6a38726e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1441,6 +1441,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
if (err)
return err;
}
+ dst_state->may_goto_reg = src->may_goto_reg;
return 0;
}
@@ -7878,6 +7879,43 @@ static int widen_imprecise_scalars(struct bpf_verifier_env *env,
return 0;
}
+static bool is_may_goto_insn(struct bpf_verifier_env *env, int insn_idx)
+{
+ return env->prog->insnsi[insn_idx].code == (BPF_JMP | BPF_JMA);
+}
+
+static struct bpf_reg_state *get_iter_reg_meta(struct bpf_verifier_state *st,
+ struct bpf_kfunc_call_arg_meta *meta)
+{
+ int iter_frameno = meta->iter.frameno;
+ int iter_spi = meta->iter.spi;
+
+ return &st->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
+}
+
+static struct bpf_reg_state *get_iter_reg(struct bpf_verifier_env *env,
+ struct bpf_verifier_state *st, int insn_idx)
+{
+ struct bpf_reg_state *iter_reg;
+ struct bpf_func_state *frame;
+ int spi;
+
+ if (is_may_goto_insn(env, insn_idx))
+ return &st->may_goto_reg;
+
+ frame = st->frame[st->curframe];
+ /* btf_check_iter_kfuncs() enforces that
+ * iter state pointer is always the first arg
+ */
+ iter_reg = &frame->regs[BPF_REG_1];
+ /* current state is valid due to states_equal(),
+ * so we can assume valid iter and reg state,
+ * no need for extra (re-)validations
+ */
+ spi = __get_spi(iter_reg->off + (s32)iter_reg->var_off.value);
+ return &st->frame[iter_reg->frameno]->stack[spi].spilled_ptr;
+}
+
/* process_iter_next_call() is called when verifier gets to iterator's next
* "method" (e.g., bpf_iter_num_next() for numbers iterator) call. We'll refer
* to it as just "iter_next()" in comments below.
@@ -7957,17 +7995,18 @@ static int widen_imprecise_scalars(struct bpf_verifier_env *env,
* bpf_iter_num_destroy(&it);
*/
static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
- struct bpf_kfunc_call_arg_meta *meta)
+ struct bpf_kfunc_call_arg_meta *meta, bool may_goto)
{
struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st;
struct bpf_func_state *cur_fr = cur_st->frame[cur_st->curframe], *queued_fr;
struct bpf_reg_state *cur_iter, *queued_iter;
- int iter_frameno = meta->iter.frameno;
- int iter_spi = meta->iter.spi;
BTF_TYPE_EMIT(struct bpf_iter);
- cur_iter = &env->cur_state->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
+ if (may_goto)
+ cur_iter = &cur_st->may_goto_reg;
+ else
+ cur_iter = get_iter_reg_meta(cur_st, meta);
if (cur_iter->iter.state != BPF_ITER_STATE_ACTIVE &&
cur_iter->iter.state != BPF_ITER_STATE_DRAINED) {
@@ -7990,25 +8029,37 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
* right at this instruction.
*/
prev_st = find_prev_entry(env, cur_st->parent, insn_idx);
+
/* branch out active iter state */
queued_st = push_stack(env, insn_idx + 1, insn_idx, false);
if (!queued_st)
return -ENOMEM;
- queued_iter = &queued_st->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
+ if (may_goto)
+ queued_iter = &queued_st->may_goto_reg;
+ else
+ queued_iter = get_iter_reg_meta(queued_st, meta);
queued_iter->iter.state = BPF_ITER_STATE_ACTIVE;
queued_iter->iter.depth++;
if (prev_st)
widen_imprecise_scalars(env, prev_st, queued_st);
- queued_fr = queued_st->frame[queued_st->curframe];
- mark_ptr_not_null_reg(&queued_fr->regs[BPF_REG_0]);
+ if (!may_goto) {
+ queued_fr = queued_st->frame[queued_st->curframe];
+ mark_ptr_not_null_reg(&queued_fr->regs[BPF_REG_0]);
+ }
}
- /* switch to DRAINED state, but keep the depth unchanged */
- /* mark current iter state as drained and assume returned NULL */
- cur_iter->iter.state = BPF_ITER_STATE_DRAINED;
- __mark_reg_const_zero(env, &cur_fr->regs[BPF_REG_0]);
+ if (!may_goto) {
+ /* mark current iter state as drained and assume returned NULL */
+ cur_iter->iter.state = BPF_ITER_STATE_DRAINED;
+ __mark_reg_const_zero(env, &cur_fr->regs[BPF_REG_0]);
+ }
+ /* else
+ * may_goto insn could be implemented with sticky state (once
+ * reaches 0 it stays 0), but the verifier shouldn't assume that.
+ * It has to explore both branches.
+ */
return 0;
}
@@ -12433,7 +12484,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
if (is_iter_next_kfunc(&meta)) {
- err = process_iter_next_call(env, insn_idx, &meta);
+ err = process_iter_next_call(env, insn_idx, &meta, false);
if (err)
return err;
}
@@ -14869,11 +14920,24 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
int err;
/* Only conditional jumps are expected to reach here. */
- if (opcode == BPF_JA || opcode > BPF_JSLE) {
+ if (opcode == BPF_JA || opcode > BPF_JMA) {
verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode);
return -EINVAL;
}
+ if (opcode == BPF_JMA) {
+ if (insn->code != (BPF_JMP | BPF_JMA) ||
+ insn->src_reg || insn->dst_reg) {
+ verbose(env, "invalid may_goto\n");
+ return -EINVAL;
+ }
+ err = process_iter_next_call(env, *insn_idx, NULL, true);
+ if (err)
+ return err;
+ *insn_idx += insn->off;
+ return 0;
+ }
+
/* check src2 operand */
err = check_reg_arg(env, insn->dst_reg, SRC_OP);
if (err)
@@ -15657,6 +15721,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
default:
/* conditional jump with two edges */
mark_prune_point(env, t);
+ if (insn->code == (BPF_JMP | BPF_JMA))
+ mark_force_checkpoint(env, t);
ret = push_insn(t, t + 1, FALLTHROUGH, env);
if (ret)
@@ -16767,6 +16833,9 @@ static bool states_equal(struct bpf_verifier_env *env,
if (old->active_rcu_lock != cur->active_rcu_lock)
return false;
+ if (old->may_goto_reg.iter.state != cur->may_goto_reg.iter.state)
+ return false;
+
/* for states to be equal callsites have to be the same
* and all frame states need to be equivalent
*/
@@ -17005,6 +17074,9 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
struct bpf_func_state *state;
int i, fr;
+ if (old->may_goto_reg.iter.depth != cur->may_goto_reg.iter.depth)
+ return true;
+
for (fr = old->curframe; fr >= 0; fr--) {
state = old->frame[fr];
for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
@@ -17109,23 +17181,11 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
* comparison would discard current state with r7=-32
* => unsafe memory access at 11 would not be caught.
*/
- if (is_iter_next_insn(env, insn_idx)) {
+ if (is_iter_next_insn(env, insn_idx) || is_may_goto_insn(env, insn_idx)) {
if (states_equal(env, &sl->state, cur, true)) {
- struct bpf_func_state *cur_frame;
- struct bpf_reg_state *iter_state, *iter_reg;
- int spi;
+ struct bpf_reg_state *iter_state;
- cur_frame = cur->frame[cur->curframe];
- /* btf_check_iter_kfuncs() enforces that
- * iter state pointer is always the first arg
- */
- iter_reg = &cur_frame->regs[BPF_REG_1];
- /* current state is valid due to states_equal(),
- * so we can assume valid iter and reg state,
- * no need for extra (re-)validations
- */
- spi = __get_spi(iter_reg->off + iter_reg->var_off.value);
- iter_state = &func(env, iter_reg)->stack[spi].spilled_ptr;
+ iter_state = get_iter_reg(env, cur, insn_idx);
if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE) {
update_loop_entry(cur, &sl->state);
goto hit;
@@ -19406,7 +19466,10 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
struct bpf_insn insn_buf[16];
struct bpf_prog *new_prog;
struct bpf_map *map_ptr;
- int i, ret, cnt, delta = 0;
+ int i, ret, cnt, delta = 0, cur_subprog = 0;
+ struct bpf_subprog_info *subprogs = env->subprog_info;
+ u16 stack_depth = subprogs[cur_subprog].stack_depth;
+ u16 stack_depth_extra = 0;
if (env->seen_exception && !env->exception_callback_subprog) {
struct bpf_insn patch[] = {
@@ -19426,7 +19489,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
mark_subprog_exc_cb(env, env->exception_callback_subprog);
}
- for (i = 0; i < insn_cnt; i++, insn++) {
+ for (i = 0; i < insn_cnt;) {
/* Make divide-by-zero exceptions impossible. */
if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) ||
insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) ||
@@ -19465,7 +19528,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement LD_ABS and LD_IND with a rewrite, if supported by the program type. */
@@ -19485,7 +19548,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Rewrite pointer arithmetic to mitigate speculation attacks. */
@@ -19500,7 +19563,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
aux = &env->insn_aux_data[i + delta];
if (!aux->alu_state ||
aux->alu_state == BPF_ALU_NON_POINTER)
- continue;
+ goto next_insn;
isneg = aux->alu_state & BPF_ALU_NEG_VALUE;
issrc = (aux->alu_state & BPF_ALU_SANITIZE) ==
@@ -19538,19 +19601,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
+ }
+
+ if (insn->code == (BPF_JMP | BPF_JMA)) {
+ int stack_off = -stack_depth - 8;
+
+ stack_depth_extra = 8;
+ insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_AX, BPF_REG_10, stack_off);
+ insn_buf[1] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, insn->off + 2);
+ insn_buf[2] = BPF_ALU64_IMM(BPF_SUB, BPF_REG_AX, 1);
+ insn_buf[3] = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_AX, stack_off);
+ cnt = 4;
+
+ new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+ if (!new_prog)
+ return -ENOMEM;
+
+ delta += cnt - 1;
+ env->prog = prog = new_prog;
+ insn = new_prog->insnsi + i + delta;
+ goto next_insn;
}
if (insn->code != (BPF_JMP | BPF_CALL))
- continue;
+ goto next_insn;
if (insn->src_reg == BPF_PSEUDO_CALL)
- continue;
+ goto next_insn;
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
if (ret)
return ret;
if (cnt == 0)
- continue;
+ goto next_insn;
new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
if (!new_prog)
@@ -19559,7 +19642,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
if (insn->imm == BPF_FUNC_get_route_realm)
@@ -19607,11 +19690,11 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
}
insn->imm = ret + 1;
- continue;
+ goto next_insn;
}
if (!bpf_map_ptr_unpriv(aux))
- continue;
+ goto next_insn;
/* instead of changing every JIT dealing with tail_call
* emit two extra insns:
@@ -19640,7 +19723,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
if (insn->imm == BPF_FUNC_timer_set_callback) {
@@ -19752,7 +19835,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
BUILD_BUG_ON(!__same_type(ops->map_lookup_elem,
@@ -19783,31 +19866,31 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
switch (insn->imm) {
case BPF_FUNC_map_lookup_elem:
insn->imm = BPF_CALL_IMM(ops->map_lookup_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_update_elem:
insn->imm = BPF_CALL_IMM(ops->map_update_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_delete_elem:
insn->imm = BPF_CALL_IMM(ops->map_delete_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_push_elem:
insn->imm = BPF_CALL_IMM(ops->map_push_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_pop_elem:
insn->imm = BPF_CALL_IMM(ops->map_pop_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_map_peek_elem:
insn->imm = BPF_CALL_IMM(ops->map_peek_elem);
- continue;
+ goto next_insn;
case BPF_FUNC_redirect_map:
insn->imm = BPF_CALL_IMM(ops->map_redirect);
- continue;
+ goto next_insn;
case BPF_FUNC_for_each_map_elem:
insn->imm = BPF_CALL_IMM(ops->map_for_each_callback);
- continue;
+ goto next_insn;
case BPF_FUNC_map_lookup_percpu_elem:
insn->imm = BPF_CALL_IMM(ops->map_lookup_percpu_elem);
- continue;
+ goto next_insn;
}
goto patch_call_imm;
@@ -19835,7 +19918,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_get_func_arg inline. */
@@ -19860,7 +19943,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_get_func_ret inline. */
@@ -19888,7 +19971,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement get_func_arg_cnt inline. */
@@ -19903,7 +19986,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_get_func_ip inline. */
@@ -19918,7 +20001,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
/* Implement bpf_kptr_xchg inline */
@@ -19936,7 +20019,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
delta += cnt - 1;
env->prog = prog = new_prog;
insn = new_prog->insnsi + i + delta;
- continue;
+ goto next_insn;
}
patch_call_imm:
fn = env->ops->get_func_proto(insn->imm, env->prog);
@@ -19950,6 +20033,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
return -EFAULT;
}
insn->imm = fn->func - __bpf_call_base;
+next_insn:
+ if (subprogs[cur_subprog + 1].start == i + delta + 1) {
+ subprogs[cur_subprog].stack_depth += stack_depth_extra;
+ subprogs[cur_subprog].stack_extra = stack_depth_extra;
+ cur_subprog++;
+ stack_depth = subprogs[cur_subprog].stack_depth;
+ stack_depth_extra = 0;
+ }
+ i++; insn++;
+ }
+
+ env->prog->aux->stack_depth = subprogs[0].stack_depth;
+ for (i = 0; i < env->subprog_cnt; i++) {
+ int subprog_start = subprogs[i].start, j;
+ int stack_slots = subprogs[i].stack_extra / 8;
+
+ if (stack_slots >= ARRAY_SIZE(insn_buf)) {
+ verbose(env, "verifier bug: stack_extra is too large\n");
+ return -EFAULT;
+ }
+
+ /* Add insns to subprog prologue to zero init extra stack */
+ for (j = 0; j < stack_slots; j++)
+ insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
+ -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
+ if (j) {
+ insn_buf[j] = env->prog->insnsi[subprog_start];
+
+ new_prog = bpf_patch_insn_data(env, subprog_start, insn_buf, j + 1);
+ if (!new_prog)
+ return -ENOMEM;
+ env->prog = prog = new_prog;
+ }
}
/* Since poke tab is now finalized, publish aux to tracker. */
@@ -20140,6 +20256,21 @@ static void free_states(struct bpf_verifier_env *env)
}
}
+static void init_may_goto_reg(struct bpf_reg_state *st)
+{
+ __mark_reg_known_zero(st);
+ st->type = PTR_TO_STACK;
+ st->live |= REG_LIVE_WRITTEN;
+ st->ref_obj_id = 0;
+ st->iter.btf = NULL;
+ st->iter.btf_id = 0;
+ /* Init register state to sane values.
+ * Only iter.state and iter.depth are used during verification.
+ */
+ st->iter.state = BPF_ITER_STATE_ACTIVE;
+ st->iter.depth = 0;
+}
+
static int do_check_common(struct bpf_verifier_env *env, int subprog)
{
bool pop_log = !(env->log.level & BPF_LOG_LEVEL2);
@@ -20157,6 +20288,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
state->curframe = 0;
state->speculative = false;
state->branches = 1;
+ init_may_goto_reg(&state->may_goto_reg);
state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL);
if (!state->frame[0]) {
kfree(state);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d2e6c5fcec01..8cf86566ad6d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -42,6 +42,7 @@
#define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */
#define BPF_JSLT 0xc0 /* SLT is signed, '<' */
#define BPF_JSLE 0xd0 /* SLE is signed, '<=' */
+#define BPF_JMA 0xe0 /* may_goto */
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match
2024-03-02 2:00 [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
@ 2024-03-02 2:00 ` Alexei Starovoitov
2024-03-03 14:12 ` Eduard Zingerman
2024-03-02 2:00 ` [PATCH v4 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-02 2:00 UTC (permalink / raw)
To: bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team
From: Alexei Starovoitov <ast@kernel.org>
When open code iterators, bpf_loop or may_goto is used the following two states
are equivalent and safe to prune the search:
cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
In other words "exact" state match should ignore liveness and precision marks,
since open coded iterator logic didn't complete their propagation,
but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr
is safe to rely on.
Avoid doing such comparison when regular infinite loop detection logic is used,
otherwise bounded loop logic will declare such "infinite loop" as false
positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop().
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
kernel/bpf/verifier.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 63ef6a38726e..49ff76543adc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7830,6 +7830,11 @@ static struct bpf_verifier_state *find_prev_entry(struct bpf_verifier_env *env,
}
static void reset_idmap_scratch(struct bpf_verifier_env *env);
+enum exact_level {
+ NOT_EXACT,
+ EXACT,
+ RANGE_WITHIN
+};
static bool regs_exact(const struct bpf_reg_state *rold,
const struct bpf_reg_state *rcur,
struct bpf_idmap *idmap);
@@ -16286,8 +16291,8 @@ static int check_btf_info(struct bpf_verifier_env *env,
}
/* check %cur's range satisfies %old's */
-static bool range_within(struct bpf_reg_state *old,
- struct bpf_reg_state *cur)
+static bool range_within(const struct bpf_reg_state *old,
+ const struct bpf_reg_state *cur)
{
return old->umin_value <= cur->umin_value &&
old->umax_value >= cur->umax_value &&
@@ -16453,12 +16458,13 @@ static bool regs_exact(const struct bpf_reg_state *rold,
/* Returns true if (rold safe implies rcur safe) */
static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
- struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact)
+ struct bpf_reg_state *rcur, struct bpf_idmap *idmap,
+ enum exact_level exact)
{
- if (exact)
+ if (exact == EXACT)
return regs_exact(rold, rcur, idmap);
- if (!(rold->live & REG_LIVE_READ))
+ if (!(rold->live & REG_LIVE_READ) && exact != RANGE_WITHIN)
/* explored state didn't use this */
return true;
if (rold->type == NOT_INIT)
@@ -16500,7 +16506,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
check_scalar_ids(rold->id, rcur->id, idmap);
}
- if (!rold->precise)
+ if (!rold->precise && exact != RANGE_WITHIN)
return true;
/* Why check_ids() for scalar registers?
*
@@ -16611,7 +16617,7 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env,
}
static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
- struct bpf_func_state *cur, struct bpf_idmap *idmap, bool exact)
+ struct bpf_func_state *cur, struct bpf_idmap *idmap, enum exact_level exact)
{
int i, spi;
@@ -16775,7 +16781,7 @@ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
* the current state will reach 'bpf_exit' instruction safely
*/
static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old,
- struct bpf_func_state *cur, bool exact)
+ struct bpf_func_state *cur, enum exact_level exact)
{
int i;
@@ -16802,7 +16808,7 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env)
static bool states_equal(struct bpf_verifier_env *env,
struct bpf_verifier_state *old,
struct bpf_verifier_state *cur,
- bool exact)
+ enum exact_level exact)
{
int i;
@@ -17182,7 +17188,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
* => unsafe memory access at 11 would not be caught.
*/
if (is_iter_next_insn(env, insn_idx) || is_may_goto_insn(env, insn_idx)) {
- if (states_equal(env, &sl->state, cur, true)) {
+ if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
struct bpf_reg_state *iter_state;
iter_state = get_iter_reg(env, cur, insn_idx);
@@ -17194,13 +17200,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
goto skip_inf_loop_check;
}
if (calls_callback(env, insn_idx)) {
- if (states_equal(env, &sl->state, cur, true))
+ if (states_equal(env, &sl->state, cur, RANGE_WITHIN))
goto hit;
goto skip_inf_loop_check;
}
/* attempt to detect infinite loop to avoid unnecessary doomed work */
if (states_maybe_looping(&sl->state, cur) &&
- states_equal(env, &sl->state, cur, true) &&
+ states_equal(env, &sl->state, cur, EXACT) &&
!iter_active_depths_differ(&sl->state, cur) &&
sl->state.callback_unroll_depth == cur->callback_unroll_depth) {
verbose_linfo(env, insn_idx, "; ");
@@ -17257,7 +17263,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
*/
loop_entry = get_loop_entry(&sl->state);
force_exact = loop_entry && loop_entry->branches > 0;
- if (states_equal(env, &sl->state, cur, force_exact)) {
+ if (states_equal(env, &sl->state, cur, force_exact ? EXACT : NOT_EXACT)) {
if (force_exact)
update_loop_entry(cur, loop_entry);
hit:
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 bpf-next 3/4] bpf: Add cond_break macro
2024-03-02 2:00 [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
@ 2024-03-02 2:00 ` Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
2024-03-04 16:45 ` [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break dthaler1968
4 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-02 2:00 UTC (permalink / raw)
To: bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Use may_goto instruction to implement cond_break macro.
Ideally the macro should be written as:
asm volatile goto(".byte 0xe5;
.byte 0;
.short (%l[l_break] - . - 4) / 8;
.long 0;
but LLVM doesn't recognize fixup of 2 byte PC relative yet.
Hence use
asm volatile goto(".byte 0xe5;
.byte 0;
.long (%l[l_break] - . - 4) / 8;
.short 0;
that produces correct asm on little endian.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/bpf_experimental.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h
index 0d749006d107..2d408d8b9b70 100644
--- a/tools/testing/selftests/bpf/bpf_experimental.h
+++ b/tools/testing/selftests/bpf/bpf_experimental.h
@@ -326,6 +326,18 @@ l_true: \
})
#endif
+#define cond_break \
+ ({ __label__ l_break, l_continue; \
+ asm volatile goto(".byte 0xe5; \
+ .byte 0; \
+ .long (%l[l_break] - . - 4) / 8; \
+ .short 0" \
+ :::: l_break); \
+ goto l_continue; \
+ l_break: break; \
+ l_continue:; \
+ })
+
#ifndef bpf_nop_mov
#define bpf_nop_mov(var) \
asm volatile("%[reg]=%[reg]"::[reg]"r"((short)var))
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 bpf-next 4/4] selftests/bpf: Test may_goto
2024-03-02 2:00 [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
` (2 preceding siblings ...)
2024-03-02 2:00 ` [PATCH v4 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov
@ 2024-03-02 2:00 ` Alexei Starovoitov
2024-03-04 16:45 ` [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break dthaler1968
4 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-02 2:00 UTC (permalink / raw)
To: bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Add tests for may_goto instruction via cond_break macro.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../bpf/progs/verifier_iterating_callbacks.c | 103 +++++++++++++++++-
2 files changed, 101 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
index 1a63996c0304..9f579fc3fd55 100644
--- a/tools/testing/selftests/bpf/DENYLIST.s390x
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -3,3 +3,4 @@
exceptions # JIT does not support calling kfunc bpf_throw (exceptions)
get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
+verifier_iterating_callbacks/cond_break
diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
index 5905e036e0ea..04cdbce4652f 100644
--- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
+++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
@@ -1,8 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
-
-#include <linux/bpf.h>
-#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
+#include "bpf_experimental.h"
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
@@ -239,4 +237,103 @@ int bpf_loop_iter_limit_nested(void *unused)
return 1000 * a + b + c;
}
+#define ARR_SZ 1000000
+int zero;
+char arr[ARR_SZ];
+
+SEC("socket")
+__success __retval(0xd495cdc0)
+int cond_break1(const void *ctx)
+{
+ unsigned long i;
+ unsigned int sum = 0;
+
+ for (i = zero; i < ARR_SZ; cond_break, i++)
+ sum += i;
+ for (i = zero; i < ARR_SZ; i++) {
+ barrier_var(i);
+ sum += i + arr[i];
+ cond_break;
+ }
+
+ return sum;
+}
+
+SEC("socket")
+__success __retval(999000000)
+int cond_break2(const void *ctx)
+{
+ int i, j;
+ int sum = 0;
+
+ for (i = zero; i < 1000; cond_break, i++)
+ for (j = zero; j < 1000; j++) {
+ sum += i + j;
+ cond_break;
+ }
+
+ return sum;
+}
+
+static __noinline int loop(void)
+{
+ int i, sum = 0;
+
+ for (i = zero; i <= 1000000; i++, cond_break)
+ sum += i;
+
+ return sum;
+}
+
+SEC("socket")
+__success __retval(0x6a5a2920)
+int cond_break3(const void *ctx)
+{
+ return loop();
+}
+
+SEC("socket")
+__success __retval(1)
+int cond_break4(const void *ctx)
+{
+ int cnt = zero;
+
+ for (;;) {
+ /* should eventually break out of the loop */
+ cond_break;
+ cnt++;
+ }
+ /* if we looped a bit, it's a success */
+ return cnt > 1 ? 1 : 0;
+}
+
+static __noinline int static_subprog(void)
+{
+ int cnt = zero;
+
+ for (;;) {
+ cond_break;
+ cnt++;
+ }
+
+ return cnt;
+}
+
+SEC("socket")
+__success __retval(1)
+int cond_break5(const void *ctx)
+{
+ int cnt1 = zero, cnt2;
+
+ for (;;) {
+ cond_break;
+ cnt1++;
+ }
+
+ cnt2 = static_subprog();
+
+ /* main and subprog have to loop a bit */
+ return cnt1 > 1 && cnt2 > 1 ? 1 : 0;
+}
+
char _license[] SEC("license") = "GPL";
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match
2024-03-02 2:00 ` [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
@ 2024-03-03 14:12 ` Eduard Zingerman
2024-03-03 17:21 ` Alexei Starovoitov
0 siblings, 1 reply; 14+ messages in thread
From: Eduard Zingerman @ 2024-03-03 14:12 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: daniel, andrii, martin.lau, memxor, john.fastabend, kernel-team
On Fri, 2024-03-01 at 18:00 -0800, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> When open code iterators, bpf_loop or may_goto is used the following two states
> are equivalent and safe to prune the search:
>
> cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
>
> In other words "exact" state match should ignore liveness and precision marks,
> since open coded iterator logic didn't complete their propagation,
> but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr
> is safe to rely on.
>
> Avoid doing such comparison when regular infinite loop detection logic is used,
> otherwise bounded loop logic will declare such "infinite loop" as false
> positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop().
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
I think this makes sense, don't see counter-examples at the moment.
One nit below.
Also, I'm curious if there is veristat results impact,
could be huge for some cases with bpf_loop().
[...]
> @@ -17257,7 +17263,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> */
> loop_entry = get_loop_entry(&sl->state);
> force_exact = loop_entry && loop_entry->branches > 0;
> - if (states_equal(env, &sl->state, cur, force_exact)) {
> + if (states_equal(env, &sl->state, cur, force_exact ? EXACT : NOT_EXACT)) {
Logically this checks same condition as checks for calls_callback() or
is_iter_next_insn() above: whether current state is equivalent to some
old state, where old state had not been tracked to 'exit' for each
possible path yet.
Thus, 'exact' flags used in these checks should be the same:
"force_exact ? RANGE_WITHIN : NOT_EXACT".
> if (force_exact)
> update_loop_entry(cur, loop_entry);
> hit:
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match
2024-03-03 14:12 ` Eduard Zingerman
@ 2024-03-03 17:21 ` Alexei Starovoitov
2024-03-05 3:44 ` Alexei Starovoitov
0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-03 17:21 UTC (permalink / raw)
To: Eduard Zingerman
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, John Fastabend, Kernel Team
On Sun, Mar 3, 2024 at 6:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2024-03-01 at 18:00 -0800, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > When open code iterators, bpf_loop or may_goto is used the following two states
> > are equivalent and safe to prune the search:
> >
> > cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> > old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf))
> >
> > In other words "exact" state match should ignore liveness and precision marks,
> > since open coded iterator logic didn't complete their propagation,
> > but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr
> > is safe to rely on.
> >
> > Avoid doing such comparison when regular infinite loop detection logic is used,
> > otherwise bounded loop logic will declare such "infinite loop" as false
> > positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop().
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> I think this makes sense, don't see counter-examples at the moment.
> One nit below.
>
> Also, I'm curious if there is veristat results impact,
> could be huge for some cases with bpf_loop().
No doubt, there will be improvements here and there. For iters.bpf.o:
Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States
(DIFF) Program
--------- --------- ------------- ---------- ----------
------------- -------------------------------
858 823 -35 (-4.08%) 81 79 -2
(-2.47%) iter_nested_iters
137 110 -27 (-19.71%) 12 10 -2
(-16.67%) iter_obfuscate_counter
1161 1109 -52 (-4.48%) 92 88 -4
(-4.35%) iter_subprog_iters
62 35 -27 (-43.55%) 5 3 -2
(-40.00%) triple_continue
the rest in this file didn't change. No change for pyperf either.
> [...]
>
> > @@ -17257,7 +17263,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > */
> > loop_entry = get_loop_entry(&sl->state);
> > force_exact = loop_entry && loop_entry->branches > 0;
> > - if (states_equal(env, &sl->state, cur, force_exact)) {
> > + if (states_equal(env, &sl->state, cur, force_exact ? EXACT : NOT_EXACT)) {
>
> Logically this checks same condition as checks for calls_callback() or
> is_iter_next_insn() above: whether current state is equivalent to some
> old state, where old state had not been tracked to 'exit' for each
> possible path yet.
> Thus, 'exact' flags used in these checks should be the same:
> "force_exact ? RANGE_WITHIN : NOT_EXACT".
Good point. Will change. It should help as well.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction
2024-03-02 2:00 ` [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
@ 2024-03-04 11:55 ` Eduard Zingerman
2024-03-04 16:59 ` Alexei Starovoitov
0 siblings, 1 reply; 14+ messages in thread
From: Eduard Zingerman @ 2024-03-04 11:55 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: daniel, andrii, martin.lau, memxor, john.fastabend, kernel-team
On Fri, 2024-03-01 at 18:00 -0800, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> Introduce may_goto instruction that acts on a hidden bpf_iter_num, so that
> bpf_iter_num_new(), bpf_iter_num_destroy() don't need to be called explicitly.
> It can be used in any normal "for" or "while" loop, like
>
> for (i = zero; i < cnt; cond_break, i++) {
>
> The verifier recognizes that may_goto is used in the program,
> reserves additional 8 bytes of stack, initializes them in subprog
> prologue, and replaces may_goto instruction with:
> aux_reg = *(u64 *)(fp - 40)
> if aux_reg == 0 goto pc+off
> aux_reg += 1
> *(u64 *)(fp - 40) = aux_reg
[...]
Modulo instruction validation issue you pointed out offlist I don't
see any obvious issues with this patch. Two notes below.
[...]
> @@ -19406,7 +19466,10 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> struct bpf_insn insn_buf[16];
> struct bpf_prog *new_prog;
> struct bpf_map *map_ptr;
> - int i, ret, cnt, delta = 0;
> + int i, ret, cnt, delta = 0, cur_subprog = 0;
> + struct bpf_subprog_info *subprogs = env->subprog_info;
> + u16 stack_depth = subprogs[cur_subprog].stack_depth;
> + u16 stack_depth_extra = 0;
Note: optimize_bpf_loop() has very similar logic,
but there stack_depth is rounded up:
u16 stack_depth = subprogs[cur_subprog].stack_depth;
u16 stack_depth_roundup = round_up(stack_depth, 8) - stack_depth;
u16 stack_depth_extra = 0;
...
stack_depth_extra = BPF_REG_SIZE * 3 + stack_depth_roundup;
And stack base for tmp variables is computed as "-(stack_depth + stack_depth_extra)".
>
> if (env->seen_exception && !env->exception_callback_subprog) {
> struct bpf_insn patch[] = {
> @@ -19426,7 +19489,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> mark_subprog_exc_cb(env, env->exception_callback_subprog);
> }
>
> - for (i = 0; i < insn_cnt; i++, insn++) {
> + for (i = 0; i < insn_cnt;) {
> /* Make divide-by-zero exceptions impossible. */
> if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) ||
> insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) ||
[...]
> @@ -19950,6 +20033,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> return -EFAULT;
> }
> insn->imm = fn->func - __bpf_call_base;
> +next_insn:
> + if (subprogs[cur_subprog + 1].start == i + delta + 1) {
> + subprogs[cur_subprog].stack_depth += stack_depth_extra;
> + subprogs[cur_subprog].stack_extra = stack_depth_extra;
> + cur_subprog++;
> + stack_depth = subprogs[cur_subprog].stack_depth;
> + stack_depth_extra = 0;
> + }
> + i++; insn++;
> + }
> +
> + env->prog->aux->stack_depth = subprogs[0].stack_depth;
> + for (i = 0; i < env->subprog_cnt; i++) {
> + int subprog_start = subprogs[i].start, j;
> + int stack_slots = subprogs[i].stack_extra / 8;
> +
> + if (stack_slots >= ARRAY_SIZE(insn_buf)) {
> + verbose(env, "verifier bug: stack_extra is too large\n");
> + return -EFAULT;
> + }
> +
> + /* Add insns to subprog prologue to zero init extra stack */
> + for (j = 0; j < stack_slots; j++)
> + insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
> + -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
Nit: the comment says that stack is zero initialized,
while it is actually set to BPF_MAX_LOOPS.
> + if (j) {
> + insn_buf[j] = env->prog->insnsi[subprog_start];
> +
> + new_prog = bpf_patch_insn_data(env, subprog_start, insn_buf, j + 1);
> + if (!new_prog)
> + return -ENOMEM;
> + env->prog = prog = new_prog;
> + }
> }
>
> /* Since poke tab is now finalized, publish aux to tracker. */
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-02 2:00 [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
` (3 preceding siblings ...)
2024-03-02 2:00 ` [PATCH v4 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
@ 2024-03-04 16:45 ` dthaler1968
2024-03-04 16:45 ` [Bpf] " dthaler1968=40googlemail.com
2024-03-04 17:05 ` Alexei Starovoitov
4 siblings, 2 replies; 14+ messages in thread
From: dthaler1968 @ 2024-03-04 16:45 UTC (permalink / raw)
To: 'Alexei Starovoitov', bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team, bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
[...]
> Alexei Starovoitov (4):
> bpf: Introduce may_goto instruction
> bpf: Recognize that two registers are safe when their ranges match
> bpf: Add cond_break macro
> selftests/bpf: Test may_goto
>
> include/linux/bpf_verifier.h | 2 +
> include/uapi/linux/bpf.h | 1 +
> kernel/bpf/core.c | 1 +
> kernel/bpf/disasm.c | 3 +
> kernel/bpf/verifier.c | 280 +++++++++++++-----
> tools/include/uapi/linux/bpf.h | 1 +
> tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> .../testing/selftests/bpf/bpf_experimental.h | 12 +
> .../bpf/progs/verifier_iterating_callbacks.c | 103 ++++++-
> 9 files changed, 330 insertions(+), 74 deletions(-)
Don't we also need to add may_goto to instruction-set.rst?
Dave
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Bpf] [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-04 16:45 ` [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break dthaler1968
@ 2024-03-04 16:45 ` dthaler1968=40googlemail.com
2024-03-04 17:05 ` Alexei Starovoitov
1 sibling, 0 replies; 14+ messages in thread
From: dthaler1968=40googlemail.com @ 2024-03-04 16:45 UTC (permalink / raw)
To: 'Alexei Starovoitov', bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team, bpf
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
[...]
> Alexei Starovoitov (4):
> bpf: Introduce may_goto instruction
> bpf: Recognize that two registers are safe when their ranges match
> bpf: Add cond_break macro
> selftests/bpf: Test may_goto
>
> include/linux/bpf_verifier.h | 2 +
> include/uapi/linux/bpf.h | 1 +
> kernel/bpf/core.c | 1 +
> kernel/bpf/disasm.c | 3 +
> kernel/bpf/verifier.c | 280 +++++++++++++-----
> tools/include/uapi/linux/bpf.h | 1 +
> tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> .../testing/selftests/bpf/bpf_experimental.h | 12 +
> .../bpf/progs/verifier_iterating_callbacks.c | 103 ++++++-
> 9 files changed, 330 insertions(+), 74 deletions(-)
Don't we also need to add may_goto to instruction-set.rst?
Dave
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction
2024-03-04 11:55 ` Eduard Zingerman
@ 2024-03-04 16:59 ` Alexei Starovoitov
0 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-04 16:59 UTC (permalink / raw)
To: Eduard Zingerman
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, John Fastabend, Kernel Team
On Mon, Mar 4, 2024 at 3:55 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2024-03-01 at 18:00 -0800, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Introduce may_goto instruction that acts on a hidden bpf_iter_num, so that
> > bpf_iter_num_new(), bpf_iter_num_destroy() don't need to be called explicitly.
> > It can be used in any normal "for" or "while" loop, like
> >
> > for (i = zero; i < cnt; cond_break, i++) {
> >
> > The verifier recognizes that may_goto is used in the program,
> > reserves additional 8 bytes of stack, initializes them in subprog
> > prologue, and replaces may_goto instruction with:
> > aux_reg = *(u64 *)(fp - 40)
> > if aux_reg == 0 goto pc+off
> > aux_reg += 1
> > *(u64 *)(fp - 40) = aux_reg
>
> [...]
>
> Modulo instruction validation issue you pointed out offlist I don't
> see any obvious issues with this patch. Two notes below.
>
> [...]
>
> > @@ -19406,7 +19466,10 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > struct bpf_insn insn_buf[16];
> > struct bpf_prog *new_prog;
> > struct bpf_map *map_ptr;
> > - int i, ret, cnt, delta = 0;
> > + int i, ret, cnt, delta = 0, cur_subprog = 0;
> > + struct bpf_subprog_info *subprogs = env->subprog_info;
> > + u16 stack_depth = subprogs[cur_subprog].stack_depth;
> > + u16 stack_depth_extra = 0;
>
> Note: optimize_bpf_loop() has very similar logic,
> but there stack_depth is rounded up:
>
> u16 stack_depth = subprogs[cur_subprog].stack_depth;
> u16 stack_depth_roundup = round_up(stack_depth, 8) - stack_depth;
> u16 stack_depth_extra = 0;
> ...
> stack_depth_extra = BPF_REG_SIZE * 3 + stack_depth_roundup;
>
> And stack base for tmp variables is computed as "-(stack_depth + stack_depth_extra)".
I did a sloppy review when I landed your
commit 1ade23711971 ("bpf: Inline calls to bpf_loop when callback is known")
round_up is unnecessary.
The stack is always incremented in power of 8.
See grow_stack_state().
> >
> > if (env->seen_exception && !env->exception_callback_subprog) {
> > struct bpf_insn patch[] = {
> > @@ -19426,7 +19489,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > mark_subprog_exc_cb(env, env->exception_callback_subprog);
> > }
> >
> > - for (i = 0; i < insn_cnt; i++, insn++) {
> > + for (i = 0; i < insn_cnt;) {
> > /* Make divide-by-zero exceptions impossible. */
> > if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) ||
> > insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) ||
>
> [...]
>
> > @@ -19950,6 +20033,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > return -EFAULT;
> > }
> > insn->imm = fn->func - __bpf_call_base;
> > +next_insn:
> > + if (subprogs[cur_subprog + 1].start == i + delta + 1) {
> > + subprogs[cur_subprog].stack_depth += stack_depth_extra;
> > + subprogs[cur_subprog].stack_extra = stack_depth_extra;
> > + cur_subprog++;
> > + stack_depth = subprogs[cur_subprog].stack_depth;
> > + stack_depth_extra = 0;
> > + }
> > + i++; insn++;
> > + }
> > +
> > + env->prog->aux->stack_depth = subprogs[0].stack_depth;
> > + for (i = 0; i < env->subprog_cnt; i++) {
> > + int subprog_start = subprogs[i].start, j;
> > + int stack_slots = subprogs[i].stack_extra / 8;
> > +
> > + if (stack_slots >= ARRAY_SIZE(insn_buf)) {
> > + verbose(env, "verifier bug: stack_extra is too large\n");
> > + return -EFAULT;
> > + }
> > +
> > + /* Add insns to subprog prologue to zero init extra stack */
> > + for (j = 0; j < stack_slots; j++)
> > + insn_buf[j] = BPF_ST_MEM(BPF_DW, BPF_REG_FP,
> > + -subprogs[i].stack_depth + j * 8, BPF_MAX_LOOPS);
>
> Nit: the comment says that stack is zero initialized,
> while it is actually set to BPF_MAX_LOOPS.
Good catch. Will fix.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-04 16:45 ` [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break dthaler1968
2024-03-04 16:45 ` [Bpf] " dthaler1968=40googlemail.com
@ 2024-03-04 17:05 ` Alexei Starovoitov
2024-03-04 17:05 ` [Bpf] " Alexei Starovoitov
1 sibling, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-04 17:05 UTC (permalink / raw)
To: Dave Thaler
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, John Fastabend, Kernel Team, bpf
On Mon, Mar 4, 2024 at 8:45 AM <dthaler1968@googlemail.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> [...]
> > Alexei Starovoitov (4):
> > bpf: Introduce may_goto instruction
> > bpf: Recognize that two registers are safe when their ranges match
> > bpf: Add cond_break macro
> > selftests/bpf: Test may_goto
> >
> > include/linux/bpf_verifier.h | 2 +
> > include/uapi/linux/bpf.h | 1 +
> > kernel/bpf/core.c | 1 +
> > kernel/bpf/disasm.c | 3 +
> > kernel/bpf/verifier.c | 280 +++++++++++++-----
> > tools/include/uapi/linux/bpf.h | 1 +
> > tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> > .../testing/selftests/bpf/bpf_experimental.h | 12 +
> > .../bpf/progs/verifier_iterating_callbacks.c | 103 ++++++-
> > 9 files changed, 330 insertions(+), 74 deletions(-)
>
> Don't we also need to add may_goto to instruction-set.rst?
If it was a real insn it would be too early to update the standard,
since it's not in any kernel release yet.
But more importantly it's a pseudo insn. CPUs/JITs won't see it.
I think we need a different document for such pseudo instructions.
Same thing for upcoming nop_or_goto pseudo insn.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Bpf] [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break
2024-03-04 17:05 ` Alexei Starovoitov
@ 2024-03-04 17:05 ` Alexei Starovoitov
0 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-04 17:05 UTC (permalink / raw)
To: Dave Thaler
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, Eddy Z, John Fastabend, Kernel Team, bpf
On Mon, Mar 4, 2024 at 8:45 AM <dthaler1968@googlemail.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> [...]
> > Alexei Starovoitov (4):
> > bpf: Introduce may_goto instruction
> > bpf: Recognize that two registers are safe when their ranges match
> > bpf: Add cond_break macro
> > selftests/bpf: Test may_goto
> >
> > include/linux/bpf_verifier.h | 2 +
> > include/uapi/linux/bpf.h | 1 +
> > kernel/bpf/core.c | 1 +
> > kernel/bpf/disasm.c | 3 +
> > kernel/bpf/verifier.c | 280 +++++++++++++-----
> > tools/include/uapi/linux/bpf.h | 1 +
> > tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> > .../testing/selftests/bpf/bpf_experimental.h | 12 +
> > .../bpf/progs/verifier_iterating_callbacks.c | 103 ++++++-
> > 9 files changed, 330 insertions(+), 74 deletions(-)
>
> Don't we also need to add may_goto to instruction-set.rst?
If it was a real insn it would be too early to update the standard,
since it's not in any kernel release yet.
But more importantly it's a pseudo insn. CPUs/JITs won't see it.
I think we need a different document for such pseudo instructions.
Same thing for upcoming nop_or_goto pseudo insn.
--
Bpf mailing list
Bpf@ietf.org
https://www.ietf.org/mailman/listinfo/bpf
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match
2024-03-03 17:21 ` Alexei Starovoitov
@ 2024-03-05 3:44 ` Alexei Starovoitov
0 siblings, 0 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-05 3:44 UTC (permalink / raw)
To: Eduard Zingerman
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Kumar Kartikeya Dwivedi, John Fastabend, Kernel Team
On Sun, Mar 3, 2024 at 9:21 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> > > loop_entry = get_loop_entry(&sl->state);
> > > force_exact = loop_entry && loop_entry->branches > 0;
> > > - if (states_equal(env, &sl->state, cur, force_exact)) {
> > > + if (states_equal(env, &sl->state, cur, force_exact ? EXACT : NOT_EXACT)) {
> >
> > Logically this checks same condition as checks for calls_callback() or
> > is_iter_next_insn() above: whether current state is equivalent to some
> > old state, where old state had not been tracked to 'exit' for each
> > possible path yet.
> > Thus, 'exact' flags used in these checks should be the same:
> > "force_exact ? RANGE_WITHIN : NOT_EXACT".
>
> Good point. Will change. It should help as well.
While working on that suggestion realized that this patch
has a bug that I'm fixing with extra hunk:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 410ac8423cf8..1e823c40a6d2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16471,11 +16471,9 @@ static bool regsafe(struct bpf_verifier_env
*env, struct bpf_reg_state *rold,
if (!(rold->live & REG_LIVE_READ) && exact != RANGE_WITHIN)
/* explored state didn't use this */
return true;
- if (rold->type == NOT_INIT)
+ if (rold->type == NOT_INIT && exact != RANGE_WITHIN)
/* explored state can't have used this */
return true;
- if (rcur->type == NOT_INIT)
- return false;
Kinda obvious in retrospect.
^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-03-05 3:45 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-02 2:00 [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
2024-03-04 11:55 ` Eduard Zingerman
2024-03-04 16:59 ` Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov
2024-03-03 14:12 ` Eduard Zingerman
2024-03-03 17:21 ` Alexei Starovoitov
2024-03-05 3:44 ` Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov
2024-03-02 2:00 ` [PATCH v4 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov
2024-03-04 16:45 ` [PATCH v4 bpf-next 0/4] bpf: Introduce may_goto and cond_break dthaler1968
2024-03-04 16:45 ` [Bpf] " dthaler1968=40googlemail.com
2024-03-04 17:05 ` Alexei Starovoitov
2024-03-04 17:05 ` [Bpf] " Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox