* [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break
@ 2024-03-06 3:19 Alexei Starovoitov
2024-03-06 3:19 ` [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Alexei Starovoitov @ 2024-03-06 3:19 UTC (permalink / raw)
To: bpf
Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend,
kernel-team
From: Alexei Starovoitov <ast@kernel.org>
v5 -> v6:
- Rename BPF_JMA to BPF_JCOND
- Addressed Andrii's review comments
v4 -> v5:
- rewrote patch 1 to avoid fake may_goto_reg and use 'u32 may_goto_cnt' instead.
This way may_goto handling is similar to bpf_loop() processing.
- fixed bug in patch 2 that RANGE_WITHIN should not use
rold->type == NOT_INIT as a safe signal.
- patch 3 fixed negative offset computation in cond_break macro
- using bpf_arena and cond_break recompiled lib/glob.c as bpf prog
and it works! It will be added as a selftest to arena series.
v3 -> v4:
- fix drained issue reported by John.
may_goto insn could be implemented with sticky state (once
reaches 0 it stays 0), but the verifier shouldn't assume that.
It has to explore both branches.
Arguably drained iterator state shouldn't be there at all.
bpf_iter_css_next() is not sticky. Can be fixed, but auditing all
iterators for stickiness. That's an orthogonal discussion.
- explained JMA name reasons in patch 1
- fixed test_progs-no_alu32 issue and added another test
v2 -> v3: Major change
- drop bpf_can_loop() kfunc and introduce may_goto instruction instead
kfunc is a function call while may_goto doesn't consume any registers
and LLVM can produce much better code due to less register pressure.
- instead of counting from zero to BPF_MAX_LOOPS start from it instead
and break out of the loop when count reaches zero
- use may_goto instruction in cond_break macro
- recognize that 'exact' state comparison doesn't need to be truly exact.
regsafe() should ignore precision and liveness marks, but range_within
logic is safe to use while evaluating open coded iterators.
Alexei Starovoitov (4):
bpf: Introduce may_goto instruction
bpf: Recognize that two registers are safe when their ranges match
bpf: Add cond_break macro
selftests/bpf: Test may_goto
include/linux/bpf_verifier.h | 2 +
include/uapi/linux/bpf.h | 5 +
kernel/bpf/core.c | 1 +
kernel/bpf/disasm.c | 4 +
kernel/bpf/verifier.c | 212 +++++++++++++-----
tools/include/uapi/linux/bpf.h | 5 +
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../testing/selftests/bpf/bpf_experimental.h | 12 +
.../bpf/progs/verifier_iterating_callbacks.c | 103 ++++++++-
9 files changed, 292 insertions(+), 53 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction 2024-03-06 3:19 [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov @ 2024-03-06 3:19 ` Alexei Starovoitov 2024-03-06 13:12 ` Eduard Zingerman 2024-03-06 18:33 ` Andrii Nakryiko 2024-03-06 3:19 ` [PATCH v6 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov ` (3 subsequent siblings) 4 siblings, 2 replies; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 3:19 UTC (permalink / raw) To: bpf Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team From: Alexei Starovoitov <ast@kernel.org> Introduce may_goto instruction that from the verifier pov is similar to open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it doesn't iterate any objects. In assembly 'may_goto' is a nop most of the time until bpf runtime has to terminate the program for whatever reason. In the current implementation may_goto has a hidden counter, but other mechanisms can be used. For programs written in C the later patch introduces 'cond_break' macro that combines 'may_goto' with 'break' statement and has similar semantics: cond_break is a nop until bpf runtime has to break out of this loop. It can be used in any normal "for" or "while" loop, like for (i = zero; i < cnt; cond_break, i++) { The verifier recognizes that may_goto is used in the program, reserves additional 8 bytes of stack, initializes them in subprog prologue, and replaces may_goto instruction with: aux_reg = *(u64 *)(fp - 40) if aux_reg == 0 goto pc+off aux_reg -= 1 *(u64 *)(fp - 40) = aux_reg may_goto instruction can be used by LLVM to implement __builtin_memcpy, __builtin_strcmp. may_goto is not a full substitute for bpf_for() macro. bpf_for() doesn't have induction variable that verifiers sees, so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded. But when the code is written as: for (i = 0; i < 100; cond_break, i++) the verifier see 'i' as precise constant zero, hence cond_break (aka may_goto) doesn't help to converge the loop. A static or global variable can be used as a workaround: static int zero = 0; for (i = zero; i < 100; cond_break, i++) // works! may_goto works well with arena pointers that don't need to be bounds checked on access. Load/store from arena returns imprecise unbounded scalar and loops with may_goto pass the verifier. Reserve new opcode BPF_JMP | BPF_JCOND for may_goto insn. JCOND stands for conditional pseudo jump. Since goto_or_nop insn was proposed, it may use the same opcode. may_goto vs goto_or_nop can be distinguished by src_reg: code = BPF_JMP | BPF_JCOND src_reg = 0 - may_goto src_reg = 1 - goto_or_nop Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- include/linux/bpf_verifier.h | 2 + include/uapi/linux/bpf.h | 5 + kernel/bpf/core.c | 1 + kernel/bpf/disasm.c | 4 + kernel/bpf/verifier.c | 163 +++++++++++++++++++++++++++------ tools/include/uapi/linux/bpf.h | 5 + 6 files changed, 150 insertions(+), 30 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 84365e6dd85d..4b0f6600e499 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -449,6 +449,7 @@ struct bpf_verifier_state { u32 jmp_history_cnt; u32 dfs_depth; u32 callback_unroll_depth; + u32 may_goto_depth; }; #define bpf_get_spilled_reg(slot, frame, mask) \ @@ -619,6 +620,7 @@ struct bpf_subprog_info { u32 start; /* insn idx of function entry point */ u32 linfo_idx; /* The idx to the main_prog->aux->linfo */ u16 stack_depth; /* max. stack depth used by this function */ + u16 stack_extra; bool has_tail_call: 1; bool tail_call_reachable: 1; bool has_ld_abs: 1; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index a241f407c234..85ec7fc799d7 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -42,6 +42,7 @@ #define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */ #define BPF_JSLT 0xc0 /* SLT is signed, '<' */ #define BPF_JSLE 0xd0 /* SLE is signed, '<=' */ +#define BPF_JCOND 0xe0 /* conditional pseudo jumps: may_goto, goto_or_nop */ #define BPF_CALL 0x80 /* function call */ #define BPF_EXIT 0x90 /* function return */ @@ -50,6 +51,10 @@ #define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */ #define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */ +enum bpf_cond_pseudo_jmp { + BPF_MAY_GOTO = 0, +}; + /* Register numbers */ enum { BPF_REG_0 = 0, diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 71c459a51d9e..9ee4536d0a09 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1675,6 +1675,7 @@ bool bpf_opcode_in_insntable(u8 code) [BPF_LD | BPF_IND | BPF_B] = true, [BPF_LD | BPF_IND | BPF_H] = true, [BPF_LD | BPF_IND | BPF_W] = true, + [BPF_JMP | BPF_JCOND] = true, }; #undef BPF_INSN_3_TBL #undef BPF_INSN_2_TBL diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c index 49940c26a227..82b2dbdd048f 100644 --- a/kernel/bpf/disasm.c +++ b/kernel/bpf/disasm.c @@ -322,6 +322,10 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs, } else if (insn->code == (BPF_JMP | BPF_JA)) { verbose(cbs->private_data, "(%02x) goto pc%+d\n", insn->code, insn->off); + } else if (insn->code == (BPF_JMP | BPF_JCOND) && + insn->src_reg == BPF_MAY_GOTO) { + verbose(cbs->private_data, "(%02x) may_goto pc%+d\n", + insn->code, insn->off); } else if (insn->code == (BPF_JMP32 | BPF_JA)) { verbose(cbs->private_data, "(%02x) gotol pc%+d\n", insn->code, insn->imm); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 4dd84e13bbfe..8030b50d3b45 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -533,6 +533,16 @@ static bool is_async_callback_calling_insn(struct bpf_insn *insn) return bpf_helper_call(insn) && is_async_callback_calling_function(insn->imm); } +static bool is_may_goto_insn(struct bpf_insn *insn) +{ + return insn->code == (BPF_JMP | BPF_JCOND) && insn->src_reg == BPF_MAY_GOTO; +} + +static bool is_may_goto_insn_at(struct bpf_verifier_env *env, int insn_idx) +{ + return is_may_goto_insn(&env->prog->insnsi[insn_idx]); +} + static bool is_storage_get_function(enum bpf_func_id func_id) { return func_id == BPF_FUNC_sk_storage_get || @@ -1429,6 +1439,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state, dst_state->dfs_depth = src->dfs_depth; dst_state->callback_unroll_depth = src->callback_unroll_depth; dst_state->used_as_loop_entry = src->used_as_loop_entry; + dst_state->may_goto_depth = src->may_goto_depth; for (i = 0; i <= src->curframe; i++) { dst = dst_state->frame[i]; if (!dst) { @@ -14871,11 +14882,36 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, int err; /* Only conditional jumps are expected to reach here. */ - if (opcode == BPF_JA || opcode > BPF_JSLE) { + if (opcode == BPF_JA || opcode > BPF_JCOND) { verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode); return -EINVAL; } + if (opcode == BPF_JCOND) { + struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st; + int idx = *insn_idx; + + if (insn->code != (BPF_JMP | BPF_JCOND) || + insn->src_reg != BPF_MAY_GOTO || + insn->dst_reg || insn->imm || insn->off == 0) { + verbose(env, "invalid may_goto off %d imm %d\n", + insn->off, insn->imm); + return -EINVAL; + } + prev_st = find_prev_entry(env, cur_st->parent, idx); + + /* branch out 'fallthrough' insn as a new state to explore */ + queued_st = push_stack(env, idx + 1, idx, false); + if (!queued_st) + return -ENOMEM; + + queued_st->may_goto_depth++; + if (prev_st) + widen_imprecise_scalars(env, prev_st, queued_st); + *insn_idx += insn->off; + return 0; + } + /* check src2 operand */ err = check_reg_arg(env, insn->dst_reg, SRC_OP); if (err) @@ -15659,6 +15695,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env) default: /* conditional jump with two edges */ mark_prune_point(env, t); + if (is_may_goto_insn(insn)) + mark_force_checkpoint(env, t); ret = push_insn(t, t + 1, FALLTHROUGH, env); if (ret) @@ -17135,6 +17173,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) } goto skip_inf_loop_check; } + if (is_may_goto_insn_at(env, insn_idx)) { + if (states_equal(env, &sl->state, cur, true)) { + update_loop_entry(cur, &sl->state); + goto hit; + } + goto skip_inf_loop_check; + } if (calls_callback(env, insn_idx)) { if (states_equal(env, &sl->state, cur, true)) goto hit; @@ -17144,6 +17189,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) if (states_maybe_looping(&sl->state, cur) && states_equal(env, &sl->state, cur, true) && !iter_active_depths_differ(&sl->state, cur) && + sl->state.may_goto_depth == cur->may_goto_depth && sl->state.callback_unroll_depth == cur->callback_unroll_depth) { verbose_linfo(env, insn_idx, "; "); verbose(env, "infinite loop detected at insn %d\n", insn_idx); @@ -19408,7 +19454,10 @@ static int do_misc_fixups(struct bpf_verifier_env *env) struct bpf_insn insn_buf[16]; struct bpf_prog *new_prog; struct bpf_map *map_ptr; - int i, ret, cnt, delta = 0; + int i, ret, cnt, delta = 0, cur_subprog = 0; + struct bpf_subprog_info *subprogs = env->subprog_info; + u16 stack_depth = subprogs[cur_subprog].stack_depth; + u16 stack_depth_extra = 0; if (env->seen_exception && !env->exception_callback_subprog) { struct bpf_insn patch[] = { @@ -19428,7 +19477,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) mark_subprog_exc_cb(env, env->exception_callback_subprog); } - for (i = 0; i < insn_cnt; i++, insn++) { + for (i = 0; i < insn_cnt;) { /* Make divide-by-zero exceptions impossible. */ if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || @@ -19467,7 +19516,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Implement LD_ABS and LD_IND with a rewrite, if supported by the program type. */ @@ -19487,7 +19536,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Rewrite pointer arithmetic to mitigate speculation attacks. */ @@ -19502,7 +19551,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) aux = &env->insn_aux_data[i + delta]; if (!aux->alu_state || aux->alu_state == BPF_ALU_NON_POINTER) - continue; + goto next_insn; isneg = aux->alu_state & BPF_ALU_NEG_VALUE; issrc = (aux->alu_state & BPF_ALU_SANITIZE) == @@ -19540,19 +19589,39 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; + } + + if (is_may_goto_insn(insn)) { + int stack_off = -stack_depth - 8; + + stack_depth_extra = 8; + insn_buf[0] = BPF_LDX_MEM(BPF_DW, BPF_REG_AX, BPF_REG_10, stack_off); + insn_buf[1] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_AX, 0, insn->off + 2); + insn_buf[2] = BPF_ALU64_IMM(BPF_SUB, BPF_REG_AX, 1); + insn_buf[3] = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_AX, stack_off); + cnt = 4; + + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = prog = new_prog; + insn = new_prog->insnsi + i + delta; + goto next_insn; } if (insn->code != (BPF_JMP | BPF_CALL)) - continue; + goto next_insn; if (insn->src_reg == BPF_PSEUDO_CALL) - continue; + goto next_insn; if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt); if (ret) return ret; if (cnt == 0) - continue; + goto next_insn; new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); if (!new_prog) @@ -19561,7 +19630,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } if (insn->imm == BPF_FUNC_get_route_realm) @@ -19609,11 +19678,11 @@ static int do_misc_fixups(struct bpf_verifier_env *env) } insn->imm = ret + 1; - continue; + goto next_insn; } if (!bpf_map_ptr_unpriv(aux)) - continue; + goto next_insn; /* instead of changing every JIT dealing with tail_call * emit two extra insns: @@ -19642,7 +19711,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } if (insn->imm == BPF_FUNC_timer_set_callback) { @@ -19754,7 +19823,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } BUILD_BUG_ON(!__same_type(ops->map_lookup_elem, @@ -19785,31 +19854,31 @@ static int do_misc_fixups(struct bpf_verifier_env *env) switch (insn->imm) { case BPF_FUNC_map_lookup_elem: insn->imm = BPF_CALL_IMM(ops->map_lookup_elem); - continue; + goto next_insn; case BPF_FUNC_map_update_elem: insn->imm = BPF_CALL_IMM(ops->map_update_elem); - continue; + goto next_insn; case BPF_FUNC_map_delete_elem: insn->imm = BPF_CALL_IMM(ops->map_delete_elem); - continue; + goto next_insn; case BPF_FUNC_map_push_elem: insn->imm = BPF_CALL_IMM(ops->map_push_elem); - continue; + goto next_insn; case BPF_FUNC_map_pop_elem: insn->imm = BPF_CALL_IMM(ops->map_pop_elem); - continue; + goto next_insn; case BPF_FUNC_map_peek_elem: insn->imm = BPF_CALL_IMM(ops->map_peek_elem); - continue; + goto next_insn; case BPF_FUNC_redirect_map: insn->imm = BPF_CALL_IMM(ops->map_redirect); - continue; + goto next_insn; case BPF_FUNC_for_each_map_elem: insn->imm = BPF_CALL_IMM(ops->map_for_each_callback); - continue; + goto next_insn; case BPF_FUNC_map_lookup_percpu_elem: insn->imm = BPF_CALL_IMM(ops->map_lookup_percpu_elem); - continue; + goto next_insn; } goto patch_call_imm; @@ -19837,7 +19906,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Implement bpf_get_func_arg inline. */ @@ -19862,7 +19931,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Implement bpf_get_func_ret inline. */ @@ -19890,7 +19959,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Implement get_func_arg_cnt inline. */ @@ -19905,7 +19974,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Implement bpf_get_func_ip inline. */ @@ -19920,7 +19989,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } /* Implement bpf_kptr_xchg inline */ @@ -19938,7 +20007,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) delta += cnt - 1; env->prog = prog = new_prog; insn = new_prog->insnsi + i + delta; - continue; + goto next_insn; } patch_call_imm: fn = env->ops->get_func_proto(insn->imm, env->prog); @@ -19952,6 +20021,40 @@ static int do_misc_fixups(struct bpf_verifier_env *env) return -EFAULT; } insn->imm = fn->func - __bpf_call_base; +next_insn: + if (subprogs[cur_subprog + 1].start == i + delta + 1) { + subprogs[cur_subprog].stack_depth += stack_depth_extra; + subprogs[cur_subprog].stack_extra = stack_depth_extra; + cur_subprog++; + stack_depth = subprogs[cur_subprog].stack_depth; + stack_depth_extra = 0; + } + i++; + insn++; + } + + env->prog->aux->stack_depth = subprogs[0].stack_depth; + for (i = 0; i < env->subprog_cnt; i++) { + int subprog_start = subprogs[i].start; + int stack_slots = subprogs[i].stack_extra / 8; + + if (!stack_slots) + continue; + if (stack_slots > 1) { + verbose(env, "verifier bug: stack_slots supports may_goto only\n"); + return -EFAULT; + } + + /* Add ST insn to subprog prologue to init extra stack */ + insn_buf[0] = BPF_ST_MEM(BPF_DW, BPF_REG_FP, + -subprogs[i].stack_depth, BPF_MAX_LOOPS); + /* Copy first actual insn to preserve it */ + insn_buf[1] = env->prog->insnsi[subprog_start]; + + new_prog = bpf_patch_insn_data(env, subprog_start, insn_buf, 2); + if (!new_prog) + return -ENOMEM; + env->prog = prog = new_prog; } /* Since poke tab is now finalized, publish aux to tracker. */ diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index a241f407c234..85ec7fc799d7 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -42,6 +42,7 @@ #define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */ #define BPF_JSLT 0xc0 /* SLT is signed, '<' */ #define BPF_JSLE 0xd0 /* SLE is signed, '<=' */ +#define BPF_JCOND 0xe0 /* conditional pseudo jumps: may_goto, goto_or_nop */ #define BPF_CALL 0x80 /* function call */ #define BPF_EXIT 0x90 /* function return */ @@ -50,6 +51,10 @@ #define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */ #define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */ +enum bpf_cond_pseudo_jmp { + BPF_MAY_GOTO = 0, +}; + /* Register numbers */ enum { BPF_REG_0 = 0, -- 2.43.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction 2024-03-06 3:19 ` [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov @ 2024-03-06 13:12 ` Eduard Zingerman 2024-03-06 16:40 ` Alexei Starovoitov 2024-03-06 18:33 ` Andrii Nakryiko 1 sibling, 1 reply; 14+ messages in thread From: Eduard Zingerman @ 2024-03-06 13:12 UTC (permalink / raw) To: Alexei Starovoitov, bpf Cc: daniel, andrii, martin.lau, memxor, john.fastabend, kernel-team On Tue, 2024-03-05 at 19:19 -0800, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@kernel.org> [...] > JCOND stands for conditional pseudo jump. > Since goto_or_nop insn was proposed, it may use the same opcode. > may_goto vs goto_or_nop can be distinguished by src_reg: > code = BPF_JMP | BPF_JCOND > src_reg = 0 - may_goto > src_reg = 1 - goto_or_nop > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > --- Acked-by: Eduard Zingerman <eddyz87@gmail.com> [...] > @@ -14871,11 +14882,36 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > int err; > > /* Only conditional jumps are expected to reach here. */ > - if (opcode == BPF_JA || opcode > BPF_JSLE) { > + if (opcode == BPF_JA || opcode > BPF_JCOND) { > verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode); > return -EINVAL; > } > > + if (opcode == BPF_JCOND) { > + struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st; > + int idx = *insn_idx; > + > + if (insn->code != (BPF_JMP | BPF_JCOND) || > + insn->src_reg != BPF_MAY_GOTO || > + insn->dst_reg || insn->imm || insn->off == 0) { > + verbose(env, "invalid may_goto off %d imm %d\n", > + insn->off, insn->imm); > + return -EINVAL; > + } > + prev_st = find_prev_entry(env, cur_st->parent, idx); > + > + /* branch out 'fallthrough' insn as a new state to explore */ > + queued_st = push_stack(env, idx + 1, idx, false); > + if (!queued_st) > + return -ENOMEM; > + > + queued_st->may_goto_depth++; > + if (prev_st) > + widen_imprecise_scalars(env, prev_st, queued_st); > + *insn_idx += insn->off; > + return 0; > + } Nit: for other conditional jumps the fallthrough branch is explored first, I tried the following and the tests keep passing: @@ -14901,14 +14901,13 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, prev_st = find_prev_entry(env, cur_st->parent, idx); /* branch out 'fallthrough' insn as a new state to explore */ - queued_st = push_stack(env, idx + 1, idx, false); + queued_st = push_stack(env, idx + insn->off + 1, idx, false); if (!queued_st) return -ENOMEM; - queued_st->may_goto_depth++; + cur_st->may_goto_depth++; if (prev_st) - widen_imprecise_scalars(env, prev_st, queued_st); - *insn_idx += insn->off; + widen_imprecise_scalars(env, prev_st, cur_st); return 0; } Maybe this is a property worth preserving, wdyt? [...] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction 2024-03-06 13:12 ` Eduard Zingerman @ 2024-03-06 16:40 ` Alexei Starovoitov 0 siblings, 0 replies; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 16:40 UTC (permalink / raw) To: Eduard Zingerman Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Kumar Kartikeya Dwivedi, John Fastabend, Kernel Team On Wed, Mar 6, 2024 at 5:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Tue, 2024-03-05 at 19:19 -0800, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > [...] > > > JCOND stands for conditional pseudo jump. > > Since goto_or_nop insn was proposed, it may use the same opcode. > > may_goto vs goto_or_nop can be distinguished by src_reg: > > code = BPF_JMP | BPF_JCOND > > src_reg = 0 - may_goto > > src_reg = 1 - goto_or_nop > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > --- > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > [...] > > > @@ -14871,11 +14882,36 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > > int err; > > > > /* Only conditional jumps are expected to reach here. */ > > - if (opcode == BPF_JA || opcode > BPF_JSLE) { > > + if (opcode == BPF_JA || opcode > BPF_JCOND) { > > verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode); > > return -EINVAL; > > } > > > > + if (opcode == BPF_JCOND) { > > + struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st; > > + int idx = *insn_idx; > > + > > + if (insn->code != (BPF_JMP | BPF_JCOND) || > > + insn->src_reg != BPF_MAY_GOTO || > > + insn->dst_reg || insn->imm || insn->off == 0) { > > + verbose(env, "invalid may_goto off %d imm %d\n", > > + insn->off, insn->imm); > > + return -EINVAL; > > + } > > + prev_st = find_prev_entry(env, cur_st->parent, idx); > > + > > + /* branch out 'fallthrough' insn as a new state to explore */ > > + queued_st = push_stack(env, idx + 1, idx, false); > > + if (!queued_st) > > + return -ENOMEM; > > + > > + queued_st->may_goto_depth++; > > + if (prev_st) > > + widen_imprecise_scalars(env, prev_st, queued_st); > > + *insn_idx += insn->off; > > + return 0; > > + } > > Nit: for other conditional jumps the fallthrough branch is explored first, > I tried the following and the tests keep passing: > > @@ -14901,14 +14901,13 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > prev_st = find_prev_entry(env, cur_st->parent, idx); > > /* branch out 'fallthrough' insn as a new state to explore */ > - queued_st = push_stack(env, idx + 1, idx, false); > + queued_st = push_stack(env, idx + insn->off + 1, idx, false); > if (!queued_st) > return -ENOMEM; > > - queued_st->may_goto_depth++; > + cur_st->may_goto_depth++; > if (prev_st) > - widen_imprecise_scalars(env, prev_st, queued_st); > - *insn_idx += insn->off; > + widen_imprecise_scalars(env, prev_st, cur_st); > return 0; > } > > Maybe this is a property worth preserving, wdyt? My gut feel that may_goto should do what bpf_loop and open coded iters are doing by exploring the "zero loops" path first. But I will explore your idea to explore looping states first in the follow up. It feels like a major change. If it works for may_goto it should work for bpf_loop too. Normal conditional jumps are different. The way compilers layout basic blocks it's beneficial to explore fall through, since it has a higher chance of state pruning earlier. For may_goto the "explore zero loop first" feels more critical for safety. Thanks for the reviews. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction 2024-03-06 3:19 ` [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov 2024-03-06 13:12 ` Eduard Zingerman @ 2024-03-06 18:33 ` Andrii Nakryiko 2024-03-06 18:38 ` Alexei Starovoitov 1 sibling, 1 reply; 14+ messages in thread From: Andrii Nakryiko @ 2024-03-06 18:33 UTC (permalink / raw) To: Alexei Starovoitov Cc: bpf, daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team On Tue, Mar 5, 2024 at 7:19 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > Introduce may_goto instruction that from the verifier pov is similar to > open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it > doesn't iterate any objects. > In assembly 'may_goto' is a nop most of the time until bpf runtime has to > terminate the program for whatever reason. In the current implementation > may_goto has a hidden counter, but other mechanisms can be used. > For programs written in C the later patch introduces 'cond_break' macro > that combines 'may_goto' with 'break' statement and has similar semantics: > cond_break is a nop until bpf runtime has to break out of this loop. > It can be used in any normal "for" or "while" loop, like > > for (i = zero; i < cnt; cond_break, i++) { > > The verifier recognizes that may_goto is used in the program, reserves > additional 8 bytes of stack, initializes them in subprog prologue, and > replaces may_goto instruction with: > aux_reg = *(u64 *)(fp - 40) > if aux_reg == 0 goto pc+off > aux_reg -= 1 > *(u64 *)(fp - 40) = aux_reg > > may_goto instruction can be used by LLVM to implement __builtin_memcpy, > __builtin_strcmp. > > may_goto is not a full substitute for bpf_for() macro. > bpf_for() doesn't have induction variable that verifiers sees, > so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded. > > But when the code is written as: > for (i = 0; i < 100; cond_break, i++) > the verifier see 'i' as precise constant zero, > hence cond_break (aka may_goto) doesn't help to converge the loop. > A static or global variable can be used as a workaround: > static int zero = 0; > for (i = zero; i < 100; cond_break, i++) // works! > > may_goto works well with arena pointers that don't need to be bounds > checked on access. Load/store from arena returns imprecise unbounded > scalar and loops with may_goto pass the verifier. > > Reserve new opcode BPF_JMP | BPF_JCOND for may_goto insn. > JCOND stands for conditional pseudo jump. > Since goto_or_nop insn was proposed, it may use the same opcode. > may_goto vs goto_or_nop can be distinguished by src_reg: > code = BPF_JMP | BPF_JCOND > src_reg = 0 - may_goto > src_reg = 1 - goto_or_nop > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > --- > include/linux/bpf_verifier.h | 2 + > include/uapi/linux/bpf.h | 5 + > kernel/bpf/core.c | 1 + > kernel/bpf/disasm.c | 4 + > kernel/bpf/verifier.c | 163 +++++++++++++++++++++++++++------ > tools/include/uapi/linux/bpf.h | 5 + > 6 files changed, 150 insertions(+), 30 deletions(-) > [...] > @@ -14871,11 +14882,36 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > int err; > > /* Only conditional jumps are expected to reach here. */ > - if (opcode == BPF_JA || opcode > BPF_JSLE) { > + if (opcode == BPF_JA || opcode > BPF_JCOND) { > verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode); > return -EINVAL; > } > > + if (opcode == BPF_JCOND) { > + struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st; > + int idx = *insn_idx; > + > + if (insn->code != (BPF_JMP | BPF_JCOND) || > + insn->src_reg != BPF_MAY_GOTO || > + insn->dst_reg || insn->imm || insn->off == 0) { Why rejecting insn->off == 0? It's still a legal instruction, even if not super useful in practice. insn->off == -1 is also weird, but legal (and not rejected in this case). > + verbose(env, "invalid may_goto off %d imm %d\n", > + insn->off, insn->imm); > + return -EINVAL; > + } > + prev_st = find_prev_entry(env, cur_st->parent, idx); > + > + /* branch out 'fallthrough' insn as a new state to explore */ > + queued_st = push_stack(env, idx + 1, idx, false); > + if (!queued_st) > + return -ENOMEM; > + > + queued_st->may_goto_depth++; > + if (prev_st) > + widen_imprecise_scalars(env, prev_st, queued_st); > + *insn_idx += insn->off; > + return 0; > + } > + > /* check src2 operand */ > err = check_reg_arg(env, insn->dst_reg, SRC_OP); > if (err) [...] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction 2024-03-06 18:33 ` Andrii Nakryiko @ 2024-03-06 18:38 ` Alexei Starovoitov 0 siblings, 0 replies; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 18:38 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Kumar Kartikeya Dwivedi, Eddy Z, John Fastabend, Kernel Team On Wed, Mar 6, 2024 at 10:33 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Tue, Mar 5, 2024 at 7:19 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > From: Alexei Starovoitov <ast@kernel.org> > > > > Introduce may_goto instruction that from the verifier pov is similar to > > open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it > > doesn't iterate any objects. > > In assembly 'may_goto' is a nop most of the time until bpf runtime has to > > terminate the program for whatever reason. In the current implementation > > may_goto has a hidden counter, but other mechanisms can be used. > > For programs written in C the later patch introduces 'cond_break' macro > > that combines 'may_goto' with 'break' statement and has similar semantics: > > cond_break is a nop until bpf runtime has to break out of this loop. > > It can be used in any normal "for" or "while" loop, like > > > > for (i = zero; i < cnt; cond_break, i++) { > > > > The verifier recognizes that may_goto is used in the program, reserves > > additional 8 bytes of stack, initializes them in subprog prologue, and > > replaces may_goto instruction with: > > aux_reg = *(u64 *)(fp - 40) > > if aux_reg == 0 goto pc+off > > aux_reg -= 1 > > *(u64 *)(fp - 40) = aux_reg > > > > may_goto instruction can be used by LLVM to implement __builtin_memcpy, > > __builtin_strcmp. > > > > may_goto is not a full substitute for bpf_for() macro. > > bpf_for() doesn't have induction variable that verifiers sees, > > so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded. > > > > But when the code is written as: > > for (i = 0; i < 100; cond_break, i++) > > the verifier see 'i' as precise constant zero, > > hence cond_break (aka may_goto) doesn't help to converge the loop. > > A static or global variable can be used as a workaround: > > static int zero = 0; > > for (i = zero; i < 100; cond_break, i++) // works! > > > > may_goto works well with arena pointers that don't need to be bounds > > checked on access. Load/store from arena returns imprecise unbounded > > scalar and loops with may_goto pass the verifier. > > > > Reserve new opcode BPF_JMP | BPF_JCOND for may_goto insn. > > JCOND stands for conditional pseudo jump. > > Since goto_or_nop insn was proposed, it may use the same opcode. > > may_goto vs goto_or_nop can be distinguished by src_reg: > > code = BPF_JMP | BPF_JCOND > > src_reg = 0 - may_goto > > src_reg = 1 - goto_or_nop > > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > --- > > include/linux/bpf_verifier.h | 2 + > > include/uapi/linux/bpf.h | 5 + > > kernel/bpf/core.c | 1 + > > kernel/bpf/disasm.c | 4 + > > kernel/bpf/verifier.c | 163 +++++++++++++++++++++++++++------ > > tools/include/uapi/linux/bpf.h | 5 + > > 6 files changed, 150 insertions(+), 30 deletions(-) > > > > [...] > > > @@ -14871,11 +14882,36 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, > > int err; > > > > /* Only conditional jumps are expected to reach here. */ > > - if (opcode == BPF_JA || opcode > BPF_JSLE) { > > + if (opcode == BPF_JA || opcode > BPF_JCOND) { > > verbose(env, "invalid BPF_JMP/JMP32 opcode %x\n", opcode); > > return -EINVAL; > > } > > > > + if (opcode == BPF_JCOND) { > > + struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st; > > + int idx = *insn_idx; > > + > > + if (insn->code != (BPF_JMP | BPF_JCOND) || > > + insn->src_reg != BPF_MAY_GOTO || > > + insn->dst_reg || insn->imm || insn->off == 0) { > > Why rejecting insn->off == 0? It's still a legal instruction, even if > not super useful in practice. insn->off == -1 is also weird, but legal > (and not rejected in this case). Legal, but nonsensical. s390 hits it due to macro endian issue, so it's better to reject early instead of wasting 1m verifier steps. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match 2024-03-06 3:19 [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov 2024-03-06 3:19 ` [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov @ 2024-03-06 3:19 ` Alexei Starovoitov 2024-03-06 13:07 ` Eduard Zingerman 2024-03-06 3:19 ` [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov ` (2 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 3:19 UTC (permalink / raw) To: bpf Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team From: Alexei Starovoitov <ast@kernel.org> When open code iterators, bpf_loop or may_goto are used the following two states are equivalent and safe to prune the search: cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) In other words "exact" state match should ignore liveness and precision marks, since open coded iterator logic didn't complete their propagation, reg_old->type == NOT_INIT && reg_cur->type != NOT_INIT is also not safe to prune while looping, but range_within logic that applies to scalars, ptr_to_mem, map_value, pkt_ptr is safe to rely on. Avoid doing such comparison when regular infinite loop detection logic is used, otherwise bounded loop logic will declare such "infinite loop" as false positive. Such example is in progs/verifier_loops1.c not_an_inifinite_loop(). Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- kernel/bpf/verifier.c | 51 +++++++++++++++++++++++++------------------ 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 8030b50d3b45..ee86e4d7d5fc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -16260,8 +16260,8 @@ static int check_btf_info(struct bpf_verifier_env *env, } /* check %cur's range satisfies %old's */ -static bool range_within(struct bpf_reg_state *old, - struct bpf_reg_state *cur) +static bool range_within(const struct bpf_reg_state *old, + const struct bpf_reg_state *cur) { return old->umin_value <= cur->umin_value && old->umax_value >= cur->umax_value && @@ -16425,21 +16425,28 @@ static bool regs_exact(const struct bpf_reg_state *rold, check_ids(rold->ref_obj_id, rcur->ref_obj_id, idmap); } +enum exact_level { + NOT_EXACT, + EXACT, + RANGE_WITHIN +}; + /* Returns true if (rold safe implies rcur safe) */ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, - struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact) + struct bpf_reg_state *rcur, struct bpf_idmap *idmap, + enum exact_level exact) { - if (exact) + if (exact == EXACT) return regs_exact(rold, rcur, idmap); - if (!(rold->live & REG_LIVE_READ)) + if (!(rold->live & REG_LIVE_READ) && exact == NOT_EXACT) /* explored state didn't use this */ return true; - if (rold->type == NOT_INIT) - /* explored state can't have used this */ - return true; - if (rcur->type == NOT_INIT) - return false; + if (rold->type == NOT_INIT) { + if (exact == NOT_EXACT || rcur->type == NOT_INIT) + /* explored state can't have used this */ + return true; + } /* Enforce that register types have to match exactly, including their * modifiers (like PTR_MAYBE_NULL, MEM_RDONLY, etc), as a general @@ -16474,7 +16481,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold, return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 && check_scalar_ids(rold->id, rcur->id, idmap); } - if (!rold->precise) + if (!rold->precise && exact == NOT_EXACT) return true; /* Why check_ids() for scalar registers? * @@ -16585,7 +16592,8 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env, } static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, - struct bpf_func_state *cur, struct bpf_idmap *idmap, bool exact) + struct bpf_func_state *cur, struct bpf_idmap *idmap, + enum exact_level exact) { int i, spi; @@ -16598,12 +16606,13 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, spi = i / BPF_REG_SIZE; - if (exact && + if (exact != NOT_EXACT && old->stack[spi].slot_type[i % BPF_REG_SIZE] != cur->stack[spi].slot_type[i % BPF_REG_SIZE]) return false; - if (!(old->stack[spi].spilled_ptr.live & REG_LIVE_READ) && !exact) { + if (!(old->stack[spi].spilled_ptr.live & REG_LIVE_READ) + && exact == NOT_EXACT) { i += BPF_REG_SIZE - 1; /* explored state didn't use this */ continue; @@ -16749,7 +16758,7 @@ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur, * the current state will reach 'bpf_exit' instruction safely */ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old, - struct bpf_func_state *cur, bool exact) + struct bpf_func_state *cur, enum exact_level exact) { int i; @@ -16776,7 +16785,7 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env) static bool states_equal(struct bpf_verifier_env *env, struct bpf_verifier_state *old, struct bpf_verifier_state *cur, - bool exact) + enum exact_level exact) { int i; @@ -17150,7 +17159,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) * => unsafe memory access at 11 would not be caught. */ if (is_iter_next_insn(env, insn_idx)) { - if (states_equal(env, &sl->state, cur, true)) { + if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) { struct bpf_func_state *cur_frame; struct bpf_reg_state *iter_state, *iter_reg; int spi; @@ -17174,20 +17183,20 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) goto skip_inf_loop_check; } if (is_may_goto_insn_at(env, insn_idx)) { - if (states_equal(env, &sl->state, cur, true)) { + if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) { update_loop_entry(cur, &sl->state); goto hit; } goto skip_inf_loop_check; } if (calls_callback(env, insn_idx)) { - if (states_equal(env, &sl->state, cur, true)) + if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) goto hit; goto skip_inf_loop_check; } /* attempt to detect infinite loop to avoid unnecessary doomed work */ if (states_maybe_looping(&sl->state, cur) && - states_equal(env, &sl->state, cur, true) && + states_equal(env, &sl->state, cur, EXACT) && !iter_active_depths_differ(&sl->state, cur) && sl->state.may_goto_depth == cur->may_goto_depth && sl->state.callback_unroll_depth == cur->callback_unroll_depth) { @@ -17245,7 +17254,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) */ loop_entry = get_loop_entry(&sl->state); force_exact = loop_entry && loop_entry->branches > 0; - if (states_equal(env, &sl->state, cur, force_exact)) { + if (states_equal(env, &sl->state, cur, force_exact ? RANGE_WITHIN : NOT_EXACT)) { if (force_exact) update_loop_entry(cur, loop_entry); hit: -- 2.43.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match 2024-03-06 3:19 ` [PATCH v6 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov @ 2024-03-06 13:07 ` Eduard Zingerman 0 siblings, 0 replies; 14+ messages in thread From: Eduard Zingerman @ 2024-03-06 13:07 UTC (permalink / raw) To: Alexei Starovoitov, bpf Cc: daniel, andrii, martin.lau, memxor, john.fastabend, kernel-team On Tue, 2024-03-05 at 19:19 -0800, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@kernel.org> > > When open code iterators, bpf_loop or may_goto are used the following two > states are equivalent and safe to prune the search: > > cur state: fp-8_w=scalar(id=3,smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) > old state: fp-8_rw=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=11,var_off=(0x0; 0xf)) > > In other words "exact" state match should ignore liveness and precision > marks, since open coded iterator logic didn't complete their propagation, > reg_old->type == NOT_INIT && reg_cur->type != NOT_INIT is also not safe to > prune while looping, but range_within logic that applies to scalars, > ptr_to_mem, map_value, pkt_ptr is safe to rely on. > > Avoid doing such comparison when regular infinite loop detection logic is > used, otherwise bounded loop logic will declare such "infinite loop" as > false positive. Such example is in progs/verifier_loops1.c > not_an_inifinite_loop(). > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro 2024-03-06 3:19 [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov 2024-03-06 3:19 ` [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov 2024-03-06 3:19 ` [PATCH v6 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov @ 2024-03-06 3:19 ` Alexei Starovoitov 2024-03-06 13:26 ` Eduard Zingerman 2024-03-06 3:19 ` [PATCH v6 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov 2024-03-06 18:50 ` [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break patchwork-bot+netdevbpf 4 siblings, 1 reply; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 3:19 UTC (permalink / raw) To: bpf Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team From: Alexei Starovoitov <ast@kernel.org> Use may_goto instruction to implement cond_break macro. Ideally the macro should be written as: asm volatile goto(".byte 0xe5; .byte 0; .short %l[l_break] ... .long 0; but LLVM doesn't recognize fixup of 2 byte PC relative yet. Hence use asm volatile goto(".byte 0xe5; .byte 0; .long %l[l_break] ... .short 0; that produces correct asm on little endian. Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- tools/testing/selftests/bpf/bpf_experimental.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 0d749006d107..bc9a0832ae72 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -326,6 +326,18 @@ l_true: \ }) #endif +#define cond_break \ + ({ __label__ l_break, l_continue; \ + asm volatile goto("1:.byte 0xe5; \ + .byte 0; \ + .long ((%l[l_break] - 1b - 8) / 8) & 0xffff; \ + .short 0" \ + :::: l_break); \ + goto l_continue; \ + l_break: break; \ + l_continue:; \ + }) + #ifndef bpf_nop_mov #define bpf_nop_mov(var) \ asm volatile("%[reg]=%[reg]"::[reg]"r"((short)var)) -- 2.43.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro 2024-03-06 3:19 ` [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov @ 2024-03-06 13:26 ` Eduard Zingerman 2024-03-06 16:51 ` Alexei Starovoitov 0 siblings, 1 reply; 14+ messages in thread From: Eduard Zingerman @ 2024-03-06 13:26 UTC (permalink / raw) To: Alexei Starovoitov, bpf Cc: daniel, andrii, martin.lau, memxor, john.fastabend, kernel-team On Tue, 2024-03-05 at 19:19 -0800, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@kernel.org> > > Use may_goto instruction to implement cond_break macro. > Ideally the macro should be written as: > asm volatile goto(".byte 0xe5; > .byte 0; > .short %l[l_break] ... > .long 0; > but LLVM doesn't recognize fixup of 2 byte PC relative yet. > Hence use > asm volatile goto(".byte 0xe5; > .byte 0; > .long %l[l_break] ... > .short 0; > that produces correct asm on little endian. > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> I tried rewriting with offset +1 and an additional goto: ({ __label__ l_break, l_continue; \ asm volatile goto("%[jcond]; goto %l[l_break];"\ :: __imm_insn(jcond, BPF_RAW_INSN(BPF_JMP | BPF_JCOND, 0, 0, 1, 0)) \ :: l_break); \ goto l_continue; \ l_break: break; \ l_continue:; \ }) But BPF_RAW_INSN needs filter.h, which can't be included because of vmlinux.h, unfortunate :( ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro 2024-03-06 13:26 ` Eduard Zingerman @ 2024-03-06 16:51 ` Alexei Starovoitov 0 siblings, 0 replies; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 16:51 UTC (permalink / raw) To: Eduard Zingerman Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Kumar Kartikeya Dwivedi, John Fastabend, Kernel Team On Wed, Mar 6, 2024 at 5:26 AM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Tue, 2024-03-05 at 19:19 -0800, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > > > Use may_goto instruction to implement cond_break macro. > > Ideally the macro should be written as: > > asm volatile goto(".byte 0xe5; > > .byte 0; > > .short %l[l_break] ... > > .long 0; > > but LLVM doesn't recognize fixup of 2 byte PC relative yet. > > Hence use > > asm volatile goto(".byte 0xe5; > > .byte 0; > > .long %l[l_break] ... > > .short 0; > > that produces correct asm on little endian. > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org> > > Acked-by: Eduard Zingerman <eddyz87@gmail.com> > > I tried rewriting with offset +1 and an additional goto: > > ({ __label__ l_break, l_continue; \ > asm volatile goto("%[jcond]; goto %l[l_break];"\ > :: __imm_insn(jcond, BPF_RAW_INSN(BPF_JMP | BPF_JCOND, 0, 0, 1, 0)) \ > :: l_break); \ > goto l_continue; \ > l_break: break; \ > l_continue:; \ > }) It probably works, but the generated code is not pretty. That's the reason why this macro has two labels l_break and l_continue. It could have been written with a single label, like: /* bad asm */ #define cond_break \ ({ __label__ l_continue; \ asm volatile goto("may_goto %l[l_continue]" :::: l_continue); \ break; \ l_continue: ; \ }) but generated code isn't great. It's similar to bpf_cmp_likely vs bpf_cmp_unlikely. The C statement that llvm sees right after 'asm volatile goto' is important for codegen. The macro in this patch is effectively this /* good asm */ #define cond_break \ ({ __label__ l_break, l_continue; \ asm volatile goto("may_goto %l[l_break]" :::: l_break); \ goto l_continue; \ l_break: break; \ l_continue: ; \ }) To visualize see: https://godbolt.org/z/98bcbrxaE The next step is to do several fixes in llvm: - allow pcrel fixup of 2 bytes - introduce builtin_may_goto() - recognize may_goto in inline asm and llvm-objdump disasm With that the macro will be cleaned up and the big-endian issue will be addressed. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v6 bpf-next 4/4] selftests/bpf: Test may_goto 2024-03-06 3:19 [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov ` (2 preceding siblings ...) 2024-03-06 3:19 ` [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov @ 2024-03-06 3:19 ` Alexei Starovoitov 2024-03-06 18:50 ` [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break patchwork-bot+netdevbpf 4 siblings, 0 replies; 14+ messages in thread From: Alexei Starovoitov @ 2024-03-06 3:19 UTC (permalink / raw) To: bpf Cc: daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team From: Alexei Starovoitov <ast@kernel.org> Add tests for may_goto instruction via cond_break macro. Signed-off-by: Alexei Starovoitov <ast@kernel.org> --- tools/testing/selftests/bpf/DENYLIST.s390x | 1 + .../bpf/progs/verifier_iterating_callbacks.c | 103 +++++++++++++++++- 2 files changed, 101 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x index 1a63996c0304..cb810a98e78f 100644 --- a/tools/testing/selftests/bpf/DENYLIST.s390x +++ b/tools/testing/selftests/bpf/DENYLIST.s390x @@ -3,3 +3,4 @@ exceptions # JIT does not support calling kfunc bpf_throw (exceptions) get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace) stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?) +verifier_iterating_callbacks diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c index 5905e036e0ea..04cdbce4652f 100644 --- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c +++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c @@ -1,8 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 - -#include <linux/bpf.h> -#include <bpf/bpf_helpers.h> #include "bpf_misc.h" +#include "bpf_experimental.h" struct { __uint(type, BPF_MAP_TYPE_ARRAY); @@ -239,4 +237,103 @@ int bpf_loop_iter_limit_nested(void *unused) return 1000 * a + b + c; } +#define ARR_SZ 1000000 +int zero; +char arr[ARR_SZ]; + +SEC("socket") +__success __retval(0xd495cdc0) +int cond_break1(const void *ctx) +{ + unsigned long i; + unsigned int sum = 0; + + for (i = zero; i < ARR_SZ; cond_break, i++) + sum += i; + for (i = zero; i < ARR_SZ; i++) { + barrier_var(i); + sum += i + arr[i]; + cond_break; + } + + return sum; +} + +SEC("socket") +__success __retval(999000000) +int cond_break2(const void *ctx) +{ + int i, j; + int sum = 0; + + for (i = zero; i < 1000; cond_break, i++) + for (j = zero; j < 1000; j++) { + sum += i + j; + cond_break; + } + + return sum; +} + +static __noinline int loop(void) +{ + int i, sum = 0; + + for (i = zero; i <= 1000000; i++, cond_break) + sum += i; + + return sum; +} + +SEC("socket") +__success __retval(0x6a5a2920) +int cond_break3(const void *ctx) +{ + return loop(); +} + +SEC("socket") +__success __retval(1) +int cond_break4(const void *ctx) +{ + int cnt = zero; + + for (;;) { + /* should eventually break out of the loop */ + cond_break; + cnt++; + } + /* if we looped a bit, it's a success */ + return cnt > 1 ? 1 : 0; +} + +static __noinline int static_subprog(void) +{ + int cnt = zero; + + for (;;) { + cond_break; + cnt++; + } + + return cnt; +} + +SEC("socket") +__success __retval(1) +int cond_break5(const void *ctx) +{ + int cnt1 = zero, cnt2; + + for (;;) { + cond_break; + cnt1++; + } + + cnt2 = static_subprog(); + + /* main and subprog have to loop a bit */ + return cnt1 > 1 && cnt2 > 1 ? 1 : 0; +} + char _license[] SEC("license") = "GPL"; -- 2.43.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break 2024-03-06 3:19 [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov ` (3 preceding siblings ...) 2024-03-06 3:19 ` [PATCH v6 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov @ 2024-03-06 18:50 ` patchwork-bot+netdevbpf 2024-03-06 19:15 ` John Fastabend 4 siblings, 1 reply; 14+ messages in thread From: patchwork-bot+netdevbpf @ 2024-03-06 18:50 UTC (permalink / raw) To: Alexei Starovoitov Cc: bpf, daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team Hello: This series was applied to bpf/bpf-next.git (master) by Andrii Nakryiko <andrii@kernel.org>: On Tue, 5 Mar 2024 19:19:25 -0800 you wrote: > From: Alexei Starovoitov <ast@kernel.org> > > v5 -> v6: > - Rename BPF_JMA to BPF_JCOND > - Addressed Andrii's review comments > > v4 -> v5: > - rewrote patch 1 to avoid fake may_goto_reg and use 'u32 may_goto_cnt' instead. > This way may_goto handling is similar to bpf_loop() processing. > - fixed bug in patch 2 that RANGE_WITHIN should not use > rold->type == NOT_INIT as a safe signal. > - patch 3 fixed negative offset computation in cond_break macro > - using bpf_arena and cond_break recompiled lib/glob.c as bpf prog > and it works! It will be added as a selftest to arena series. > > [...] Here is the summary with links: - [v6,bpf-next,1/4] bpf: Introduce may_goto instruction https://git.kernel.org/bpf/bpf-next/c/9c4cab4e6756 - [v6,bpf-next,2/4] bpf: Recognize that two registers are safe when their ranges match https://git.kernel.org/bpf/bpf-next/c/cc570d85e66e - [v6,bpf-next,3/4] bpf: Add cond_break macro https://git.kernel.org/bpf/bpf-next/c/7825948e135b - [v6,bpf-next,4/4] selftests/bpf: Test may_goto https://git.kernel.org/bpf/bpf-next/c/8089b99d4649 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break 2024-03-06 18:50 ` [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break patchwork-bot+netdevbpf @ 2024-03-06 19:15 ` John Fastabend 0 siblings, 0 replies; 14+ messages in thread From: John Fastabend @ 2024-03-06 19:15 UTC (permalink / raw) To: patchwork-bot+netdevbpf, Alexei Starovoitov Cc: bpf, daniel, andrii, martin.lau, memxor, eddyz87, john.fastabend, kernel-team patchwork-bot+netdevbpf@ wrote: > Hello: > > This series was applied to bpf/bpf-next.git (master) > by Andrii Nakryiko <andrii@kernel.org>: > > On Tue, 5 Mar 2024 19:19:25 -0800 you wrote: > > From: Alexei Starovoitov <ast@kernel.org> > > > > v5 -> v6: > > - Rename BPF_JMA to BPF_JCOND > > - Addressed Andrii's review comments > > > > v4 -> v5: > > - rewrote patch 1 to avoid fake may_goto_reg and use 'u32 may_goto_cnt' instead. > > This way may_goto handling is similar to bpf_loop() processing. > > - fixed bug in patch 2 that RANGE_WITHIN should not use > > rold->type == NOT_INIT as a safe signal. > > - patch 3 fixed negative offset computation in cond_break macro > > - using bpf_arena and cond_break recompiled lib/glob.c as bpf prog > > and it works! It will be added as a selftest to arena series. > > > > [...] > > Here is the summary with links: > - [v6,bpf-next,1/4] bpf: Introduce may_goto instruction > https://git.kernel.org/bpf/bpf-next/c/9c4cab4e6756 > - [v6,bpf-next,2/4] bpf: Recognize that two registers are safe when their ranges match > https://git.kernel.org/bpf/bpf-next/c/cc570d85e66e > - [v6,bpf-next,3/4] bpf: Add cond_break macro > https://git.kernel.org/bpf/bpf-next/c/7825948e135b > - [v6,bpf-next,4/4] selftests/bpf: Test may_goto > https://git.kernel.org/bpf/bpf-next/c/8089b99d4649 > > You are awesome, thank you! > -- > Deet-doot-dot, I am a bot. > https://korg.docs.kernel.org/patchwork/pwbot.html > > Fwiw, I've been running this for a few days converting things to it. Tested-by: John Fastabend <john.fastabend@gmail.com> Acked-by: John Fastabend <john.fastabend@gmail.com> ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-03-06 19:15 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-06 3:19 [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break Alexei Starovoitov 2024-03-06 3:19 ` [PATCH v6 bpf-next 1/4] bpf: Introduce may_goto instruction Alexei Starovoitov 2024-03-06 13:12 ` Eduard Zingerman 2024-03-06 16:40 ` Alexei Starovoitov 2024-03-06 18:33 ` Andrii Nakryiko 2024-03-06 18:38 ` Alexei Starovoitov 2024-03-06 3:19 ` [PATCH v6 bpf-next 2/4] bpf: Recognize that two registers are safe when their ranges match Alexei Starovoitov 2024-03-06 13:07 ` Eduard Zingerman 2024-03-06 3:19 ` [PATCH v6 bpf-next 3/4] bpf: Add cond_break macro Alexei Starovoitov 2024-03-06 13:26 ` Eduard Zingerman 2024-03-06 16:51 ` Alexei Starovoitov 2024-03-06 3:19 ` [PATCH v6 bpf-next 4/4] selftests/bpf: Test may_goto Alexei Starovoitov 2024-03-06 18:50 ` [PATCH v6 bpf-next 0/4] bpf: Introduce may_goto and cond_break patchwork-bot+netdevbpf 2024-03-06 19:15 ` John Fastabend
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox