BPF List
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/6] BPF verifier precision tracking improvements
@ 2022-11-02  6:22 Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations Andrii Nakryiko
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

This patch set fixes and improves BPF verifier's precision tracking logic for
SCALAR registers.

Patches #1 and #2 are bug fixes discovered while working on these changes.

Patch #3 enables precision tracking for BPF programs that contain subprograms.
This was disabled before and prevent any modern BPF programs that use
subprograms from enjoying the benefits of SCALAR (im)precise logic.

Patch #4 is few lines of code changes and many lines of explaining why those
changes are correct. We establish why ignoring precise markings in current
state is OK.

Patch #5 build on explanation in patch #4 and pushes it to the limit by
forcefully forgetting inherited precise markins. Patch #4 by itself doesn't
prevent current state from having precise=true SCALARs, so patch #5 is
necessary to prevent such stray precise=true registers from creeping in.

Patch #6 adjusts test_align selftests to work around BPF verifier log's
limitations when it comes to interactions between state output and precision
backtracking output.

Overall, the goal of this patch set is to make BPF verifier's state tracking
a bit more efficient by trying to preserve as much generality in checkpointed
states as possible.

Andrii Nakryiko (6):
  bpf: propagate precision in ALU/ALU64 operations
  bpf: propagate precision across all frames, not just the last one
  bpf: allow precision tracking for programs with subprogs
  bpf: stop setting precise in current state
  bpf: aggressively forget precise markings during state checkpointing
  selftests/bpf: make test_align selftest more robust

 kernel/bpf/verifier.c                         | 277 +++++++++++++++---
 .../testing/selftests/bpf/prog_tests/align.c  |  38 ++-
 2 files changed, 256 insertions(+), 59 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations
  2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
@ 2022-11-02  6:22 ` Andrii Nakryiko
  2022-11-03  1:23   ` Alexei Starovoitov
  2022-11-02  6:22 ` [PATCH bpf-next 2/6] bpf: propagate precision across all frames, not just the last one Andrii Nakryiko
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

When processing ALU/ALU64 operations (apart from BPF_MOV, which is
handled correctly already; and BPF_NEG and BPF_END are special and don't
have source register), if destination register is already marked
precise, this causes problem with potentially missing precision tracking
for the source register. E.g., when we have r1 >>= r5 and r1 is marked
precise, but r5 isn't, this will lead to r5 staying as imprecise. This
is due to the precision backtracking logic stopping early when it sees
r1 is already marked precise. If r1 wasn't precise, we'd keep
backtracking and would add r5 to the set of registers that need to be
marked precise. So there is a discrepancy here which can lead to invalid
and incompatible states matched due to lack of precision marking on r5.
If r1 wasn't precise, precision backtracking would correctly mark both
r1 and r5 as precise.

This is simple to fix, luckily. If destination register is already
precise, we need to propagate precision to source register (if it is
SCALAR). This is similar to `reg += scalar` handling case, so nothing
conceptually new here.

This does have (negative) effect on some selftest programs and few
Cilium programs.  ~/baseline-tmp-results.csv are veristat results with
this patch, while ~/baseline-results.csv is without it. See post
scriptum for instructions on how to make Cilium programs testable with
veristat. Correctness has a price.

$ ./veristat -C -e file,prog,insns,states ~/baseline-results.csv ~/baseline-tmp-results.csv | grep -v '+0'
File                     Program               Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
-----------------------  --------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
bpf_cubic.bpf.linked1.o  bpf_cubic_cong_avoid              997             1700      +703 (+70.51%)                62                90        +28 (+45.16%)
test_l4lb.bpf.linked1.o  balancer_ingress                 4559             5469      +910 (+19.96%)               118               126          +8 (+6.78%)
-----------------------  --------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

$ ./veristat -C -e file,prog,verdict,insns,states ~/baseline-results-cilium.csv ~/baseline-tmp-results-cilium.csv | grep -v '+0'
File           Program                         Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
-------------  ------------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
bpf_host.o     tail_nodeport_nat_ingress_ipv6             4448             5261      +813 (+18.28%)               234               247         +13 (+5.56%)
bpf_host.o     tail_nodeport_nat_ipv6_egress              3396             3446        +50 (+1.47%)               201               203          +2 (+1.00%)
bpf_lxc.o      tail_nodeport_nat_ingress_ipv6             4448             5261      +813 (+18.28%)               234               247         +13 (+5.56%)
bpf_overlay.o  tail_nodeport_nat_ingress_ipv6             4448             5261      +813 (+18.28%)               234               247         +13 (+5.56%)
bpf_xdp.o      tail_lb_ipv4                              71736            73442      +1706 (+2.38%)              4295              4370         +75 (+1.75%)
-------------  ------------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

P.S. To make Cilium ([0]) programs libbpf-compatible and thus
veristat-loadable, apply changes from topmost commit in [1], which does
minimal changes to Cilium source code, mostly around SEC() annotations
and BPF map definitions.

  [0] https://github.com/cilium/cilium/
  [1] https://github.com/anakryiko/cilium/commits/libbpf-friendliness

Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/verifier.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 275c2f1f00ee..3f3c4451674d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9197,6 +9197,11 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 				return err;
 			return adjust_ptr_min_max_vals(env, insn,
 						       dst_reg, src_reg);
+		} else if (dst_reg->precise) {
+			/* if dst_reg is precise, src_reg should be precise as well */
+			err = mark_chain_precision(env, insn->src_reg);
+			if (err)
+				return err;
 		}
 	} else {
 		/* Pretend the src is a reg with a known value, since we only
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 2/6] bpf: propagate precision across all frames, not just the last one
  2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations Andrii Nakryiko
@ 2022-11-02  6:22 ` Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs Andrii Nakryiko
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

When equivalent completed state is found and it has additional precision
restrictions, BPF verifier propagates precision to
currently-being-verified state chain (i.e., including parent states) so
that if some of the states in the chain are not yet completed, necessary
precision restrictions are enforced.

Unfortunately, right now this happens only for the last frame (deepest
active subprogram's frame), not all the frames. This can lead to
incorrect matching of states due to missing precision marker. Currently
this doesn't seem possible as BPF verifier forces everything to precise
when validated BPF program has any subprograms. But with the next patch
lifting this restriction, this becomes problematic.

In fact, without this fix, we'll start getting failure in one of the
existing test_verifier test cases:

  #906/p precise: cross frame pruning FAIL
  Unexpected success to load!
  verification time 48 usec
  stack depth 0+0
  processed 26 insns (limit 1000000) max_states_per_insn 3 total_states 17 peak_states 17 mark_read 8

This patch adds precision propagation across all frames.

Fixes: a3ce685dd01a ("bpf: fix precision tracking")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/verifier.c | 71 ++++++++++++++++++++++++-------------------
 1 file changed, 39 insertions(+), 32 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3f3c4451674d..7a71154de32b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2763,7 +2763,7 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
 		}
 }
 
-static int __mark_chain_precision(struct bpf_verifier_env *env, int regno,
+static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno,
 				  int spi)
 {
 	struct bpf_verifier_state *st = env->cur_state;
@@ -2780,7 +2780,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno,
 	if (!env->bpf_capable)
 		return 0;
 
-	func = st->frame[st->curframe];
+	func = st->frame[frame];
 	if (regno >= 0) {
 		reg = &func->regs[regno];
 		if (reg->type != SCALAR_VALUE) {
@@ -2861,7 +2861,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno,
 			break;
 
 		new_marks = false;
-		func = st->frame[st->curframe];
+		func = st->frame[frame];
 		bitmap_from_u64(mask, reg_mask);
 		for_each_set_bit(i, mask, 32) {
 			reg = &func->regs[i];
@@ -2927,12 +2927,17 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int regno,
 
 int mark_chain_precision(struct bpf_verifier_env *env, int regno)
 {
-	return __mark_chain_precision(env, regno, -1);
+	return __mark_chain_precision(env, env->cur_state->curframe, regno, -1);
 }
 
-static int mark_chain_precision_stack(struct bpf_verifier_env *env, int spi)
+static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno)
 {
-	return __mark_chain_precision(env, -1, spi);
+	return __mark_chain_precision(env, frame, regno, -1);
+}
+
+static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi)
+{
+	return __mark_chain_precision(env, frame, -1, spi);
 }
 
 static bool is_spillable_regtype(enum bpf_reg_type type)
@@ -11838,34 +11843,36 @@ static int propagate_precision(struct bpf_verifier_env *env,
 {
 	struct bpf_reg_state *state_reg;
 	struct bpf_func_state *state;
-	int i, err = 0;
+	int i, err = 0, fr;
 
-	state = old->frame[old->curframe];
-	state_reg = state->regs;
-	for (i = 0; i < BPF_REG_FP; i++, state_reg++) {
-		if (state_reg->type != SCALAR_VALUE ||
-		    !state_reg->precise)
-			continue;
-		if (env->log.level & BPF_LOG_LEVEL2)
-			verbose(env, "propagating r%d\n", i);
-		err = mark_chain_precision(env, i);
-		if (err < 0)
-			return err;
-	}
+	for (fr = old->curframe; fr >= 0; fr--) {
+		state = old->frame[fr];
+		state_reg = state->regs;
+		for (i = 0; i < BPF_REG_FP; i++, state_reg++) {
+			if (state_reg->type != SCALAR_VALUE ||
+			    !state_reg->precise)
+				continue;
+			if (env->log.level & BPF_LOG_LEVEL2)
+				verbose(env, "frame %d: propagating r%d\n", i, fr);
+			err = mark_chain_precision_frame(env, fr, i);
+			if (err < 0)
+				return err;
+		}
 
-	for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
-		if (!is_spilled_reg(&state->stack[i]))
-			continue;
-		state_reg = &state->stack[i].spilled_ptr;
-		if (state_reg->type != SCALAR_VALUE ||
-		    !state_reg->precise)
-			continue;
-		if (env->log.level & BPF_LOG_LEVEL2)
-			verbose(env, "propagating fp%d\n",
-				(-i - 1) * BPF_REG_SIZE);
-		err = mark_chain_precision_stack(env, i);
-		if (err < 0)
-			return err;
+		for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
+			if (!is_spilled_reg(&state->stack[i]))
+				continue;
+			state_reg = &state->stack[i].spilled_ptr;
+			if (state_reg->type != SCALAR_VALUE ||
+			    !state_reg->precise)
+				continue;
+			if (env->log.level & BPF_LOG_LEVEL2)
+				verbose(env, "frame %d: propagating fp%d\n",
+					(-i - 1) * BPF_REG_SIZE, fr);
+			err = mark_chain_precision_stack_frame(env, fr, i);
+			if (err < 0)
+				return err;
+		}
 	}
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs
  2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 2/6] bpf: propagate precision across all frames, not just the last one Andrii Nakryiko
@ 2022-11-02  6:22 ` Andrii Nakryiko
  2022-11-03  1:41   ` Alexei Starovoitov
  2022-11-02  6:22 ` [PATCH bpf-next 4/6] bpf: stop setting precise in current state Andrii Nakryiko
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

Stop forcing precise=true for SCALAR registers when BPF program has any
subprograms. Current restriction means that any BPF program, as soon as
it uses subprograms, will end up not getting any of the precision
tracking benefits in reduction of number of verified states.

This patch keeps the fallback mark_all_scalars_precise() behavior if
precise marking has to cross function frames. E.g., if subprogram
requires R1 (first input arg) to be marked precise, ideally we'd need to
backtrack to the parent function and keep marking R1 and its
dependencies as precise. But right now we give up and force all the
SCALARs in any of the current and parent states to be forced to
precise=true. We can lift that restriction in the future.

But this patch fixes two issues identified when trying to enable
precision tracking for subprogs.

First, prevent "escaping" from top-most state in a global subprog. While
with entry-level BPF program we never end up requesting precision for
R1-R5 registers, because R2-R5 are not initialized (and so not readable
in correct BPF program), and R1 is PTR_TO_CTX, not SCALAR, and so is
implicitly precise. With global subprogs, though, it's different, as
global subprog a) can have up to 5 SCALAR input arguments, which might
get marked as precise=true and b) it is validated in isolation from its
main entry BPF program. b) means that we can end up exhausting parent
state chain and still not mark all registers in reg_mask as precise,
which would lead to verifier bug warning.

To handle that, we need to consider two cases. First, if the very first
state is not immediately "checkpointed" (i.e., stored in state lookup
hashtable), it will get correct first_insn_idx and last_insn_idx
instruction set during state checkpointing. As such, this case is
already handled and __mark_chain_precision() already handles that by
just doing nothing when we reach to the very first parent state.
st->parent will be NULL and we'll just stop. Perhaps some extra check
for reg_mask and stack_mask is due here, but this patch doesn't address
that issue.

More problematic second case is when global function's initial state is
immediately checkpointed before we manage to process the very first
instruction. This is happening because when there is a call to global
subprog from the main program the very first subprog's instruction is
marked as pruning point, so before we manage to process first
instruction we have to check and checkpoint state. This patch adds
a special handling for such "empty" state, which is identified by having
st->last_insn_idx set to -1. In such case, we check that we are indeed
validating global subprog, and with some sanity checking we mark input
args as precise if requested.

Note that we also initialize state->first_insn_idx with correct start
insn_idx offset. For main program zero is correct value, but for any
subprog it's quite confusing to not have first_insn_idx set. This
doesn't have any functional impact, but helps with debugging and state
printing. We also explicitly initialize state->last_insns_idx instead of
relying on is_state_visited() to do this with env->prev_insns_idx, which
will be -1 on the very first instruction. This concludes necessary
changes to handle specifically global subprog's precision tracking.

Second identified problem was missed handling of BPF helper functions
that call into subprogs (e.g., bpf_loop and few others). From precision
tracking and backtracking logic's standpoint those are effectively calls
into subprogs and should be called as BPF_PSEUDO_CALL calls.

This patch takes the least intrusive way and just checks against a short
list of current BPF helpers that do call subprogs, encapsulated in
is_callback_calling_function() function. But to prevent accidentally
forgetting to add new BPF helpers to this "list", we also do a sanity
check in __check_func_call, which has to be called for each such special
BPF helper, to validate that BPF helper is indeed recognized as
callback-calling one. This should catch any missed checks in the future.
Adding some special flags to be added in function proto definitions
seemed like an overkill in this case.

With the above changes, it's possible to remove forceful setting of
reg->precise to true in __mark_reg_unknown, which turns on precision
tracking both inside subprogs and entry progs that have subprogs. No
warnings or errors were detected across all the selftests, but also when
validating with veristat against internal Meta BPF objects and Cilium
objects. Further, in some BPF programs there are noticeable reduction in
number of states and instructions validated due to more effective
precision tracking, especially benefiting syncookie test.

$ ./veristat -C -e file,prog,insns,states ~/baseline-results.csv ~/subprog-precise-results.csv  | grep -v '+0'
File                                      Program                     Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
----------------------------------------  --------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
pyperf600_bpf_loop.bpf.linked1.o          on_event                               3966             3678       -288 (-7.26%)               306               276         -30 (-9.80%)
pyperf_global.bpf.linked1.o               on_event                               7563             7530        -33 (-0.44%)               520               517          -3 (-0.58%)
pyperf_subprogs.bpf.linked1.o             on_event                              36358            36934       +576 (+1.58%)              2499              2531         +32 (+1.28%)
setget_sockopt.bpf.linked1.o              skops_sockopt                          3965             4038        +73 (+1.84%)               343               347          +4 (+1.17%)
test_cls_redirect_subprogs.bpf.linked1.o  cls_redirect                          64965            64901        -64 (-0.10%)              4619              4612          -7 (-0.15%)
test_misc_tcp_hdr_options.bpf.linked1.o   misc_estab                             1491             1307      -184 (-12.34%)               110               100         -10 (-9.09%)
test_pkt_access.bpf.linked1.o             test_pkt_access                         354              349         -5 (-1.41%)                25                24          -1 (-4.00%)
test_sock_fields.bpf.linked1.o            egress_read_sock_fields                 435              375       -60 (-13.79%)                22                20          -2 (-9.09%)
test_sysctl_loop2.bpf.linked1.o           sysctl_tcp_mem                         1508             1501         -7 (-0.46%)                29                28          -1 (-3.45%)
test_tc_dtime.bpf.linked1.o               egress_fwdns_prio100                    468              435        -33 (-7.05%)                45                41          -4 (-8.89%)
test_tc_dtime.bpf.linked1.o               ingress_fwdns_prio100                   398              408        +10 (+2.51%)                42                39          -3 (-7.14%)
test_tc_dtime.bpf.linked1.o               ingress_fwdns_prio101                  1096              842      -254 (-23.18%)                97                73        -24 (-24.74%)
test_tcp_hdr_options.bpf.linked1.o        estab                                  2758             2408      -350 (-12.69%)               208               181        -27 (-12.98%)
test_urandom_usdt.bpf.linked1.o           urand_read_with_sema                    466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_urandom_usdt.bpf.linked1.o           urand_read_without_sema                 466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_urandom_usdt.bpf.linked1.o           urandlib_read_with_sema                 466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_urandom_usdt.bpf.linked1.o           urandlib_read_without_sema              466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_xdp_noinline.bpf.linked1.o           balancer_ingress_v6                    4302             4294         -8 (-0.19%)               257               256          -1 (-0.39%)
xdp_synproxy_kern.bpf.linked1.o           syncookie_tc                         583722           405757   -177965 (-30.49%)             35846             25735     -10111 (-28.21%)
xdp_synproxy_kern.bpf.linked1.o           syncookie_xdp                        609123           479055   -130068 (-21.35%)             35452             29145      -6307 (-17.79%)
----------------------------------------  --------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/verifier.c | 61 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7a71154de32b..cf9e20da8f6b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -511,6 +511,15 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id)
 	return func_id == BPF_FUNC_dynptr_data;
 }
 
+static bool is_callback_calling_function(enum bpf_func_id func_id)
+{
+	return func_id == BPF_FUNC_for_each_map_elem ||
+	       func_id == BPF_FUNC_timer_set_callback ||
+	       func_id == BPF_FUNC_find_vma ||
+	       func_id == BPF_FUNC_loop ||
+	       func_id == BPF_FUNC_user_ringbuf_drain;
+}
+
 static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id,
 					const struct bpf_map *map)
 {
@@ -1684,7 +1693,7 @@ static void __mark_reg_unknown(const struct bpf_verifier_env *env,
 	reg->type = SCALAR_VALUE;
 	reg->var_off = tnum_unknown;
 	reg->frameno = 0;
-	reg->precise = env->subprog_cnt > 1 || !env->bpf_capable;
+	reg->precise = !env->bpf_capable;
 	__mark_reg_unbounded(reg);
 }
 
@@ -2653,6 +2662,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
 		if (opcode == BPF_CALL) {
 			if (insn->src_reg == BPF_PSEUDO_CALL)
 				return -ENOTSUPP;
+			/* BPF helpers that invoke callback subprogs are
+			 * equivalent to BPF_PSEUDO_CALL above
+			 */
+			if (insn->src_reg == 0 && is_callback_calling_function(insn->imm))
+				return -ENOTSUPP;
 			/* regular helper call sets R0 */
 			*reg_mask &= ~1;
 			if (*reg_mask & 0x3f) {
@@ -2816,10 +2830,39 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
 		return 0;
 	if (!reg_mask && !stack_mask)
 		return 0;
+
 	for (;;) {
 		DECLARE_BITMAP(mask, 64);
 		u32 history = st->jmp_history_cnt;
 
+		if (last_idx < 0) {
+			/* we are at the entry into subprog, which
+			 * is expected for global funcs, but only if
+			 * requested precise registers are R1-R5
+			 * (which are global func's input arguments)
+			 */
+			if (st->curframe == 0 &&
+			    st->frame[0]->subprogno > 0 &&
+			    st->frame[0]->callsite == BPF_MAIN_FUNC &&
+			    stack_mask == 0 && (reg_mask & ~0x3e) == 0) {
+				bitmap_from_u64(mask, reg_mask);
+				for_each_set_bit(i, mask, 32) {
+					reg = &st->frame[0]->regs[i];
+					if (reg->type != SCALAR_VALUE) {
+						reg_mask &= ~(1u << i);
+						continue;
+					}
+					reg->precise = true;
+				}
+				return 0;
+			}
+
+			verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n",
+				st->frame[0]->subprogno, reg_mask, stack_mask);
+			WARN_ONCE(1, "verifier backtracking bug");
+			return -EFAULT;
+		}
+
 		if (env->log.level & BPF_LOG_LEVEL2)
 			verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx);
 		for (i = last_idx;;) {
@@ -6647,6 +6690,10 @@ typedef int (*set_callee_state_fn)(struct bpf_verifier_env *env,
 				   struct bpf_func_state *callee,
 				   int insn_idx);
 
+static int set_callee_state(struct bpf_verifier_env *env,
+			    struct bpf_func_state *caller,
+			    struct bpf_func_state *callee, int insn_idx);
+
 static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx, int subprog,
 			     set_callee_state_fn set_callee_state_cb)
@@ -6697,6 +6744,16 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		}
 	}
 
+	/* set_callee_state is used for direct subprog calls, but we are
+	 * interested in validating only BPF helpers that can call subprogs as
+	 * callbacks
+	 */
+	if (set_callee_state_cb != set_callee_state && !is_callback_calling_function(insn->imm)) {
+		verbose(env, "verifier bug: helper %s#%d is not marked as callback-calling\n",
+			func_id_name(insn->imm), insn->imm);
+		return -EFAULT;
+	}
+
 	if (insn->code == (BPF_JMP | BPF_CALL) &&
 	    insn->src_reg == 0 &&
 	    insn->imm == BPF_FUNC_timer_set_callback) {
@@ -14615,6 +14672,8 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
 			BPF_MAIN_FUNC /* callsite */,
 			0 /* frameno */,
 			subprog);
+	state->first_insn_idx = env->subprog_info[subprog].start;
+	state->last_insn_idx = -1;
 
 	regs = state->frame[state->curframe]->regs;
 	if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 4/6] bpf: stop setting precise in current state
  2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2022-11-02  6:22 ` [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs Andrii Nakryiko
@ 2022-11-02  6:22 ` Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 5/6] bpf: aggressively forget precise markings during state checkpointing Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 6/6] selftests/bpf: make test_align selftest more robust Andrii Nakryiko
  5 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

Setting reg->precise to true in current state is not necessary from
correctness standpoint, but it does pessimise the whole precision (or
rather "imprecision", because that's what we want to keep as much as
possible) tracking. Why is somewhat subtle and my best attempt to
explain this is recorded in an extensive comment for __mark_chain_precise()
function. Some more careful thinking and code reading is probably required
still to grok this completely, unfortunately. Whiteboarding and a bunch
of extra handwaiving in person would be even more helpful, but is deemed
impractical in Git commit.

Next patch pushes this imprecision property even further, building on top of
the insights described in this patch.

End results are pretty nice, we get reduction in number of total instructions
and states verified due to a better states reuse, as some of the states are now
more generic and permissive due to less unnecessary precise=true requirements.

SELFTESTS RESULTS
=================

$ ./veristat -C -e file,prog,insns,states ~/subprog-precise-results.csv ~/imprecise-early-results.csv | grep -v '+0'
File                                     Program                 Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
---------------------------------------  ----------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
bpf_iter_ksym.bpf.linked1.o              dump_ksym                           347              285       -62 (-17.87%)                20                19          -1 (-5.00%)
pyperf600_bpf_loop.bpf.linked1.o         on_event                           3678             3736        +58 (+1.58%)               276               285          +9 (+3.26%)
setget_sockopt.bpf.linked1.o             skops_sockopt                      4038             3947        -91 (-2.25%)               347               343          -4 (-1.15%)
test_l4lb.bpf.linked1.o                  balancer_ingress                   4559             2611     -1948 (-42.73%)               118               105        -13 (-11.02%)
test_l4lb_noinline.bpf.linked1.o         balancer_ingress                   6279             6268        -11 (-0.18%)               237               236          -1 (-0.42%)
test_misc_tcp_hdr_options.bpf.linked1.o  misc_estab                         1307             1303         -4 (-0.31%)               100                99          -1 (-1.00%)
test_sk_lookup.bpf.linked1.o             ctx_narrow_access                   456              447         -9 (-1.97%)                39                38          -1 (-2.56%)
test_sysctl_loop1.bpf.linked1.o          sysctl_tcp_mem                     1389             1384         -5 (-0.36%)                26                25          -1 (-3.85%)
test_tc_dtime.bpf.linked1.o              egress_fwdns_prio101                518              485        -33 (-6.37%)                51                46          -5 (-9.80%)
test_tc_dtime.bpf.linked1.o              egress_host                         519              468        -51 (-9.83%)                50                44         -6 (-12.00%)
test_tc_dtime.bpf.linked1.o              ingress_fwdns_prio101               842             1000      +158 (+18.76%)                73                88        +15 (+20.55%)
xdp_synproxy_kern.bpf.linked1.o          syncookie_tc                     405757           373173     -32584 (-8.03%)             25735             22882      -2853 (-11.09%)
xdp_synproxy_kern.bpf.linked1.o          syncookie_xdp                    479055           371590   -107465 (-22.43%)             29145             22207      -6938 (-23.81%)
---------------------------------------  ----------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

Slight regression in test_tc_dtime.bpf.linked1.o/ingress_fwdns_prio101
is left for a follow up, there might be some more precision-related bugs
in existing BPF verifier logic.

CILIUM RESULTS
==============

$ ./veristat -C -e file,prog,insns,states ~/subprog-precise-results-cilium.csv ~/imprecise-early-results-cilium.csv | grep -v '+0'
File           Program                         Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
-------------  ------------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
bpf_host.o     cil_from_host                               762              556      -206 (-27.03%)                43                37         -6 (-13.95%)
bpf_host.o     tail_handle_nat_fwd_ipv4                  23541            23426       -115 (-0.49%)              1538              1537          -1 (-0.07%)
bpf_host.o     tail_nodeport_nat_egress_ipv4             33592            33566        -26 (-0.08%)              2163              2161          -2 (-0.09%)
bpf_lxc.o      tail_handle_nat_fwd_ipv4                  23541            23426       -115 (-0.49%)              1538              1537          -1 (-0.07%)
bpf_overlay.o  tail_nodeport_nat_egress_ipv4             33581            33543        -38 (-0.11%)              2160              2157          -3 (-0.14%)
bpf_xdp.o      tail_handle_nat_fwd_ipv4                  21659            20920       -739 (-3.41%)              1440              1376         -64 (-4.44%)
bpf_xdp.o      tail_handle_nat_fwd_ipv6                  17084            17039        -45 (-0.26%)               907               905          -2 (-0.22%)
bpf_xdp.o      tail_lb_ipv4                              73442            73430        -12 (-0.02%)              4370              4369          -1 (-0.02%)
bpf_xdp.o      tail_lb_ipv6                             152114           151895       -219 (-0.14%)              6493              6479         -14 (-0.22%)
bpf_xdp.o      tail_nodeport_nat_egress_ipv4             17377            17200       -177 (-1.02%)              1125              1111         -14 (-1.24%)
bpf_xdp.o      tail_nodeport_nat_ingress_ipv6             6405             6397         -8 (-0.12%)               309               308          -1 (-0.32%)
bpf_xdp.o      tail_rev_nodeport_lb4                      7126             6934       -192 (-2.69%)               414               402         -12 (-2.90%)
bpf_xdp.o      tail_rev_nodeport_lb6                     18059            17905       -154 (-0.85%)              1105              1096          -9 (-0.81%)
-------------  ------------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/verifier.c | 103 +++++++++++++++++++++++++++++++++++++-----
 1 file changed, 91 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index cf9e20da8f6b..2ec9d1452d4f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2756,8 +2756,11 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
 
 	/* big hammer: mark all scalars precise in this path.
 	 * pop_stack may still get !precise scalars.
+	 * We also skip current state and go straight to first parent state,
+	 * because precision markings in current non-checkpointed state are
+	 * not needed. See why in the comment in __mark_chain_precision below.
 	 */
-	for (; st; st = st->parent)
+	for (st = st->parent; st; st = st->parent) {
 		for (i = 0; i <= st->curframe; i++) {
 			func = st->frame[i];
 			for (j = 0; j < BPF_REG_FP; j++) {
@@ -2775,8 +2778,88 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
 				reg->precise = true;
 			}
 		}
+	}
 }
 
+/*
+ * __mark_chain_precision() backtracks BPF program instruction sequence and
+ * chain of verifier states making sure that register *regno* (if regno >= 0)
+ * and/or stack slot *spi* (if spi >= 0) are marked as precisely tracked
+ * SCALARS, as well as any other registers and slots that contribute to
+ * a tracked state of given registers/stack slots, depending on specific BPF
+ * assembly instructions (see backtrack_insns() for exact instruction handling
+ * logic). This backtracking relies on recorded jmp_history and is able to
+ * traverse entire chain of parent states. This process ends only when all the
+ * necessary registers/slots and their transitive dependencies are marked as
+ * precise.
+ *
+ * One important and subtle aspect is that precise marks *do not matter* in
+ * the currently verified state (current state). It is important to understand
+ * why this is the case.
+ *
+ * First, note that current state is the state that is not yet "checkpointed",
+ * i.e., it is not yet put into env->explored_states, and it has no children
+ * states as well. It's ephemeral, and can end up either a) being discarded if
+ * compatible explored state is found at some point or BPF_EXIT instruction is
+ * reached or b) checkpointed and put into env->explored_states, branching out
+ * into one or more children states.
+ *
+ * In the former case, precise markings in current state are completely
+ * ignored by state comparison code (see regsafe() for details). Only
+ * checkpointed ("old") state precise markings are important, and if old
+ * state's register/slot is precise, regsafe() assumes current state's
+ * register/slot as precise and checks value ranges exactly and precisely. If
+ * states turn out to be compatible, current state's necessary precise
+ * markings and any required parent states' precise markings are enforced
+ * after the fact with propagate_precision() logic, after the fact. But it's
+ * important to realize that in this case, even after marking current state
+ * registers/slots as precise, we immediately discard current state. So what
+ * actually matters is any of the precise markings propagated into current
+ * state's parent states, which are always checkpointed (due to b) case above).
+ * As such, for scenario a) it doesn't matter if current state has precise
+ * markings set or not.
+ *
+ * Now, for the scenario b), checkpointing and forking into child(ren)
+ * state(s). Note that before current state gets to checkpointing step, any
+ * processed instruction always assumes precise SCALAR register/slot
+ * knowledge: if precise value or range is useful to prune jump branch, BPF
+ * verifier takes this opportunity enthusiastically. Similarly, when
+ * register's value is used to calculate offset or memory address, exact
+ * knowledge of SCALAR range is assumed, checked, and enforced. So, similar to
+ * what we mentioned above about state comparison ignoring precise markings
+ * during state comparison, BPF verifier ignores and also assumes precise
+ * markings *at will* during instruction verification process. But as verifier
+ * assumes precision, it also propagates any precision dependencies across
+ * parent states, which are not yet finalized, so can be further restricted
+ * based on new knowledge gained from restrictions enforced by their children
+ * states. This is so that once those parent states are finalized, i.e., when
+ * they have no more active children state, state comparison logic in
+ * is_state_visited() would enforce strict and precise SCALAR ranges, if
+ * required for correctness.
+ *
+ * To build a bit more intuition, note also that once a state is checkpointed,
+ * the path we took to get to that state is not important. This is crucial
+ * property for state pruning. When state is checkpointed and finalized at
+ * some instruction index, it can be correctly and safely used to "short
+ * circuit" any *compatible* state that reaches exactly the same instruction
+ * index. I.e., if we jumped to that instruction from a completely different
+ * code path than original finalized state was derived from, it doesn't
+ * matter, current state can be discarded because from that instruction
+ * forward having a compatible state will ensure we will safely reach the
+ * exit. States describe preconditions for further exploration, but completely
+ * forget the history of how we got here.
+ *
+ * This also means that even if we needed precise SCALAR range to get to
+ * finalized state, but from that point forward *that same* SCALAR register is
+ * never used in a precise context (i.e., it's precise value is not needed for
+ * correctness), it's correct and safe to mark such register as "imprecise"
+ * (i.e., precise marking set to false). This is what we rely on when we do
+ * not set precise marking in current state. If no child state requires
+ * precision for any given SCALAR register, it's safe to dictate that it can
+ * be imprecise. If any child state does require this register to be precise,
+ * we'll mark it precise later retroactively during precise markings
+ * propagation from child state to parent states.
+ */
 static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno,
 				  int spi)
 {
@@ -2794,6 +2877,10 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
 	if (!env->bpf_capable)
 		return 0;
 
+	/* Do sanity checks against current state of register and/or stack
+	 * slot, but don't set precise flag in current state, as precision
+	 * tracking in the current state is unnecessary.
+	 */
 	func = st->frame[frame];
 	if (regno >= 0) {
 		reg = &func->regs[regno];
@@ -2801,11 +2888,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
 			WARN_ONCE(1, "backtracing misuse");
 			return -EFAULT;
 		}
-		if (!reg->precise)
-			new_marks = true;
-		else
-			reg_mask = 0;
-		reg->precise = true;
+		new_marks = true;
 	}
 
 	while (spi >= 0) {
@@ -2818,11 +2901,7 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
 			stack_mask = 0;
 			break;
 		}
-		if (!reg->precise)
-			new_marks = true;
-		else
-			stack_mask = 0;
-		reg->precise = true;
+		new_marks = true;
 		break;
 	}
 
@@ -11577,7 +11656,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 		if (env->explore_alu_limits)
 			return false;
 		if (rcur->type == SCALAR_VALUE) {
-			if (!rold->precise && !rcur->precise)
+			if (!rold->precise)
 				return true;
 			/* new val must satisfy old val knowledge */
 			return range_within(rold, rcur) &&
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 5/6] bpf: aggressively forget precise markings during state checkpointing
  2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2022-11-02  6:22 ` [PATCH bpf-next 4/6] bpf: stop setting precise in current state Andrii Nakryiko
@ 2022-11-02  6:22 ` Andrii Nakryiko
  2022-11-02  6:22 ` [PATCH bpf-next 6/6] selftests/bpf: make test_align selftest more robust Andrii Nakryiko
  5 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

Exploit the property of about-to-be-checkpointed state to be able to
forget all precise markings up to that point even more aggressively. We
now clear all potentially inherited precise markings right before
checkpointing and branching off into child state. If any of children
states require precise knowledge of any SCALAR register, those will be
propagated backwards later on before this state is finalized, preserving
correctness.

There is a single selftests BPF program change, but tremendous one: 25x
reduction in number of verified instructions and states in
trace_virtqueue_add_sgs.

Cilium results are more modest, but happen across wider range of programs.

SELFTESTS RESULTS
=================

$ ./veristat -C -e file,prog,insns,states ~/imprecise-early-results.csv ~/imprecise-aggressive-results.csv | grep -v '+0'
File                 Program                  Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
-------------------  -----------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
loop6.bpf.linked1.o  trace_virtqueue_add_sgs           398057            15114   -382943 (-96.20%)              8717               336      -8381 (-96.15%)
-------------------  -----------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

CILIUM RESULTS
==============

$ ./veristat -C -e file,prog,insns,states ~/imprecise-early-results-cilium.csv ~/imprecise-aggressive-results-cilium.csv | grep -v '+0'
File           Program                           Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
-------------  --------------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
bpf_host.o     tail_handle_nat_fwd_ipv4                    23426            23221       -205 (-0.88%)              1537              1515         -22 (-1.43%)
bpf_host.o     tail_handle_nat_fwd_ipv6                    13009            12904       -105 (-0.81%)               719               708         -11 (-1.53%)
bpf_host.o     tail_nodeport_nat_ingress_ipv6               5261             5196        -65 (-1.24%)               247               243          -4 (-1.62%)
bpf_host.o     tail_nodeport_nat_ipv6_egress                3446             3406        -40 (-1.16%)               203               198          -5 (-2.46%)
bpf_lxc.o      tail_handle_nat_fwd_ipv4                    23426            23221       -205 (-0.88%)              1537              1515         -22 (-1.43%)
bpf_lxc.o      tail_handle_nat_fwd_ipv6                    13009            12904       -105 (-0.81%)               719               708         -11 (-1.53%)
bpf_lxc.o      tail_ipv4_ct_egress                          5074             4897       -177 (-3.49%)               255               248          -7 (-2.75%)
bpf_lxc.o      tail_ipv4_ct_ingress                         5100             4923       -177 (-3.47%)               255               248          -7 (-2.75%)
bpf_lxc.o      tail_ipv4_ct_ingress_policy_only             5100             4923       -177 (-3.47%)               255               248          -7 (-2.75%)
bpf_lxc.o      tail_ipv6_ct_egress                          4558             4536        -22 (-0.48%)               188               187          -1 (-0.53%)
bpf_lxc.o      tail_ipv6_ct_ingress                         4578             4556        -22 (-0.48%)               188               187          -1 (-0.53%)
bpf_lxc.o      tail_ipv6_ct_ingress_policy_only             4578             4556        -22 (-0.48%)               188               187          -1 (-0.53%)
bpf_lxc.o      tail_nodeport_nat_ingress_ipv6               5261             5196        -65 (-1.24%)               247               243          -4 (-1.62%)
bpf_overlay.o  tail_nodeport_nat_ingress_ipv6               5261             5196        -65 (-1.24%)               247               243          -4 (-1.62%)
bpf_overlay.o  tail_nodeport_nat_ipv6_egress                3482             3442        -40 (-1.15%)               204               201          -3 (-1.47%)
bpf_xdp.o      tail_nodeport_nat_egress_ipv4               17200            15619      -1581 (-9.19%)              1111              1010        -101 (-9.09%)
-------------  --------------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/bpf/verifier.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2ec9d1452d4f..8841a873bcd5 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2781,6 +2781,31 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
 	}
 }
 
+static void mark_all_scalars_imprecise(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
+{
+	struct bpf_func_state *func;
+	struct bpf_reg_state *reg;
+	int i, j;
+
+	for (i = 0; i <= st->curframe; i++) {
+		func = st->frame[i];
+		for (j = 0; j < BPF_REG_FP; j++) {
+			reg = &func->regs[j];
+			if (reg->type != SCALAR_VALUE)
+				continue;
+			reg->precise = false;
+		}
+		for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) {
+			if (!is_spilled_reg(&func->stack[j]))
+				continue;
+			reg = &func->stack[j].spilled_ptr;
+			if (reg->type != SCALAR_VALUE)
+				continue;
+			reg->precise = false;
+		}
+	}
+}
+
 /*
  * __mark_chain_precision() backtracks BPF program instruction sequence and
  * chain of verifier states making sure that register *regno* (if regno >= 0)
@@ -2859,6 +2884,14 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
  * be imprecise. If any child state does require this register to be precise,
  * we'll mark it precise later retroactively during precise markings
  * propagation from child state to parent states.
+ *
+ * Skipping precise marking setting in current state is a mild version of
+ * relying on the above observation. But we can utilize this property even
+ * more aggressively by proactively forgetting any precise marking in the
+ * current state (which we inherited from the parent state), right before we
+ * checkpoint it and branch off into new child state. This is done by
+ * mark_all_scalars_imprecise() to hopefully get more permissive and generic
+ * finalized states which help in short circuiting more future states.
  */
 static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno,
 				  int spi)
@@ -12203,6 +12236,10 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	env->prev_jmps_processed = env->jmps_processed;
 	env->prev_insn_processed = env->insn_processed;
 
+	/* forget precise markings we inherited, see __mark_chain_precision */
+	if (env->bpf_capable)
+		mark_all_scalars_imprecise(env, cur);
+
 	/* add new state to the head of linked list */
 	new = &new_sl->state;
 	err = copy_verifier_state(new, cur);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next 6/6] selftests/bpf: make test_align selftest more robust
  2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2022-11-02  6:22 ` [PATCH bpf-next 5/6] bpf: aggressively forget precise markings during state checkpointing Andrii Nakryiko
@ 2022-11-02  6:22 ` Andrii Nakryiko
  5 siblings, 0 replies; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-02  6:22 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

test_align selftest relies on BPF verifier log emitting register states
for specific instructions in expected format. Unfortunately, BPF
verifier precision backtracking log interferes with such expectations.
And instruction on which precision propagation happens sometimes don't
output full expected register states. This does indeed look like
something to be improved in BPF verifier, but is beyond the scope of
this patch set.

So to make test_align a bit more robust, inject few dummy R4 = R5
instructions which capture desired state of R5 and won't have precision
tracking logs on them. This fixes tests until we can improve BPF
verifier output in the presence of precision tracking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/align.c  | 38 ++++++++++++-------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/align.c b/tools/testing/selftests/bpf/prog_tests/align.c
index 970f09156eb4..4666f88f2bb4 100644
--- a/tools/testing/selftests/bpf/prog_tests/align.c
+++ b/tools/testing/selftests/bpf/prog_tests/align.c
@@ -2,7 +2,7 @@
 #include <test_progs.h>
 
 #define MAX_INSNS	512
-#define MAX_MATCHES	16
+#define MAX_MATCHES	24
 
 struct bpf_reg_match {
 	unsigned int line;
@@ -267,6 +267,7 @@ static struct bpf_align_test tests[] = {
 			 */
 			BPF_MOV64_REG(BPF_REG_5, BPF_REG_2),
 			BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6),
+			BPF_MOV64_REG(BPF_REG_4, BPF_REG_5),
 			BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 14),
 			BPF_MOV64_REG(BPF_REG_4, BPF_REG_5),
 			BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, 4),
@@ -280,6 +281,7 @@ static struct bpf_align_test tests[] = {
 			BPF_MOV64_REG(BPF_REG_5, BPF_REG_2),
 			BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 14),
 			BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6),
+			BPF_MOV64_REG(BPF_REG_4, BPF_REG_5),
 			BPF_ALU64_IMM(BPF_ADD, BPF_REG_5, 4),
 			BPF_ALU64_REG(BPF_ADD, BPF_REG_5, BPF_REG_6),
 			BPF_MOV64_REG(BPF_REG_4, BPF_REG_5),
@@ -311,44 +313,52 @@ static struct bpf_align_test tests[] = {
 			{15, "R4=pkt(id=1,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"},
 			{15, "R5=pkt(id=1,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"},
 			/* Variable offset is added to R5 packet pointer,
-			 * resulting in auxiliary alignment of 4.
+			 * resulting in auxiliary alignment of 4. To avoid BPF
+			 * verifier's precision backtracking logging
+			 * interfering we also have a no-op R4 = R5
+			 * instruction to validate R5 state. We also check
+			 * that R4 is what it should be in such case.
 			 */
-			{17, "R5_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
+			{18, "R4_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
+			{18, "R5_w=pkt(id=2,off=0,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
 			/* Constant offset is added to R5, resulting in
 			 * reg->off of 14.
 			 */
-			{18, "R5_w=pkt(id=2,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
+			{19, "R5_w=pkt(id=2,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off
 			 * (14) which is 16.  Then the variable offset is 4-byte
 			 * aligned, so the total offset is 4-byte aligned and
 			 * meets the load's requirements.
 			 */
-			{23, "R4=pkt(id=2,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"},
-			{23, "R5=pkt(id=2,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"},
+			{24, "R4=pkt(id=2,off=18,r=18,umax=1020,var_off=(0x0; 0x3fc))"},
+			{24, "R5=pkt(id=2,off=14,r=18,umax=1020,var_off=(0x0; 0x3fc))"},
 			/* Constant offset is added to R5 packet pointer,
 			 * resulting in reg->off value of 14.
 			 */
-			{25, "R5_w=pkt(off=14,r=8"},
+			{26, "R5_w=pkt(off=14,r=8"},
 			/* Variable offset is added to R5, resulting in a
-			 * variable offset of (4n).
+			 * variable offset of (4n). See comment for insn #18
+			 * for R4 = R5 trick.
 			 */
-			{26, "R5_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
+			{28, "R4_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
+			{28, "R5_w=pkt(id=3,off=14,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
 			/* Constant is added to R5 again, setting reg->off to 18. */
-			{27, "R5_w=pkt(id=3,off=18,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
+			{29, "R5_w=pkt(id=3,off=18,r=0,umax=1020,var_off=(0x0; 0x3fc))"},
 			/* And once more we add a variable; resulting var_off
 			 * is still (4n), fixed offset is not changed.
 			 * Also, we create a new reg->id.
 			 */
-			{28, "R5_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"},
+			{31, "R4_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"},
+			{31, "R5_w=pkt(id=4,off=18,r=0,umax=2040,var_off=(0x0; 0x7fc)"},
 			/* At the time the word size load is performed from R5,
 			 * its total fixed offset is NET_IP_ALIGN + reg->off (18)
 			 * which is 20.  Then the variable offset is (4n), so
 			 * the total offset is 4-byte aligned and meets the
 			 * load's requirements.
 			 */
-			{33, "R4=pkt(id=4,off=22,r=22,umax=2040,var_off=(0x0; 0x7fc)"},
-			{33, "R5=pkt(id=4,off=18,r=22,umax=2040,var_off=(0x0; 0x7fc)"},
+			{35, "R4=pkt(id=4,off=22,r=22,umax=2040,var_off=(0x0; 0x7fc)"},
+			{35, "R5=pkt(id=4,off=18,r=22,umax=2040,var_off=(0x0; 0x7fc)"},
 		},
 	},
 	{
@@ -681,6 +691,6 @@ void test_align(void)
 		if (!test__start_subtest(test->descr))
 			continue;
 
-		CHECK_FAIL(do_test_single(test));
+		ASSERT_OK(do_test_single(test), test->descr);
 	}
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations
  2022-11-02  6:22 ` [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations Andrii Nakryiko
@ 2022-11-03  1:23   ` Alexei Starovoitov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2022-11-03  1:23 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team

On Tue, Nov 01, 2022 at 11:22:16PM -0700, Andrii Nakryiko wrote:
> When processing ALU/ALU64 operations (apart from BPF_MOV, which is
> handled correctly already; and BPF_NEG and BPF_END are special and don't
> have source register), if destination register is already marked
> precise, this causes problem with potentially missing precision tracking
> for the source register. E.g., when we have r1 >>= r5 and r1 is marked
> precise, but r5 isn't, this will lead to r5 staying as imprecise. This
> is due to the precision backtracking logic stopping early when it sees
> r1 is already marked precise. If r1 wasn't precise, we'd keep
> backtracking and would add r5 to the set of registers that need to be
> marked precise. So there is a discrepancy here which can lead to invalid
> and incompatible states matched due to lack of precision marking on r5.
> If r1 wasn't precise, precision backtracking would correctly mark both
> r1 and r5 as precise.
> 
> This is simple to fix, luckily. If destination register is already
> precise, we need to propagate precision to source register (if it is
> SCALAR). This is similar to `reg += scalar` handling case, so nothing
> conceptually new here.

Could you rephrase the above paragraph to make it clear that the fix
is not in backtracking logic, but in nomral forward simulating part of the verifier.
After reading the 1st paragraph it sounds that backtracking is wrong,
but it's the normal pass that is missing precisions propagation.
Also instead of "similar to `reg += scalar`" could you explicitly mention
that pointer += scalar and scalar += pointer correctly propagate
precisions of scalars. It's the scalar += scalar case that was incomplete.

Other than that it's a great fix!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs
  2022-11-02  6:22 ` [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs Andrii Nakryiko
@ 2022-11-03  1:41   ` Alexei Starovoitov
  2022-11-04 16:32     ` Andrii Nakryiko
  0 siblings, 1 reply; 11+ messages in thread
From: Alexei Starovoitov @ 2022-11-03  1:41 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team

On Tue, Nov 01, 2022 at 11:22:18PM -0700, Andrii Nakryiko wrote:
> Stop forcing precise=true for SCALAR registers when BPF program has any
> subprograms. Current restriction means that any BPF program, as soon as
> it uses subprograms, will end up not getting any of the precision
> tracking benefits in reduction of number of verified states.
> 
> This patch keeps the fallback mark_all_scalars_precise() behavior if
> precise marking has to cross function frames. E.g., if subprogram
> requires R1 (first input arg) to be marked precise, ideally we'd need to
> backtrack to the parent function and keep marking R1 and its
> dependencies as precise. But right now we give up and force all the
> SCALARs in any of the current and parent states to be forced to
> precise=true. We can lift that restriction in the future.
> 
> But this patch fixes two issues identified when trying to enable
> precision tracking for subprogs.
> 
> First, prevent "escaping" from top-most state in a global subprog. While
> with entry-level BPF program we never end up requesting precision for
> R1-R5 registers, because R2-R5 are not initialized (and so not readable
> in correct BPF program), and R1 is PTR_TO_CTX, not SCALAR, and so is
> implicitly precise. With global subprogs, though, it's different, as
> global subprog a) can have up to 5 SCALAR input arguments, which might
> get marked as precise=true and b) it is validated in isolation from its
> main entry BPF program. b) means that we can end up exhausting parent
> state chain and still not mark all registers in reg_mask as precise,
> which would lead to verifier bug warning.
> 
> To handle that, we need to consider two cases. First, if the very first
> state is not immediately "checkpointed" (i.e., stored in state lookup
> hashtable), it will get correct first_insn_idx and last_insn_idx
> instruction set during state checkpointing. As such, this case is
> already handled and __mark_chain_precision() already handles that by
> just doing nothing when we reach to the very first parent state.
> st->parent will be NULL and we'll just stop. Perhaps some extra check
> for reg_mask and stack_mask is due here, but this patch doesn't address
> that issue.
> 
> More problematic second case is when global function's initial state is
> immediately checkpointed before we manage to process the very first
> instruction. This is happening because when there is a call to global
> subprog from the main program the very first subprog's instruction is
> marked as pruning point, so before we manage to process first
> instruction we have to check and checkpoint state. This patch adds
> a special handling for such "empty" state, which is identified by having
> st->last_insn_idx set to -1. In such case, we check that we are indeed
> validating global subprog, and with some sanity checking we mark input
> args as precise if requested.
> 
> Note that we also initialize state->first_insn_idx with correct start
> insn_idx offset. For main program zero is correct value, but for any
> subprog it's quite confusing to not have first_insn_idx set. This
> doesn't have any functional impact, but helps with debugging and state
> printing. We also explicitly initialize state->last_insns_idx instead of
> relying on is_state_visited() to do this with env->prev_insns_idx, which
> will be -1 on the very first instruction. This concludes necessary
> changes to handle specifically global subprog's precision tracking.
> 
> Second identified problem was missed handling of BPF helper functions
> that call into subprogs (e.g., bpf_loop and few others). From precision
> tracking and backtracking logic's standpoint those are effectively calls
> into subprogs and should be called as BPF_PSEUDO_CALL calls.
> 
> This patch takes the least intrusive way and just checks against a short
> list of current BPF helpers that do call subprogs, encapsulated in
> is_callback_calling_function() function. But to prevent accidentally
> forgetting to add new BPF helpers to this "list", we also do a sanity
> check in __check_func_call, which has to be called for each such special
> BPF helper, to validate that BPF helper is indeed recognized as
> callback-calling one. This should catch any missed checks in the future.
> Adding some special flags to be added in function proto definitions
> seemed like an overkill in this case.
> 
> With the above changes, it's possible to remove forceful setting of
> reg->precise to true in __mark_reg_unknown, which turns on precision
> tracking both inside subprogs and entry progs that have subprogs. No
> warnings or errors were detected across all the selftests, but also when
> validating with veristat against internal Meta BPF objects and Cilium
> objects. Further, in some BPF programs there are noticeable reduction in
> number of states and instructions validated due to more effective
> precision tracking, especially benefiting syncookie test.
> 
> $ ./veristat -C -e file,prog,insns,states ~/baseline-results.csv ~/subprog-precise-results.csv  | grep -v '+0'
> File                                      Program                     Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
> ----------------------------------------  --------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
> pyperf600_bpf_loop.bpf.linked1.o          on_event                               3966             3678       -288 (-7.26%)               306               276         -30 (-9.80%)
> pyperf_global.bpf.linked1.o               on_event                               7563             7530        -33 (-0.44%)               520               517          -3 (-0.58%)
> pyperf_subprogs.bpf.linked1.o             on_event                              36358            36934       +576 (+1.58%)              2499              2531         +32 (+1.28%)
> setget_sockopt.bpf.linked1.o              skops_sockopt                          3965             4038        +73 (+1.84%)               343               347          +4 (+1.17%)
> test_cls_redirect_subprogs.bpf.linked1.o  cls_redirect                          64965            64901        -64 (-0.10%)              4619              4612          -7 (-0.15%)
> test_misc_tcp_hdr_options.bpf.linked1.o   misc_estab                             1491             1307      -184 (-12.34%)               110               100         -10 (-9.09%)
> test_pkt_access.bpf.linked1.o             test_pkt_access                         354              349         -5 (-1.41%)                25                24          -1 (-4.00%)
> test_sock_fields.bpf.linked1.o            egress_read_sock_fields                 435              375       -60 (-13.79%)                22                20          -2 (-9.09%)
> test_sysctl_loop2.bpf.linked1.o           sysctl_tcp_mem                         1508             1501         -7 (-0.46%)                29                28          -1 (-3.45%)
> test_tc_dtime.bpf.linked1.o               egress_fwdns_prio100                    468              435        -33 (-7.05%)                45                41          -4 (-8.89%)
> test_tc_dtime.bpf.linked1.o               ingress_fwdns_prio100                   398              408        +10 (+2.51%)                42                39          -3 (-7.14%)
> test_tc_dtime.bpf.linked1.o               ingress_fwdns_prio101                  1096              842      -254 (-23.18%)                97                73        -24 (-24.74%)
> test_tcp_hdr_options.bpf.linked1.o        estab                                  2758             2408      -350 (-12.69%)               208               181        -27 (-12.98%)
> test_urandom_usdt.bpf.linked1.o           urand_read_with_sema                    466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
> test_urandom_usdt.bpf.linked1.o           urand_read_without_sema                 466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
> test_urandom_usdt.bpf.linked1.o           urandlib_read_with_sema                 466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
> test_urandom_usdt.bpf.linked1.o           urandlib_read_without_sema              466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
> test_xdp_noinline.bpf.linked1.o           balancer_ingress_v6                    4302             4294         -8 (-0.19%)               257               256          -1 (-0.39%)
> xdp_synproxy_kern.bpf.linked1.o           syncookie_tc                         583722           405757   -177965 (-30.49%)             35846             25735     -10111 (-28.21%)
> xdp_synproxy_kern.bpf.linked1.o           syncookie_xdp                        609123           479055   -130068 (-21.35%)             35452             29145      -6307 (-17.79%)
> ----------------------------------------  --------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  kernel/bpf/verifier.c | 61 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 7a71154de32b..cf9e20da8f6b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -511,6 +511,15 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id)
>  	return func_id == BPF_FUNC_dynptr_data;
>  }
>  
> +static bool is_callback_calling_function(enum bpf_func_id func_id)
> +{
> +	return func_id == BPF_FUNC_for_each_map_elem ||
> +	       func_id == BPF_FUNC_timer_set_callback ||
> +	       func_id == BPF_FUNC_find_vma ||
> +	       func_id == BPF_FUNC_loop ||
> +	       func_id == BPF_FUNC_user_ringbuf_drain;
> +}
> +
>  static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id,
>  					const struct bpf_map *map)
>  {
> @@ -1684,7 +1693,7 @@ static void __mark_reg_unknown(const struct bpf_verifier_env *env,
>  	reg->type = SCALAR_VALUE;
>  	reg->var_off = tnum_unknown;
>  	reg->frameno = 0;
> -	reg->precise = env->subprog_cnt > 1 || !env->bpf_capable;
> +	reg->precise = !env->bpf_capable;
>  	__mark_reg_unbounded(reg);
>  }
>  
> @@ -2653,6 +2662,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
>  		if (opcode == BPF_CALL) {
>  			if (insn->src_reg == BPF_PSEUDO_CALL)
>  				return -ENOTSUPP;
> +			/* BPF helpers that invoke callback subprogs are
> +			 * equivalent to BPF_PSEUDO_CALL above
> +			 */
> +			if (insn->src_reg == 0 && is_callback_calling_function(insn->imm))
> +				return -ENOTSUPP;
>  			/* regular helper call sets R0 */
>  			*reg_mask &= ~1;
>  			if (*reg_mask & 0x3f) {
> @@ -2816,10 +2830,39 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
>  		return 0;
>  	if (!reg_mask && !stack_mask)
>  		return 0;
> +
>  	for (;;) {
>  		DECLARE_BITMAP(mask, 64);
>  		u32 history = st->jmp_history_cnt;
>  
> +		if (last_idx < 0) {
> +			/* we are at the entry into subprog, which
> +			 * is expected for global funcs, but only if
> +			 * requested precise registers are R1-R5
> +			 * (which are global func's input arguments)
> +			 */
> +			if (st->curframe == 0 &&
> +			    st->frame[0]->subprogno > 0 &&
> +			    st->frame[0]->callsite == BPF_MAIN_FUNC &&
> +			    stack_mask == 0 && (reg_mask & ~0x3e) == 0) {
> +				bitmap_from_u64(mask, reg_mask);
> +				for_each_set_bit(i, mask, 32) {
> +					reg = &st->frame[0]->regs[i];
> +					if (reg->type != SCALAR_VALUE) {
> +						reg_mask &= ~(1u << i);
> +						continue;
> +					}
> +					reg->precise = true;
> +				}
> +				return 0;
> +			}
> +
> +			verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n",
> +				st->frame[0]->subprogno, reg_mask, stack_mask);
> +			WARN_ONCE(1, "verifier backtracking bug");
> +			return -EFAULT;
> +		}
> +
>  		if (env->log.level & BPF_LOG_LEVEL2)
>  			verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx);

Minor nit: maybe move above if (last_idx < 0) after this verbose() print?

st->parent should be == NULL once we detected global func, right?

If so (and considering next patches) doing reg->precise = true for this state
is unnecessary...
and then doing reg_mask &=
is also unnecessary since there is no st->parent to go next and no insns to backtrack, right?

Probably worth to keep all this code anyway...
just for clarity and documentation purpose.

>  		for (i = last_idx;;) {
> @@ -6647,6 +6690,10 @@ typedef int (*set_callee_state_fn)(struct bpf_verifier_env *env,
>  				   struct bpf_func_state *callee,
>  				   int insn_idx);
>  
> +static int set_callee_state(struct bpf_verifier_env *env,
> +			    struct bpf_func_state *caller,
> +			    struct bpf_func_state *callee, int insn_idx);
> +
>  static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>  			     int *insn_idx, int subprog,
>  			     set_callee_state_fn set_callee_state_cb)
> @@ -6697,6 +6744,16 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  		}
>  	}
>  
> +	/* set_callee_state is used for direct subprog calls, but we are
> +	 * interested in validating only BPF helpers that can call subprogs as
> +	 * callbacks
> +	 */
> +	if (set_callee_state_cb != set_callee_state && !is_callback_calling_function(insn->imm)) {
> +		verbose(env, "verifier bug: helper %s#%d is not marked as callback-calling\n",
> +			func_id_name(insn->imm), insn->imm);
> +		return -EFAULT;
> +	}
> +
>  	if (insn->code == (BPF_JMP | BPF_CALL) &&
>  	    insn->src_reg == 0 &&
>  	    insn->imm == BPF_FUNC_timer_set_callback) {
> @@ -14615,6 +14672,8 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
>  			BPF_MAIN_FUNC /* callsite */,
>  			0 /* frameno */,
>  			subprog);
> +	state->first_insn_idx = env->subprog_info[subprog].start;
> +	state->last_insn_idx = -1;
>  
>  	regs = state->frame[state->curframe]->regs;
>  	if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs
  2022-11-03  1:41   ` Alexei Starovoitov
@ 2022-11-04 16:32     ` Andrii Nakryiko
  2022-11-04 20:01       ` Alexei Starovoitov
  0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2022-11-04 16:32 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Andrii Nakryiko, bpf, ast, daniel, kernel-team

On Wed, Nov 2, 2022 at 6:41 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Nov 01, 2022 at 11:22:18PM -0700, Andrii Nakryiko wrote:
> > Stop forcing precise=true for SCALAR registers when BPF program has any
> > subprograms. Current restriction means that any BPF program, as soon as
> > it uses subprograms, will end up not getting any of the precision
> > tracking benefits in reduction of number of verified states.
> >
> > This patch keeps the fallback mark_all_scalars_precise() behavior if
> > precise marking has to cross function frames. E.g., if subprogram
> > requires R1 (first input arg) to be marked precise, ideally we'd need to
> > backtrack to the parent function and keep marking R1 and its
> > dependencies as precise. But right now we give up and force all the
> > SCALARs in any of the current and parent states to be forced to
> > precise=true. We can lift that restriction in the future.
> >
> > But this patch fixes two issues identified when trying to enable
> > precision tracking for subprogs.
> >
> > First, prevent "escaping" from top-most state in a global subprog. While
> > with entry-level BPF program we never end up requesting precision for
> > R1-R5 registers, because R2-R5 are not initialized (and so not readable
> > in correct BPF program), and R1 is PTR_TO_CTX, not SCALAR, and so is
> > implicitly precise. With global subprogs, though, it's different, as
> > global subprog a) can have up to 5 SCALAR input arguments, which might
> > get marked as precise=true and b) it is validated in isolation from its
> > main entry BPF program. b) means that we can end up exhausting parent
> > state chain and still not mark all registers in reg_mask as precise,
> > which would lead to verifier bug warning.
> >
> > To handle that, we need to consider two cases. First, if the very first
> > state is not immediately "checkpointed" (i.e., stored in state lookup
> > hashtable), it will get correct first_insn_idx and last_insn_idx
> > instruction set during state checkpointing. As such, this case is
> > already handled and __mark_chain_precision() already handles that by
> > just doing nothing when we reach to the very first parent state.
> > st->parent will be NULL and we'll just stop. Perhaps some extra check
> > for reg_mask and stack_mask is due here, but this patch doesn't address
> > that issue.
> >
> > More problematic second case is when global function's initial state is
> > immediately checkpointed before we manage to process the very first
> > instruction. This is happening because when there is a call to global
> > subprog from the main program the very first subprog's instruction is
> > marked as pruning point, so before we manage to process first
> > instruction we have to check and checkpoint state. This patch adds
> > a special handling for such "empty" state, which is identified by having
> > st->last_insn_idx set to -1. In such case, we check that we are indeed
> > validating global subprog, and with some sanity checking we mark input
> > args as precise if requested.
> >
> > Note that we also initialize state->first_insn_idx with correct start
> > insn_idx offset. For main program zero is correct value, but for any
> > subprog it's quite confusing to not have first_insn_idx set. This
> > doesn't have any functional impact, but helps with debugging and state
> > printing. We also explicitly initialize state->last_insns_idx instead of
> > relying on is_state_visited() to do this with env->prev_insns_idx, which
> > will be -1 on the very first instruction. This concludes necessary
> > changes to handle specifically global subprog's precision tracking.
> >
> > Second identified problem was missed handling of BPF helper functions
> > that call into subprogs (e.g., bpf_loop and few others). From precision
> > tracking and backtracking logic's standpoint those are effectively calls
> > into subprogs and should be called as BPF_PSEUDO_CALL calls.
> >
> > This patch takes the least intrusive way and just checks against a short
> > list of current BPF helpers that do call subprogs, encapsulated in
> > is_callback_calling_function() function. But to prevent accidentally
> > forgetting to add new BPF helpers to this "list", we also do a sanity
> > check in __check_func_call, which has to be called for each such special
> > BPF helper, to validate that BPF helper is indeed recognized as
> > callback-calling one. This should catch any missed checks in the future.
> > Adding some special flags to be added in function proto definitions
> > seemed like an overkill in this case.
> >
> > With the above changes, it's possible to remove forceful setting of
> > reg->precise to true in __mark_reg_unknown, which turns on precision
> > tracking both inside subprogs and entry progs that have subprogs. No
> > warnings or errors were detected across all the selftests, but also when
> > validating with veristat against internal Meta BPF objects and Cilium
> > objects. Further, in some BPF programs there are noticeable reduction in
> > number of states and instructions validated due to more effective
> > precision tracking, especially benefiting syncookie test.
> >

[...]

> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  kernel/bpf/verifier.c | 61 ++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 60 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 7a71154de32b..cf9e20da8f6b 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -511,6 +511,15 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id)
> >       return func_id == BPF_FUNC_dynptr_data;
> >  }
> >
> > +static bool is_callback_calling_function(enum bpf_func_id func_id)
> > +{
> > +     return func_id == BPF_FUNC_for_each_map_elem ||
> > +            func_id == BPF_FUNC_timer_set_callback ||
> > +            func_id == BPF_FUNC_find_vma ||
> > +            func_id == BPF_FUNC_loop ||
> > +            func_id == BPF_FUNC_user_ringbuf_drain;
> > +}
> > +
> >  static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id,
> >                                       const struct bpf_map *map)
> >  {
> > @@ -1684,7 +1693,7 @@ static void __mark_reg_unknown(const struct bpf_verifier_env *env,
> >       reg->type = SCALAR_VALUE;
> >       reg->var_off = tnum_unknown;
> >       reg->frameno = 0;
> > -     reg->precise = env->subprog_cnt > 1 || !env->bpf_capable;
> > +     reg->precise = !env->bpf_capable;
> >       __mark_reg_unbounded(reg);
> >  }
> >
> > @@ -2653,6 +2662,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
> >               if (opcode == BPF_CALL) {
> >                       if (insn->src_reg == BPF_PSEUDO_CALL)
> >                               return -ENOTSUPP;
> > +                     /* BPF helpers that invoke callback subprogs are
> > +                      * equivalent to BPF_PSEUDO_CALL above
> > +                      */
> > +                     if (insn->src_reg == 0 && is_callback_calling_function(insn->imm))
> > +                             return -ENOTSUPP;
> >                       /* regular helper call sets R0 */
> >                       *reg_mask &= ~1;
> >                       if (*reg_mask & 0x3f) {
> > @@ -2816,10 +2830,39 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
> >               return 0;
> >       if (!reg_mask && !stack_mask)
> >               return 0;
> > +
> >       for (;;) {
> >               DECLARE_BITMAP(mask, 64);
> >               u32 history = st->jmp_history_cnt;
> >
> > +             if (last_idx < 0) {
> > +                     /* we are at the entry into subprog, which
> > +                      * is expected for global funcs, but only if
> > +                      * requested precise registers are R1-R5
> > +                      * (which are global func's input arguments)
> > +                      */
> > +                     if (st->curframe == 0 &&
> > +                         st->frame[0]->subprogno > 0 &&
> > +                         st->frame[0]->callsite == BPF_MAIN_FUNC &&
> > +                         stack_mask == 0 && (reg_mask & ~0x3e) == 0) {
> > +                             bitmap_from_u64(mask, reg_mask);
> > +                             for_each_set_bit(i, mask, 32) {
> > +                                     reg = &st->frame[0]->regs[i];
> > +                                     if (reg->type != SCALAR_VALUE) {
> > +                                             reg_mask &= ~(1u << i);
> > +                                             continue;
> > +                                     }
> > +                                     reg->precise = true;
> > +                             }
> > +                             return 0;
> > +                     }
> > +
> > +                     verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n",
> > +                             st->frame[0]->subprogno, reg_mask, stack_mask);
> > +                     WARN_ONCE(1, "verifier backtracking bug");
> > +                     return -EFAULT;
> > +             }
> > +
> >               if (env->log.level & BPF_LOG_LEVEL2)
> >                       verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx);
>
> Minor nit: maybe move above if (last_idx < 0) after this verbose() print?
>

yep, done

> st->parent should be == NULL once we detected global func, right?

yep, if there are no bugs

>
> If so (and considering next patches) doing reg->precise = true for this state
> is unnecessary...
> and then doing reg_mask &=
> is also unnecessary since there is no st->parent to go next and no insns to backtrack, right?
>
> Probably worth to keep all this code anyway...
> just for clarity and documentation purpose.

You are right, there shouldn't be any more parent states. TBH, even
this "empty" state with last_idx == -1 is quite surprising.

But I thought I'd do correct precision handling just in case there is
some jump back to first instruction, at which point we'll do state
comparison to detect infinite loops and stuff like that. I haven't
analyzed much if precision markings are important for such use case,
but theoretically we can use this empty state in is_state_visited(),
so best make sure that we don't shortcut any logic.

I was actually thinking that maybe a better way to handle all this
would be to prevent having a global func's first instruction marked as
prune_point. But all this requires some more careful analysis and
testing. So maybe I'll get back to this as I work on next patch set.
It touches prune points anyways.

For now, if this doesn't look wrong, let's land it as is, and I'll
work on it some more later.

Oh, one more thing. We currently don't have a check at the very end
that reg_mask and stack_mask is zero, which should be the case for the
main program. We probably should add it. But then global func will be
a special case, as it could have R1-R5 as scalar input args, so those
might end up being set in reg_mask. And we can check that using BTF.
But I left it also for follow up, there are enough tricky changes in
one patch set :)


[...]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs
  2022-11-04 16:32     ` Andrii Nakryiko
@ 2022-11-04 20:01       ` Alexei Starovoitov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2022-11-04 20:01 UTC (permalink / raw)
  To: Andrii Nakryiko, Kumar Kartikeya Dwivedi
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team

On Fri, Nov 4, 2022 at 9:32 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Nov 2, 2022 at 6:41 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Nov 01, 2022 at 11:22:18PM -0700, Andrii Nakryiko wrote:
> > > Stop forcing precise=true for SCALAR registers when BPF program has any
> > > subprograms. Current restriction means that any BPF program, as soon as
> > > it uses subprograms, will end up not getting any of the precision
> > > tracking benefits in reduction of number of verified states.
> > >
> > > This patch keeps the fallback mark_all_scalars_precise() behavior if
> > > precise marking has to cross function frames. E.g., if subprogram
> > > requires R1 (first input arg) to be marked precise, ideally we'd need to
> > > backtrack to the parent function and keep marking R1 and its
> > > dependencies as precise. But right now we give up and force all the
> > > SCALARs in any of the current and parent states to be forced to
> > > precise=true. We can lift that restriction in the future.
> > >
> > > But this patch fixes two issues identified when trying to enable
> > > precision tracking for subprogs.
> > >
> > > First, prevent "escaping" from top-most state in a global subprog. While
> > > with entry-level BPF program we never end up requesting precision for
> > > R1-R5 registers, because R2-R5 are not initialized (and so not readable
> > > in correct BPF program), and R1 is PTR_TO_CTX, not SCALAR, and so is
> > > implicitly precise. With global subprogs, though, it's different, as
> > > global subprog a) can have up to 5 SCALAR input arguments, which might
> > > get marked as precise=true and b) it is validated in isolation from its
> > > main entry BPF program. b) means that we can end up exhausting parent
> > > state chain and still not mark all registers in reg_mask as precise,
> > > which would lead to verifier bug warning.
> > >
> > > To handle that, we need to consider two cases. First, if the very first
> > > state is not immediately "checkpointed" (i.e., stored in state lookup
> > > hashtable), it will get correct first_insn_idx and last_insn_idx
> > > instruction set during state checkpointing. As such, this case is
> > > already handled and __mark_chain_precision() already handles that by
> > > just doing nothing when we reach to the very first parent state.
> > > st->parent will be NULL and we'll just stop. Perhaps some extra check
> > > for reg_mask and stack_mask is due here, but this patch doesn't address
> > > that issue.
> > >
> > > More problematic second case is when global function's initial state is
> > > immediately checkpointed before we manage to process the very first
> > > instruction. This is happening because when there is a call to global
> > > subprog from the main program the very first subprog's instruction is
> > > marked as pruning point, so before we manage to process first
> > > instruction we have to check and checkpoint state. This patch adds
> > > a special handling for such "empty" state, which is identified by having
> > > st->last_insn_idx set to -1. In such case, we check that we are indeed
> > > validating global subprog, and with some sanity checking we mark input
> > > args as precise if requested.
> > >
> > > Note that we also initialize state->first_insn_idx with correct start
> > > insn_idx offset. For main program zero is correct value, but for any
> > > subprog it's quite confusing to not have first_insn_idx set. This
> > > doesn't have any functional impact, but helps with debugging and state
> > > printing. We also explicitly initialize state->last_insns_idx instead of
> > > relying on is_state_visited() to do this with env->prev_insns_idx, which
> > > will be -1 on the very first instruction. This concludes necessary
> > > changes to handle specifically global subprog's precision tracking.
> > >
> > > Second identified problem was missed handling of BPF helper functions
> > > that call into subprogs (e.g., bpf_loop and few others). From precision
> > > tracking and backtracking logic's standpoint those are effectively calls
> > > into subprogs and should be called as BPF_PSEUDO_CALL calls.
> > >
> > > This patch takes the least intrusive way and just checks against a short
> > > list of current BPF helpers that do call subprogs, encapsulated in
> > > is_callback_calling_function() function. But to prevent accidentally
> > > forgetting to add new BPF helpers to this "list", we also do a sanity
> > > check in __check_func_call, which has to be called for each such special
> > > BPF helper, to validate that BPF helper is indeed recognized as
> > > callback-calling one. This should catch any missed checks in the future.
> > > Adding some special flags to be added in function proto definitions
> > > seemed like an overkill in this case.
> > >
> > > With the above changes, it's possible to remove forceful setting of
> > > reg->precise to true in __mark_reg_unknown, which turns on precision
> > > tracking both inside subprogs and entry progs that have subprogs. No
> > > warnings or errors were detected across all the selftests, but also when
> > > validating with veristat against internal Meta BPF objects and Cilium
> > > objects. Further, in some BPF programs there are noticeable reduction in
> > > number of states and instructions validated due to more effective
> > > precision tracking, especially benefiting syncookie test.
> > >
>
> [...]
>
> > >
> > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > ---
> > >  kernel/bpf/verifier.c | 61 ++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 60 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 7a71154de32b..cf9e20da8f6b 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -511,6 +511,15 @@ static bool is_dynptr_ref_function(enum bpf_func_id func_id)
> > >       return func_id == BPF_FUNC_dynptr_data;
> > >  }
> > >
> > > +static bool is_callback_calling_function(enum bpf_func_id func_id)
> > > +{
> > > +     return func_id == BPF_FUNC_for_each_map_elem ||
> > > +            func_id == BPF_FUNC_timer_set_callback ||
> > > +            func_id == BPF_FUNC_find_vma ||
> > > +            func_id == BPF_FUNC_loop ||
> > > +            func_id == BPF_FUNC_user_ringbuf_drain;
> > > +}
> > > +
> > >  static bool helper_multiple_ref_obj_use(enum bpf_func_id func_id,
> > >                                       const struct bpf_map *map)
> > >  {
> > > @@ -1684,7 +1693,7 @@ static void __mark_reg_unknown(const struct bpf_verifier_env *env,
> > >       reg->type = SCALAR_VALUE;
> > >       reg->var_off = tnum_unknown;
> > >       reg->frameno = 0;
> > > -     reg->precise = env->subprog_cnt > 1 || !env->bpf_capable;
> > > +     reg->precise = !env->bpf_capable;
> > >       __mark_reg_unbounded(reg);
> > >  }
> > >
> > > @@ -2653,6 +2662,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
> > >               if (opcode == BPF_CALL) {
> > >                       if (insn->src_reg == BPF_PSEUDO_CALL)
> > >                               return -ENOTSUPP;
> > > +                     /* BPF helpers that invoke callback subprogs are
> > > +                      * equivalent to BPF_PSEUDO_CALL above
> > > +                      */
> > > +                     if (insn->src_reg == 0 && is_callback_calling_function(insn->imm))
> > > +                             return -ENOTSUPP;
> > >                       /* regular helper call sets R0 */
> > >                       *reg_mask &= ~1;
> > >                       if (*reg_mask & 0x3f) {
> > > @@ -2816,10 +2830,39 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
> > >               return 0;
> > >       if (!reg_mask && !stack_mask)
> > >               return 0;
> > > +
> > >       for (;;) {
> > >               DECLARE_BITMAP(mask, 64);
> > >               u32 history = st->jmp_history_cnt;
> > >
> > > +             if (last_idx < 0) {
> > > +                     /* we are at the entry into subprog, which
> > > +                      * is expected for global funcs, but only if
> > > +                      * requested precise registers are R1-R5
> > > +                      * (which are global func's input arguments)
> > > +                      */
> > > +                     if (st->curframe == 0 &&
> > > +                         st->frame[0]->subprogno > 0 &&
> > > +                         st->frame[0]->callsite == BPF_MAIN_FUNC &&
> > > +                         stack_mask == 0 && (reg_mask & ~0x3e) == 0) {
> > > +                             bitmap_from_u64(mask, reg_mask);
> > > +                             for_each_set_bit(i, mask, 32) {
> > > +                                     reg = &st->frame[0]->regs[i];
> > > +                                     if (reg->type != SCALAR_VALUE) {
> > > +                                             reg_mask &= ~(1u << i);
> > > +                                             continue;
> > > +                                     }
> > > +                                     reg->precise = true;
> > > +                             }
> > > +                             return 0;
> > > +                     }
> > > +
> > > +                     verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n",
> > > +                             st->frame[0]->subprogno, reg_mask, stack_mask);
> > > +                     WARN_ONCE(1, "verifier backtracking bug");
> > > +                     return -EFAULT;
> > > +             }
> > > +
> > >               if (env->log.level & BPF_LOG_LEVEL2)
> > >                       verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx);
> >
> > Minor nit: maybe move above if (last_idx < 0) after this verbose() print?
> >
>
> yep, done
>
> > st->parent should be == NULL once we detected global func, right?
>
> yep, if there are no bugs
>
> >
> > If so (and considering next patches) doing reg->precise = true for this state
> > is unnecessary...
> > and then doing reg_mask &=
> > is also unnecessary since there is no st->parent to go next and no insns to backtrack, right?
> >
> > Probably worth to keep all this code anyway...
> > just for clarity and documentation purpose.
>
> You are right, there shouldn't be any more parent states. TBH, even
> this "empty" state with last_idx == -1 is quite surprising.
>
> But I thought I'd do correct precision handling just in case there is
> some jump back to first instruction, at which point we'll do state
> comparison to detect infinite loops and stuff like that. I haven't
> analyzed much if precision markings are important for such use case,
> but theoretically we can use this empty state in is_state_visited(),
> so best make sure that we don't shortcut any logic.
>
> I was actually thinking that maybe a better way to handle all this
> would be to prevent having a global func's first instruction marked as
> prune_point. But all this requires some more careful analysis and
> testing. So maybe I'll get back to this as I work on next patch set.
> It touches prune points anyways.
>
> For now, if this doesn't look wrong, let's land it as is, and I'll
> work on it some more later.
>
> Oh, one more thing. We currently don't have a check at the very end
> that reg_mask and stack_mask is zero, which should be the case for the
> main program. We probably should add it. But then global func will be
> a special case, as it could have R1-R5 as scalar input args, so those
> might end up being set in reg_mask. And we can check that using BTF.
> But I left it also for follow up, there are enough tricky changes in
> one patch set :)

Yep. Agree on all points.
This set is already tricky and it's a great improvement
for precision logic.
Though it's pushed to bpf-next I hope folks who reviewed
the precision in the past (Kumar, Daniel, others?) will
spend time carefully analyzing the patches.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-11-04 20:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-02  6:22 [PATCH bpf-next 0/6] BPF verifier precision tracking improvements Andrii Nakryiko
2022-11-02  6:22 ` [PATCH bpf-next 1/6] bpf: propagate precision in ALU/ALU64 operations Andrii Nakryiko
2022-11-03  1:23   ` Alexei Starovoitov
2022-11-02  6:22 ` [PATCH bpf-next 2/6] bpf: propagate precision across all frames, not just the last one Andrii Nakryiko
2022-11-02  6:22 ` [PATCH bpf-next 3/6] bpf: allow precision tracking for programs with subprogs Andrii Nakryiko
2022-11-03  1:41   ` Alexei Starovoitov
2022-11-04 16:32     ` Andrii Nakryiko
2022-11-04 20:01       ` Alexei Starovoitov
2022-11-02  6:22 ` [PATCH bpf-next 4/6] bpf: stop setting precise in current state Andrii Nakryiko
2022-11-02  6:22 ` [PATCH bpf-next 5/6] bpf: aggressively forget precise markings during state checkpointing Andrii Nakryiko
2022-11-02  6:22 ` [PATCH bpf-next 6/6] selftests/bpf: make test_align selftest more robust Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox