public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
To: Paul Chaignon <paul.chaignon@gmail.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>,
	Shung-Hsi Yu <shung-hsi.yu@suse.com>,
	Srinivas Narayana <srinivas.narayana@rutgers.edu>,
	Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Subject: Re: [PATCH v2 bpf-next 4/6] bpf: Simulate branches to prune based on range violations
Date: Mon, 23 Mar 2026 16:19:08 +0000	[thread overview]
Message-ID: <87ldfijz7n.fsf@gmail.com> (raw)
In-Reply-To: <d4fe45f8bd5c6a48efd2ba3b66932bf7eb5aa020.1774025082.git.paul.chaignon@gmail.com>

Paul Chaignon <paul.chaignon@gmail.com> writes:

> From: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
>
> This patch fixes the invariant violations that can happen after we
> refine ranges & tnum based on an incorrectly-detected branch condition.
> For example, the branch is always true, but we miss it in
> is_branch_taken; we then refine based on the branch being false and end
> up with incoherent ranges (e.g. umax < umin).
>
> To avoid this, we can simulate the refinement on both branches. More
> specifically, this patch simulates both branches taken using
> regs_refine_cond_op and reg_bounds_sync. If the resulting register
> states are ill-formed on one of the branches, is_branch_taken can mark
> that branch as "never taken".
>
> On a more formal note, we can deduce a branch is not taken when
> regs_refine_cond_op or reg_bounds_sync returns an ill-formed state
> because the branch operators are sound (formally verified). Soundness
> means that the verifier is guaranteed to produce sound outputs on the
> taken branches. On the non-taken branch (explored because of
> imprecision in the bounds), the verifier is free to produce any output.
> We use ill-formedness as a signal that the branch is dead and prune
> that branch.
>
> This patch moves the refinement logic for both branches from
> reg_set_min_max to their own function, simulate_both_branches_taken,
> which is called from is_scalar_branch_taken. As a result,
> reg_set_min_max now only runs sanity checks and has been renamed to
> reg_bounds_sanity_check_branches to reflect that.
>
> We have had five patches fixing specific cases of invariant violations
> in the past, all added with selftests:
> - commit fbc7aef517d8 ("bpf: Fix u32/s32 bounds when ranges cross
>   min/max boundary")
> - commit efc11a667878 ("bpf: Improve bounds when tnum has a single
>   possible value")
> - commit f41345f47fb2 ("bpf: Use tnums for JEQ/JNE is_branch_taken
>   logic")
> - commit 00bf8d0c6c9b ("bpf: Improve bounds when s64 crosses sign
>   boundary")
> - commit 6279846b9b25 ("bpf: Forget ranges when refining tnum after
>   JSET")
>
> To confirm that this patch addresses all invariant violations, we have
> also reverted those five commits and verified that their related
> selftests don't cause any invariant violation warnings anymore. Those
> selftests still fail but only because of mis-detected branches or
> less-precise bounds than expected. This demonstrates that the current
> patch is enough to avoid the invariant violation warning AND that the
> previous five patches are still useful to improve branch detection.
>
> In addition to the selftests, this change was also tested with the
> Cilium complexity test suite: all programs were successfully loaded and
> it didn't change the number of processed instructions.
>
> Reported-by: syzbot+c950cc277150935cc0b5@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=c950cc277150935cc0b5
> Co-developed-by: Paul Chaignon <paul.chaignon@gmail.com>
> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
> Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
> ---
>  kernel/bpf/verifier.c | 84 +++++++++++++++++++++++++------------------
>  1 file changed, 49 insertions(+), 35 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 8d7ad5f21ed6..0b4f622efec1 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -16717,11 +16717,48 @@ static void find_good_pkt_pointers(struct bpf_verifier_state *vstate,
>  	}));
>  }
>  
> +static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state *reg2,
> +				u8 opcode, bool is_jmp32);
> +static u8 rev_opcode(u8 opcode);
> +
> +/* Learn more information about live branches by simulating both branches being
> + * taken using regs_refine_cond_op. Because regs_refine_cond_op is sound when
> + * the branch is taken, if it produces ill-formed register bounds, it must mean
> + * that the branch is dead.
> + */
Sorry for being nit-picky, could you please use kernel style comment style +
maybe reword it a little bit, instead of:

/*
 * Because regs_refine_cond_op is sound when
 * the branch is taken, if it produces ill-formed register bounds, it must mean
 * that the branch is dead.
 */
Something like:
/*
 * regs_refine_cond_op() is sound, so producing ill-formed register
 * bounds for the branch means that branch is dead.
 */

> +static int simulate_both_branches_taken(struct bpf_verifier_env *env, u8 opcode, bool is_jmp32)
> +{
> +	/* Fallthrough (FALSE) branch */
> +	regs_refine_cond_op(&env->false_reg1, &env->false_reg2, rev_opcode(opcode), is_jmp32);
> +	reg_bounds_sync(&env->false_reg1);
> +	reg_bounds_sync(&env->false_reg2);
This is probably more related to patch 2: is it necessary to have both
true/false_reg1/2 pairs, it looks like we process false branch before
the true branch and they never intersect, don't they? So it worth
removing a pair of bpf_reg_states from env?
> +	/* If there is a range bounds violation in *any* of the abstract values in
> +	 * either reg_states in the FALSE branch (i.e. reg1, reg2), the
> +	 * FALSE branch must be dead. Only TRUE branch will be taken.
> +	 */
> +	if (range_bounds_violation(&env->false_reg1) || range_bounds_violation(&env->false_reg2))
> +		return 1;
> +
> +	/* Jump (TRUE) branch */
> +	regs_refine_cond_op(&env->true_reg1, &env->true_reg2, opcode, is_jmp32);
> +	reg_bounds_sync(&env->true_reg1);
> +	reg_bounds_sync(&env->true_reg2);
> +	/* If there is a range bounds violation in *any* of the abstract values in
> +	 * either reg_states in the TRUE branch (i.e. true_reg1, true_reg2), the
> +	 * TRUE branch must be dead. Only FALSE branch will be taken.
> +	 */
> +	if (range_bounds_violation(&env->true_reg1) || range_bounds_violation(&env->true_reg2))
> +		return 0;
> +
> +	/* Both branches are possible, we can't determine which one will be taken. */
> +	return -1;
> +}
> +
>  /*
>   * <reg1> <op> <reg2>, currently assuming reg2 is a constant
>   */
> -static int is_scalar_branch_taken(struct bpf_reg_state *reg1, struct bpf_reg_state *reg2,
> -				  u8 opcode, bool is_jmp32)
> +static int is_scalar_branch_taken(struct bpf_verifier_env *env, struct bpf_reg_state *reg1,
> +				  struct bpf_reg_state *reg2, u8 opcode, bool is_jmp32)
>  {
>  	struct tnum t1 = is_jmp32 ? tnum_subreg(reg1->var_off) : reg1->var_off;
>  	struct tnum t2 = is_jmp32 ? tnum_subreg(reg2->var_off) : reg2->var_off;
> @@ -16873,7 +16910,7 @@ static int is_scalar_branch_taken(struct bpf_reg_state *reg1, struct bpf_reg_sta
>  		break;
>  	}
>  
> -	return -1;
> +	return simulate_both_branches_taken(env, opcode, is_jmp32);
>  }
>  
>  static int flip_opcode(u32 opcode)
> @@ -16944,8 +16981,8 @@ static int is_pkt_ptr_branch_taken(struct bpf_reg_state *dst_reg,
>   * -1 - unknown. Example: "if (reg1 < 5)" is unknown when register value
>   *      range [0,10]
>   */
> -static int is_branch_taken(struct bpf_reg_state *reg1, struct bpf_reg_state *reg2,
> -			   u8 opcode, bool is_jmp32)
> +static int is_branch_taken(struct bpf_verifier_env *env, struct bpf_reg_state *reg1,
> +			   struct bpf_reg_state *reg2, u8 opcode, bool is_jmp32)
>  {
>  	if (reg_is_pkt_pointer_any(reg1) && reg_is_pkt_pointer_any(reg2) && !is_jmp32)
>  		return is_pkt_ptr_branch_taken(reg1, reg2, opcode);
> @@ -16983,7 +17020,7 @@ static int is_branch_taken(struct bpf_reg_state *reg1, struct bpf_reg_state *reg
>  	}
>  
>  	/* now deal with two scalars, but not necessarily constants */
> -	return is_scalar_branch_taken(reg1, reg2, opcode, is_jmp32);
> +	return is_scalar_branch_taken(env, reg1, reg2, opcode, is_jmp32);
>  }
>  
>  /* Opcode that corresponds to a *false* branch condition.
> @@ -17074,8 +17111,8 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state
>  			/* u32_min_value is not equal to 0xffffffff at this point,
>  			 * because otherwise u32_max_value is 0xffffffff as well,
>  			 * in such a case both reg1 and reg2 would be constants,
> -			 * jump would be predicted and reg_set_min_max() won't
> -			 * be called.
> +			 * jump would be predicted and regs_refine_cond_op()
> +			 * wouldn't be called.
>  			 *
>  			 * Same reasoning works for all {u,s}{min,max}{32,64} cases
>  			 * below.
> @@ -17182,34 +17219,11 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state
>  	}
>  }
>  
> -/* Adjusts the register min/max values in the case that the dst_reg and
> - * src_reg are both SCALAR_VALUE registers (or we are simply doing a BPF_K
> - * check, in which case we have a fake SCALAR_VALUE representing insn->imm).
> - * Technically we can do similar adjustments for pointers to the same object,
> - * but we don't support that right now.
> - */
> -static int reg_set_min_max(struct bpf_verifier_env *env,
> -			   u8 opcode, bool is_jmp32)
> +/* Check for invariant violations on the registers for both branches of a condition */
> +static int regs_bounds_sanity_check_branches(struct bpf_verifier_env *env)
>  {
>  	int err;
>  
> -	/* If either register is a pointer, we can't learn anything about its
> -	 * variable offset from the compare (unless they were a pointer into
> -	 * the same object, but we don't bother with that).
> -	 */
> -	if (env->false_reg1.type != SCALAR_VALUE || env->false_reg2.type != SCALAR_VALUE)
> -		return 0;
> -
> -	/* fallthrough (FALSE) branch */
> -	regs_refine_cond_op(&env->false_reg1, &env->false_reg2, rev_opcode(opcode), is_jmp32);
> -	reg_bounds_sync(&env->false_reg1);
> -	reg_bounds_sync(&env->false_reg2);
> -
> -	/* jump (TRUE) branch */
> -	regs_refine_cond_op(&env->true_reg1, &env->true_reg2, opcode, is_jmp32);
> -	reg_bounds_sync(&env->true_reg1);
> -	reg_bounds_sync(&env->true_reg2);
> -
>  	err = reg_bounds_sanity_check(env, &env->true_reg1, "true_reg1");
>  	err = err ?: reg_bounds_sanity_check(env, &env->true_reg2, "true_reg2");
>  	err = err ?: reg_bounds_sanity_check(env, &env->false_reg1, "false_reg1");
> @@ -17595,7 +17609,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
>  	copy_register_state(&env->false_reg2, src_reg);
>  	copy_register_state(&env->true_reg1, dst_reg);
>  	copy_register_state(&env->true_reg2, src_reg);
> -	pred = is_branch_taken(dst_reg, src_reg, opcode, is_jmp32);
> +	pred = is_branch_taken(env, dst_reg, src_reg, opcode, is_jmp32);
>  	if (pred >= 0) {
>  		/* If we get here with a dst_reg pointer type it is because
>  		 * above is_branch_taken() special cased the 0 comparison.
> @@ -17659,7 +17673,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
>  		return PTR_ERR(other_branch);
>  	other_branch_regs = other_branch->frame[other_branch->curframe]->regs;
>  
> -	err = reg_set_min_max(env, opcode, is_jmp32);
> +	err = regs_bounds_sanity_check_branches(env);
>  	if (err)
>  		return err;
>  
> -- 
> 2.43.0

  parent reply	other threads:[~2026-03-23 16:19 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 16:45 [PATCH v2 bpf-next 0/6] Fix invariant violations and improve branch detection Paul Chaignon
2026-03-20 16:47 ` [PATCH v2 bpf-next 1/6] bpf: Refactor reg_bounds_sanity_check Paul Chaignon
2026-03-23  8:01   ` Shung-Hsi Yu
2026-03-23 14:16   ` Mykyta Yatsenko
2026-03-24 16:56     ` Harishankar Vishwanathan
2026-03-24 18:16       ` Mykyta Yatsenko
2026-03-20 16:49 ` [PATCH v2 bpf-next 2/6] bpf: Use bpf_verifier_env buffers for reg_set_min_max Paul Chaignon
2026-03-23  8:15   ` Shung-Hsi Yu
2026-03-23 15:33   ` Mykyta Yatsenko
2026-03-23 18:42   ` Eduard Zingerman
2026-03-20 16:49 ` [PATCH v2 bpf-next 3/6] bpf: Exit early if reg_bounds_sync gets invalid inputs Paul Chaignon
2026-03-23 12:12   ` Shung-Hsi Yu
2026-03-24 17:46     ` Harishankar Vishwanathan
2026-03-23 18:47   ` Eduard Zingerman
2026-03-24 19:28     ` Harishankar Vishwanathan
2026-03-24 19:33       ` Eduard Zingerman
2026-03-20 16:49 ` [PATCH v2 bpf-next 4/6] bpf: Simulate branches to prune based on range violations Paul Chaignon
2026-03-23 12:23   ` Shung-Hsi Yu
2026-03-23 16:19   ` Mykyta Yatsenko [this message]
2026-03-24 20:36     ` Harishankar Vishwanathan
2026-03-25 13:52       ` Mykyta Yatsenko
2026-03-23 19:05   ` Eduard Zingerman
2026-03-24 23:59     ` Harishankar Vishwanathan
2026-03-25  0:08       ` Eduard Zingerman
2026-03-20 16:50 ` [PATCH v2 bpf-next 5/6] selftests/bpf: Cover invariant violation cases from syzbot Paul Chaignon
2026-03-23 17:46   ` Mykyta Yatsenko
2026-03-28 16:20     ` Paul Chaignon
2026-03-28 17:31       ` Alexei Starovoitov
2026-03-20 16:50 ` [PATCH v2 bpf-next 6/6] selftests/bpf: Remove invariant violation flags Paul Chaignon
2026-03-23 18:04   ` Mykyta Yatsenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ldfijz7n.fsf@gmail.com \
    --to=mykyta.yatsenko5@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=harishankar.vishwanathan@gmail.com \
    --cc=paul.chaignon@gmail.com \
    --cc=santosh.nagarakatte@rutgers.edu \
    --cc=shung-hsi.yu@suse.com \
    --cc=srinivas.narayana@rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox