Re: [PATCH v2 1/2] bpf, verifier: Improve precision for BPF_ADD and BPF_SUB

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Eduard Zingerman <eddyz87@gmail.com>
To: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>,
	 ast@kernel.org
Cc: m.shachnai@rutgers.edu, srinivas.narayana@rutgers.edu,
	 santosh.nagarakatte@rutgers.edu,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau	 <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>,
	Yonghong Song	 <yonghong.song@linux.dev>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev	 <sdf@fomichev.me>,
	Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>,
		bpf@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/2] bpf, verifier: Improve precision for BPF_ADD and BPF_SUB
Date: Wed, 18 Jun 2025 13:53:56 -0700	[thread overview]
Message-ID: <25d3c6762681b583165b1afe6b1837b22d20b818.camel@gmail.com> (raw)
In-Reply-To: <20250617231733.181797-2-harishankar.vishwanathan@gmail.com>

On Tue, 2025-06-17 at 19:17 -0400, Harishankar Vishwanathan wrote:
> This patch improves the precison of the scalar(32)_min_max_add and
> scalar(32)_min_max_sub functions, which update the u(32)min/u(32)_max
> ranges for the BPF_ADD and BPF_SUB instructions. We discovered this more
> precise operator using a technique we are developing for automatically
> synthesizing functions for updating tnums and ranges.
> 
> According to the BPF ISA [1], "Underflow and overflow are allowed during
> arithmetic operations, meaning the 64-bit or 32-bit value will wrap".
> Our patch leverages the wrap-around semantics of unsigned overflow and
> underflow to improve precision.
> 
> Below is an example of our patch for scalar_min_max_add; the idea is
> analogous for all four functions.
> 
> There are three cases to consider when adding two u64 ranges [dst_umin,
> dst_umax] and [src_umin, src_umax]. Consider a value x in the range
> [dst_umin, dst_umax] and another value y in the range [src_umin,
> src_umax].
> 
> (a) No overflow: No addition x + y overflows. This occurs when even the
> largest possible sum, i.e., dst_umax + src_umax <= U64_MAX.
> 
> (b) Partial overflow: Some additions x + y overflow. This occurs when
> the largest possible sum overflows (dst_umax + src_umax > U64_MAX), but
> the smallest possible sum does not overflow (dst_umin + src_umin <=
> U64_MAX).
> 
> (c) Full overflow: All additions x + y overflow. This occurs when both
> the smallest possible sum and the largest possible sum overflow, i.e.,
> both (dst_umin + src_umin) and (dst_umax + src_umax) are > U64_MAX.
> 
> The current implementation conservatively sets the output bounds to
> unbounded, i.e, [umin=0, umax=U64_MAX], whenever there is *any*
> possibility of overflow, i.e, in cases (b) and (c). Otherwise it
> computes tight bounds as [dst_umin + src_umin, dst_umax + src_umax]:
> 
> if (check_add_overflow(*dst_umin, src_reg->umin_value, dst_umin) ||
>     check_add_overflow(*dst_umax, src_reg->umax_value, dst_umax)) {
> 	*dst_umin = 0;
> 	*dst_umax = U64_MAX;
> }
> 
> Our synthesis-based technique discovered a more precise operator.
> Particularly, in case (c), all possible additions x + y overflow and
> wrap around according to eBPF semantics, and the computation of the
> output range as [dst_umin + src_umin, dst_umax + src_umax] continues to
> work. Only in case (b), do we need to set the output bounds to
> unbounded, i.e., [0, U64_MAX].
> 
> Case (b) can be checked by seeing if the minimum possible sum does *not*
> overflow and the maximum possible sum *does* overflow, and when that
> happens, we set the output to unbounded:
> 
> min_overflow = check_add_overflow(*dst_umin, src_reg->umin_value, dst_umin);
> max_overflow = check_add_overflow(*dst_umax, src_reg->umax_value, dst_umax);
> 
> if (!min_overflow && max_overflow) {
> 	*dst_umin = 0;
> 	*dst_umax = U64_MAX;
> }
> 
> Below is an example eBPF program and the corresponding log from the
> verifier. At instruction 7: (0f) r5 += r3, due to conservative overflow
> handling, the current implementation of scalar_min_max_add() sets r5's
> bounds to [0, U64_MAX], which is then updated by reg_bounds_sync() to
> [0x3d67e960f7d, U64_MAX].
> 
> 0: R1=ctx() R10=fp0
> 0: (85) call bpf_get_prandom_u32#7    ; R0_w=scalar()
> 1: (bf) r3 = r0                       ; R0_w=scalar(id=1) R3_w=scalar(id=1)
> 2: (18) r4 = 0x950a43d67e960f7d       ; R4_w=0x950a43d67e960f7d
> 4: (4f) r3 |= r4                      ; R3_w=scalar(smin=0x950a43d67e960f7d,smax=-1,umin=0x950a43d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x950a43d67e960f7d; 0x6af5bc298169f082)) R4_w=0x950a43d67e960f7d
> 5: (18) r5 = 0xc014a00000000000       ; R5_w=0xc014a00000000000
> 7: (0f) r5 += r3                      ; R3_w=scalar(smin=0x950a43d67e960f7d,smax=-1,umin=0x950a43d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x950a43d67e960f7d; 0x6af5bc298169f082)) R5_w=scalar(smin=0x800003d67e960f7d,umin=0x3d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x3d67e960f7d; 0xfffffc298169f082))
> 8: (b7) r0 = 0                        ; R0_w=0
> 9: (95) exit
> 
> With our patch, r5's bounds after instruction 7 are set to a much more
> precise [0x551ee3d67e960f7d, 0xc0149fffffffffff] by
> scalar_min_max_add().
> 
> ...
> 7: (0f) r5 += r3                      ; R3_w=scalar(smin=0x950a43d67e960f7d,smax=-1,umin=0x950a43d67e960f7d,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x950a43d67e960f7d; 0x6af5bc298169f082)) R5_w=scalar(smin=0x800003d67e960f7d,umin=0x551ee3d67e960f7d,umax=0xc0149fffffffffff,smin32=0xfe960f7d,umin32=0x7e960f7d,var_off=(0x3d67e960f7d; 0xfffffc298169f082))
> 8: (b7) r0 = 0                        ; R0_w=0
> 9: (95) exit
> 
> The logic for scalar32_min_max_add is analogous. For the
> scalar(32)_min_max_sub functions, the reasoning is similar but applied
> to detecting underflow instead of overflow.
> 
> We verified the correctness of the new implementations using Agni [3,4].
> 
> We since also discovered that a similar technique has been used to
> calculate output ranges for unsigned interval addition and subtraction
> in Hacker's Delight [2].
> 
> [1] https://docs.kernel.org/bpf/standardization/instruction-set.html
> [2] Hacker's Delight Ch.4-2, Propagating Bounds through Add’s and Subtract’s
> [3] https://github.com/bpfverif/agni
> [4] https://people.cs.rutgers.edu/~sn349/papers/sas24-preprint.pdf
> 
> Co-developed-by: Matan Shachnai <m.shachnai@rutgers.edu>
> Signed-off-by: Matan Shachnai <m.shachnai@rutgers.edu>
> Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

next prev parent reply	other threads:[~2025-06-18 20:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-17 23:17 [PATCH v2 0/2] bpf, verifier: Improve precision of BPF_ADD and BPF_SUB Harishankar Vishwanathan
2025-06-17 23:17 ` [PATCH v2 1/2] bpf, verifier: Improve precision for " Harishankar Vishwanathan
2025-06-18 11:24   ` Hao Sun
2025-06-19 22:17     ` Harishankar Vishwanathan
2025-06-18 20:53   ` Eduard Zingerman [this message]
2025-06-17 23:17 ` [PATCH v2 2/2] selftests/bpf: Add testcases " Harishankar Vishwanathan
2025-06-18 21:22   ` Eduard Zingerman
2025-06-19 21:13     ` Harishankar Vishwanathan
2025-06-19 21:55       ` Eduard Zingerman
2025-06-19 22:34         ` Harishankar Vishwanathan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25d3c6762681b583165b1afe6b1837b22d20b818.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=harishankar.vishwanathan@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.shachnai@rutgers.edu \
    --cc=martin.lau@linux.dev \
    --cc=santosh.nagarakatte@rutgers.edu \
    --cc=sdf@fomichev.me \
    --cc=song@kernel.org \
    --cc=srinivas.narayana@rutgers.edu \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).