From: Eduard Zingerman <eddyz87@gmail.com>
To: Shung-Hsi Yu <shung-hsi.yu@suse.com>,
Xu Kuohai <xukuohai@huaweicloud.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
Roberto Sassu <roberto.sassu@huawei.com>,
Edward Cree <ecree.xilinx@gmail.com>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>,
Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>,
Srinivas Narayana <srinivas.narayana@rutgers.edu>,
Matan Shachnai <m.shachnai@rutgers.edu>
Subject: Re: [RFC bpf-next] bpf, verifier: improve signed ranges inference for BPF_AND
Date: Wed, 17 Jul 2024 14:10:35 -0700 [thread overview]
Message-ID: <be239a5581e5b7d5c6f310c2a4c11282aa5896b5.camel@gmail.com> (raw)
In-Reply-To: <ykuhustu7vt2ilwhl32kj655xfdgdlm2xkl5rff6tw2ycksovp@ss2n4gpjysnw>
On Tue, 2024-07-16 at 22:52 +0800, Shung-Hsi Yu wrote:
[...]
> To allow verification of such instruction pattern, update
> scalar*_min_max_and() to infer signed ranges directly from signed ranges
> of the operands. With BPF_AND, the resulting value always gains more
> unset '0' bit, thus it only move towards 0x0000000000000000. The
> difficulty lies with how to deal with signs. While non-negative
> (positive and zero) value simply grows smaller, a negative number can
> grows smaller, but may also underflow and become a larger value.
>
> To better address this situation we split the signed ranges into
> negative range and non-negative range cases, ignoring the mixed sign
> cases for now; and only consider how to calculate smax_value.
>
> Since negative range & negative range preserve the sign bit, so we know
> the result is still a negative value, thus it only move towards S64_MIN,
> but never underflow, thus a save bet is to use a value in ranges that is
> closet to 0, thus "max(dst_reg->smax_value, src->smax_value)". For
> negative range & positive range the sign bit is always cleared, thus we
> know the resulting is a non-negative, and only moves towards 0, so a
> safe bet is to use smax_value of the non-negative range. Last but not
> least, non-negative range & non-negative range is still a non-negative
> value, and only moves towards 0; however same as the unsigned range
> case, the maximum is actually capped by the lesser of the two, and thus
> min(dst_reg->smax_value, src_reg->smax_value);
>
> Listing out the above reasoning as a table (dst_reg abbreviated as dst,
> src_reg abbreviated as src, smax_value abbrivated as smax) we get:
>
> | src_reg
> smax = ? +---------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+---------------------------+---------------------------
> | negative | max(dst->smax, src->smax) | src->smax
> dst_reg +--------------+---------------------------+---------------------------
> | non-negative | dst->smax | min(dst->smax, src->smax)
>
> However this is quite complicated, luckily it can be simplified given
> the following observations
>
> max(dst_reg->smax_value, src_reg->smax_value) >= src_reg->smax_value
> max(dst_reg->smax_value, src_reg->smax_value) >= dst_reg->smax_value
> max(dst_reg->smax_value, src_reg->smax_value) >= min(dst_reg->smax_value, src_reg->smax_value)
>
> So we could substitute the cells in the table above all with max(...),
> and arrive at:
>
> | src_reg
> smax' = ? +---------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+---------------------------+---------------------------
> | negative | max(dst->smax, src->smax) | max(dst->smax, src->smax)
> dst_reg +--------------+---------------------------+---------------------------
> | non-negative | max(dst->smax, src->smax) | max(dst->smax, src->smax)
>
> Meaning that simply using
>
> max(dst_reg->smax_value, src_reg->smax_value)
>
> to calculate the resulting smax_value would work across all sign combinations.
>
>
> For smin_value, we know that both non-negative range & non-negative
> range and negative range & non-negative range both result in a
> non-negative value, so an easy guess is to use the minimum non-negative
> value, thus 0.
>
> | src_reg
> smin = ? +----------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+----------------------------+---------------------------
> | negative | ? | 0
> dst_reg +--------------+----------------------------+---------------------------
> | non-negative | 0 | 0
>
> This leave the negative range & negative range case to be considered. We
> know that negative range & negative range always yield a negative value,
> so a preliminary guess would be S64_MIN. However, that guess is too
> imprecise to help with the r0 <<= 62, r0 s>>= 63, r0 &= -13 pattern
> we're trying to deal with here.
>
> This can be further improve with the observation that for negative range
> & negative range, the smallest possible value must be one that has
> longest _common_ most-significant set '1' bits sequence, thus we can use
> min(dst_reg->smin_value, src->smin_value) as the starting point, as the
> smaller value will be the one with the shorter most-significant set '1'
> bits sequence. But that alone is not enough, as we do not know whether
> rest of the bits would be set, so the safest guess would be one that
> clear alls bits after the most-significant set '1' bits sequence,
> something akin to bit_floor(), but for rounding to a negative power-of-2
> instead.
>
> negative_bit_floor(0xffff000000000003) == 0xffff000000000000
> negative_bit_floor(0xf0ff0000ffff0000) == 0xf000000000000000
> negative_bit_floor(0xfffffb0000000000) == 0xfffff80000000000
>
> With negative range & negative range solve, we now have:
>
> | src_reg
> smin = ? +----------------------------+---------------------------
> | negative | non-negative
> ---------+--------------+----------------------------+---------------------------
> | negative |negative_bit_floor( | 0
> | | min(dst->smin, src->smin))|
> dst_reg +--------------+----------------------------+---------------------------
> | non-negative | 0 | 0
>
> This can be further simplied since min(dst->smin, src->smin) < 0 when both
> dst_reg and src_reg have a negative range. Which means using
>
> negative_bit_floor(min(dst_reg->smin_value, src_reg->smin_value)
>
> to calculate the resulting smin_value would work across all sign combinations.
>
> Together these allows us to infer the signed range of the result of BPF_AND
> operation using the signed range from its operands.
Hi Shung-Hsi,
This seems quite elegant.
As an additional check, I did a simple brute-force for all possible
ranges of 6-bit integers and bounds are computed safely.
[...]
next prev parent reply other threads:[~2024-07-17 21:10 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-11 11:38 [PATCH bpf-next v4 13/20] bpf, lsm: Add check for BPF LSM return value Xu Kuohai
2024-07-11 11:38 ` [PATCH bpf-next v4 14/20] bpf: Prevent tail call between progs attached to different hooks Xu Kuohai
2024-07-11 11:38 ` [PATCH bpf-next v4 15/20] bpf: Fix compare error in function retval_range_within Xu Kuohai
2024-07-11 11:38 ` [PATCH bpf-next v4 16/20] bpf: Add a special case for bitwise AND on range [-1, 0] Xu Kuohai
2024-07-15 15:29 ` Shung-Hsi Yu
2024-07-16 7:05 ` Xu Kuohai
2024-07-16 14:52 ` [RFC bpf-next] bpf, verifier: improve signed ranges inference for BPF_AND Shung-Hsi Yu
2024-07-16 15:10 ` Shung-Hsi Yu
2024-07-17 21:10 ` Eduard Zingerman [this message]
2024-07-19 8:32 ` Shung-Hsi Yu
2024-07-28 22:38 ` Harishankar Vishwanathan
2024-07-30 4:25 ` Shung-Hsi Yu
2024-08-02 21:30 ` Harishankar Vishwanathan
2024-07-16 15:19 ` [PATCH bpf-next v4 16/20] bpf: Add a special case for bitwise AND on range [-1, 0] Shung-Hsi Yu
2024-07-11 11:38 ` [PATCH bpf-next v4 17/20] selftests/bpf: Avoid load failure for token_lsm.c Xu Kuohai
2024-07-11 11:38 ` [PATCH bpf-next v4 18/20] selftests/bpf: Add return value checks for failed tests Xu Kuohai
2024-07-11 11:38 ` [PATCH bpf-next v4 19/20] selftests/bpf: Add test for lsm tail call Xu Kuohai
2024-07-11 11:38 ` [PATCH bpf-next v4 20/20] selftests/bpf: Add verifier tests for bpf lsm Xu Kuohai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=be239a5581e5b7d5c6f310c2a4c11282aa5896b5.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=ecree.xilinx@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=harishankar.vishwanathan@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=m.shachnai@rutgers.edu \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=roberto.sassu@huawei.com \
--cc=santosh.nagarakatte@rutgers.edu \
--cc=sdf@google.com \
--cc=shung-hsi.yu@suse.com \
--cc=song@kernel.org \
--cc=srinivas.narayana@rutgers.edu \
--cc=xukuohai@huaweicloud.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).