From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yonghong Song Subject: Re: [PATCH bpf-next v3 4/9] bpf/verifier: improve register value range tracking with ARSH Date: Sun, 22 Apr 2018 21:31:19 -0700 Message-ID: <8a76b492-e01a-d79e-3dbe-5a1e6b0e60ce@fb.com> References: <20180420221842.742330-1-yhs@fb.com> <20180420221842.742330-5-yhs@fb.com> <20180423001615.wlxnlp6xdquzrntt@ast-mbp> <20180423041901.44xlyekpw3kehh7v@ast-mbp> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Cc: , , , To: Alexei Starovoitov Return-path: Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:44120 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751003AbeDWEbz (ORCPT ); Mon, 23 Apr 2018 00:31:55 -0400 In-Reply-To: <20180423041901.44xlyekpw3kehh7v@ast-mbp> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 4/22/18 9:19 PM, Alexei Starovoitov wrote: > On Sun, Apr 22, 2018 at 07:49:13PM -0700, Yonghong Song wrote: >> >> >> On 4/22/18 5:16 PM, Alexei Starovoitov wrote: >>> On Fri, Apr 20, 2018 at 03:18:37PM -0700, Yonghong Song wrote: >>>> When helpers like bpf_get_stack returns an int value >>>> and later on used for arithmetic computation, the LSH and ARSH >>>> operations are often required to get proper sign extension into >>>> 64-bit. For example, without this patch: >>>> 54: R0=inv(id=0,umax_value=800) >>>> 54: (bf) r8 = r0 >>>> 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) >>>> 55: (67) r8 <<= 32 >>>> 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) >>>> 56: (c7) r8 s>>= 32 >>>> 57: R8=inv(id=0) >>>> With this patch: >>>> 54: R0=inv(id=0,umax_value=800) >>>> 54: (bf) r8 = r0 >>>> 55: R0=inv(id=0,umax_value=800) R8_w=inv(id=0,umax_value=800) >>>> 55: (67) r8 <<= 32 >>>> 56: R8_w=inv(id=0,umax_value=3435973836800,var_off=(0x0; 0x3ff00000000)) >>>> 56: (c7) r8 s>>= 32 >>>> 57: R8=inv(id=0, umax_value=800,var_off=(0x0; 0x3ff)) >>>> With better range of "R8", later on when "R8" is added to other register, >>>> e.g., a map pointer or scalar-value register, the better register >>>> range can be derived and verifier failure may be avoided. >>>> >>>> In our later example, >>>> ...... >>>> usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK); >>>> if (usize < 0) >>>> return 0; >>>> ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0); >>>> ...... >>>> Without improving ARSH value range tracking, the register representing >>>> "max_len - usize" will have smin_value equal to S64_MIN and will be >>>> rejected by verifier. >>>> >>>> Signed-off-by: Yonghong Song >>>> --- >>>> kernel/bpf/verifier.c | 26 ++++++++++++++++++++++++++ >>>> 1 file changed, 26 insertions(+) >>>> >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>> index 3c8bb92..01c215d 100644 >>>> --- a/kernel/bpf/verifier.c >>>> +++ b/kernel/bpf/verifier.c >>>> @@ -2975,6 +2975,32 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, >>>> /* We may learn something more from the var_off */ >>>> __update_reg_bounds(dst_reg); >>>> break; >>>> + case BPF_ARSH: >>>> + if (umax_val >= insn_bitness) { >>>> + /* Shifts greater than 31 or 63 are undefined. >>>> + * This includes shifts by a negative number. >>>> + */ >>>> + mark_reg_unknown(env, regs, insn->dst_reg); >>>> + break; >>>> + } >>>> + if (dst_reg->smin_value < 0) >>>> + dst_reg->smin_value >>= umin_val; >>>> + else >>>> + dst_reg->smin_value >>= umax_val; >>>> + if (dst_reg->smax_value < 0) >>>> + dst_reg->smax_value >>= umax_val; >>>> + else >>>> + dst_reg->smax_value >>= umin_val; >>>> + if (src_known) >>>> + dst_reg->var_off = tnum_rshift(dst_reg->var_off, >>>> + umin_val); >>>> + else >>>> + dst_reg->var_off = tnum_rshift(tnum_unknown, umin_val); >>>> + dst_reg->umin_value >>= umax_val; >>>> + dst_reg->umax_value >>= umin_val; >>>> + /* We may learn something more from the var_off */ >>>> + __update_reg_bounds(dst_reg); >>> >>> I'm struggling to understand how these bounds are computed. >>> Could you add examples in the comments? >> >> Okay, let me try to add some comments for better understanding. >> >>> In particular if dst_reg is unknown (tnum.mask == -1) >>> the above tnum_rshift() will clear upper bits and will make it >>> 64-bit positive, but that doesn't seem correct. >>> What am I missing? >> >> Considering this is arith shift, we probably should just have >> dst_reg->var_off = tnum_unknown to be conservative. >> >> I could miss something here as well. Let me try to write more >> detailed explanation, hopefully to cover all corner cases. > > Is there a use case for !src_known ? For typical bpf programs, the shift amount should always be known... If src_known is true, it must be dealing custom packets or custom data structures in tracing, etc. > I think test_verifier should have 100% line coverage of verifier.c > and every 'if' condition in the verifier needs to have real use case > behind it. > It's still on my todo list to get rid of [su][min|max]_value tracking > that was introduced without solid justification. >