All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: bpf@vger.kernel.org, Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
	Justin Iurman <justin.iurman@uliege.be>
Subject: Re: [QUESTION] bpf/tc verifier error: invalid access to map value, min value is outside of the allowed memory range
Date: Mon, 28 Aug 2023 16:25:58 +0300	[thread overview]
Message-ID: <ffa8d7aa2e77fb843a4b94c3be45bc9297e7b3a2.camel@gmail.com> (raw)
In-Reply-To: <98af45809e7276431b7d053bfe8b26d98b2f9394.camel@gmail.com>

On Mon, 2023-08-28 at 15:46 +0300, Eduard Zingerman wrote:
> On Thu, 2023-08-24 at 22:32 +0200, Justin Iurman wrote:
> > Hello,
> > 
> > I'm facing a verifier error and don't know how to make it happy (already 
> > tried lots of checks). First, here is my env:
> >   - OS: Ubuntu 22.04.3 LTS
> >   - kernel: 5.15.0-79-generic x86_64 (CONFIG_DEBUG_INFO_BTF=y)
> >   - clang version: 14.0.0-1ubuntu1.1
> >   - iproute2-5.15.0 with libbpf 0.5.0
> > 
> > And here is a simplified example of my program (basically, it will 
> > insert in packets some bytes defined inside a map):
> > 
> > #include "vmlinux.h"
> > #include <bpf/bpf_endian.h>
> > #include <bpf/bpf_helpers.h>
> > 
> > #define MAX_BYTES 2048
> > 
> > struct xxx_t {
> > 	__u32 bytes_len;
> > 	__u8 bytes[MAX_BYTES];
> > };
> > 
> > struct {
> > 	__uint(type, BPF_MAP_TYPE_ARRAY);
> > 	__uint(max_entries, 1);
> > 	__type(key, __u32);
> > 	__type(value, struct xxx_t);
> > 	__uint(pinning, LIBBPF_PIN_BY_NAME);
> > } my_map SEC(".maps");
> > 
> > char _license[] SEC("license") = "GPL";
> > 
> > SEC("egress")
> > int egress_handler(struct __sk_buff *skb)
> > {
> > 	void *data_end = (void *)(long)skb->data_end;
> > 	void *data = (void *)(long)skb->data;
> > 	struct ethhdr *eth = data;
> > 	struct ipv6hdr *ip6;
> > 	struct xxx_t *x;
> > 	__u32 offset;
> > 	__u32 idx = 0;
> > 
> > 	offset = sizeof(*eth) + sizeof(*ip6);
> > 	if (data + offset > data_end)
> > 		return TC_ACT_OK;
> > 
> > 	if (bpf_ntohs(eth->h_proto) != ETH_P_IPV6)
> > 		return TC_ACT_OK;
> > 
> > 	x = bpf_map_lookup_elem(&my_map, &idx);
> > 	if (!x)
> > 		return TC_ACT_OK;
> > 
> > 	if (x->bytes_len == 0 || x->bytes_len > MAX_BYTES)
> > 		return TC_ACT_OK;
> > 
> > 	if (bpf_skb_adjust_room(skb, x->bytes_len, BPF_ADJ_ROOM_NET, 0))
> > 		return TC_ACT_OK;
> > 
> > 	if (bpf_skb_store_bytes(skb, offset, x->bytes, 8/*x->bytes_len*/, 
> > BPF_F_RECOMPUTE_CSUM))
> > 		return TC_ACT_SHOT;
> > 
> > 	/* blah blah blah... */
> > 
> > 	return TC_ACT_OK;
> > }
> > 
> > Let's focus on the line where bpf_skb_store_bytes is called. As is, with 
> > a constant length (i.e., 8 for instance), the verifier is happy. 
> > However, as soon as I try to use a map value as the length, it fails:
> > 
> > [...]
> > ; if (bpf_skb_store_bytes(skb, offset, x->bytes, x->bytes_len,
> > 34: (bf) r1 = r7
> > 35: (b7) r2 = 54
> > 36: (bf) r3 = r8
> > 37: (b7) r5 = 1
> > 38: (85) call bpf_skb_store_bytes#9
> >   R0=inv0 R1_w=ctx(id=0,off=0,imm=0) R2_w=inv54 
> > R3_w=map_value(id=0,off=4,ks=4,vs=2052,imm=0) 
> > R4_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R5_w=inv1 
> > R6_w=inv1 R7=ctx(id=0,off=0,imm=0) 
> > R8_w=map_value(id=0,off=4,ks=4,vs=2052,imm=0) R10=fp0 fp-8=mmmm????
> > invalid access to map value, value_size=2052 off=4 size=0
> > R3 min value is outside of the allowed memory range
> > 
> > I guess "size=0" is the problem here, but don't understand why. What 
> > bothers me is that it looks like it's about R3 (i.e., x->bytes), not R4. 
> > Anyway, I already tried to add a bunch of checks for both, but did not 
> > succeed. Any idea?
> 
> Hi Justin, John,
> 
> The following fragment of C code:
> 
> 	if (x->bytes_len == 0 || x->bytes_len > MAX_BYTES)
> 		return TC_ACT_OK;
> 
> 	if (bpf_skb_adjust_room(skb, x->bytes_len, BPF_ADJ_ROOM_NET, 0))
> 		return TC_ACT_OK;
> 
> 	if (bpf_skb_store_bytes(skb, offset, x->bytes, x->bytes_len,
> 				BPF_F_RECOMPUTE_CSUM))
> 		return TC_ACT_SHOT;
> 
> Gets compiled to the following BPF:
> 
>     ; r8 is `x` at this point
>     ; if (x->bytes_len == 0 || x->bytes_len > MAX_BYTES)
>     17: (61) r2 = *(u32 *)(r8 +0)           ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>                                             ; R8_w=map_value(off=0,ks=4,vs=2052,imm=0)
>     18: (bc) w1 = w2                        ; R1_w=scalar(id=2,umax=4294967295,var_off=(0x0; 0xffffffff))
>                                             ; R2_w=scalar(id=2,umax=4294967295,var_off=(0x0; 0xffffffff))
>     19: (04) w1 += -2049                    ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>     20: (a6) if w1 < 0xfffff800 goto pc+16  ; R1_w=scalar(umin=4294965248,umax=4294967295,
>                                             ;             var_off=(0xfffff800; 0x7ff),s32_min=-2048,s32_max=-1)
> 
>     ; if (bpf_skb_adjust_room(skb, x->bytes_len, BPF_ADJ_ROOM_NET, 0))
>     21: (bf) r1 = r6                        ; R1_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0)
>     22: (b4) w3 = 0                         ; R3_w=0
>     23: (b7) r4 = 0                         ; R4_w=0
>     24: (85) call bpf_skb_adjust_room#50    ; R0=scalar()
>     25: (55) if r0 != 0x0 goto pc+11        ; R0=0
> 
>     ; if (bpf_skb_store_bytes(skb, offset, x->bytes, x->bytes_len,
>     26: (61) r4 = *(u32 *)(r8 +0)           ; R4_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
>                                             ; R8=map_value(off=0,ks=4,vs=2052,imm=0)
>     27: (07) r8 += 4                        ; R8_w=map_value(off=4,ks=4,vs=2052,imm=0)
>     28: (bf) r1 = r6                        ; R1_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0)
>     29: (b4) w2 = 54                        ; R2_w=54
>     30: (bf) r3 = r8                        ; R3_w=map_value(off=4,ks=4,vs=2052,imm=0) R8_w=map_value(off=4,ks=4,vs=2052,imm=0)
>     31: (b7) r5 = 1                         ; R5_w=1
>     32: (85) call bpf_skb_store_bytes#9
> 
> Note the following instructions:
> - (17): x->bytes_len access;
> - (18,19,20): a curious way in which clang translates the (_ == 0 || _ > MAX_BYTES);
> - (26): x->bytes_len is re-read.
> 
> Several observations:
> - because of (20) verifier can infer range for w1, but this range is
>   not propagated to w2;
> - even if it was propagated verifier does not track range for values
>   stored in "general memory", only for values in registers and values
>   spilled to stack => known range for w2 does not imply known range
>   for x->bytes_len.
> 
> I can make it work with the following modification:
> 
>     int egress_handler(struct __sk_buff *skb)
>     {
>     	void *data_end = (void *)(long)skb->data_end;
>     	void *data = (void *)(long)skb->data;
>     	struct ethhdr *eth = data;
>     	struct ipv6hdr *ip6;
>     	struct xxx_t *x;
>     	__s32 bytes_len;   // <------ signed !
>     	__u32 offset;
>     	__u32 idx = 0;
>     
>     	offset = sizeof(*eth) + sizeof(*ip6);
>     	if (data + offset > data_end)
>     		return TC_ACT_OK;
>     
>     	if (bpf_ntohs(eth->h_proto) != ETH_P_IPV6)
>     		return TC_ACT_OK;
>     
>     	x = bpf_map_lookup_elem(&my_map, &idx);
>     	if (!x)
>     		return TC_ACT_OK;
>     
>     	bytes_len = x->bytes_len; // <----- use bytes_len everywhere below !
>     
>     	if (bytes_len <= 0 || bytes_len > MAX_BYTES)
>     		return TC_ACT_OK;
>     
>     	if (bpf_skb_adjust_room(skb, bytes_len, BPF_ADJ_ROOM_NET, 0))
>     		return TC_ACT_OK;
>     
>     	if (bpf_skb_store_bytes(skb, offset, x->bytes, bytes_len,
>     				BPF_F_RECOMPUTE_CSUM))
>     		return TC_ACT_SHOT;
>     
>     	/* blah blah blah... */
>     
>     	return TC_ACT_OK;
>     }
> 
> After such change the following BPF is generated:
> 
>     ; bytes_len = x->bytes_len;
>     17: (61) r9 = *(u32 *)(r8 +0)         ; R8_w=map_value(off=0,ks=4,vs=2052,imm=0)
>                                           ; R9_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
> 
>     ; if (bytes_len <= 0 || bytes_len > MAX_BYTES)
>     18: (c6) if w9 s< 0x1 goto pc+18      ; R9_w=scalar(umin=1,umax=2147483647,var_off=(0x0; 0x7fffffff))
>     19: (66) if w9 s> 0x800 goto pc+17    ; R9_w=scalar(umin=1,umax=2048,var_off=(0x0; 0xfff))
>     ; if (bpf_skb_adjust_room(skb, bytes_len, BPF_ADJ_ROOM_NET, 0))
>     20: (bf) r1 = r6                      ; R1_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0)
>     21: (bc) w2 = w9                      ; R2_w=scalar(id=2,umin=1,umax=2048,var_off=(0x0; 0xfff))
>                                           ; R9_w=scalar(id=2,umin=1,umax=2048,var_off=(0x0; 0xfff))
>     22: (b4) w3 = 0                       ; R3_w=0
>     23: (b7) r4 = 0                       ; R4_w=0
>     24: (85) call bpf_skb_adjust_room#50          ; R0=scalar()
>     25: (55) if r0 != 0x0 goto pc+11      ; R0=0
> 
>     ; if (bpf_skb_store_bytes(skb, offset, x->bytes, bytes_len,
>     26: (07) r8 += 4                      ; R8_w=map_value(off=4,ks=4,vs=2052,imm=0)
>     27: (bf) r1 = r6                      ; R1_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0)
>     28: (b4) w2 = 54                      ; R2_w=54
>     29: (bf) r3 = r8                      ; R3_w=map_value(off=4,ks=4,vs=2052,imm=0)
>                                           ; R8_w=map_value(off=4,ks=4,vs=2052,imm=0)
>     30: (bc) w4 = w9                      ; R4_w=scalar(id=2,umin=1,umax=2048,var_off=(0x0; 0xfff))
>                                           ; R9=scalar(id=2,umin=1,umax=2048,var_off=(0x0; 0xfff))
>     31: (b7) r5 = 1                       ; R5_w=1
>     32: (85) call bpf_skb_store_bytes#9
>     
> Note the following:
> - (17): x->bytes_len is in w9;
> - (18,19): range check is done w/o -= 2049 trick and verifier infers
>   range for w9 as [1,2048];
> - (30): w9 is reused as a parameter to bpf_skb_store_bytes with
>   correct range.
> 
> I think that main pain point here is "clever" (_ == 0 || _ > MAX_BYTES)
> translation, need to think a bit if it is possible to dissuade clang
> from such transformation (via change in clang).
> 
> Alternatively, I think that doing (_ == 0 || _ > MAX_BYTES) check in
> inline assembly as two compare instructions should also work.

In terms of compiler/verifier improvements another option would be to
teach verifier to track +-scalar relations between registers, so that
-2049 trick would be understood by verifier, e.g.:

     ; r8 is `x` at this point
     ; if (x->bytes_len == 0 || x->bytes_len > MAX_BYTES)
     17: (61) r2 = *(u32 *)(r8 +0)
     18: (bc) w1 = w2                         <--- w1.id = w2.id
     19: (04) w1 += -2049                     <--- don't clear w1.id,
                                                   instead track that it is w1.(id - 2049)
     20: (a6) if w1 < 0xfffff800 goto pc+16   <--- propagate range info for w2

In a sense, extend scalar ID to be a pair [ID, scalar offset].
But that might get complicated.

Yonghong,
what do you think, does it make sense to investigate this or am I
talking gibberish?

  reply	other threads:[~2023-08-28 13:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-24 20:32 [QUESTION] bpf/tc verifier error: invalid access to map value, min value is outside of the allowed memory range Justin Iurman
2023-08-26  0:49 ` John Fastabend
2023-08-26 12:54   ` Justin Iurman
2023-08-26 23:13     ` Justin Iurman
2023-08-27 16:56   ` Justin Iurman
2023-08-28 12:46 ` Eduard Zingerman
2023-08-28 13:25   ` Eduard Zingerman [this message]
2023-08-29  0:40     ` Yonghong Song
2023-09-01 11:58       ` Justin Iurman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ffa8d7aa2e77fb843a4b94c3be45bc9297e7b3a2.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=justin.iurman@uliege.be \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.