public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Alexander Lobakin <alexandr.lobakin@intel.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Larysa Zaremba <larysa.zaremba@intel.com>,
	bpf <bpf@vger.kernel.org>, netdev <netdev@vger.kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>
Subject: Re: Accessing XDP packet memory from the end
Date: Fri, 22 Apr 2022 18:41:37 +0200	[thread overview]
Message-ID: <20220422164137.875143-1-alexandr.lobakin@intel.com> (raw)
In-Reply-To: <87czhagxuw.fsf@toke.dk>

From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 21 Apr 2022 19:17:11 +0200

> Larysa Zaremba <larysa.zaremba@intel.com> writes:
> 
> > Dear all,
> > Our team has encountered a need of accessing data_meta in a following way:
> >
> > int xdp_meta_prog(struct xdp_md *ctx)
> > {
> > 	void *data_meta_ptr = (void *)(long)ctx->data_meta;
> > 	void *data_end = (void *)(long)ctx->data_end;
> > 	void *data = (void *)(long)ctx->data;
> > 	u64 data_size = sizeof(u32);
> > 	u32 magic_meta;
> > 	u8 offset;
> >
> > 	offset = (u8)((s64)data - (s64)data_meta_ptr);
> > 	if (offset < data_size) {
> > 		bpf_printk("invalid offset: %ld\n", offset);
> > 		return XDP_DROP;
> > 	}
> >
> > 	data_meta_ptr += offset;
> > 	data_meta_ptr -= data_size;
> >
> > 	if (data_meta_ptr + data_size > data) {
> > 		return XDP_DROP;
> > 	}
> > 		
> > 	magic_meta = *((u32 *)data);
> > 	bpf_printk("Magic: %d\n", magic_meta);
> > 	return XDP_PASS;
> > }
> >
> > Unfortunately, verifier claims this code attempts to access packet with
> > an offset of -2 (a constant part) and negative offset is generally forbidden.
> >
> > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
> > which is pretty good, but not ideal for the hot path.
> > The second one is the patch at the end.
> >
> > Do you see any other way of accessing memory from the end of data_meta/data?
> > What do you think about both suggested solutions?
> 
> The problem is that the compiler is generating code that the verifier
> doesn't understand. It's notoriously hard to get LLVM to produce code
> that preserves the right bounds checks which is why projects like Cilium
> use helpers with inline ASM to produce the right loads, like in [0].
> 
> Adapting that cilium helper to load from the metadata area, your example
> can be rewritten as follows (which works just fine with no verifier
> changes):
> 
> static __always_inline int
> xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len)
> {
> 	void *from;
> 	int ret;
> 	/* LLVM tends to generate code that verifier doesn't understand,
> 	 * so force it the way we want it in order to open up a range
> 	 * on the reg.
> 	 */
> 	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
> 		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
> 		     "%[off] &= %[offmax]\n\t"
> 		     "r1 += %[off]\n\t"
> 		     "%[from] = r1\n\t"
> 		     "r1 += %[len]\n\t"
> 		     "if r1 > r2 goto +2\n\t"
> 		     "%[ret] = 0\n\t"
> 		     "goto +1\n\t"
> 		     "%[ret] = %[errno]\n\t"
> 		     : [ret]"=r"(ret), [from]"=r"(from)
> 		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
> 		       [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL)
> 		     : "r1", "r2");
> 	if (!ret)
> 		__builtin_memcpy(to, from, len);
> 	return ret;
> }
> 
> 
> SEC("xdp")
> int xdp_meta_prog(struct xdp_md *ctx)
> {
>         void *data_meta_ptr = (void *)(long)ctx->data_meta;
>         void *data = (void *)(long)ctx->data;
>         __u32 magic_meta;
>         __u8 offset;
> 	int ret;
> 
>         offset = (__u8)((__s64)data - (__s64)data_meta_ptr);
> 	ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta));
> 	if (ret) {
> 		bpf_printk("load bytes failed: %d\n", ret);
>                 return XDP_DROP;
> 	}
> 
>         bpf_printk("Magic: %d\n", magic_meta);
>         return XDP_PASS;
> }

At the moment, we use this (based on Cilium's and your), it works
just like we want C code to work previously:

#define __CTX_OFF_MAX 0xff

static __always_inline void *
can_i_access_meta_please(const struct xdp_md *ctx, __u64 off, const __u64 len)
{
	void *ret;

	/* LLVM tends to generate code that verifier doesn't understand,
	 * so force it the way we want it in order to open up a range
	 * on the reg.
	 */
	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
		     "%[off] &= %[offmax]\n\t"
		     "r1 += %[off]\n\t"
		     "%[ret] = r1\n\t"
		     "r1 += %[len]\n\t"
		     "if r1 > r2 goto +1\n\t"
		     "goto +1\n\t"
		     "%[ret] = %[null]\n\t"
		     : [ret]"=r"(ret)
		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
		       [offmax]"i"(__CTX_OFF_MAX), [null]"i"(NULL)
		     : "r1", "r2");

	return ret;
}

SEC("xdp")
int xdp_prognum_n0_meta(struct xdp_md *ctx)
{
	void *data_meta = (void *)(__s64)ctx->data_meta;
	void *data = (void *)(__s64)ctx->data;
	struct xdp_meta_generic *md;
	__u64 offset;

	offset = (__u64)((__s64)data - (__s64)data_meta);

	md = can_i_access_meta_please(ctx, offset, sizeof(*md));
	if (__builtin_expect(!md, 0)) {
		bpf_printk("No you can't\n");
		return XDP_DROP;
	}

	bpf_printk("Magic: 0x%04x\n", md->magic_id);
	return XDP_PASS;
}

Thanks for the help! It's a shame LLVM still suck on generating
correct object code from C.
I guess we'll define a helper above in one of the headers to not
copy-paste it back and forth between each program wanting to
access only the generic part of the metadata (which is always being
placed at the end).

> 
> -Toke
> 
> 
> [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/ctx/xdp.h#L35

Thanks,
Al

  reply	other threads:[~2022-04-22 16:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-21 15:56 Accessing XDP packet memory from the end Larysa Zaremba
2022-04-21 16:27 ` Jesper Dangaard Brouer
2022-04-22 10:06   ` Alexander Lobakin
2022-04-21 17:17 ` Toke Høiland-Jørgensen
2022-04-22 16:41   ` Alexander Lobakin [this message]
2022-04-23 20:05     ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220422164137.875143-1-alexandr.lobakin@intel.com \
    --to=alexandr.lobakin@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=larysa.zaremba@intel.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox