linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0
@ 2025-03-31  3:23 Jiayuan Chen
  2025-03-31  3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen
  2025-03-31  3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen
  0 siblings, 2 replies; 11+ messages in thread
From: Jiayuan Chen @ 2025-03-31  3:23 UTC (permalink / raw)
  To: bpf
  Cc: mrpre, Jiayuan Chen, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko,
	Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov,
	Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner,
	linux-kernel, netdev, linux-kselftest

This patchset originates from my attempt to resolve a KMSAN warning that
has existed for over 3 years:
https://syzkaller.appspot.com/bug?extid=0e6ddb1ef80986bdfe64

Previously, we had a brief discussion in this thread about whether we can
simply perform memset in adjust_{head,meta}:
https://lore.kernel.org/netdev/20250328043941.085de23b@kernel.org/T/#t

Unfortunately, I couldn't find a similar topic in the mail list, but I did
find a similar security-related commit:
commit 6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse")

I just create a new topic here and make subject more clear, we can discuss
this here.

Meanwhile, I also discovered a related issue that led to a CVE,specifically
the Facebook Katran vulnerability (https://vuldb.com/?id.246309).

Currently, even with unprivileged functionality disabled, a user can load
a BPF program using CAP_BPF and CAP_NET_ADMIN, which I believe we should
avoid exposing kernel memory directly to users now.

Regarding performance considerations, I added corresponding results to the
selftest, testing common MAC headers and IP headers of various sizes.

Compared to not using memset, the execution time increased by 2ns, but I
think this is negligible considering the entire net stack.

Jiayuan Chen (2):
  bpf, xdp: clean head/meta when expanding it
  selftests/bpf: add perf test for adjust_{head,meta}

 include/uapi/linux/bpf.h                      |  8 +--
 net/core/filter.c                             |  5 +-
 tools/include/uapi/linux/bpf.h                |  6 ++-
 .../selftests/bpf/prog_tests/xdp_perf.c       | 52 ++++++++++++++++---
 tools/testing/selftests/bpf/progs/xdp_dummy.c | 14 +++++
 5 files changed, 72 insertions(+), 13 deletions(-)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-03-31  3:23 [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 Jiayuan Chen
@ 2025-03-31  3:23 ` Jiayuan Chen
  2025-04-03  8:17   ` Jesper Dangaard Brouer
  2025-04-03 14:24   ` Alexei Starovoitov
  2025-03-31  3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen
  1 sibling, 2 replies; 11+ messages in thread
From: Jiayuan Chen @ 2025-03-31  3:23 UTC (permalink / raw)
  To: bpf
  Cc: mrpre, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev,
	linux-kselftest

The device allocates an skb, it additionally allocates a prepad size
(usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
uninitialized.

The bpf_xdp_adjust_head function moves skb->data forward, which allows
users to access data belonging to other programs, posing a security risk.

Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 include/uapi/linux/bpf.h       | 8 +++++---
 net/core/filter.c              | 5 ++++-
 tools/include/uapi/linux/bpf.h | 6 ++++--
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index defa5bb881f4..be01a848cbbf 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2760,8 +2760,9 @@ union bpf_attr {
  *
  * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
  * 	Description
- * 		Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
- * 		it is possible to use a negative value for *delta*. This helper
+ *		Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
+ *		it is possible to use a negative value for *delta*. If *delta*
+ *		is negative, the new header will be memset to zero. This helper
  * 		can be used to prepare the packet for pushing or popping
  * 		headers.
  *
@@ -2989,7 +2990,8 @@ union bpf_attr {
  * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
  * 	Description
  * 		Adjust the address pointed by *xdp_md*\ **->data_meta** by
- * 		*delta* (which can be positive or negative). Note that this
+ *		*delta* (which can be positive or negative). If *delta* is
+ *		negative, the new meta will be memset to zero. Note that this
  * 		operation modifies the address stored in *xdp_md*\ **->data**,
  * 		so the latter must be loaded only after the helper has been
  * 		called.
diff --git a/net/core/filter.c b/net/core/filter.c
index 46ae8eb7a03c..5f01d373b719 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
 	if (metalen)
 		memmove(xdp->data_meta + offset,
 			xdp->data_meta, metalen);
+	if (offset < 0)
+		memset(data, 0, -offset);
 	xdp->data_meta += offset;
 	xdp->data = data;
 
@@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
 		return -EINVAL;
 	if (unlikely(xdp_metalen_invalid(metalen)))
 		return -EACCES;
-
+	if (offset < 0)
+		memset(meta, 0, -offset);
 	xdp->data_meta = meta;
 
 	return 0;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index defa5bb881f4..7b1871f2eccf 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2761,7 +2761,8 @@ union bpf_attr {
  * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
  * 	Description
  * 		Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
- * 		it is possible to use a negative value for *delta*. This helper
+ *		it is possible to use a negative value for *delta*. If *delta*
+ *		is negative, the new header will be memset to zero. This helper
  * 		can be used to prepare the packet for pushing or popping
  * 		headers.
  *
@@ -2989,7 +2990,8 @@ union bpf_attr {
  * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
  * 	Description
  * 		Adjust the address pointed by *xdp_md*\ **->data_meta** by
- * 		*delta* (which can be positive or negative). Note that this
+ *		*delta* (which can be positive or negative). If *delta* is
+ *		negative, the new meta will be memset to zero. Note that this
  * 		operation modifies the address stored in *xdp_md*\ **->data**,
  * 		so the latter must be loaded only after the helper has been
  * 		called.
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta}
  2025-03-31  3:23 [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 Jiayuan Chen
  2025-03-31  3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen
@ 2025-03-31  3:23 ` Jiayuan Chen
  2025-04-03  0:24   ` Jakub Kicinski
  1 sibling, 1 reply; 11+ messages in thread
From: Jiayuan Chen @ 2025-03-31  3:23 UTC (permalink / raw)
  To: bpf
  Cc: mrpre, Jiayuan Chen, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko,
	Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov,
	Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner,
	linux-kernel, netdev, linux-kselftest

We added a memset operation during the adjust operation, which may cause
performance issues.

Therefore, we added perf testing, and testing found that for common header
length operations, memset() operation increased the performance overhead
by 2ns, which is negligible for the net stack.

Before memset
./test_progs -a xdp_adjust_head_perf -v
run adjust head with size 6 cost 56 ns
run adjust head with size 20 cost 56 ns
run adjust head with size 40 cost 56 ns
run adjust head with size 200 cost 56 ns

After memset
./test_progs -a xdp_adjust_head_perf -v
run adjust head with size 6 cost 58 ns
run adjust head with size 20 cost 58 ns
run adjust head with size 40 cost 58 ns
run adjust head with size 200 cost 66 ns

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 .../selftests/bpf/prog_tests/xdp_perf.c       | 52 ++++++++++++++++---
 tools/testing/selftests/bpf/progs/xdp_dummy.c | 14 +++++
 2 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_perf.c b/tools/testing/selftests/bpf/prog_tests/xdp_perf.c
index ec5369f247cb..1b4260c6e5d7 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_perf.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_perf.c
@@ -1,10 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <test_progs.h>
+#include <network_helpers.h>
+#include "xdp_dummy.skel.h"
 
 void test_xdp_perf(void)
 {
-	const char *file = "./xdp_dummy.bpf.o";
-	struct bpf_object *obj;
+	struct xdp_dummy *skel;
 	char in[128], out[128];
 	int err, prog_fd;
 	LIBBPF_OPTS(bpf_test_run_opts, topts,
@@ -15,14 +16,51 @@ void test_xdp_perf(void)
 		.repeat = 1000000,
 	);
 
-	err = bpf_prog_test_load(file, BPF_PROG_TYPE_XDP, &obj, &prog_fd);
-	if (CHECK_FAIL(err))
-		return;
-
+	skel = xdp_dummy__open_and_load();
+	prog_fd = bpf_program__fd(skel->progs.xdp_dummy_prog);
 	err = bpf_prog_test_run_opts(prog_fd, &topts);
 	ASSERT_OK(err, "test_run");
 	ASSERT_EQ(topts.retval, XDP_PASS, "test_run retval");
 	ASSERT_EQ(topts.data_size_out, 128, "test_run data_size_out");
 
-	bpf_object__close(obj);
+	xdp_dummy__destroy(skel);
+}
+
+void test_xdp_adjust_head_perf(void)
+{
+	struct xdp_dummy *skel;
+	int repeat = 9000000;
+	struct xdp_md ctx_in;
+	char data[100];
+	int err, prog_fd;
+	size_t test_header_size[] = {
+		ETH_ALEN,
+		sizeof(struct iphdr),
+		sizeof(struct ipv6hdr),
+		200,
+	};
+	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, topts,
+			    .data_in = &data,
+			    .data_size_in = sizeof(data),
+			    .repeat = repeat,
+	);
+
+	topts.ctx_in = &ctx_in;
+	topts.ctx_size_in = sizeof(ctx_in);
+	memset(&ctx_in, 0, sizeof(ctx_in));
+	ctx_in.data_meta = 0;
+	ctx_in.data_end = ctx_in.data + sizeof(data);
+
+	skel = xdp_dummy__open_and_load();
+	prog_fd = bpf_program__fd(skel->progs.xdp_dummy_adjust_head);
+
+	for (int i = 0; i < ARRAY_SIZE(test_header_size); i++) {
+		skel->bss->head_size = test_header_size[i];
+		err = bpf_prog_test_run_opts(prog_fd, &topts);
+		ASSERT_OK(err, "test_run");
+		ASSERT_EQ(topts.retval, XDP_PASS, "test_run retval");
+		fprintf(stdout, "run adjust head with size %zd cost %d ns\n",
+			test_header_size[i], topts.duration);
+	}
+	xdp_dummy__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/xdp_dummy.c b/tools/testing/selftests/bpf/progs/xdp_dummy.c
index d988b2e0cee8..7bebedbbc949 100644
--- a/tools/testing/selftests/bpf/progs/xdp_dummy.c
+++ b/tools/testing/selftests/bpf/progs/xdp_dummy.c
@@ -4,10 +4,24 @@
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 
+int head_size;
+
 SEC("xdp")
 int xdp_dummy_prog(struct xdp_md *ctx)
 {
 	return XDP_PASS;
 }
 
+SEC("xdp")
+int xdp_dummy_adjust_head(struct xdp_md *ctx)
+{
+	if (bpf_xdp_adjust_head(ctx, -head_size))
+		return XDP_DROP;
+
+	if (bpf_xdp_adjust_head(ctx, head_size))
+		return XDP_DROP;
+
+	return XDP_PASS;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta}
  2025-03-31  3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen
@ 2025-04-03  0:24   ` Jakub Kicinski
  2025-04-03  9:37     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2025-04-03  0:24 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: bpf, mrpre, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev,
	linux-kselftest

On Mon, 31 Mar 2025 11:23:45 +0800 Jiayuan Chen wrote:
> which is negligible for the net stack.
> 
> Before memset
> ./test_progs -a xdp_adjust_head_perf -v
> run adjust head with size 6 cost 56 ns
> run adjust head with size 20 cost 56 ns
> run adjust head with size 40 cost 56 ns
> run adjust head with size 200 cost 56 ns
> 
> After memset
> ./test_progs -a xdp_adjust_head_perf -v
> run adjust head with size 6 cost 58 ns
> run adjust head with size 20 cost 58 ns
> run adjust head with size 40 cost 58 ns
> run adjust head with size 200 cost 66 ns

FWIW I'm not sure if this is "negligible" for XDP like you say,
but I defer to Jesper :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-03-31  3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen
@ 2025-04-03  8:17   ` Jesper Dangaard Brouer
  2025-04-03 14:24   ` Alexei Starovoitov
  1 sibling, 0 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2025-04-03  8:17 UTC (permalink / raw)
  To: Jiayuan Chen, bpf
  Cc: mrpre, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing,
	Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly,
	David Lechner, linux-kernel, netdev, linux-kselftest



On 31/03/2025 05.23, Jiayuan Chen wrote:
> The device allocates an skb, it additionally allocates a prepad size
> (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
> uninitialized.
> 
> The bpf_xdp_adjust_head function moves skb->data forward, which allows
> users to access data belonging to other programs, posing a security risk.

I find your description confusing, and it needs to be improved to avoid 
people interpenetrating this when reading the commit log in the future.

It is part of the UAPI that BPF programs access data belonging to other 
BPF programs.  It is the concept behind data_meta, e.g. that XDP set 
information in this memory and TC-BPF reads it again (chained XDP progs 
also have R/W access).  I hope/assume this is not the desired 
interpretation of your text.

I guess you want to say, that the intention is to avoid BPF programs 
accessing uninitialized data?
(... which is also what the code does at a glance)

> Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>   include/uapi/linux/bpf.h       | 8 +++++---
>   net/core/filter.c              | 5 ++++-
>   tools/include/uapi/linux/bpf.h | 6 ++++--
>   3 files changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index defa5bb881f4..be01a848cbbf 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2760,8 +2760,9 @@ union bpf_attr {
>    *
>    * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
>    * 	Description
> - * 		Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> - * 		it is possible to use a negative value for *delta*. This helper
> + *		Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> + *		it is possible to use a negative value for *delta*. If *delta*
> + *		is negative, the new header will be memset to zero. This helper
>    * 		can be used to prepare the packet for pushing or popping
>    * 		headers.
>    *
> @@ -2989,7 +2990,8 @@ union bpf_attr {
>    * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
>    * 	Description
>    * 		Adjust the address pointed by *xdp_md*\ **->data_meta** by
> - * 		*delta* (which can be positive or negative). Note that this
> + *		*delta* (which can be positive or negative). If *delta* is
> + *		negative, the new meta will be memset to zero. Note that this
>    * 		operation modifies the address stored in *xdp_md*\ **->data**,
>    * 		so the latter must be loaded only after the helper has been
>    * 		called.
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 46ae8eb7a03c..5f01d373b719 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>   	if (metalen)
>   		memmove(xdp->data_meta + offset,
>   			xdp->data_meta, metalen);
> +	if (offset < 0)
> +		memset(data, 0, -offset);
>   	xdp->data_meta += offset;
>   	xdp->data = data;
>   
> @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
>   		return -EINVAL;
>   	if (unlikely(xdp_metalen_invalid(metalen)))
>   		return -EACCES;
> -
> +	if (offset < 0)
> +		memset(meta, 0, -offset);
>   	xdp->data_meta = meta;
>   
>   	return 0;
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index defa5bb881f4..7b1871f2eccf 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2761,7 +2761,8 @@ union bpf_attr {
>    * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
>    * 	Description
>    * 		Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> - * 		it is possible to use a negative value for *delta*. This helper
> + *		it is possible to use a negative value for *delta*. If *delta*
> + *		is negative, the new header will be memset to zero. This helper
>    * 		can be used to prepare the packet for pushing or popping
>    * 		headers.
>    *
> @@ -2989,7 +2990,8 @@ union bpf_attr {
>    * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
>    * 	Description
>    * 		Adjust the address pointed by *xdp_md*\ **->data_meta** by
> - * 		*delta* (which can be positive or negative). Note that this
> + *		*delta* (which can be positive or negative). If *delta* is
> + *		negative, the new meta will be memset to zero. Note that this
>    * 		operation modifies the address stored in *xdp_md*\ **->data**,
>    * 		so the latter must be loaded only after the helper has been
>    * 		called.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta}
  2025-04-03  0:24   ` Jakub Kicinski
@ 2025-04-03  9:37     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2025-04-03  9:37 UTC (permalink / raw)
  To: Jakub Kicinski, Jiayuan Chen
  Cc: bpf, mrpre, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing,
	Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly,
	David Lechner, linux-kernel, netdev, linux-kselftest, kernel-team



On 03/04/2025 02.24, Jakub Kicinski wrote:
> On Mon, 31 Mar 2025 11:23:45 +0800 Jiayuan Chen wrote:
>> which is negligible for the net stack.
>>
>> Before memset
>> ./test_progs -a xdp_adjust_head_perf -v
>> run adjust head with size 6 cost 56 ns
>> run adjust head with size 20 cost 56 ns
>> run adjust head with size 40 cost 56 ns
>> run adjust head with size 200 cost 56 ns
>>
>> After memset
>> ./test_progs -a xdp_adjust_head_perf -v
>> run adjust head with size 6 cost 58 ns
>> run adjust head with size 20 cost 58 ns
>> run adjust head with size 40 cost 58 ns
>> run adjust head with size 200 cost 66 ns
> 
> FWIW I'm not sure if this is "negligible" for XDP like you say,
> but I defer to Jesper :)

It would be too much for the XDP_DROP use-case, e.g. DDoS protection and
driver hardware eval. But this is changing a BPF-helper, which means it
is opt-in by the BPF-programmer.  Thus, we can accept larger perf
overhead here.

I suspect your 2 nanosec overhead primarily comes from the function call
overhead.  On my AMD production system with SRSO mitigation enabled I
expect to see around 6 ns overhead (5.699 ns), which is sad.

I've done a lot of benchmarking of memset (see [1]). One take-away is
that memset with small const values will get compiled into very fast
code that avoids the function call (basically QWORD MOVs).  E.g. memset
with const 32 is extremely fast[2], on my system it takes 0.673 ns (and
0.562 ns comes from for-loop overhead).  Thus, it is possible to do
something faster, as we are clearing very small values. I.e. using a
duff's device construct like I did for remainder in [2].

In this case, as this is a BPF-helper, I am uncertain if it is worth the
complexity to add such optimizations... I guess not.
This turned into a long way of saying, I'm okay with this change.

[1] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_memset.c

[2] 
https://github.com/netoptimizer/prototype-kernel/blob/35b1716d0c300e7fa2c8b6d8cfed2ec81df8f3a4/kernel/lib/time_bench_memset.c#L520-L521

--Jesper

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-03-31  3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen
  2025-04-03  8:17   ` Jesper Dangaard Brouer
@ 2025-04-03 14:24   ` Alexei Starovoitov
  2025-04-03 14:32     ` Willem de Bruijn
  2025-04-04  0:27     ` Jiayuan Chen
  1 sibling, 2 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2025-04-03 14:24 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, LKML,
	Network Development, open list:KERNEL SELFTEST FRAMEWORK

On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> The device allocates an skb, it additionally allocates a prepad size
> (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
> uninitialized.
>
> The bpf_xdp_adjust_head function moves skb->data forward, which allows
> users to access data belonging to other programs, posing a security risk.
>
> Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  include/uapi/linux/bpf.h       | 8 +++++---
>  net/core/filter.c              | 5 ++++-
>  tools/include/uapi/linux/bpf.h | 6 ++++--
>  3 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index defa5bb881f4..be01a848cbbf 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2760,8 +2760,9 @@ union bpf_attr {
>   *
>   * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
>   *     Description
> - *             Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> - *             it is possible to use a negative value for *delta*. This helper
> + *             Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> + *             it is possible to use a negative value for *delta*. If *delta*
> + *             is negative, the new header will be memset to zero. This helper
>   *             can be used to prepare the packet for pushing or popping
>   *             headers.
>   *
> @@ -2989,7 +2990,8 @@ union bpf_attr {
>   * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
>   *     Description
>   *             Adjust the address pointed by *xdp_md*\ **->data_meta** by
> - *             *delta* (which can be positive or negative). Note that this
> + *             *delta* (which can be positive or negative). If *delta* is
> + *             negative, the new meta will be memset to zero. Note that this
>   *             operation modifies the address stored in *xdp_md*\ **->data**,
>   *             so the latter must be loaded only after the helper has been
>   *             called.
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 46ae8eb7a03c..5f01d373b719 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>         if (metalen)
>                 memmove(xdp->data_meta + offset,
>                         xdp->data_meta, metalen);
> +       if (offset < 0)
> +               memset(data, 0, -offset);
>         xdp->data_meta += offset;
>         xdp->data = data;
>
> @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
>                 return -EINVAL;
>         if (unlikely(xdp_metalen_invalid(metalen)))
>                 return -EACCES;
> -
> +       if (offset < 0)
> +               memset(meta, 0, -offset);

Let's make everyone pay a performance penalty to silence
KMSAN warning?

I don't think it's a good trade off.

Soft nack.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-04-03 14:24   ` Alexei Starovoitov
@ 2025-04-03 14:32     ` Willem de Bruijn
  2025-04-04  0:28       ` Alexei Starovoitov
  2025-04-04  0:27     ` Jiayuan Chen
  1 sibling, 1 reply; 11+ messages in thread
From: Willem de Bruijn @ 2025-04-03 14:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Jiayuan Chen
  Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, LKML,
	Network Development, open list:KERNEL SELFTEST FRAMEWORK

Alexei Starovoitov wrote:
> On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >
> > The device allocates an skb, it additionally allocates a prepad size
> > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
> > uninitialized.
> >
> > The bpf_xdp_adjust_head function moves skb->data forward, which allows
> > users to access data belonging to other programs, posing a security risk.
> >
> > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > ---
> >  include/uapi/linux/bpf.h       | 8 +++++---
> >  net/core/filter.c              | 5 ++++-
> >  tools/include/uapi/linux/bpf.h | 6 ++++--
> >  3 files changed, 13 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index defa5bb881f4..be01a848cbbf 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2760,8 +2760,9 @@ union bpf_attr {
> >   *
> >   * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
> >   *     Description
> > - *             Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > - *             it is possible to use a negative value for *delta*. This helper
> > + *             Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > + *             it is possible to use a negative value for *delta*. If *delta*
> > + *             is negative, the new header will be memset to zero. This helper
> >   *             can be used to prepare the packet for pushing or popping
> >   *             headers.
> >   *
> > @@ -2989,7 +2990,8 @@ union bpf_attr {
> >   * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
> >   *     Description
> >   *             Adjust the address pointed by *xdp_md*\ **->data_meta** by
> > - *             *delta* (which can be positive or negative). Note that this
> > + *             *delta* (which can be positive or negative). If *delta* is
> > + *             negative, the new meta will be memset to zero. Note that this
> >   *             operation modifies the address stored in *xdp_md*\ **->data**,
> >   *             so the latter must be loaded only after the helper has been
> >   *             called.
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 46ae8eb7a03c..5f01d373b719 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> >         if (metalen)
> >                 memmove(xdp->data_meta + offset,
> >                         xdp->data_meta, metalen);
> > +       if (offset < 0)
> > +               memset(data, 0, -offset);
> >         xdp->data_meta += offset;
> >         xdp->data = data;
> >
> > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> >                 return -EINVAL;
> >         if (unlikely(xdp_metalen_invalid(metalen)))
> >                 return -EACCES;
> > -
> > +       if (offset < 0)
> > +               memset(meta, 0, -offset);
> 
> Let's make everyone pay a performance penalty to silence
> KMSAN warning?
> 
> I don't think it's a good trade off.
> 
> Soft nack.

I also assumed that this was known when the feature was originally
introduced and left as is for performance reasons.

Might be good to have that explicit. And that it is deemed safe by
virtue of XDP requiring superuser privileges anyway. Or at least I
guess that was the thought process?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-04-03 14:24   ` Alexei Starovoitov
  2025-04-03 14:32     ` Willem de Bruijn
@ 2025-04-04  0:27     ` Jiayuan Chen
  2025-04-04  0:29       ` Alexei Starovoitov
  1 sibling, 1 reply; 11+ messages in thread
From: Jiayuan Chen @ 2025-04-04  0:27 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, LKML,
	Network Development, open list:KERNEL SELFTEST FRAMEWORK

April 3, 2025 at 22:24, "Alexei Starovoitov" <alexei.starovoitov@gmail.com> wrote:



> 
> On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> 
> > 
> > The device allocates an skb, it additionally allocates a prepad size
> > 
> >  (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
> > 
> >  uninitialized.
> > 
> >  The bpf_xdp_adjust_head function moves skb->data forward, which allows
> > 
> >  users to access data belonging to other programs, posing a security risk.
> > 
> >  Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
> > 
> >  Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
> > 
> >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > 
> >  ---
> > 
> >  include/uapi/linux/bpf.h | 8 +++++---
> > 
> >  net/core/filter.c | 5 ++++-
> > 
> >  tools/include/uapi/linux/bpf.h | 6 ++++--
> > 
> >  3 files changed, 13 insertions(+), 6 deletions(-)
> > 
> >  diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > 
> >  index defa5bb881f4..be01a848cbbf 100644
> > 
> >  --- a/include/uapi/linux/bpf.h
> > 
> >  +++ b/include/uapi/linux/bpf.h
> > 
> >  @@ -2760,8 +2760,9 @@ union bpf_attr {
> > 
> >  *
> > 
> >  * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
> > 
> >  * Description
> > 
> >  - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > 
> >  - * it is possible to use a negative value for *delta*. This helper
> > 
> >  + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > 
> >  + * it is possible to use a negative value for *delta*. If *delta*
> > 
> >  + * is negative, the new header will be memset to zero. This helper
> > 
> >  * can be used to prepare the packet for pushing or popping
> > 
> >  * headers.
> > 
> >  *
> > 
> >  @@ -2989,7 +2990,8 @@ union bpf_attr {
> > 
> >  * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
> > 
> >  * Description
> > 
> >  * Adjust the address pointed by *xdp_md*\ **->data_meta** by
> > 
> >  - * *delta* (which can be positive or negative). Note that this
> > 
> >  + * *delta* (which can be positive or negative). If *delta* is
> > 
> >  + * negative, the new meta will be memset to zero. Note that this
> > 
> >  * operation modifies the address stored in *xdp_md*\ **->data**,
> > 
> >  * so the latter must be loaded only after the helper has been
> > 
> >  * called.
> > 
> >  diff --git a/net/core/filter.c b/net/core/filter.c
> > 
> >  index 46ae8eb7a03c..5f01d373b719 100644
> > 
> >  --- a/net/core/filter.c
> > 
> >  +++ b/net/core/filter.c
> > 
> >  @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> > 
> >  if (metalen)
> > 
> >  memmove(xdp->data_meta + offset,
> > 
> >  xdp->data_meta, metalen);
> > 
> >  + if (offset < 0)
> > 
> >  + memset(data, 0, -offset);
> > 
> >  xdp->data_meta += offset;
> > 
> >  xdp->data = data;
> > 
> >  @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> > 
> >  return -EINVAL;
> > 
> >  if (unlikely(xdp_metalen_invalid(metalen)))
> > 
> >  return -EACCES;
> > 
> >  -
> > 
> >  + if (offset < 0)
> > 
> >  + memset(meta, 0, -offset);
> > 
> 
> Let's make everyone pay a performance penalty to silence
> KMSAN warning?
> I don't think it's a good trade off.
> Soft nack.
>

It's not just about simply suppressing KMSAN warnings, but for security
considerations.

So I'd like to confirm: currently, loading an XDP program only requires
CAP_NET_ADMIN and CAP_BPF permissions. If we consider this as a super
privilege, then even if uninitialized memory is exposed, I think it's okay,
as it's the developer's responsibility, for example, like the CVE in meta
https://vuldb.com/?id.246309.

Or I'm thinking, can we rely on the verifier to force the initialization
of the newly added packet boundary behavior, specifically for this special
case (although it won't be easy to implement).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-04-03 14:32     ` Willem de Bruijn
@ 2025-04-04  0:28       ` Alexei Starovoitov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2025-04-04  0:28 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Jiayuan Chen, bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, LKML,
	Network Development, open list:KERNEL SELFTEST FRAMEWORK

On Thu, Apr 3, 2025 at 7:32 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Alexei Starovoitov wrote:
> > On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> > >
> > > The device allocates an skb, it additionally allocates a prepad size
> > > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
> > > uninitialized.
> > >
> > > The bpf_xdp_adjust_head function moves skb->data forward, which allows
> > > users to access data belonging to other programs, posing a security risk.
> > >
> > > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
> > > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
> > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > > ---
> > >  include/uapi/linux/bpf.h       | 8 +++++---
> > >  net/core/filter.c              | 5 ++++-
> > >  tools/include/uapi/linux/bpf.h | 6 ++++--
> > >  3 files changed, 13 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index defa5bb881f4..be01a848cbbf 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -2760,8 +2760,9 @@ union bpf_attr {
> > >   *
> > >   * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
> > >   *     Description
> > > - *             Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > > - *             it is possible to use a negative value for *delta*. This helper
> > > + *             Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > > + *             it is possible to use a negative value for *delta*. If *delta*
> > > + *             is negative, the new header will be memset to zero. This helper
> > >   *             can be used to prepare the packet for pushing or popping
> > >   *             headers.
> > >   *
> > > @@ -2989,7 +2990,8 @@ union bpf_attr {
> > >   * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
> > >   *     Description
> > >   *             Adjust the address pointed by *xdp_md*\ **->data_meta** by
> > > - *             *delta* (which can be positive or negative). Note that this
> > > + *             *delta* (which can be positive or negative). If *delta* is
> > > + *             negative, the new meta will be memset to zero. Note that this
> > >   *             operation modifies the address stored in *xdp_md*\ **->data**,
> > >   *             so the latter must be loaded only after the helper has been
> > >   *             called.
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index 46ae8eb7a03c..5f01d373b719 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> > >         if (metalen)
> > >                 memmove(xdp->data_meta + offset,
> > >                         xdp->data_meta, metalen);
> > > +       if (offset < 0)
> > > +               memset(data, 0, -offset);
> > >         xdp->data_meta += offset;
> > >         xdp->data = data;
> > >
> > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> > >                 return -EINVAL;
> > >         if (unlikely(xdp_metalen_invalid(metalen)))
> > >                 return -EACCES;
> > > -
> > > +       if (offset < 0)
> > > +               memset(meta, 0, -offset);
> >
> > Let's make everyone pay a performance penalty to silence
> > KMSAN warning?
> >
> > I don't think it's a good trade off.
> >
> > Soft nack.
>
> I also assumed that this was known when the feature was originally
> introduced and left as is for performance reasons.
>
> Might be good to have that explicit. And that it is deemed safe by
> virtue of XDP requiring superuser privileges anyway. Or at least I
> guess that was the thought process?

Correct. When prog extends the headroom it is suppose to write
something in there. Extending the packet just to capture
some garbage bytes from the previous packet is dumb,
but doesn't compromise the safety of the kernel.
There were proposals to ask the verifier to track that the headroom
is actually initialized by the program, but it's pointless.
Dumb prog can write garbage in there just as well.
bpf_probe_read_kernel( from_random_addr) and store into the headroom.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it
  2025-04-04  0:27     ` Jiayuan Chen
@ 2025-04-04  0:29       ` Alexei Starovoitov
  0 siblings, 0 replies; 11+ messages in thread
From: Alexei Starovoitov @ 2025-04-04  0:29 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan,
	Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan,
	Jordan Rome, Martin Kelly, David Lechner, LKML,
	Network Development, open list:KERNEL SELFTEST FRAMEWORK

On Thu, Apr 3, 2025 at 5:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> April 3, 2025 at 22:24, "Alexei Starovoitov" <alexei.starovoitov@gmail.com> wrote:
>
>
>
> >
> > On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >
> > >
> > > The device allocates an skb, it additionally allocates a prepad size
> > >
> > >  (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it
> > >
> > >  uninitialized.
> > >
> > >  The bpf_xdp_adjust_head function moves skb->data forward, which allows
> > >
> > >  users to access data belonging to other programs, posing a security risk.
> > >
> > >  Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com
> > >
> > >  Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/
> > >
> > >  Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> > >
> > >  ---
> > >
> > >  include/uapi/linux/bpf.h | 8 +++++---
> > >
> > >  net/core/filter.c | 5 ++++-
> > >
> > >  tools/include/uapi/linux/bpf.h | 6 ++++--
> > >
> > >  3 files changed, 13 insertions(+), 6 deletions(-)
> > >
> > >  diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > >
> > >  index defa5bb881f4..be01a848cbbf 100644
> > >
> > >  --- a/include/uapi/linux/bpf.h
> > >
> > >  +++ b/include/uapi/linux/bpf.h
> > >
> > >  @@ -2760,8 +2760,9 @@ union bpf_attr {
> > >
> > >  *
> > >
> > >  * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
> > >
> > >  * Description
> > >
> > >  - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > >
> > >  - * it is possible to use a negative value for *delta*. This helper
> > >
> > >  + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that
> > >
> > >  + * it is possible to use a negative value for *delta*. If *delta*
> > >
> > >  + * is negative, the new header will be memset to zero. This helper
> > >
> > >  * can be used to prepare the packet for pushing or popping
> > >
> > >  * headers.
> > >
> > >  *
> > >
> > >  @@ -2989,7 +2990,8 @@ union bpf_attr {
> > >
> > >  * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
> > >
> > >  * Description
> > >
> > >  * Adjust the address pointed by *xdp_md*\ **->data_meta** by
> > >
> > >  - * *delta* (which can be positive or negative). Note that this
> > >
> > >  + * *delta* (which can be positive or negative). If *delta* is
> > >
> > >  + * negative, the new meta will be memset to zero. Note that this
> > >
> > >  * operation modifies the address stored in *xdp_md*\ **->data**,
> > >
> > >  * so the latter must be loaded only after the helper has been
> > >
> > >  * called.
> > >
> > >  diff --git a/net/core/filter.c b/net/core/filter.c
> > >
> > >  index 46ae8eb7a03c..5f01d373b719 100644
> > >
> > >  --- a/net/core/filter.c
> > >
> > >  +++ b/net/core/filter.c
> > >
> > >  @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> > >
> > >  if (metalen)
> > >
> > >  memmove(xdp->data_meta + offset,
> > >
> > >  xdp->data_meta, metalen);
> > >
> > >  + if (offset < 0)
> > >
> > >  + memset(data, 0, -offset);
> > >
> > >  xdp->data_meta += offset;
> > >
> > >  xdp->data = data;
> > >
> > >  @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> > >
> > >  return -EINVAL;
> > >
> > >  if (unlikely(xdp_metalen_invalid(metalen)))
> > >
> > >  return -EACCES;
> > >
> > >  -
> > >
> > >  + if (offset < 0)
> > >
> > >  + memset(meta, 0, -offset);
> > >
> >
> > Let's make everyone pay a performance penalty to silence
> > KMSAN warning?
> > I don't think it's a good trade off.
> > Soft nack.
> >
>
> It's not just about simply suppressing KMSAN warnings, but for security
> considerations.
>
> So I'd like to confirm: currently, loading an XDP program only requires
> CAP_NET_ADMIN and CAP_BPF permissions. If we consider this as a super
> privilege, then even if uninitialized memory is exposed, I think it's okay,
> as it's the developer's responsibility, for example, like the CVE in meta
> https://vuldb.com/?id.246309.

And we fixed Katran. not the kernel.

> Or I'm thinking, can we rely on the verifier to force the initialization
> of the newly added packet boundary behavior, specifically for this special
> case (although it won't be easy to implement).

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-04-04  0:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-31  3:23 [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 Jiayuan Chen
2025-03-31  3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen
2025-04-03  8:17   ` Jesper Dangaard Brouer
2025-04-03 14:24   ` Alexei Starovoitov
2025-04-03 14:32     ` Willem de Bruijn
2025-04-04  0:28       ` Alexei Starovoitov
2025-04-04  0:27     ` Jiayuan Chen
2025-04-04  0:29       ` Alexei Starovoitov
2025-03-31  3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen
2025-04-03  0:24   ` Jakub Kicinski
2025-04-03  9:37     ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).