* [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 @ 2025-03-31 3:23 Jiayuan Chen 2025-03-31 3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen 2025-03-31 3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen 0 siblings, 2 replies; 11+ messages in thread From: Jiayuan Chen @ 2025-03-31 3:23 UTC (permalink / raw) To: bpf Cc: mrpre, Jiayuan Chen, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev, linux-kselftest This patchset originates from my attempt to resolve a KMSAN warning that has existed for over 3 years: https://syzkaller.appspot.com/bug?extid=0e6ddb1ef80986bdfe64 Previously, we had a brief discussion in this thread about whether we can simply perform memset in adjust_{head,meta}: https://lore.kernel.org/netdev/20250328043941.085de23b@kernel.org/T/#t Unfortunately, I couldn't find a similar topic in the mail list, but I did find a similar security-related commit: commit 6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse") I just create a new topic here and make subject more clear, we can discuss this here. Meanwhile, I also discovered a related issue that led to a CVE,specifically the Facebook Katran vulnerability (https://vuldb.com/?id.246309). Currently, even with unprivileged functionality disabled, a user can load a BPF program using CAP_BPF and CAP_NET_ADMIN, which I believe we should avoid exposing kernel memory directly to users now. Regarding performance considerations, I added corresponding results to the selftest, testing common MAC headers and IP headers of various sizes. Compared to not using memset, the execution time increased by 2ns, but I think this is negligible considering the entire net stack. Jiayuan Chen (2): bpf, xdp: clean head/meta when expanding it selftests/bpf: add perf test for adjust_{head,meta} include/uapi/linux/bpf.h | 8 +-- net/core/filter.c | 5 +- tools/include/uapi/linux/bpf.h | 6 ++- .../selftests/bpf/prog_tests/xdp_perf.c | 52 ++++++++++++++++--- tools/testing/selftests/bpf/progs/xdp_dummy.c | 14 +++++ 5 files changed, 72 insertions(+), 13 deletions(-) -- 2.47.1 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-03-31 3:23 [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 Jiayuan Chen @ 2025-03-31 3:23 ` Jiayuan Chen 2025-04-03 8:17 ` Jesper Dangaard Brouer 2025-04-03 14:24 ` Alexei Starovoitov 2025-03-31 3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen 1 sibling, 2 replies; 11+ messages in thread From: Jiayuan Chen @ 2025-03-31 3:23 UTC (permalink / raw) To: bpf Cc: mrpre, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev, linux-kselftest The device allocates an skb, it additionally allocates a prepad size (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it uninitialized. The bpf_xdp_adjust_head function moves skb->data forward, which allows users to access data belonging to other programs, posing a security risk. Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> --- include/uapi/linux/bpf.h | 8 +++++--- net/core/filter.c | 5 ++++- tools/include/uapi/linux/bpf.h | 6 ++++-- 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index defa5bb881f4..be01a848cbbf 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2760,8 +2760,9 @@ union bpf_attr { * * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) * Description - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that - * it is possible to use a negative value for *delta*. This helper + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that + * it is possible to use a negative value for *delta*. If *delta* + * is negative, the new header will be memset to zero. This helper * can be used to prepare the packet for pushing or popping * headers. * @@ -2989,7 +2990,8 @@ union bpf_attr { * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) * Description * Adjust the address pointed by *xdp_md*\ **->data_meta** by - * *delta* (which can be positive or negative). Note that this + * *delta* (which can be positive or negative). If *delta* is + * negative, the new meta will be memset to zero. Note that this * operation modifies the address stored in *xdp_md*\ **->data**, * so the latter must be loaded only after the helper has been * called. diff --git a/net/core/filter.c b/net/core/filter.c index 46ae8eb7a03c..5f01d373b719 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) if (metalen) memmove(xdp->data_meta + offset, xdp->data_meta, metalen); + if (offset < 0) + memset(data, 0, -offset); xdp->data_meta += offset; xdp->data = data; @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) return -EINVAL; if (unlikely(xdp_metalen_invalid(metalen))) return -EACCES; - + if (offset < 0) + memset(meta, 0, -offset); xdp->data_meta = meta; return 0; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index defa5bb881f4..7b1871f2eccf 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -2761,7 +2761,8 @@ union bpf_attr { * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) * Description * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that - * it is possible to use a negative value for *delta*. This helper + * it is possible to use a negative value for *delta*. If *delta* + * is negative, the new header will be memset to zero. This helper * can be used to prepare the packet for pushing or popping * headers. * @@ -2989,7 +2990,8 @@ union bpf_attr { * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) * Description * Adjust the address pointed by *xdp_md*\ **->data_meta** by - * *delta* (which can be positive or negative). Note that this + * *delta* (which can be positive or negative). If *delta* is + * negative, the new meta will be memset to zero. Note that this * operation modifies the address stored in *xdp_md*\ **->data**, * so the latter must be loaded only after the helper has been * called. -- 2.47.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-03-31 3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen @ 2025-04-03 8:17 ` Jesper Dangaard Brouer 2025-04-03 14:24 ` Alexei Starovoitov 1 sibling, 0 replies; 11+ messages in thread From: Jesper Dangaard Brouer @ 2025-04-03 8:17 UTC (permalink / raw) To: Jiayuan Chen, bpf Cc: mrpre, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev, linux-kselftest On 31/03/2025 05.23, Jiayuan Chen wrote: > The device allocates an skb, it additionally allocates a prepad size > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it > uninitialized. > > The bpf_xdp_adjust_head function moves skb->data forward, which allows > users to access data belonging to other programs, posing a security risk. I find your description confusing, and it needs to be improved to avoid people interpenetrating this when reading the commit log in the future. It is part of the UAPI that BPF programs access data belonging to other BPF programs. It is the concept behind data_meta, e.g. that XDP set information in this memory and TC-BPF reads it again (chained XDP progs also have R/W access). I hope/assume this is not the desired interpretation of your text. I guess you want to say, that the intention is to avoid BPF programs accessing uninitialized data? (... which is also what the code does at a glance) > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > --- > include/uapi/linux/bpf.h | 8 +++++--- > net/core/filter.c | 5 ++++- > tools/include/uapi/linux/bpf.h | 6 ++++-- > 3 files changed, 13 insertions(+), 6 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index defa5bb881f4..be01a848cbbf 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2760,8 +2760,9 @@ union bpf_attr { > * > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > * Description > - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > - * it is possible to use a negative value for *delta*. This helper > + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > + * it is possible to use a negative value for *delta*. If *delta* > + * is negative, the new header will be memset to zero. This helper > * can be used to prepare the packet for pushing or popping > * headers. > * > @@ -2989,7 +2990,8 @@ union bpf_attr { > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > * Description > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > - * *delta* (which can be positive or negative). Note that this > + * *delta* (which can be positive or negative). If *delta* is > + * negative, the new meta will be memset to zero. Note that this > * operation modifies the address stored in *xdp_md*\ **->data**, > * so the latter must be loaded only after the helper has been > * called. > diff --git a/net/core/filter.c b/net/core/filter.c > index 46ae8eb7a03c..5f01d373b719 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) > if (metalen) > memmove(xdp->data_meta + offset, > xdp->data_meta, metalen); > + if (offset < 0) > + memset(data, 0, -offset); > xdp->data_meta += offset; > xdp->data = data; > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) > return -EINVAL; > if (unlikely(xdp_metalen_invalid(metalen))) > return -EACCES; > - > + if (offset < 0) > + memset(meta, 0, -offset); > xdp->data_meta = meta; > > return 0; > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index defa5bb881f4..7b1871f2eccf 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -2761,7 +2761,8 @@ union bpf_attr { > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > * Description > * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > - * it is possible to use a negative value for *delta*. This helper > + * it is possible to use a negative value for *delta*. If *delta* > + * is negative, the new header will be memset to zero. This helper > * can be used to prepare the packet for pushing or popping > * headers. > * > @@ -2989,7 +2990,8 @@ union bpf_attr { > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > * Description > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > - * *delta* (which can be positive or negative). Note that this > + * *delta* (which can be positive or negative). If *delta* is > + * negative, the new meta will be memset to zero. Note that this > * operation modifies the address stored in *xdp_md*\ **->data**, > * so the latter must be loaded only after the helper has been > * called. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-03-31 3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen 2025-04-03 8:17 ` Jesper Dangaard Brouer @ 2025-04-03 14:24 ` Alexei Starovoitov 2025-04-03 14:32 ` Willem de Bruijn 2025-04-04 0:27 ` Jiayuan Chen 1 sibling, 2 replies; 11+ messages in thread From: Alexei Starovoitov @ 2025-04-03 14:24 UTC (permalink / raw) To: Jiayuan Chen Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, LKML, Network Development, open list:KERNEL SELFTEST FRAMEWORK On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > The device allocates an skb, it additionally allocates a prepad size > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it > uninitialized. > > The bpf_xdp_adjust_head function moves skb->data forward, which allows > users to access data belonging to other programs, posing a security risk. > > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > --- > include/uapi/linux/bpf.h | 8 +++++--- > net/core/filter.c | 5 ++++- > tools/include/uapi/linux/bpf.h | 6 ++++-- > 3 files changed, 13 insertions(+), 6 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index defa5bb881f4..be01a848cbbf 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2760,8 +2760,9 @@ union bpf_attr { > * > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > * Description > - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > - * it is possible to use a negative value for *delta*. This helper > + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > + * it is possible to use a negative value for *delta*. If *delta* > + * is negative, the new header will be memset to zero. This helper > * can be used to prepare the packet for pushing or popping > * headers. > * > @@ -2989,7 +2990,8 @@ union bpf_attr { > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > * Description > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > - * *delta* (which can be positive or negative). Note that this > + * *delta* (which can be positive or negative). If *delta* is > + * negative, the new meta will be memset to zero. Note that this > * operation modifies the address stored in *xdp_md*\ **->data**, > * so the latter must be loaded only after the helper has been > * called. > diff --git a/net/core/filter.c b/net/core/filter.c > index 46ae8eb7a03c..5f01d373b719 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) > if (metalen) > memmove(xdp->data_meta + offset, > xdp->data_meta, metalen); > + if (offset < 0) > + memset(data, 0, -offset); > xdp->data_meta += offset; > xdp->data = data; > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) > return -EINVAL; > if (unlikely(xdp_metalen_invalid(metalen))) > return -EACCES; > - > + if (offset < 0) > + memset(meta, 0, -offset); Let's make everyone pay a performance penalty to silence KMSAN warning? I don't think it's a good trade off. Soft nack. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-04-03 14:24 ` Alexei Starovoitov @ 2025-04-03 14:32 ` Willem de Bruijn 2025-04-04 0:28 ` Alexei Starovoitov 2025-04-04 0:27 ` Jiayuan Chen 1 sibling, 1 reply; 11+ messages in thread From: Willem de Bruijn @ 2025-04-03 14:32 UTC (permalink / raw) To: Alexei Starovoitov, Jiayuan Chen Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, LKML, Network Development, open list:KERNEL SELFTEST FRAMEWORK Alexei Starovoitov wrote: > On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > > > The device allocates an skb, it additionally allocates a prepad size > > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it > > uninitialized. > > > > The bpf_xdp_adjust_head function moves skb->data forward, which allows > > users to access data belonging to other programs, posing a security risk. > > > > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com > > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > > --- > > include/uapi/linux/bpf.h | 8 +++++--- > > net/core/filter.c | 5 ++++- > > tools/include/uapi/linux/bpf.h | 6 ++++-- > > 3 files changed, 13 insertions(+), 6 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index defa5bb881f4..be01a848cbbf 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -2760,8 +2760,9 @@ union bpf_attr { > > * > > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > > * Description > > - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > - * it is possible to use a negative value for *delta*. This helper > > + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > + * it is possible to use a negative value for *delta*. If *delta* > > + * is negative, the new header will be memset to zero. This helper > > * can be used to prepare the packet for pushing or popping > > * headers. > > * > > @@ -2989,7 +2990,8 @@ union bpf_attr { > > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > > * Description > > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > > - * *delta* (which can be positive or negative). Note that this > > + * *delta* (which can be positive or negative). If *delta* is > > + * negative, the new meta will be memset to zero. Note that this > > * operation modifies the address stored in *xdp_md*\ **->data**, > > * so the latter must be loaded only after the helper has been > > * called. > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 46ae8eb7a03c..5f01d373b719 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) > > if (metalen) > > memmove(xdp->data_meta + offset, > > xdp->data_meta, metalen); > > + if (offset < 0) > > + memset(data, 0, -offset); > > xdp->data_meta += offset; > > xdp->data = data; > > > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) > > return -EINVAL; > > if (unlikely(xdp_metalen_invalid(metalen))) > > return -EACCES; > > - > > + if (offset < 0) > > + memset(meta, 0, -offset); > > Let's make everyone pay a performance penalty to silence > KMSAN warning? > > I don't think it's a good trade off. > > Soft nack. I also assumed that this was known when the feature was originally introduced and left as is for performance reasons. Might be good to have that explicit. And that it is deemed safe by virtue of XDP requiring superuser privileges anyway. Or at least I guess that was the thought process? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-04-03 14:32 ` Willem de Bruijn @ 2025-04-04 0:28 ` Alexei Starovoitov 0 siblings, 0 replies; 11+ messages in thread From: Alexei Starovoitov @ 2025-04-04 0:28 UTC (permalink / raw) To: Willem de Bruijn Cc: Jiayuan Chen, bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, LKML, Network Development, open list:KERNEL SELFTEST FRAMEWORK On Thu, Apr 3, 2025 at 7:32 AM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > Alexei Starovoitov wrote: > > On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > > > > > The device allocates an skb, it additionally allocates a prepad size > > > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it > > > uninitialized. > > > > > > The bpf_xdp_adjust_head function moves skb->data forward, which allows > > > users to access data belonging to other programs, posing a security risk. > > > > > > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com > > > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ > > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > > > --- > > > include/uapi/linux/bpf.h | 8 +++++--- > > > net/core/filter.c | 5 ++++- > > > tools/include/uapi/linux/bpf.h | 6 ++++-- > > > 3 files changed, 13 insertions(+), 6 deletions(-) > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > index defa5bb881f4..be01a848cbbf 100644 > > > --- a/include/uapi/linux/bpf.h > > > +++ b/include/uapi/linux/bpf.h > > > @@ -2760,8 +2760,9 @@ union bpf_attr { > > > * > > > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > > > * Description > > > - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > > - * it is possible to use a negative value for *delta*. This helper > > > + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > > + * it is possible to use a negative value for *delta*. If *delta* > > > + * is negative, the new header will be memset to zero. This helper > > > * can be used to prepare the packet for pushing or popping > > > * headers. > > > * > > > @@ -2989,7 +2990,8 @@ union bpf_attr { > > > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > > > * Description > > > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > > > - * *delta* (which can be positive or negative). Note that this > > > + * *delta* (which can be positive or negative). If *delta* is > > > + * negative, the new meta will be memset to zero. Note that this > > > * operation modifies the address stored in *xdp_md*\ **->data**, > > > * so the latter must be loaded only after the helper has been > > > * called. > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > index 46ae8eb7a03c..5f01d373b719 100644 > > > --- a/net/core/filter.c > > > +++ b/net/core/filter.c > > > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) > > > if (metalen) > > > memmove(xdp->data_meta + offset, > > > xdp->data_meta, metalen); > > > + if (offset < 0) > > > + memset(data, 0, -offset); > > > xdp->data_meta += offset; > > > xdp->data = data; > > > > > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) > > > return -EINVAL; > > > if (unlikely(xdp_metalen_invalid(metalen))) > > > return -EACCES; > > > - > > > + if (offset < 0) > > > + memset(meta, 0, -offset); > > > > Let's make everyone pay a performance penalty to silence > > KMSAN warning? > > > > I don't think it's a good trade off. > > > > Soft nack. > > I also assumed that this was known when the feature was originally > introduced and left as is for performance reasons. > > Might be good to have that explicit. And that it is deemed safe by > virtue of XDP requiring superuser privileges anyway. Or at least I > guess that was the thought process? Correct. When prog extends the headroom it is suppose to write something in there. Extending the packet just to capture some garbage bytes from the previous packet is dumb, but doesn't compromise the safety of the kernel. There were proposals to ask the verifier to track that the headroom is actually initialized by the program, but it's pointless. Dumb prog can write garbage in there just as well. bpf_probe_read_kernel( from_random_addr) and store into the headroom. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-04-03 14:24 ` Alexei Starovoitov 2025-04-03 14:32 ` Willem de Bruijn @ 2025-04-04 0:27 ` Jiayuan Chen 2025-04-04 0:29 ` Alexei Starovoitov 1 sibling, 1 reply; 11+ messages in thread From: Jiayuan Chen @ 2025-04-04 0:27 UTC (permalink / raw) To: Alexei Starovoitov Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, LKML, Network Development, open list:KERNEL SELFTEST FRAMEWORK April 3, 2025 at 22:24, "Alexei Starovoitov" <alexei.starovoitov@gmail.com> wrote: > > On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > > > > The device allocates an skb, it additionally allocates a prepad size > > > > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it > > > > uninitialized. > > > > The bpf_xdp_adjust_head function moves skb->data forward, which allows > > > > users to access data belonging to other programs, posing a security risk. > > > > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com > > > > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ > > > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > > > > --- > > > > include/uapi/linux/bpf.h | 8 +++++--- > > > > net/core/filter.c | 5 ++++- > > > > tools/include/uapi/linux/bpf.h | 6 ++++-- > > > > 3 files changed, 13 insertions(+), 6 deletions(-) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > > index defa5bb881f4..be01a848cbbf 100644 > > > > --- a/include/uapi/linux/bpf.h > > > > +++ b/include/uapi/linux/bpf.h > > > > @@ -2760,8 +2760,9 @@ union bpf_attr { > > > > * > > > > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > > > > * Description > > > > - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > > > - * it is possible to use a negative value for *delta*. This helper > > > > + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > > > + * it is possible to use a negative value for *delta*. If *delta* > > > > + * is negative, the new header will be memset to zero. This helper > > > > * can be used to prepare the packet for pushing or popping > > > > * headers. > > > > * > > > > @@ -2989,7 +2990,8 @@ union bpf_attr { > > > > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > > > > * Description > > > > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > > > > - * *delta* (which can be positive or negative). Note that this > > > > + * *delta* (which can be positive or negative). If *delta* is > > > > + * negative, the new meta will be memset to zero. Note that this > > > > * operation modifies the address stored in *xdp_md*\ **->data**, > > > > * so the latter must be loaded only after the helper has been > > > > * called. > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > index 46ae8eb7a03c..5f01d373b719 100644 > > > > --- a/net/core/filter.c > > > > +++ b/net/core/filter.c > > > > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) > > > > if (metalen) > > > > memmove(xdp->data_meta + offset, > > > > xdp->data_meta, metalen); > > > > + if (offset < 0) > > > > + memset(data, 0, -offset); > > > > xdp->data_meta += offset; > > > > xdp->data = data; > > > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) > > > > return -EINVAL; > > > > if (unlikely(xdp_metalen_invalid(metalen))) > > > > return -EACCES; > > > > - > > > > + if (offset < 0) > > > > + memset(meta, 0, -offset); > > > > Let's make everyone pay a performance penalty to silence > KMSAN warning? > I don't think it's a good trade off. > Soft nack. > It's not just about simply suppressing KMSAN warnings, but for security considerations. So I'd like to confirm: currently, loading an XDP program only requires CAP_NET_ADMIN and CAP_BPF permissions. If we consider this as a super privilege, then even if uninitialized memory is exposed, I think it's okay, as it's the developer's responsibility, for example, like the CVE in meta https://vuldb.com/?id.246309. Or I'm thinking, can we rely on the verifier to force the initialization of the newly added packet boundary behavior, specifically for this special case (although it won't be easy to implement). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it 2025-04-04 0:27 ` Jiayuan Chen @ 2025-04-04 0:29 ` Alexei Starovoitov 0 siblings, 0 replies; 11+ messages in thread From: Alexei Starovoitov @ 2025-04-04 0:29 UTC (permalink / raw) To: Jiayuan Chen Cc: bpf, Jiayuan Chen, syzbot+0e6ddb1ef80986bdfe64, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, LKML, Network Development, open list:KERNEL SELFTEST FRAMEWORK On Thu, Apr 3, 2025 at 5:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > April 3, 2025 at 22:24, "Alexei Starovoitov" <alexei.starovoitov@gmail.com> wrote: > > > > > > > On Sun, Mar 30, 2025 at 8:27 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > > > > > > > The device allocates an skb, it additionally allocates a prepad size > > > > > > (usually equal to NET_SKB_PAD or XDP_PACKET_HEADROOM) but leaves it > > > > > > uninitialized. > > > > > > The bpf_xdp_adjust_head function moves skb->data forward, which allows > > > > > > users to access data belonging to other programs, posing a security risk. > > > > > > Reported-by: syzbot+0e6ddb1ef80986bdfe64@syzkaller.appspotmail.com > > > > > > Closes: https://lore.kernel.org/all/00000000000067f65105edbd295d@google.com/T/ > > > > > > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> > > > > > > --- > > > > > > include/uapi/linux/bpf.h | 8 +++++--- > > > > > > net/core/filter.c | 5 ++++- > > > > > > tools/include/uapi/linux/bpf.h | 6 ++++-- > > > > > > 3 files changed, 13 insertions(+), 6 deletions(-) > > > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > > > > index defa5bb881f4..be01a848cbbf 100644 > > > > > > --- a/include/uapi/linux/bpf.h > > > > > > +++ b/include/uapi/linux/bpf.h > > > > > > @@ -2760,8 +2760,9 @@ union bpf_attr { > > > > > > * > > > > > > * long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta) > > > > > > * Description > > > > > > - * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > > > > > - * it is possible to use a negative value for *delta*. This helper > > > > > > + * Adjust (move) *xdp_md*\ **->data** by *delta* bytes. Note that > > > > > > + * it is possible to use a negative value for *delta*. If *delta* > > > > > > + * is negative, the new header will be memset to zero. This helper > > > > > > * can be used to prepare the packet for pushing or popping > > > > > > * headers. > > > > > > * > > > > > > @@ -2989,7 +2990,8 @@ union bpf_attr { > > > > > > * long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta) > > > > > > * Description > > > > > > * Adjust the address pointed by *xdp_md*\ **->data_meta** by > > > > > > - * *delta* (which can be positive or negative). Note that this > > > > > > + * *delta* (which can be positive or negative). If *delta* is > > > > > > + * negative, the new meta will be memset to zero. Note that this > > > > > > * operation modifies the address stored in *xdp_md*\ **->data**, > > > > > > * so the latter must be loaded only after the helper has been > > > > > > * called. > > > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > > > > > index 46ae8eb7a03c..5f01d373b719 100644 > > > > > > --- a/net/core/filter.c > > > > > > +++ b/net/core/filter.c > > > > > > @@ -3947,6 +3947,8 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset) > > > > > > if (metalen) > > > > > > memmove(xdp->data_meta + offset, > > > > > > xdp->data_meta, metalen); > > > > > > + if (offset < 0) > > > > > > + memset(data, 0, -offset); > > > > > > xdp->data_meta += offset; > > > > > > xdp->data = data; > > > > > > @@ -4239,7 +4241,8 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset) > > > > > > return -EINVAL; > > > > > > if (unlikely(xdp_metalen_invalid(metalen))) > > > > > > return -EACCES; > > > > > > - > > > > > > + if (offset < 0) > > > > > > + memset(meta, 0, -offset); > > > > > > > Let's make everyone pay a performance penalty to silence > > KMSAN warning? > > I don't think it's a good trade off. > > Soft nack. > > > > It's not just about simply suppressing KMSAN warnings, but for security > considerations. > > So I'd like to confirm: currently, loading an XDP program only requires > CAP_NET_ADMIN and CAP_BPF permissions. If we consider this as a super > privilege, then even if uninitialized memory is exposed, I think it's okay, > as it's the developer's responsibility, for example, like the CVE in meta > https://vuldb.com/?id.246309. And we fixed Katran. not the kernel. > Or I'm thinking, can we rely on the verifier to force the initialization > of the newly added packet boundary behavior, specifically for this special > case (although it won't be easy to implement). ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} 2025-03-31 3:23 [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 Jiayuan Chen 2025-03-31 3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen @ 2025-03-31 3:23 ` Jiayuan Chen 2025-04-03 0:24 ` Jakub Kicinski 1 sibling, 1 reply; 11+ messages in thread From: Jiayuan Chen @ 2025-03-31 3:23 UTC (permalink / raw) To: bpf Cc: mrpre, Jiayuan Chen, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev, linux-kselftest We added a memset operation during the adjust operation, which may cause performance issues. Therefore, we added perf testing, and testing found that for common header length operations, memset() operation increased the performance overhead by 2ns, which is negligible for the net stack. Before memset ./test_progs -a xdp_adjust_head_perf -v run adjust head with size 6 cost 56 ns run adjust head with size 20 cost 56 ns run adjust head with size 40 cost 56 ns run adjust head with size 200 cost 56 ns After memset ./test_progs -a xdp_adjust_head_perf -v run adjust head with size 6 cost 58 ns run adjust head with size 20 cost 58 ns run adjust head with size 40 cost 58 ns run adjust head with size 200 cost 66 ns Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> --- .../selftests/bpf/prog_tests/xdp_perf.c | 52 ++++++++++++++++--- tools/testing/selftests/bpf/progs/xdp_dummy.c | 14 +++++ 2 files changed, 59 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_perf.c b/tools/testing/selftests/bpf/prog_tests/xdp_perf.c index ec5369f247cb..1b4260c6e5d7 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_perf.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_perf.c @@ -1,10 +1,11 @@ // SPDX-License-Identifier: GPL-2.0 #include <test_progs.h> +#include <network_helpers.h> +#include "xdp_dummy.skel.h" void test_xdp_perf(void) { - const char *file = "./xdp_dummy.bpf.o"; - struct bpf_object *obj; + struct xdp_dummy *skel; char in[128], out[128]; int err, prog_fd; LIBBPF_OPTS(bpf_test_run_opts, topts, @@ -15,14 +16,51 @@ void test_xdp_perf(void) .repeat = 1000000, ); - err = bpf_prog_test_load(file, BPF_PROG_TYPE_XDP, &obj, &prog_fd); - if (CHECK_FAIL(err)) - return; - + skel = xdp_dummy__open_and_load(); + prog_fd = bpf_program__fd(skel->progs.xdp_dummy_prog); err = bpf_prog_test_run_opts(prog_fd, &topts); ASSERT_OK(err, "test_run"); ASSERT_EQ(topts.retval, XDP_PASS, "test_run retval"); ASSERT_EQ(topts.data_size_out, 128, "test_run data_size_out"); - bpf_object__close(obj); + xdp_dummy__destroy(skel); +} + +void test_xdp_adjust_head_perf(void) +{ + struct xdp_dummy *skel; + int repeat = 9000000; + struct xdp_md ctx_in; + char data[100]; + int err, prog_fd; + size_t test_header_size[] = { + ETH_ALEN, + sizeof(struct iphdr), + sizeof(struct ipv6hdr), + 200, + }; + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = &data, + .data_size_in = sizeof(data), + .repeat = repeat, + ); + + topts.ctx_in = &ctx_in; + topts.ctx_size_in = sizeof(ctx_in); + memset(&ctx_in, 0, sizeof(ctx_in)); + ctx_in.data_meta = 0; + ctx_in.data_end = ctx_in.data + sizeof(data); + + skel = xdp_dummy__open_and_load(); + prog_fd = bpf_program__fd(skel->progs.xdp_dummy_adjust_head); + + for (int i = 0; i < ARRAY_SIZE(test_header_size); i++) { + skel->bss->head_size = test_header_size[i]; + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run"); + ASSERT_EQ(topts.retval, XDP_PASS, "test_run retval"); + fprintf(stdout, "run adjust head with size %zd cost %d ns\n", + test_header_size[i], topts.duration); + } + xdp_dummy__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/xdp_dummy.c b/tools/testing/selftests/bpf/progs/xdp_dummy.c index d988b2e0cee8..7bebedbbc949 100644 --- a/tools/testing/selftests/bpf/progs/xdp_dummy.c +++ b/tools/testing/selftests/bpf/progs/xdp_dummy.c @@ -4,10 +4,24 @@ #include <linux/bpf.h> #include <bpf/bpf_helpers.h> +int head_size; + SEC("xdp") int xdp_dummy_prog(struct xdp_md *ctx) { return XDP_PASS; } +SEC("xdp") +int xdp_dummy_adjust_head(struct xdp_md *ctx) +{ + if (bpf_xdp_adjust_head(ctx, -head_size)) + return XDP_DROP; + + if (bpf_xdp_adjust_head(ctx, head_size)) + return XDP_DROP; + + return XDP_PASS; +} + char _license[] SEC("license") = "GPL"; -- 2.47.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} 2025-03-31 3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen @ 2025-04-03 0:24 ` Jakub Kicinski 2025-04-03 9:37 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 11+ messages in thread From: Jakub Kicinski @ 2025-04-03 0:24 UTC (permalink / raw) To: Jiayuan Chen Cc: bpf, mrpre, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman, Jesper Dangaard Brouer, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev, linux-kselftest On Mon, 31 Mar 2025 11:23:45 +0800 Jiayuan Chen wrote: > which is negligible for the net stack. > > Before memset > ./test_progs -a xdp_adjust_head_perf -v > run adjust head with size 6 cost 56 ns > run adjust head with size 20 cost 56 ns > run adjust head with size 40 cost 56 ns > run adjust head with size 200 cost 56 ns > > After memset > ./test_progs -a xdp_adjust_head_perf -v > run adjust head with size 6 cost 58 ns > run adjust head with size 20 cost 58 ns > run adjust head with size 40 cost 58 ns > run adjust head with size 200 cost 66 ns FWIW I'm not sure if this is "negligible" for XDP like you say, but I defer to Jesper :) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} 2025-04-03 0:24 ` Jakub Kicinski @ 2025-04-03 9:37 ` Jesper Dangaard Brouer 0 siblings, 0 replies; 11+ messages in thread From: Jesper Dangaard Brouer @ 2025-04-03 9:37 UTC (permalink / raw) To: Jakub Kicinski, Jiayuan Chen Cc: bpf, mrpre, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman, Mykola Lysenko, Shuah Khan, Willem de Bruijn, Jason Xing, Anton Protopopov, Abhishek Chauhan, Jordan Rome, Martin Kelly, David Lechner, linux-kernel, netdev, linux-kselftest, kernel-team On 03/04/2025 02.24, Jakub Kicinski wrote: > On Mon, 31 Mar 2025 11:23:45 +0800 Jiayuan Chen wrote: >> which is negligible for the net stack. >> >> Before memset >> ./test_progs -a xdp_adjust_head_perf -v >> run adjust head with size 6 cost 56 ns >> run adjust head with size 20 cost 56 ns >> run adjust head with size 40 cost 56 ns >> run adjust head with size 200 cost 56 ns >> >> After memset >> ./test_progs -a xdp_adjust_head_perf -v >> run adjust head with size 6 cost 58 ns >> run adjust head with size 20 cost 58 ns >> run adjust head with size 40 cost 58 ns >> run adjust head with size 200 cost 66 ns > > FWIW I'm not sure if this is "negligible" for XDP like you say, > but I defer to Jesper :) It would be too much for the XDP_DROP use-case, e.g. DDoS protection and driver hardware eval. But this is changing a BPF-helper, which means it is opt-in by the BPF-programmer. Thus, we can accept larger perf overhead here. I suspect your 2 nanosec overhead primarily comes from the function call overhead. On my AMD production system with SRSO mitigation enabled I expect to see around 6 ns overhead (5.699 ns), which is sad. I've done a lot of benchmarking of memset (see [1]). One take-away is that memset with small const values will get compiled into very fast code that avoids the function call (basically QWORD MOVs). E.g. memset with const 32 is extremely fast[2], on my system it takes 0.673 ns (and 0.562 ns comes from for-loop overhead). Thus, it is possible to do something faster, as we are clearing very small values. I.e. using a duff's device construct like I did for remainder in [2]. In this case, as this is a BPF-helper, I am uncertain if it is worth the complexity to add such optimizations... I guess not. This turned into a long way of saying, I'm okay with this change. [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_memset.c [2] https://github.com/netoptimizer/prototype-kernel/blob/35b1716d0c300e7fa2c8b6d8cfed2ec81df8f3a4/kernel/lib/time_bench_memset.c#L520-L521 --Jesper ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-04 0:29 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-31 3:23 [PATCH bpf v2 0/2] bpf, xdp: clean adjust_{head,meta} memory when offset < 0 Jiayuan Chen 2025-03-31 3:23 ` [PATCH bpf v2 1/2] bpf, xdp: clean head/meta when expanding it Jiayuan Chen 2025-04-03 8:17 ` Jesper Dangaard Brouer 2025-04-03 14:24 ` Alexei Starovoitov 2025-04-03 14:32 ` Willem de Bruijn 2025-04-04 0:28 ` Alexei Starovoitov 2025-04-04 0:27 ` Jiayuan Chen 2025-04-04 0:29 ` Alexei Starovoitov 2025-03-31 3:23 ` [PATCH bpf v2 2/2] selftests/bpf: add perf test for adjust_{head,meta} Jiayuan Chen 2025-04-03 0:24 ` Jakub Kicinski 2025-04-03 9:37 ` Jesper Dangaard Brouer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).