* [PATCH bpf-next v3 1/6] bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
@ 2025-09-15 22:47 ` Amery Hung
2025-09-15 22:47 ` [PATCH bpf-next v3 2/6] bpf: Support pulling non-linear xdp data Amery Hung
` (5 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Amery Hung @ 2025-09-15 22:47 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, paul.chaignon, kuba,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
Move skb_frag_t adjustment into bpf_xdp_shrink_data() and extend its
functionality to be able to shrink an xdp fragment from both head and
tail. In a later patch, bpf_xdp_pull_data() will reuse it to shrink an
xdp fragment from head.
Additionally, in bpf_xdp_frags_shrink_tail(), breaking the loop when
bpf_xdp_shrink_data() returns false (i.e., not releasing the current
fragment) is not necessary as the loop condition, offset > 0, has the
same effect. Remove the else branch to simplify the code.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
include/net/xdp_sock_drv.h | 21 ++++++++++++++++++---
net/core/filter.c | 28 +++++++++++++++++-----------
2 files changed, 35 insertions(+), 14 deletions(-)
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 513c8e9704f6..4f2d3268a676 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -160,13 +160,23 @@ static inline struct xdp_buff *xsk_buff_get_frag(const struct xdp_buff *first)
return ret;
}
-static inline void xsk_buff_del_tail(struct xdp_buff *tail)
+static inline void xsk_buff_del_frag(struct xdp_buff *xdp)
{
- struct xdp_buff_xsk *xskb = container_of(tail, struct xdp_buff_xsk, xdp);
+ struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp);
list_del(&xskb->list_node);
}
+static inline struct xdp_buff *xsk_buff_get_head(struct xdp_buff *first)
+{
+ struct xdp_buff_xsk *xskb = container_of(first, struct xdp_buff_xsk, xdp);
+ struct xdp_buff_xsk *frag;
+
+ frag = list_first_entry(&xskb->pool->xskb_list, struct xdp_buff_xsk,
+ list_node);
+ return &frag->xdp;
+}
+
static inline struct xdp_buff *xsk_buff_get_tail(struct xdp_buff *first)
{
struct xdp_buff_xsk *xskb = container_of(first, struct xdp_buff_xsk, xdp);
@@ -389,8 +399,13 @@ static inline struct xdp_buff *xsk_buff_get_frag(const struct xdp_buff *first)
return NULL;
}
-static inline void xsk_buff_del_tail(struct xdp_buff *tail)
+static inline void xsk_buff_del_frag(struct xdp_buff *xdp)
+{
+}
+
+static inline struct xdp_buff *xsk_buff_get_head(struct xdp_buff *first)
{
+ return NULL;
}
static inline struct xdp_buff *xsk_buff_get_tail(struct xdp_buff *first)
diff --git a/net/core/filter.c b/net/core/filter.c
index 63f3baee2daf..0b82cb348ce0 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4153,27 +4153,31 @@ static int bpf_xdp_frags_increase_tail(struct xdp_buff *xdp, int offset)
return 0;
}
-static void bpf_xdp_shrink_data_zc(struct xdp_buff *xdp, int shrink,
+static void bpf_xdp_shrink_data_zc(struct xdp_buff *xdp, int shrink, bool tail,
enum xdp_mem_type mem_type, bool release)
{
- struct xdp_buff *zc_frag = xsk_buff_get_tail(xdp);
+ struct xdp_buff *zc_frag = tail ? xsk_buff_get_tail(xdp) :
+ xsk_buff_get_head(xdp);
if (release) {
- xsk_buff_del_tail(zc_frag);
+ xsk_buff_del_frag(zc_frag);
__xdp_return(0, mem_type, false, zc_frag);
} else {
- zc_frag->data_end -= shrink;
+ if (tail)
+ zc_frag->data_end -= shrink;
+ else
+ zc_frag->data += shrink;
}
}
static bool bpf_xdp_shrink_data(struct xdp_buff *xdp, skb_frag_t *frag,
- int shrink)
+ int shrink, bool tail)
{
enum xdp_mem_type mem_type = xdp->rxq->mem.type;
bool release = skb_frag_size(frag) == shrink;
if (mem_type == MEM_TYPE_XSK_BUFF_POOL) {
- bpf_xdp_shrink_data_zc(xdp, shrink, mem_type, release);
+ bpf_xdp_shrink_data_zc(xdp, shrink, tail, mem_type, release);
goto out;
}
@@ -4181,6 +4185,12 @@ static bool bpf_xdp_shrink_data(struct xdp_buff *xdp, skb_frag_t *frag,
__xdp_return(skb_frag_netmem(frag), mem_type, false, NULL);
out:
+ if (!release) {
+ if (!tail)
+ skb_frag_off_add(frag, shrink);
+ skb_frag_size_sub(frag, shrink);
+ }
+
return release;
}
@@ -4198,12 +4208,8 @@ static int bpf_xdp_frags_shrink_tail(struct xdp_buff *xdp, int offset)
len_free += shrink;
offset -= shrink;
- if (bpf_xdp_shrink_data(xdp, frag, shrink)) {
+ if (bpf_xdp_shrink_data(xdp, frag, shrink, true))
n_frags_free++;
- } else {
- skb_frag_size_sub(frag, shrink);
- break;
- }
}
sinfo->nr_frags -= n_frags_free;
sinfo->xdp_frags_size -= len_free;
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH bpf-next v3 2/6] bpf: Support pulling non-linear xdp data
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
2025-09-15 22:47 ` [PATCH bpf-next v3 1/6] bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail Amery Hung
@ 2025-09-15 22:47 ` Amery Hung
2025-09-17 0:17 ` Jakub Kicinski
2025-09-15 22:47 ` [PATCH bpf-next v3 3/6] bpf: Clear packet pointers after changing packet data in kfuncs Amery Hung
` (4 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Amery Hung @ 2025-09-15 22:47 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, paul.chaignon, kuba,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
Add kfunc, bpf_xdp_pull_data(), to support pulling data from xdp
fragments. Similar to bpf_skb_pull_data(), bpf_xdp_pull_data() makes
the first len bytes of data directly readable and writable in bpf
programs. If the "len" argument is larger than the linear data size,
data in fragments will be copied to the linear data area when there
is enough room. Specifically, the kfunc will try to use the tailroom
first. When the tailroom is not enough, metadata and data will be
shifted down to make room for pulling data.
A use case of the kfunc is to decapsulate headers residing in xdp
fragments. It is possible for a NIC driver to place headers in xdp
fragments. To keep using direct packet access for parsing and
decapsulating headers, users can pull headers into the linear data
area by calling bpf_xdp_pull_data() and then pop the header with
bpf_xdp_adjust_head().
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
net/core/filter.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 95 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c
index 0b82cb348ce0..3a24c4db46f9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -12212,6 +12212,100 @@ __bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops,
return 0;
}
+/**
+ * bpf_xdp_pull_data() - Pull in non-linear xdp data.
+ * @x: &xdp_md associated with the XDP buffer
+ * @len: length of data to be made directly accessible in the linear part
+ *
+ * Pull in non-linear data in case the XDP buffer associated with @x is
+ * non-linear and not all @len are in the linear data area.
+ *
+ * Direct packet access allows reading and writing linear XDP data through
+ * packet pointers (i.e., &xdp_md->data + offsets). The amount of data which
+ * ends up in the linear part of the xdp_buff depends on the NIC and its
+ * configuration. When an eBPF program wants to directly access headers that
+ * may be in the non-linear area, call this kfunc to make sure the data is
+ * available in the linear area. Alternatively, use dynptr or
+ * bpf_xdp_{load,store}_bytes() to access data without pulling.
+ *
+ * This kfunc can also be used with bpf_xdp_adjust_head() to decapsulate
+ * headers in the non-linear data area.
+ *
+ * A call to this kfunc may reduce headroom. If there is not enough tailroom
+ * in the linear data area, metadata and data will be shifted down.
+ *
+ * A call to this kfunc is susceptible to change the buffer geometry.
+ * Therefore, at load time, all checks on pointers previously done by the
+ * verifier are invalidated and must be performed again, if the kfunc is used
+ * in combination with direct packet access.
+ *
+ * Return:
+ * * %0 - success
+ * * %-EINVAL - invalid len
+ */
+__bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len)
+{
+ struct xdp_buff *xdp = (struct xdp_buff *)x;
+ int i, delta, shift, headroom, tailroom, n_frags_free = 0, len_free = 0;
+ struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+ void *data_hard_end = xdp_data_hard_end(xdp);
+ int data_len = xdp->data_end - xdp->data;
+ void *start, *new_end = xdp->data + len;
+
+ if (len <= data_len)
+ return 0;
+
+ if (unlikely(len > xdp_get_buff_len(xdp)))
+ return -EINVAL;
+
+ start = xdp_data_meta_unsupported(xdp) ? xdp->data : xdp->data_meta;
+
+ headroom = start - xdp->data_hard_start - sizeof(struct xdp_frame);
+ tailroom = data_hard_end - xdp->data_end;
+
+ delta = len - data_len;
+ if (unlikely(delta > tailroom + headroom))
+ return -EINVAL;
+
+ shift = delta - tailroom;
+ if (shift > 0) {
+ memmove(start - shift, start, xdp->data_end - start);
+
+ xdp->data_meta -= shift;
+ xdp->data -= shift;
+ xdp->data_end -= shift;
+
+ new_end = data_hard_end;
+ }
+
+ for (i = 0; i < sinfo->nr_frags && delta; i++) {
+ skb_frag_t *frag = &sinfo->frags[i];
+ u32 shrink = min_t(u32, delta, skb_frag_size(frag));
+
+ memcpy(xdp->data_end + len_free, skb_frag_address(frag), shrink);
+
+ len_free += shrink;
+ delta -= shrink;
+ if (bpf_xdp_shrink_data(xdp, frag, shrink, false))
+ n_frags_free++;
+ }
+
+ if (unlikely(n_frags_free)) {
+ memmove(sinfo->frags, sinfo->frags + n_frags_free,
+ (sinfo->nr_frags - n_frags_free) * sizeof(skb_frag_t));
+
+ sinfo->nr_frags -= n_frags_free;
+
+ if (!sinfo->nr_frags)
+ xdp_buff_clear_frags_flag(xdp);
+ }
+
+ sinfo->xdp_frags_size -= len_free;
+ xdp->data_end = new_end;
+
+ return 0;
+}
+
__bpf_kfunc_end_defs();
int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags,
@@ -12239,6 +12333,7 @@ BTF_KFUNCS_END(bpf_kfunc_check_set_skb_meta)
BTF_KFUNCS_START(bpf_kfunc_check_set_xdp)
BTF_ID_FLAGS(func, bpf_dynptr_from_xdp)
+BTF_ID_FLAGS(func, bpf_xdp_pull_data)
BTF_KFUNCS_END(bpf_kfunc_check_set_xdp)
BTF_KFUNCS_START(bpf_kfunc_check_set_sock_addr)
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 2/6] bpf: Support pulling non-linear xdp data
2025-09-15 22:47 ` [PATCH bpf-next v3 2/6] bpf: Support pulling non-linear xdp data Amery Hung
@ 2025-09-17 0:17 ` Jakub Kicinski
2025-09-17 19:37 ` Amery Hung
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-09-17 0:17 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, paul.chaignon,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
On Mon, 15 Sep 2025 15:47:57 -0700 Amery Hung wrote:
> +/**
> + * bpf_xdp_pull_data() - Pull in non-linear xdp data.
> + * @x: &xdp_md associated with the XDP buffer
> + * @len: length of data to be made directly accessible in the linear part
> + *
> + * Pull in non-linear data in case the XDP buffer associated with @x is
looks like there will be a v4, so nit, I'd drop the first non-linear:
Pull in data in case the XDP buffer associated with @x is
we say linear too many times, makes the doc hard to read
> + * non-linear and not all @len are in the linear data area.
> + *
> + * Direct packet access allows reading and writing linear XDP data through
> + * packet pointers (i.e., &xdp_md->data + offsets). The amount of data which
> + * ends up in the linear part of the xdp_buff depends on the NIC and its
> + * configuration. When an eBPF program wants to directly access headers that
s/eBPF/frag-capable XDP/ ?
> + * may be in the non-linear area, call this kfunc to make sure the data is
> + * available in the linear area. Alternatively, use dynptr or
> + * bpf_xdp_{load,store}_bytes() to access data without pulling.
> + *
> + * This kfunc can also be used with bpf_xdp_adjust_head() to decapsulate
> + * headers in the non-linear data area.
> + *
> + * A call to this kfunc may reduce headroom. If there is not enough tailroom
> + * in the linear data area, metadata and data will be shifted down.
> + *
> + * A call to this kfunc is susceptible to change the buffer geometry.
> + * Therefore, at load time, all checks on pointers previously done by the
> + * verifier are invalidated and must be performed again, if the kfunc is used
> + * in combination with direct packet access.
> + *
> + * Return:
> + * * %0 - success
> + * * %-EINVAL - invalid len
> + */
> +__bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len)
> +{
> + struct xdp_buff *xdp = (struct xdp_buff *)x;
> + int i, delta, shift, headroom, tailroom, n_frags_free = 0, len_free = 0;
> + struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> + void *data_hard_end = xdp_data_hard_end(xdp);
> + int data_len = xdp->data_end - xdp->data;
> + void *start, *new_end = xdp->data + len;
> +
> + if (len <= data_len)
> + return 0;
> +
> + if (unlikely(len > xdp_get_buff_len(xdp)))
> + return -EINVAL;
> +
> + start = xdp_data_meta_unsupported(xdp) ? xdp->data : xdp->data_meta;
> +
> + headroom = start - xdp->data_hard_start - sizeof(struct xdp_frame);
> + tailroom = data_hard_end - xdp->data_end;
> +
> + delta = len - data_len;
> + if (unlikely(delta > tailroom + headroom))
> + return -EINVAL;
> +
> + shift = delta - tailroom;
> + if (shift > 0) {
> + memmove(start - shift, start, xdp->data_end - start);
> +
> + xdp->data_meta -= shift;
> + xdp->data -= shift;
> + xdp->data_end -= shift;
> +
> + new_end = data_hard_end;
> + }
> +
> + for (i = 0; i < sinfo->nr_frags && delta; i++) {
> + skb_frag_t *frag = &sinfo->frags[i];
> + u32 shrink = min_t(u32, delta, skb_frag_size(frag));
> +
> + memcpy(xdp->data_end + len_free, skb_frag_address(frag), shrink);
> +
> + len_free += shrink;
> + delta -= shrink;
> + if (bpf_xdp_shrink_data(xdp, frag, shrink, false))
> + n_frags_free++;
> + }
> +
> + if (unlikely(n_frags_free)) {
> + memmove(sinfo->frags, sinfo->frags + n_frags_free,
> + (sinfo->nr_frags - n_frags_free) * sizeof(skb_frag_t));
> +
> + sinfo->nr_frags -= n_frags_free;
> +
> + if (!sinfo->nr_frags)
> + xdp_buff_clear_frags_flag(xdp);
> + }
> +
> + sinfo->xdp_frags_size -= len_free;
> + xdp->data_end = new_end;
Not sure I see the benefit of maintaining the new_end, and len_free.
We could directly adjust
xdp->data_end += shrink;
sinfo->xdp_frags_size -= shrink;
as we copy from the frags. But either way:
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
The whole things actually looks pretty clean, I was worried
the shifting down of the data would add a lot of complexity :)
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 2/6] bpf: Support pulling non-linear xdp data
2025-09-17 0:17 ` Jakub Kicinski
@ 2025-09-17 19:37 ` Amery Hung
0 siblings, 0 replies; 15+ messages in thread
From: Amery Hung @ 2025-09-17 19:37 UTC (permalink / raw)
To: Jakub Kicinski
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, paul.chaignon,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
On Tue, Sep 16, 2025 at 5:17 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 15 Sep 2025 15:47:57 -0700 Amery Hung wrote:
> > +/**
> > + * bpf_xdp_pull_data() - Pull in non-linear xdp data.
> > + * @x: &xdp_md associated with the XDP buffer
> > + * @len: length of data to be made directly accessible in the linear part
> > + *
> > + * Pull in non-linear data in case the XDP buffer associated with @x is
>
> looks like there will be a v4, so nit, I'd drop the first non-linear:
>
> Pull in data in case the XDP buffer associated with @x is
>
> we say linear too many times, makes the doc hard to read
>
> > + * non-linear and not all @len are in the linear data area.
> > + *
> > + * Direct packet access allows reading and writing linear XDP data through
> > + * packet pointers (i.e., &xdp_md->data + offsets). The amount of data which
> > + * ends up in the linear part of the xdp_buff depends on the NIC and its
> > + * configuration. When an eBPF program wants to directly access headers that
>
> s/eBPF/frag-capable XDP/ ?
>
Will change. Thanks for helping improve the comments.
> > + * may be in the non-linear area, call this kfunc to make sure the data is
> > + * available in the linear area. Alternatively, use dynptr or
> > + * bpf_xdp_{load,store}_bytes() to access data without pulling.
> > + *
> > + * This kfunc can also be used with bpf_xdp_adjust_head() to decapsulate
> > + * headers in the non-linear data area.
> > + *
> > + * A call to this kfunc may reduce headroom. If there is not enough tailroom
> > + * in the linear data area, metadata and data will be shifted down.
> > + *
> > + * A call to this kfunc is susceptible to change the buffer geometry.
> > + * Therefore, at load time, all checks on pointers previously done by the
> > + * verifier are invalidated and must be performed again, if the kfunc is used
> > + * in combination with direct packet access.
> > + *
> > + * Return:
> > + * * %0 - success
> > + * * %-EINVAL - invalid len
> > + */
> > +__bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len)
> > +{
> > + struct xdp_buff *xdp = (struct xdp_buff *)x;
> > + int i, delta, shift, headroom, tailroom, n_frags_free = 0, len_free = 0;
> > + struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> > + void *data_hard_end = xdp_data_hard_end(xdp);
> > + int data_len = xdp->data_end - xdp->data;
> > + void *start, *new_end = xdp->data + len;
> > +
> > + if (len <= data_len)
> > + return 0;
> > +
> > + if (unlikely(len > xdp_get_buff_len(xdp)))
> > + return -EINVAL;
> > +
> > + start = xdp_data_meta_unsupported(xdp) ? xdp->data : xdp->data_meta;
> > +
> > + headroom = start - xdp->data_hard_start - sizeof(struct xdp_frame);
> > + tailroom = data_hard_end - xdp->data_end;
> > +
> > + delta = len - data_len;
> > + if (unlikely(delta > tailroom + headroom))
> > + return -EINVAL;
> > +
> > + shift = delta - tailroom;
> > + if (shift > 0) {
> > + memmove(start - shift, start, xdp->data_end - start);
> > +
> > + xdp->data_meta -= shift;
> > + xdp->data -= shift;
> > + xdp->data_end -= shift;
> > +
> > + new_end = data_hard_end;
> > + }
> > +
> > + for (i = 0; i < sinfo->nr_frags && delta; i++) {
> > + skb_frag_t *frag = &sinfo->frags[i];
> > + u32 shrink = min_t(u32, delta, skb_frag_size(frag));
> > +
> > + memcpy(xdp->data_end + len_free, skb_frag_address(frag), shrink);
> > +
> > + len_free += shrink;
> > + delta -= shrink;
> > + if (bpf_xdp_shrink_data(xdp, frag, shrink, false))
> > + n_frags_free++;
> > + }
> > +
> > + if (unlikely(n_frags_free)) {
> > + memmove(sinfo->frags, sinfo->frags + n_frags_free,
> > + (sinfo->nr_frags - n_frags_free) * sizeof(skb_frag_t));
> > +
> > + sinfo->nr_frags -= n_frags_free;
> > +
> > + if (!sinfo->nr_frags)
> > + xdp_buff_clear_frags_flag(xdp);
> > + }
> > +
> > + sinfo->xdp_frags_size -= len_free;
> > + xdp->data_end = new_end;
>
> Not sure I see the benefit of maintaining the new_end, and len_free.
> We could directly adjust
>
> xdp->data_end += shrink;
> sinfo->xdp_frags_size -= shrink;
>
> as we copy from the frags. But either way:
>
Great suggestion! I will drop new_end and len_free.
> Reviewed-by: Jakub Kicinski <kuba@kernel.org>
>
> The whole things actually looks pretty clean, I was worried
> the shifting down of the data would add a lot of complexity :)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next v3 3/6] bpf: Clear packet pointers after changing packet data in kfuncs
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
2025-09-15 22:47 ` [PATCH bpf-next v3 1/6] bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail Amery Hung
2025-09-15 22:47 ` [PATCH bpf-next v3 2/6] bpf: Support pulling non-linear xdp data Amery Hung
@ 2025-09-15 22:47 ` Amery Hung
2025-09-15 22:47 ` [PATCH bpf-next v3 4/6] bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN Amery Hung
` (3 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Amery Hung @ 2025-09-15 22:47 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, paul.chaignon, kuba,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
bpf_xdp_pull_data() may change packet data and therefore packet pointers
need to be invalidated. Add bpf_xdp_pull_data() to the special kfunc
list instead of introducing a new KF_ flag until there are more kfuncs
changing packet data.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
kernel/bpf/verifier.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1029380f84db..ed493d1dd2e3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -12239,6 +12239,7 @@ enum special_kfunc_type {
KF_bpf_dynptr_from_skb,
KF_bpf_dynptr_from_xdp,
KF_bpf_dynptr_from_skb_meta,
+ KF_bpf_xdp_pull_data,
KF_bpf_dynptr_slice,
KF_bpf_dynptr_slice_rdwr,
KF_bpf_dynptr_clone,
@@ -12289,10 +12290,12 @@ BTF_ID(func, bpf_rbtree_right)
BTF_ID(func, bpf_dynptr_from_skb)
BTF_ID(func, bpf_dynptr_from_xdp)
BTF_ID(func, bpf_dynptr_from_skb_meta)
+BTF_ID(func, bpf_xdp_pull_data)
#else
BTF_ID_UNUSED
BTF_ID_UNUSED
BTF_ID_UNUSED
+BTF_ID_UNUSED
#endif
BTF_ID(func, bpf_dynptr_slice)
BTF_ID(func, bpf_dynptr_slice_rdwr)
@@ -12362,6 +12365,11 @@ static bool is_kfunc_bpf_preempt_enable(struct bpf_kfunc_call_arg_meta *meta)
return meta->func_id == special_kfunc_list[KF_bpf_preempt_enable];
}
+static bool is_kfunc_pkt_changing(struct bpf_kfunc_call_arg_meta *meta)
+{
+ return meta->func_id == special_kfunc_list[KF_bpf_xdp_pull_data];
+}
+
static enum kfunc_ptr_arg_type
get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
struct bpf_kfunc_call_arg_meta *meta,
@@ -14081,6 +14089,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
}
+ if (is_kfunc_pkt_changing(&meta))
+ clear_all_pkt_pointers(env);
+
nargs = btf_type_vlen(meta.func_proto);
args = (const struct btf_param *)(meta.func_proto + 1);
for (i = 0; i < nargs; i++) {
@@ -17802,6 +17813,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
*/
if (ret == 0 && is_kfunc_sleepable(&meta))
mark_subprog_might_sleep(env, t);
+ if (ret == 0 && is_kfunc_pkt_changing(&meta))
+ mark_subprog_changes_pkt_data(env, t);
}
return visit_func_call_insn(t, insns, env, insn->src_reg == BPF_PSEUDO_CALL);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH bpf-next v3 4/6] bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
` (2 preceding siblings ...)
2025-09-15 22:47 ` [PATCH bpf-next v3 3/6] bpf: Clear packet pointers after changing packet data in kfuncs Amery Hung
@ 2025-09-15 22:47 ` Amery Hung
2025-09-16 22:59 ` Martin KaFai Lau
2025-09-15 22:48 ` [PATCH bpf-next v3 5/6] selftests/bpf: Test bpf_xdp_pull_data Amery Hung
` (2 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Amery Hung @ 2025-09-15 22:47 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, paul.chaignon, kuba,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
To test bpf_xdp_pull_data(), an xdp packet containing fragments as well
as free linear data area after xdp->data_end needs to be created.
However, bpf_prog_test_run_xdp() always fills the linear area with
data_in before creating fragments, leaving no space to pull data. This
patch will allow users to specify the linear data size through
ctx->data_end.
Currently, ctx_in->data_end must match data_size_in and will not be the
final ctx->data_end seen by xdp programs. This is because ctx->data_end
is populated according to the xdp_buff passed to test_run. The linear
data area available in an xdp_buff, max_data_sz, is alawys filled up
before copying data_in into fragments.
This patch will allow users to specify the size of data that goes into
the linear area. When ctx_in->data_end is different from data_size_in,
only ctx_in->data_end bytes of data will be put into the linear area when
creating the xdp_buff.
While ctx_in->data_end will be allowed to be different from data_size_in,
it cannot be larger than the data_size_in as there will be no data to
copy from user space. If it is larger than the maximum linear data area
size, the layout suggested by the user will not be honored. Data beyond
max_data_sz bytes will still be copied into fragments.
Finally, since it is possible for a NIC to produce a xdp_buff with empty
linear data area, allow it when calling bpf_test_init() from
bpf_prog_test_run_xdp() so that we can test XDP kfuncs with such
xdp_buff.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
net/bpf/test_run.c | 26 ++++++++++++-------
.../bpf/prog_tests/xdp_context_test_run.c | 4 +--
2 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 4a862d605386..558126bbd180 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -660,12 +660,15 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_memb_release, KF_RELEASE)
BTF_KFUNCS_END(test_sk_check_kfunc_ids)
static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size,
- u32 size, u32 headroom, u32 tailroom)
+ u32 size, u32 headroom, u32 tailroom, bool is_xdp)
{
void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
void *data;
- if (user_size < ETH_HLEN || user_size > PAGE_SIZE - headroom - tailroom)
+ if (!is_xdp && user_size < ETH_HLEN)
+ return ERR_PTR(-EINVAL);
+
+ if (user_size > PAGE_SIZE - headroom - tailroom)
return ERR_PTR(-EINVAL);
size = SKB_DATA_ALIGN(size);
@@ -1003,7 +1006,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
data = bpf_test_init(kattr, kattr->test.data_size_in,
size, NET_SKB_PAD + NET_IP_ALIGN,
- SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
+ false);
if (IS_ERR(data))
return PTR_ERR(data);
@@ -1207,8 +1211,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
{
bool do_live = (kattr->test.flags & BPF_F_TEST_XDP_LIVE_FRAMES);
u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ u32 retval = 0, duration, max_data_sz, data_sz;
u32 batch_size = kattr->test.batch_size;
- u32 retval = 0, duration, max_data_sz;
u32 size = kattr->test.data_size_in;
u32 headroom = XDP_PACKET_HEADROOM;
u32 repeat = kattr->test.repeat;
@@ -1246,7 +1250,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
if (ctx) {
/* There can't be user provided data before the meta data */
- if (ctx->data_meta || ctx->data_end != size ||
+ if (ctx->data_meta || ctx->data_end > size ||
ctx->data > ctx->data_end ||
unlikely(xdp_metalen_invalid(ctx->data)) ||
(do_live && (kattr->test.data_out || kattr->test.ctx_out)))
@@ -1256,14 +1260,15 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
}
max_data_sz = PAGE_SIZE - headroom - tailroom;
- if (size > max_data_sz) {
+ data_sz = (ctx && ctx->data_end < max_data_sz) ? ctx->data_end : max_data_sz;
+ if (size > data_sz) {
/* disallow live data mode for jumbo frames */
if (do_live)
goto free_ctx;
- size = max_data_sz;
+ size = data_sz;
}
- data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
+ data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom, true);
if (IS_ERR(data)) {
ret = PTR_ERR(data);
goto free_ctx;
@@ -1386,7 +1391,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
if (size < ETH_HLEN)
return -EINVAL;
- data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0);
+ data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0, false);
if (IS_ERR(data))
return PTR_ERR(data);
@@ -1659,7 +1664,8 @@ int bpf_prog_test_run_nf(struct bpf_prog *prog,
data = bpf_test_init(kattr, kattr->test.data_size_in, size,
NET_SKB_PAD + NET_IP_ALIGN,
- SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
+ false);
if (IS_ERR(data))
return PTR_ERR(data);
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
index 46e0730174ed..178292d1251a 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
@@ -97,9 +97,7 @@ void test_xdp_context_test_run(void)
/* Meta data must be 255 bytes or smaller */
test_xdp_context_error(prog_fd, opts, 0, 256, sizeof(data), 0, 0, 0);
- /* Total size of data must match data_end - data_meta */
- test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32),
- sizeof(data) - 1, 0, 0, 0);
+ /* Total size of data must be data_end - data_meta or larger */
test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32),
sizeof(data) + 1, 0, 0, 0);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 4/6] bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN
2025-09-15 22:47 ` [PATCH bpf-next v3 4/6] bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN Amery Hung
@ 2025-09-16 22:59 ` Martin KaFai Lau
2025-09-17 17:23 ` Amery Hung
0 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2025-09-16 22:59 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, paul.chaignon,
kuba, stfomichev, martin.lau, mohsin.bashr, noren, dtatulea,
saeedm, tariqt, mbloch, maciej.fijalkowski, kernel-team
On 9/15/25 3:47 PM, Amery Hung wrote:
> To test bpf_xdp_pull_data(), an xdp packet containing fragments as well
> as free linear data area after xdp->data_end needs to be created.
> However, bpf_prog_test_run_xdp() always fills the linear area with
> data_in before creating fragments, leaving no space to pull data. This
> patch will allow users to specify the linear data size through
> ctx->data_end.
>
> Currently, ctx_in->data_end must match data_size_in and will not be the
> final ctx->data_end seen by xdp programs. This is because ctx->data_end
> is populated according to the xdp_buff passed to test_run. The linear
> data area available in an xdp_buff, max_data_sz, is alawys filled up
> before copying data_in into fragments.
>
> This patch will allow users to specify the size of data that goes into
> the linear area. When ctx_in->data_end is different from data_size_in,
> only ctx_in->data_end bytes of data will be put into the linear area when
> creating the xdp_buff.
>
> While ctx_in->data_end will be allowed to be different from data_size_in,
> it cannot be larger than the data_size_in as there will be no data to
> copy from user space. If it is larger than the maximum linear data area
> size, the layout suggested by the user will not be honored. Data beyond
> max_data_sz bytes will still be copied into fragments.
>
> Finally, since it is possible for a NIC to produce a xdp_buff with empty
> linear data area, allow it when calling bpf_test_init() from
> bpf_prog_test_run_xdp() so that we can test XDP kfuncs with such
> xdp_buff.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
> net/bpf/test_run.c | 26 ++++++++++++-------
> .../bpf/prog_tests/xdp_context_test_run.c | 4 +--
> 2 files changed, 17 insertions(+), 13 deletions(-)
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 4a862d605386..558126bbd180 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -660,12 +660,15 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_memb_release, KF_RELEASE)
> BTF_KFUNCS_END(test_sk_check_kfunc_ids)
>
> static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size,
> - u32 size, u32 headroom, u32 tailroom)
> + u32 size, u32 headroom, u32 tailroom, bool is_xdp)
Understood that the patch has inherited this function. I found it hard to read
when it is called by xdp but this could be just me. For example, what is passed
as "size" from the bpf_prog_test_run_xdp(), which ends up being "PAGE_SIZE -
headroom - tailroom". I am not sure how to fix it. e.g. can we always allocate a
PAGE_SIZE for non xdp callers also. or may be the xdp should not reuse this
function. This probably is a fruit of thoughts for later. Not asking to consider
it in this set.
I think at least the first step is to avoid adding "is_xdp" specific logic here.
> {
> void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
> void *data;
>
> - if (user_size < ETH_HLEN || user_size > PAGE_SIZE - headroom - tailroom)
> + if (!is_xdp && user_size < ETH_HLEN)
Move the lower bound check to its caller. test_run_xdp() does not need this
check. test_run_flow_dissector() and test_run_nf() already have its own check.
test_run_nf() actually has a different bound. test_run_skb() is the only one
that needs this check, so it can be explicitly done in there like other callers.
> + return ERR_PTR(-EINVAL);
> +
> + if (user_size > PAGE_SIZE - headroom - tailroom)
> return ERR_PTR(-EINVAL);
>
> size = SKB_DATA_ALIGN(size);
> @@ -1003,7 +1006,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>
> data = bpf_test_init(kattr, kattr->test.data_size_in,
> size, NET_SKB_PAD + NET_IP_ALIGN,
> - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
> + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
> + false);
> if (IS_ERR(data))
> return PTR_ERR(data);
>
> @@ -1207,8 +1211,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> {
> bool do_live = (kattr->test.flags & BPF_F_TEST_XDP_LIVE_FRAMES);
> u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> + u32 retval = 0, duration, max_data_sz, data_sz;
> u32 batch_size = kattr->test.batch_size;
> - u32 retval = 0, duration, max_data_sz;
> u32 size = kattr->test.data_size_in;
> u32 headroom = XDP_PACKET_HEADROOM;
> u32 repeat = kattr->test.repeat;
> @@ -1246,7 +1250,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>
> if (ctx) {
> /* There can't be user provided data before the meta data */
> - if (ctx->data_meta || ctx->data_end != size ||
> + if (ctx->data_meta || ctx->data_end > size ||
> ctx->data > ctx->data_end ||
> unlikely(xdp_metalen_invalid(ctx->data)) ||
> (do_live && (kattr->test.data_out || kattr->test.ctx_out)))
> @@ -1256,14 +1260,15 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> }
>
> max_data_sz = PAGE_SIZE - headroom - tailroom;
> - if (size > max_data_sz) {
> + data_sz = (ctx && ctx->data_end < max_data_sz) ? ctx->data_end : max_data_sz;
hmm... can the "size" (not data_sz) be directly updated to ctx->data_end in the
above "if (ctx)".
> + if (size > data_sz) {
> /* disallow live data mode for jumbo frames */
> if (do_live)
> goto free_ctx;
> - size = max_data_sz;
> + size = data_sz;
> }
>
> - data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
> + data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom, true);
> if (IS_ERR(data)) {
> ret = PTR_ERR(data);
> goto free_ctx;
> @@ -1386,7 +1391,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> if (size < ETH_HLEN)
> return -EINVAL;
>
> - data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0);
> + data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0, false);
> if (IS_ERR(data))
> return PTR_ERR(data);
>
> @@ -1659,7 +1664,8 @@ int bpf_prog_test_run_nf(struct bpf_prog *prog,
>
> data = bpf_test_init(kattr, kattr->test.data_size_in, size,
> NET_SKB_PAD + NET_IP_ALIGN,
> - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
> + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
> + false);
> if (IS_ERR(data))
> return PTR_ERR(data);
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> index 46e0730174ed..178292d1251a 100644
> --- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> @@ -97,9 +97,7 @@ void test_xdp_context_test_run(void)
> /* Meta data must be 255 bytes or smaller */
> test_xdp_context_error(prog_fd, opts, 0, 256, sizeof(data), 0, 0, 0);
>
> - /* Total size of data must match data_end - data_meta */
> - test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32),
> - sizeof(data) - 1, 0, 0, 0);
> + /* Total size of data must be data_end - data_meta or larger */
> test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32),
> sizeof(data) + 1, 0, 0, 0);
>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 4/6] bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN
2025-09-16 22:59 ` Martin KaFai Lau
@ 2025-09-17 17:23 ` Amery Hung
0 siblings, 0 replies; 15+ messages in thread
From: Amery Hung @ 2025-09-17 17:23 UTC (permalink / raw)
To: Martin KaFai Lau
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, paul.chaignon,
kuba, stfomichev, martin.lau, mohsin.bashr, noren, dtatulea,
saeedm, tariqt, mbloch, maciej.fijalkowski, kernel-team
On Tue, Sep 16, 2025 at 3:59 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 9/15/25 3:47 PM, Amery Hung wrote:
> > To test bpf_xdp_pull_data(), an xdp packet containing fragments as well
> > as free linear data area after xdp->data_end needs to be created.
> > However, bpf_prog_test_run_xdp() always fills the linear area with
> > data_in before creating fragments, leaving no space to pull data. This
> > patch will allow users to specify the linear data size through
> > ctx->data_end.
> >
> > Currently, ctx_in->data_end must match data_size_in and will not be the
> > final ctx->data_end seen by xdp programs. This is because ctx->data_end
> > is populated according to the xdp_buff passed to test_run. The linear
> > data area available in an xdp_buff, max_data_sz, is alawys filled up
> > before copying data_in into fragments.
> >
> > This patch will allow users to specify the size of data that goes into
> > the linear area. When ctx_in->data_end is different from data_size_in,
> > only ctx_in->data_end bytes of data will be put into the linear area when
> > creating the xdp_buff.
> >
> > While ctx_in->data_end will be allowed to be different from data_size_in,
> > it cannot be larger than the data_size_in as there will be no data to
> > copy from user space. If it is larger than the maximum linear data area
> > size, the layout suggested by the user will not be honored. Data beyond
> > max_data_sz bytes will still be copied into fragments.
> >
> > Finally, since it is possible for a NIC to produce a xdp_buff with empty
> > linear data area, allow it when calling bpf_test_init() from
> > bpf_prog_test_run_xdp() so that we can test XDP kfuncs with such
> > xdp_buff.
> >
> > Signed-off-by: Amery Hung <ameryhung@gmail.com>
> > ---
> > net/bpf/test_run.c | 26 ++++++++++++-------
> > .../bpf/prog_tests/xdp_context_test_run.c | 4 +--
> > 2 files changed, 17 insertions(+), 13 deletions(-)
> >
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index 4a862d605386..558126bbd180 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -660,12 +660,15 @@ BTF_ID_FLAGS(func, bpf_kfunc_call_memb_release, KF_RELEASE)
> > BTF_KFUNCS_END(test_sk_check_kfunc_ids)
> >
> > static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size,
> > - u32 size, u32 headroom, u32 tailroom)
> > + u32 size, u32 headroom, u32 tailroom, bool is_xdp)
>
> Understood that the patch has inherited this function. I found it hard to read
> when it is called by xdp but this could be just me. For example, what is passed
> as "size" from the bpf_prog_test_run_xdp(), which ends up being "PAGE_SIZE -
> headroom - tailroom". I am not sure how to fix it. e.g. can we always allocate a
> PAGE_SIZE for non xdp callers also. or may be the xdp should not reuse this
> function. This probably is a fruit of thoughts for later. Not asking to consider
> it in this set.
>
> I think at least the first step is to avoid adding "is_xdp" specific logic here.
>
> > {
> > void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
> > void *data;
> >
> > - if (user_size < ETH_HLEN || user_size > PAGE_SIZE - headroom - tailroom)
> > + if (!is_xdp && user_size < ETH_HLEN)
>
> Move the lower bound check to its caller. test_run_xdp() does not need this
> check. test_run_flow_dissector() and test_run_nf() already have its own check.
> test_run_nf() actually has a different bound. test_run_skb() is the only one
> that needs this check, so it can be explicitly done in there like other callers.
>
Yeah, is _xdp is bad. I will move lower bound checks to callers.
Thanks for pointing this out.
> > + return ERR_PTR(-EINVAL);
> > +
> > + if (user_size > PAGE_SIZE - headroom - tailroom)
> > return ERR_PTR(-EINVAL);
> >
> > size = SKB_DATA_ALIGN(size);
> > @@ -1003,7 +1006,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> >
> > data = bpf_test_init(kattr, kattr->test.data_size_in,
> > size, NET_SKB_PAD + NET_IP_ALIGN,
> > - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
> > + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
> > + false);
> > if (IS_ERR(data))
> > return PTR_ERR(data);
> >
> > @@ -1207,8 +1211,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> > {
> > bool do_live = (kattr->test.flags & BPF_F_TEST_XDP_LIVE_FRAMES);
> > u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > + u32 retval = 0, duration, max_data_sz, data_sz;
> > u32 batch_size = kattr->test.batch_size;
> > - u32 retval = 0, duration, max_data_sz;
> > u32 size = kattr->test.data_size_in;
> > u32 headroom = XDP_PACKET_HEADROOM;
> > u32 repeat = kattr->test.repeat;
> > @@ -1246,7 +1250,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> >
> > if (ctx) {
> > /* There can't be user provided data before the meta data */
> > - if (ctx->data_meta || ctx->data_end != size ||
> > + if (ctx->data_meta || ctx->data_end > size ||
> > ctx->data > ctx->data_end ||
> > unlikely(xdp_metalen_invalid(ctx->data)) ||
> > (do_live && (kattr->test.data_out || kattr->test.ctx_out)))
> > @@ -1256,14 +1260,15 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> > }
> >
> > max_data_sz = PAGE_SIZE - headroom - tailroom;
> > - if (size > max_data_sz) {
> > + data_sz = (ctx && ctx->data_end < max_data_sz) ? ctx->data_end : max_data_sz;
>
> hmm... can the "size" (not data_sz) be directly updated to ctx->data_end in the
> above "if (ctx)".
>
That simplifies things a lot. Will change in the next version.
> > + if (size > data_sz) {
> > /* disallow live data mode for jumbo frames */
> > if (do_live)
> > goto free_ctx;
> > - size = max_data_sz;
> > + size = data_sz;
> > }
> >
> > - data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
> > + data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom, true);
> > if (IS_ERR(data)) {
> > ret = PTR_ERR(data);
> > goto free_ctx;
> > @@ -1386,7 +1391,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > if (size < ETH_HLEN)
> > return -EINVAL;
> >
> > - data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0);
> > + data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0, false);
> > if (IS_ERR(data))
> > return PTR_ERR(data);
> >
> > @@ -1659,7 +1664,8 @@ int bpf_prog_test_run_nf(struct bpf_prog *prog,
> >
> > data = bpf_test_init(kattr, kattr->test.data_size_in, size,
> > NET_SKB_PAD + NET_IP_ALIGN,
> > - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
> > + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
> > + false);
> > if (IS_ERR(data))
> > return PTR_ERR(data);
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> > index 46e0730174ed..178292d1251a 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
> > @@ -97,9 +97,7 @@ void test_xdp_context_test_run(void)
> > /* Meta data must be 255 bytes or smaller */
> > test_xdp_context_error(prog_fd, opts, 0, 256, sizeof(data), 0, 0, 0);
> >
> > - /* Total size of data must match data_end - data_meta */
> > - test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32),
> > - sizeof(data) - 1, 0, 0, 0);
> > + /* Total size of data must be data_end - data_meta or larger */
> > test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32),
> > sizeof(data) + 1, 0, 0, 0);
> >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next v3 5/6] selftests/bpf: Test bpf_xdp_pull_data
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
` (3 preceding siblings ...)
2025-09-15 22:47 ` [PATCH bpf-next v3 4/6] bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN Amery Hung
@ 2025-09-15 22:48 ` Amery Hung
2025-09-17 17:54 ` Martin KaFai Lau
2025-09-15 22:48 ` [PATCH bpf-next v3 6/6] selftests: drv-net: Pull data before parsing headers Amery Hung
2025-09-17 18:50 ` [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Martin KaFai Lau
6 siblings, 1 reply; 15+ messages in thread
From: Amery Hung @ 2025-09-15 22:48 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, paul.chaignon, kuba,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
Test bpf_xdp_pull_data() with xdp packets with different layouts. The
xdp bpf program first checks if the layout is as expected. Then, it
calls bpf_xdp_pull_data(). Finally, it checks the 0xbb marker at offset
1024 using directly packet access.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../selftests/bpf/prog_tests/xdp_pull_data.c | 174 ++++++++++++++++++
.../selftests/bpf/progs/test_xdp_pull_data.c | 48 +++++
2 files changed, 222 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_pull_data.c
create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_pull_data.c
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_pull_data.c b/tools/testing/selftests/bpf/prog_tests/xdp_pull_data.c
new file mode 100644
index 000000000000..932b33a71b17
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_pull_data.c
@@ -0,0 +1,174 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+#include <network_helpers.h>
+#include "test_xdp_pull_data.skel.h"
+
+#define PULL_MAX (1 << 31)
+#define PULL_PLUS_ONE (1 << 30)
+
+#define XDP_PACKET_HEADROOM 256
+
+/* Find sizes of struct skb_shared_info and struct xdp_frame so that
+ * we can calculate the maximum pull lengths for test cases
+ */
+int find_xdp_sizes(struct test_xdp_pull_data *skel, int frame_sz)
+{
+ LIBBPF_OPTS(bpf_test_run_opts, topts);
+ struct xdp_md ctx = {};
+ int prog_fd, err;
+ __u8 *buf;
+
+ buf = calloc(frame_sz, sizeof(__u8));
+ if (!ASSERT_OK_PTR(buf, "calloc buf"))
+ return -ENOMEM;
+
+ topts.data_in = buf;
+ topts.data_out = buf;
+ topts.data_size_in = frame_sz;
+ topts.data_size_out = frame_sz;
+ /* Pass a data_end larger than the linear space available to make sure
+ * bpf_prog_test_run_xdp() will fill the linear data area so that
+ * xdp_find_data_hard_end can infer the size of struct skb_shared_info
+ */
+ ctx.data_end = frame_sz;
+ topts.ctx_in = &ctx;
+ topts.ctx_out = &ctx;
+ topts.ctx_size_in = sizeof(ctx);
+ topts.ctx_size_out = sizeof(ctx);
+
+ prog_fd = bpf_program__fd(skel->progs.xdp_find_sizes);
+ err = bpf_prog_test_run_opts(prog_fd, &topts);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+
+ return err;
+}
+
+/* xdp_pull_data_prog will directly read a marker 0xbb stored at buf[1024]
+ * so caller expecting XDP_PASS should always pass pull_len no less than 1024
+ */
+void run_test(struct test_xdp_pull_data *skel, int retval,
+ int frame_sz, int buff_len, int meta_len, int data_len,
+ int pull_len)
+{
+ LIBBPF_OPTS(bpf_test_run_opts, topts);
+ struct xdp_md ctx = {};
+ int prog_fd, err;
+ __u8 *buf;
+
+ buf = calloc(buff_len, sizeof(__u8));
+ if (!ASSERT_OK_PTR(buf, "calloc buf"))
+ return;
+
+ buf[meta_len + 1023] = 0xaa;
+ buf[meta_len + 1024] = 0xbb;
+ buf[meta_len + 1025] = 0xcc;
+
+ topts.data_in = buf;
+ topts.data_out = buf;
+ topts.data_size_in = buff_len;
+ topts.data_size_out = buff_len;
+ ctx.data = meta_len;
+ ctx.data_end = meta_len + data_len;
+ topts.ctx_in = &ctx;
+ topts.ctx_out = &ctx;
+ topts.ctx_size_in = sizeof(ctx);
+ topts.ctx_size_out = sizeof(ctx);
+
+ skel->bss->data_len = data_len;
+ if (pull_len & PULL_MAX) {
+ int headroom = XDP_PACKET_HEADROOM - meta_len - skel->bss->xdpf_sz;
+ int tailroom = frame_sz - XDP_PACKET_HEADROOM -
+ data_len - skel->bss->sinfo_sz;
+
+ pull_len = pull_len & PULL_PLUS_ONE ? 1 : 0;
+ pull_len += headroom + tailroom + data_len;
+ }
+ skel->bss->pull_len = pull_len;
+
+ prog_fd = bpf_program__fd(skel->progs.xdp_pull_data_prog);
+ err = bpf_prog_test_run_opts(prog_fd, &topts);
+ ASSERT_OK(err, "bpf_prog_test_run_opts");
+ ASSERT_EQ(topts.retval, retval, "xdp_pull_data_prog retval");
+
+ if (retval == XDP_DROP)
+ goto out;
+
+ ASSERT_EQ(ctx.data_end, meta_len + pull_len, "linear data size");
+ ASSERT_EQ(topts.data_size_out, buff_len, "linear + non-linear data size");
+ /* Make sure data around xdp->data_end was not messed up by
+ * bpf_xdp_pull_data()
+ */
+ ASSERT_EQ(buf[meta_len + 1023], 0xaa, "data[1023]");
+ ASSERT_EQ(buf[meta_len + 1024], 0xbb, "data[1024]");
+ ASSERT_EQ(buf[meta_len + 1025], 0xcc, "data[1025]");
+out:
+ free(buf);
+}
+
+static void test_xdp_pull_data_basic(void)
+{
+ u32 pg_sz, max_meta_len, max_data_len;
+ struct test_xdp_pull_data *skel;
+
+ skel = test_xdp_pull_data__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "test_xdp_pull_data__open_and_load"))
+ return;
+
+ pg_sz = sysconf(_SC_PAGE_SIZE);
+
+ if (find_xdp_sizes(skel, pg_sz))
+ goto out;
+
+ max_meta_len = XDP_PACKET_HEADROOM - skel->bss->xdpf_sz;
+ max_data_len = pg_sz - XDP_PACKET_HEADROOM - skel->bss->sinfo_sz;
+
+ /* linear xdp pkt, pull 0 byte */
+ run_test(skel, XDP_PASS, pg_sz, 2048, 0, 2048, 2048);
+
+ /* multi-buf pkt, pull results in linear xdp pkt */
+ run_test(skel, XDP_PASS, pg_sz, 2048, 0, 1024, 2048);
+
+ /* multi-buf pkt, pull 1 byte to linear data area */
+ run_test(skel, XDP_PASS, pg_sz, 9000, 0, 1024, 1025);
+
+ /* multi-buf pkt, pull 0 byte to linear data area */
+ run_test(skel, XDP_PASS, pg_sz, 9000, 0, 1025, 1025);
+
+ /* multi-buf pkt, empty linear data area, pull requires memmove */
+ run_test(skel, XDP_PASS, pg_sz, 9000, 0, 0, PULL_MAX);
+
+ /* multi-buf pkt, no headroom */
+ run_test(skel, XDP_PASS, pg_sz, 9000, max_meta_len, 1024, PULL_MAX);
+
+ /* multi-buf pkt, no tailroom, pull requires memmove */
+ run_test(skel, XDP_PASS, pg_sz, 9000, 0, max_data_len, PULL_MAX);
+
+
+ /* linear xdp pkt, pull more than total data len */
+ run_test(skel, XDP_DROP, pg_sz, 2048, 0, 2048, 2049);
+
+ /* multi-buf pkt with no space left in linear data area */
+ run_test(skel, XDP_DROP, pg_sz, 9000, max_meta_len, max_data_len,
+ PULL_MAX | PULL_PLUS_ONE);
+
+ /* multi-buf pkt, empty linear data area */
+ run_test(skel, XDP_DROP, pg_sz, 9000, 0, 0, PULL_MAX | PULL_PLUS_ONE);
+
+ /* multi-buf pkt, no headroom */
+ run_test(skel, XDP_DROP, pg_sz, 9000, max_meta_len, 1024,
+ PULL_MAX | PULL_PLUS_ONE);
+
+ /* multi-buf pkt, no tailroom */
+ run_test(skel, XDP_DROP, pg_sz, 9000, 0, max_data_len,
+ PULL_MAX | PULL_PLUS_ONE);
+
+out:
+ test_xdp_pull_data__destroy(skel);
+}
+
+void test_xdp_pull_data(void)
+{
+ if (test__start_subtest("xdp_pull_data"))
+ test_xdp_pull_data_basic();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_pull_data.c b/tools/testing/selftests/bpf/progs/test_xdp_pull_data.c
new file mode 100644
index 000000000000..dd901bb109b6
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_pull_data.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+
+int xdpf_sz;
+int sinfo_sz;
+int data_len;
+int pull_len;
+
+#define XDP_PACKET_HEADROOM 256
+
+SEC("xdp.frags")
+int xdp_find_sizes(struct xdp_md *ctx)
+{
+ xdpf_sz = sizeof(struct xdp_frame);
+ sinfo_sz = __PAGE_SIZE - XDP_PACKET_HEADROOM -
+ (ctx->data_end - ctx->data);
+
+ return XDP_PASS;
+}
+
+SEC("xdp.frags")
+int xdp_pull_data_prog(struct xdp_md *ctx)
+{
+ __u8 *data_end = (void *)(long)ctx->data_end;
+ __u8 *data = (void *)(long)ctx->data;
+ __u8 *val_p;
+ int err;
+
+ if (data_len != data_end - data)
+ return XDP_DROP;
+
+ err = bpf_xdp_pull_data(ctx, pull_len);
+ if (err)
+ return XDP_DROP;
+
+ val_p = (void *)(long)ctx->data + 1024;
+ if (val_p + 1 > (void *)(long)ctx->data_end)
+ return XDP_DROP;
+
+ if (*val_p != 0xbb)
+ return XDP_DROP;
+
+ return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 5/6] selftests/bpf: Test bpf_xdp_pull_data
2025-09-15 22:48 ` [PATCH bpf-next v3 5/6] selftests/bpf: Test bpf_xdp_pull_data Amery Hung
@ 2025-09-17 17:54 ` Martin KaFai Lau
0 siblings, 0 replies; 15+ messages in thread
From: Martin KaFai Lau @ 2025-09-17 17:54 UTC (permalink / raw)
To: Amery Hung
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, paul.chaignon,
kuba, stfomichev, martin.lau, mohsin.bashr, noren, dtatulea,
saeedm, tariqt, mbloch, maciej.fijalkowski, kernel-team
On 9/15/25 3:48 PM, Amery Hung wrote:
> +/* Find sizes of struct skb_shared_info and struct xdp_frame so that
> + * we can calculate the maximum pull lengths for test cases
> + */
> +int find_xdp_sizes(struct test_xdp_pull_data *skel, int frame_sz)
static
> +{
> + LIBBPF_OPTS(bpf_test_run_opts, topts);
> + struct xdp_md ctx = {};
> + int prog_fd, err;
> + __u8 *buf;
> +
> + buf = calloc(frame_sz, sizeof(__u8));
buf is leaked.
> + if (!ASSERT_OK_PTR(buf, "calloc buf"))
> + return -ENOMEM;
> +
> + topts.data_in = buf;
> + topts.data_out = buf;
> + topts.data_size_in = frame_sz;
> + topts.data_size_out = frame_sz;
> + /* Pass a data_end larger than the linear space available to make sure
> + * bpf_prog_test_run_xdp() will fill the linear data area so that
> + * xdp_find_data_hard_end can infer the size of struct skb_shared_info
> + */
> + ctx.data_end = frame_sz;
> + topts.ctx_in = &ctx;
> + topts.ctx_out = &ctx;
> + topts.ctx_size_in = sizeof(ctx);
> + topts.ctx_size_out = sizeof(ctx);
> +
> + prog_fd = bpf_program__fd(skel->progs.xdp_find_sizes);
> + err = bpf_prog_test_run_opts(prog_fd, &topts);
> + ASSERT_OK(err, "bpf_prog_test_run_opts");
> +
> + return err;
> +}
> +
> +/* xdp_pull_data_prog will directly read a marker 0xbb stored at buf[1024]
> + * so caller expecting XDP_PASS should always pass pull_len no less than 1024
> + */
> +void run_test(struct test_xdp_pull_data *skel, int retval,
static
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next v3 6/6] selftests: drv-net: Pull data before parsing headers
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
` (4 preceding siblings ...)
2025-09-15 22:48 ` [PATCH bpf-next v3 5/6] selftests/bpf: Test bpf_xdp_pull_data Amery Hung
@ 2025-09-15 22:48 ` Amery Hung
2025-09-17 18:50 ` [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Martin KaFai Lau
6 siblings, 0 replies; 15+ messages in thread
From: Amery Hung @ 2025-09-15 22:48 UTC (permalink / raw)
To: bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, paul.chaignon, kuba,
stfomichev, martin.lau, mohsin.bashr, noren, dtatulea, saeedm,
tariqt, mbloch, maciej.fijalkowski, kernel-team
It is possible for drivers to generate xdp packets with data residing
entirely in fragments. To keep parsing headers using direcy packet
access, call bpf_xdp_pull_data() to pull headers into the linear data
area.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
---
.../selftests/net/lib/xdp_native.bpf.c | 89 +++++++++++++++----
1 file changed, 74 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/net/lib/xdp_native.bpf.c b/tools/testing/selftests/net/lib/xdp_native.bpf.c
index 521ba38f2ddd..df4eea5c192b 100644
--- a/tools/testing/selftests/net/lib/xdp_native.bpf.c
+++ b/tools/testing/selftests/net/lib/xdp_native.bpf.c
@@ -14,6 +14,8 @@
#define MAX_PAYLOAD_LEN 5000
#define MAX_HDR_LEN 64
+extern int bpf_xdp_pull_data(struct xdp_md *xdp, __u32 len) __ksym __weak;
+
enum {
XDP_MODE = 0,
XDP_PORT = 1,
@@ -68,30 +70,57 @@ static void record_stats(struct xdp_md *ctx, __u32 stat_type)
static struct udphdr *filter_udphdr(struct xdp_md *ctx, __u16 port)
{
- void *data_end = (void *)(long)ctx->data_end;
- void *data = (void *)(long)ctx->data;
struct udphdr *udph = NULL;
- struct ethhdr *eth = data;
+ void *data, *data_end;
+ struct ethhdr *eth;
+ int err;
+
+ err = bpf_xdp_pull_data(ctx, sizeof(*eth));
+ if (err)
+ return NULL;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = eth = (void *)(long)ctx->data;
if (data + sizeof(*eth) > data_end)
return NULL;
if (eth->h_proto == bpf_htons(ETH_P_IP)) {
- struct iphdr *iph = data + sizeof(*eth);
+ struct iphdr *iph;
+
+ err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*iph) +
+ sizeof(*udph));
+ if (err)
+ return NULL;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = (void *)(long)ctx->data;
+
+ iph = data + sizeof(*eth);
if (iph + 1 > (struct iphdr *)data_end ||
iph->protocol != IPPROTO_UDP)
return NULL;
- udph = (void *)eth + sizeof(*iph) + sizeof(*eth);
- } else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) {
- struct ipv6hdr *ipv6h = data + sizeof(*eth);
+ udph = data + sizeof(*iph) + sizeof(*eth);
+ } else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) {
+ struct ipv6hdr *ipv6h;
+
+ err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*ipv6h) +
+ sizeof(*udph));
+ if (err)
+ return NULL;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = (void *)(long)ctx->data;
+
+ ipv6h = data + sizeof(*eth);
if (ipv6h + 1 > (struct ipv6hdr *)data_end ||
ipv6h->nexthdr != IPPROTO_UDP)
return NULL;
- udph = (void *)eth + sizeof(*ipv6h) + sizeof(*eth);
+ udph = data + sizeof(*ipv6h) + sizeof(*eth);
} else {
return NULL;
}
@@ -145,17 +174,34 @@ static void swap_machdr(void *data)
static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
{
- void *data_end = (void *)(long)ctx->data_end;
- void *data = (void *)(long)ctx->data;
struct udphdr *udph = NULL;
- struct ethhdr *eth = data;
+ void *data, *data_end;
+ struct ethhdr *eth;
+ int err;
+
+ err = bpf_xdp_pull_data(ctx, sizeof(*eth));
+ if (err)
+ return XDP_PASS;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = eth = (void *)(long)ctx->data;
if (data + sizeof(*eth) > data_end)
return XDP_PASS;
if (eth->h_proto == bpf_htons(ETH_P_IP)) {
- struct iphdr *iph = data + sizeof(*eth);
- __be32 tmp_ip = iph->saddr;
+ struct iphdr *iph;
+ __be32 tmp_ip;
+
+ err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*iph) +
+ sizeof(*udph));
+ if (err)
+ return XDP_PASS;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = (void *)(long)ctx->data;
+
+ iph = data + sizeof(*eth);
if (iph + 1 > (struct iphdr *)data_end ||
iph->protocol != IPPROTO_UDP)
@@ -169,8 +215,10 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
return XDP_PASS;
record_stats(ctx, STATS_RX);
+ eth = data;
swap_machdr((void *)eth);
+ tmp_ip = iph->saddr;
iph->saddr = iph->daddr;
iph->daddr = tmp_ip;
@@ -178,9 +226,19 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
return XDP_TX;
- } else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) {
- struct ipv6hdr *ipv6h = data + sizeof(*eth);
+ } else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) {
struct in6_addr tmp_ipv6;
+ struct ipv6hdr *ipv6h;
+
+ err = bpf_xdp_pull_data(ctx, sizeof(*eth) + sizeof(*ipv6h) +
+ sizeof(*udph));
+ if (err)
+ return XDP_PASS;
+
+ data_end = (void *)(long)ctx->data_end;
+ data = (void *)(long)ctx->data;
+
+ ipv6h = data + sizeof(*eth);
if (ipv6h + 1 > (struct ipv6hdr *)data_end ||
ipv6h->nexthdr != IPPROTO_UDP)
@@ -194,6 +252,7 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port)
return XDP_PASS;
record_stats(ctx, STATS_RX);
+ eth = data;
swap_machdr((void *)eth);
__builtin_memcpy(&tmp_ipv6, &ipv6h->saddr, sizeof(tmp_ipv6));
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data
2025-09-15 22:47 [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Amery Hung
` (5 preceding siblings ...)
2025-09-15 22:48 ` [PATCH bpf-next v3 6/6] selftests: drv-net: Pull data before parsing headers Amery Hung
@ 2025-09-17 18:50 ` Martin KaFai Lau
2025-09-17 21:22 ` Jakub Kicinski
6 siblings, 1 reply; 15+ messages in thread
From: Martin KaFai Lau @ 2025-09-17 18:50 UTC (permalink / raw)
To: Amery Hung, Jakub Kicinski
Cc: bpf, netdev, alexei.starovoitov, andrii, daniel, paul.chaignon,
kuba, stfomichev, martin.lau, mohsin.bashr, noren, dtatulea,
saeedm, tariqt, mbloch, maciej.fijalkowski, kernel-team
On 9/15/25 3:47 PM, Amery Hung wrote:
> include/net/xdp_sock_drv.h | 21 ++-
> kernel/bpf/verifier.c | 13 ++
> net/bpf/test_run.c | 26 ++-
> net/core/filter.c | 123 +++++++++++--
> .../bpf/prog_tests/xdp_context_test_run.c | 4 +-
> .../selftests/bpf/prog_tests/xdp_pull_data.c | 174 ++++++++++++++++++
> .../selftests/bpf/progs/test_xdp_pull_data.c | 48 +++++
> .../selftests/net/lib/xdp_native.bpf.c | 89 +++++++--
I think the next re-spin should be ready. Jakub, can this be landed to
bpf-next/master alone and will be available in net-next after the upcoming merge
window, considering it is almost rc7?
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data
2025-09-17 18:50 ` [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data Martin KaFai Lau
@ 2025-09-17 21:22 ` Jakub Kicinski
2025-09-18 6:43 ` Nimrod Oren
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-09-17 21:22 UTC (permalink / raw)
To: Martin KaFai Lau, noren
Cc: Amery Hung, bpf, netdev, alexei.starovoitov, andrii, daniel,
paul.chaignon, stfomichev, martin.lau, mohsin.bashr, dtatulea,
saeedm, tariqt, mbloch, maciej.fijalkowski, kernel-team
On Wed, 17 Sep 2025 11:50:01 -0700 Martin KaFai Lau wrote:
> On 9/15/25 3:47 PM, Amery Hung wrote:
> > include/net/xdp_sock_drv.h | 21 ++-
> > kernel/bpf/verifier.c | 13 ++
> > net/bpf/test_run.c | 26 ++-
> > net/core/filter.c | 123 +++++++++++--
> > .../bpf/prog_tests/xdp_context_test_run.c | 4 +-
> > .../selftests/bpf/prog_tests/xdp_pull_data.c | 174 ++++++++++++++++++
> > .../selftests/bpf/progs/test_xdp_pull_data.c | 48 +++++
> > .../selftests/net/lib/xdp_native.bpf.c | 89 +++++++--
>
> I think the next re-spin should be ready. Jakub, can this be landed to
> bpf-next/master alone and will be available in net-next after the upcoming merge
> window, considering it is almost rc7?
Nimrod, are you waiting for these before you send you dyn_ptr flavor of
XDP tests? Other than Nimrod's work I don't see a problem.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v3 0/6] Add kfunc bpf_xdp_pull_data
2025-09-17 21:22 ` Jakub Kicinski
@ 2025-09-18 6:43 ` Nimrod Oren
0 siblings, 0 replies; 15+ messages in thread
From: Nimrod Oren @ 2025-09-18 6:43 UTC (permalink / raw)
To: Jakub Kicinski, Martin KaFai Lau
Cc: Amery Hung, bpf, netdev, alexei.starovoitov, andrii, daniel,
paul.chaignon, stfomichev, martin.lau, mohsin.bashr, dtatulea,
saeedm, tariqt, mbloch, maciej.fijalkowski, kernel-team
On 18/09/2025 0:22, Jakub Kicinski wrote:
> On Wed, 17 Sep 2025 11:50:01 -0700 Martin KaFai Lau wrote:
>> On 9/15/25 3:47 PM, Amery Hung wrote:
>>> include/net/xdp_sock_drv.h | 21 ++-
>>> kernel/bpf/verifier.c | 13 ++
>>> net/bpf/test_run.c | 26 ++-
>>> net/core/filter.c | 123 +++++++++++--
>>> .../bpf/prog_tests/xdp_context_test_run.c | 4 +-
>>> .../selftests/bpf/prog_tests/xdp_pull_data.c | 174 ++++++++++++++++++
>>> .../selftests/bpf/progs/test_xdp_pull_data.c | 48 +++++
>>> .../selftests/net/lib/xdp_native.bpf.c | 89 +++++++--
>>
>> I think the next re-spin should be ready. Jakub, can this be landed to
>> bpf-next/master alone and will be available in net-next after the upcoming merge
>> window, considering it is almost rc7?
>
> Nimrod, are you waiting for these before you send you dyn_ptr flavor of
> XDP tests? Other than Nimrod's work I don't see a problem.
Yes I am waiting for these to be merged so I can rebase my work on top.
Thanks
^ permalink raw reply [flat|nested] 15+ messages in thread