* [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation
@ 2026-04-22 3:36 Jason Xing
2026-04-22 3:36 ` [PATCH net v3 1/8] xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices Jason Xing
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
There are rare issues around xsk_build_skb(). Some of them
were founded by Sashiko[1][2].
[1]: https://lore.kernel.org/all/20260415082654.21026-1-kerneljasonxing@gmail.com/
[2]: https://lore.kernel.org/all/20260418045644.28612-1-kerneljasonxing@gmail.com/
---
v3
Link: https://lore.kernel.org/all/20260420082805.14844-1-kerneljasonxing@gmail.com/
1. use !xs->skb as the indicator of first frag in patch 3 (Stan)
2. move init skb after it's allocated successfully, so patch 4,5,6 have
been adjusted accordingly (Stan)
3. do not support 32-bit arch (Stan)
4. add acked-by tags (Stan)
v2
Link: https://lore.kernel.org/all/20260418045644.28612-1-kerneljasonxing@gmail.com/#t
1. add four patches spotted by sashiko to fix buggy pre-existing
behavior
2. adjust the order of 8 patches.
Jason Xing (8):
xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices
xsk: handle NULL dereference of the skb without frags issue
xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path
xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb()
xsk: avoid skb leak in XDP_TX_METADATA case
xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS
xsk: fix xsk_addrs slab leak on multi-buffer error path
xsk: don't support AF_XDP on 32-bit architectures
net/xdp/Kconfig | 2 +-
net/xdp/xsk.c | 31 ++++++++++++++++++++-----------
net/xdp/xsk_buff_pool.c | 3 +++
3 files changed, 24 insertions(+), 12 deletions(-)
--
2.41.3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net v3 1/8] xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 2/8] xsk: handle NULL dereference of the skb without frags issue Jason Xing
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
skb_checksum_help() is a common helper that writes the folded
16-bit checksum back via skb->data + csum_start + csum_offset,
i.e. it relies on the skb's linear head and fails (with WARN_ONCE
and -EINVAL) when skb_headlen() is 0.
AF_XDP generic xmit takes two very different paths depending on the
netdev. Drivers that advertise IFF_TX_SKB_NO_LINEAR (e.g. virtio_net)
skip the "copy payload into a linear head" step on purpose as a
performance optimisation: xsk_build_skb_zerocopy() only attaches UMEM
pages as frags and never calls skb_put(), so skb_headlen() stays 0
for the whole skb. For these skbs there is simply no linear area for
skb_checksum_help() to write the csum into - the sw-csum fallback is
structurally inapplicable.
The patch tries to catch this and reject the combination with error at
setup time. Rejecting at bind() converts this silent per-packet failure
into a synchronous, actionable -EOPNOTSUPP at setup time. HW csum and
launch_time metadata on IFF_TX_SKB_NO_LINEAR drivers are unaffected
because they do not call skb_checksum_help().
Without the patch, every descriptor carrying 'XDP_TX_METADATA |
XDP_TXMD_FLAGS_CHECKSUM' produces:
1) a WARN_ONCE "offset (N) >= skb_headlen() (0)" from skb_checksum_help(),
2) sendmsg() returning -EINVAL without consuming the descriptor
(invalid_descs is not incremented),
3) a wedged TX ring: __xsk_generic_xmit() does not advance the
consumer on non-EOVERFLOW errors, so the next sendmsg() re-reads
the same descriptor and re-hits the same WARN until the socket
is closed.
Closes: https://lore.kernel.org/all/20260419045822.843BFC2BCAF@smtp.kernel.org/#t
Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk_buff_pool.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 37b7a68b89b3..c2521b6547e3 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -169,6 +169,9 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
if (force_zc && force_copy)
return -EINVAL;
+ if (pool->tx_sw_csum && (netdev->priv_flags & IFF_TX_SKB_NO_LINEAR))
+ return -EOPNOTSUPP;
+
if (xsk_get_pool_from_qid(netdev, queue_id))
return -EBUSY;
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 2/8] xsk: handle NULL dereference of the skb without frags issue
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
2026-04-22 3:36 ` [PATCH net v3 1/8] xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 3/8] xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path Jason Xing
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
When a first descriptor (xs->skb == NULL) triggers -EOVERFLOW in
xsk_build_skb_zerocopy (e.g., MAX_SKB_FRAGS exceeded), the free_err
EOVERFLOW handler unconditionally dereferences xs->skb via
xsk_inc_num_desc(xs->skb) and xsk_drop_skb(xs->skb), causing a NULL
pointer dereference.
In this series, the skb is already freed by kfree_skb() inside
xsk_build_skb_zerocopy for the first-descriptor case, so we only need
to do the bookkeeping: cancel the one reserved CQ slot and account for
the single invalid descriptor.
Guard the existing xsk_inc_num_desc/xsk_drop_skb calls with an
xs->skb check (for the continuation case), and add an else branch
for the first-descriptor case that manually cancels the CQ slot and
increments invalid_descs by one.
Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 6149f6a79897..6521604f8d42 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -893,9 +893,14 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
kfree_skb(skb);
if (err == -EOVERFLOW) {
- /* Drop the packet */
- xsk_inc_num_desc(xs->skb);
- xsk_drop_skb(xs->skb);
+ if (xs->skb) {
+ /* Drop the packet */
+ xsk_inc_num_desc(xs->skb);
+ xsk_drop_skb(xs->skb);
+ } else {
+ xsk_cq_cancel_locked(xs->pool, 1);
+ xs->tx->invalid_descs++;
+ }
xskq_cons_release(xs->tx);
} else {
/* Let application retry */
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 3/8] xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
2026-04-22 3:36 ` [PATCH net v3 1/8] xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices Jason Xing
2026-04-22 3:36 ` [PATCH net v3 2/8] xsk: handle NULL dereference of the skb without frags issue Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 4/8] xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb() Jason Xing
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
When xsk_build_skb() processes multi-buffer packets in copy mode, the
first descriptor stores data into the skb linear area without adding
any frags, so nr_frags stays at 0. The caller then sets xs->skb = skb
to accumulate subsequent descriptors.
If a continuation descriptor fails (e.g. alloc_page returns NULL with
-EAGAIN), we jump to free_err where the condition:
if (skb && !skb_shinfo(skb)->nr_frags)
kfree_skb(skb);
evaluates to true because nr_frags is still 0 (the first descriptor
used the linear area, not frags). This frees the skb while xs->skb
still points to it, creating a dangling pointer. On the next transmit
attempt or socket close, xs->skb is dereferenced, causing a
use-after-free or double-free.
Fix by using a !xs->skb check to handle first frag situation, ensuring
we only free skbs that were freshly allocated in this call
(xs->skb is NULL) and never free an in-progress multi-buffer skb that
the caller still references.
Closes: https://lore.kernel.org/all/20260415082654.21026-4-kerneljasonxing@gmail.com/
Fixes: 6b9c129c2f93 ("xsk: remove @first_frag from xsk_build_skb()")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 6521604f8d42..d23d1b14b8b4 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -889,7 +889,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
return skb;
free_err:
- if (skb && !skb_shinfo(skb)->nr_frags)
+ if (skb && !xs->skb)
kfree_skb(skb);
if (err == -EOVERFLOW) {
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 4/8] xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb()
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
` (2 preceding siblings ...)
2026-04-22 3:36 ` [PATCH net v3 3/8] xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 5/8] xsk: avoid skb leak in XDP_TX_METADATA case Jason Xing
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Once xsk_skb_init_misc() has been called on an skb, its destructor is
set to xsk_destruct_skb(), which submits the descriptor address(es) to
the completion queue and advances the CQ producer. If such an skb is
subsequently freed via kfree_skb() along an error path - before the
skb has ever been handed to the driver - the destructor still runs and
submits a bogus, half-initialized address to the CQ.
Postpone the init phase when we believe the allocation of first frag is
successfully completed. Before this init, skb can be safely freed by
kfree_skb().
Closes: https://lore.kernel.org/all/20260419045822.843BFC2BCAF@smtp.kernel.org/
Fixes: c30d084960cf ("xsk: avoid overwriting skb fields for multi-buffer traffic")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index d23d1b14b8b4..88ec6d2cbbcf 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -739,8 +739,6 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
return ERR_PTR(err);
skb_reserve(skb, hr);
-
- xsk_skb_init_misc(skb, xs, desc->addr);
if (desc->options & XDP_TX_METADATA) {
err = xsk_skb_metadata(skb, buffer, desc, pool, hr);
if (unlikely(err))
@@ -834,7 +832,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
if (unlikely(err))
goto free_err;
- xsk_skb_init_misc(skb, xs, desc->addr);
if (desc->options & XDP_TX_METADATA) {
err = xsk_skb_metadata(skb, buffer, desc,
xs->pool, hr);
@@ -884,6 +881,8 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
}
}
+ if (!xs->skb)
+ xsk_skb_init_misc(skb, xs, desc->addr);
xsk_inc_num_desc(skb);
return skb;
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 5/8] xsk: avoid skb leak in XDP_TX_METADATA case
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
` (3 preceding siblings ...)
2026-04-22 3:36 ` [PATCH net v3 4/8] xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb() Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 6/8] xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS Jason Xing
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Fix it by explicitly adding kfree_skb() before returning back to its
caller.
How to reproduce it in virtio_net:
1. the current skb is the first one (which means no frag and xs->skb is
NULL) and users enable metadata feature.
2. xsk_skb_metadata() returns a error code.
3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'.
4. there is no chance to free this skb anymore.
Closes: https://lore.kernel.org/all/20260415085204.3F87AC19424@smtp.kernel.org/
Fixes: 30c3055f9c0d ("xsk: wrap generic metadata handling onto separate function")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 88ec6d2cbbcf..c49b58199d2f 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -741,8 +741,10 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
skb_reserve(skb, hr);
if (desc->options & XDP_TX_METADATA) {
err = xsk_skb_metadata(skb, buffer, desc, pool, hr);
- if (unlikely(err))
+ if (unlikely(err)) {
+ kfree_skb(skb);
return ERR_PTR(err);
+ }
}
} else {
struct xsk_addrs *xsk_addr;
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 6/8] xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
` (4 preceding siblings ...)
2026-04-22 3:36 ` [PATCH net v3 5/8] xsk: avoid skb leak in XDP_TX_METADATA case Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 7/8] xsk: fix xsk_addrs slab leak on multi-buffer error path Jason Xing
2026-04-22 3:36 ` [PATCH net v3 8/8] xsk: don't support AF_XDP on 32-bit architectures Jason Xing
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
Fix it by explicitly adding kfree_skb() before returning back to its
caller.
How to reproduce it in virtio_net:
1. the current skb is the first one (which means xs->skb is NULL) and
hit the limit MAX_SKB_FRAGS.
2. xsk_build_skb_zerocopy() returns -EOVERFLOW.
3. the caller xsk_build_skb() clears skb by using 'skb = NULL;'. This
is why bug can be triggered.
4. there is no chance to free this skb anymore.
Note that if in this case the xs->skb is not NULL, xsk_build_skb() will
call xsk_drop_skb(xs->skb) to do the right thing.
Fixes: cf24f5a5feea ("xsk: add support for AF_XDP multi-buffer on Tx path")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index c49b58199d2f..5e6326e076ab 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -776,8 +776,11 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
addr = buffer - pool->addrs;
for (copied = 0, i = skb_shinfo(skb)->nr_frags; copied < len; i++) {
- if (unlikely(i >= MAX_SKB_FRAGS))
+ if (unlikely(i >= MAX_SKB_FRAGS)) {
+ if (!xs->skb)
+ kfree_skb(skb);
return ERR_PTR(-EOVERFLOW);
+ }
page = pool->umem->pgs[addr >> PAGE_SHIFT];
get_page(page);
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 7/8] xsk: fix xsk_addrs slab leak on multi-buffer error path
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
` (5 preceding siblings ...)
2026-04-22 3:36 ` [PATCH net v3 6/8] xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
2026-04-22 3:36 ` [PATCH net v3 8/8] xsk: don't support AF_XDP on 32-bit architectures Jason Xing
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
When xsk_build_skb() / xsk_build_skb_zerocopy() sees the first
continuation descriptor, it promotes destructor_arg from an inlined
address to a freshly allocated xsk_addrs (num_descs = 1). The counter
is bumped to >= 2 only at the very end of a successful build (by calling
xsk_inc_num_desc()).
If the build fails in between (e.g. alloc_page() returns NULL with
-EAGAIN, or the MAX_SKB_FRAGS overflow hits), we jump to free_err, skip
calling xsk_inc_num_desc() to increment num_descs and leave the half-built
skb attached to xs->skb for the app to retry. The skb now has
1) destructor_arg = a real xsk_addrs pointer,
2) num_descs = 1
If the app never retries and just close()s the socket, xsk_release()
calls xsk_drop_skb() -> xsk_consume_skb(), which decides whether to
free xsk_addrs by testing num_descs > 1:
if (unlikely(num_descs > 1))
kmem_cache_free(xsk_tx_generic_cache, destructor_arg);
Because num_descs is exactly 1 the branch is skipped and the
xsk_addrs object is leaked to the xsk_tx_generic_cache slab.
Fix it by directly testing if destructor_arg is still addr. Or else it
is modified and used to store the newly allocated memory from
xsk_tx_generic_cache regardless of increment of num_desc, which we
need to handle.
Closes: https://lore.kernel.org/all/20260419045824.D9E5EC2BCAF@smtp.kernel.org/
Fixes: 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 5e6326e076ab..ed96f6ec8ff2 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -605,7 +605,7 @@ static void xsk_cq_submit_addr_locked(struct xsk_buff_pool *pool,
spin_lock_irqsave(&pool->cq_prod_lock, flags);
idx = xskq_get_prod(pool->cq);
- if (unlikely(num_descs > 1)) {
+ if (unlikely(!xsk_skb_destructor_is_addr(skb))) {
xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg;
for (i = 0; i < num_descs; i++) {
@@ -660,7 +660,7 @@ static void xsk_consume_skb(struct sk_buff *skb)
u32 num_descs = xsk_get_num_desc(skb);
struct xsk_addrs *xsk_addr;
- if (unlikely(num_descs > 1)) {
+ if (unlikely(!xsk_skb_destructor_is_addr(skb))) {
xsk_addr = (struct xsk_addrs *)skb_shinfo(skb)->destructor_arg;
kmem_cache_free(xsk_tx_generic_cache, xsk_addr);
}
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net v3 8/8] xsk: don't support AF_XDP on 32-bit architectures
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
` (6 preceding siblings ...)
2026-04-22 3:36 ` [PATCH net v3 7/8] xsk: fix xsk_addrs slab leak on multi-buffer error path Jason Xing
@ 2026-04-22 3:36 ` Jason Xing
7 siblings, 0 replies; 9+ messages in thread
From: Jason Xing @ 2026-04-22 3:36 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, bjorn, magnus.karlsson,
maciej.fijalkowski, jonathan.lemon, sdf, ast, daniel, hawk,
john.fastabend
Cc: bpf, netdev, Jason Xing
From: Jason Xing <kernelxing@tencent.com>
In copy mode TX, xsk_skb_destructor_set_addr() stores the 64-bit
descriptor address into skb_shinfo(skb)->destructor_arg (void *) via a
uintptr_t cast:
skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL);
On 32-bit architectures uintptr_t is 32 bits, so the upper 32 bits of
the descriptor address are silently dropped. In XDP_ZEROCOPY unaligned
mode the chunk offset is encoded in bits 48-63 of the descriptor
address (XSK_UNALIGNED_BUF_OFFSET_SHIFT = 48), meaning the offset is
lost entirely. The completion queue then returns a truncated address to
userspace, making buffer recycling impossible.
Since we hear no one is using AF_XDP on 32-bit arch, we decided to
strictly stop supporting it at compile time.
Closes: https://lore.kernel.org/all/20260419045824.D9E5EC2BCAF@smtp.kernel.org/
Fixes: 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
Suggested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xdp/Kconfig b/net/xdp/Kconfig
index 71af2febe72a..819aa5795f50 100644
--- a/net/xdp/Kconfig
+++ b/net/xdp/Kconfig
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
config XDP_SOCKETS
bool "XDP sockets"
- depends on BPF_SYSCALL
+ depends on BPF_SYSCALL && 64BIT
default n
help
XDP sockets allows a channel between XDP programs and
--
2.41.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-04-22 3:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-22 3:36 [PATCH net v3 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
2026-04-22 3:36 ` [PATCH net v3 1/8] xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices Jason Xing
2026-04-22 3:36 ` [PATCH net v3 2/8] xsk: handle NULL dereference of the skb without frags issue Jason Xing
2026-04-22 3:36 ` [PATCH net v3 3/8] xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path Jason Xing
2026-04-22 3:36 ` [PATCH net v3 4/8] xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb() Jason Xing
2026-04-22 3:36 ` [PATCH net v3 5/8] xsk: avoid skb leak in XDP_TX_METADATA case Jason Xing
2026-04-22 3:36 ` [PATCH net v3 6/8] xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS Jason Xing
2026-04-22 3:36 ` [PATCH net v3 7/8] xsk: fix xsk_addrs slab leak on multi-buffer error path Jason Xing
2026-04-22 3:36 ` [PATCH net v3 8/8] xsk: don't support AF_XDP on 32-bit architectures Jason Xing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox