From: Jason Xing <kerneljasonxing@gmail.com>
To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, bjorn@kernel.org, magnus.karlsson@intel.com,
maciej.fijalkowski@intel.com, jonathan.lemon@gmail.com,
sdf@fomichev.me, ast@kernel.org, daniel@iogearbox.net,
hawk@kernel.org, john.fastabend@gmail.com,
aleksander.lobakin@intel.com
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
Jason Xing <kernelxing@tencent.com>
Subject: [PATCH net v4 8/8] xsk: fix u64 descriptor address truncation on 32-bit architectures
Date: Fri, 24 Apr 2026 13:38:16 +0800 [thread overview]
Message-ID: <20260424053816.27965-9-kerneljasonxing@gmail.com> (raw)
In-Reply-To: <20260424053816.27965-1-kerneljasonxing@gmail.com>
From: Jason Xing <kernelxing@tencent.com>
In copy mode TX, xsk_skb_destructor_set_addr() stores the 64-bit
descriptor address into skb_shinfo(skb)->destructor_arg (void *) via a
uintptr_t cast:
skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL);
On 32-bit architectures uintptr_t is 32 bits, so the upper 32 bits of
the descriptor address are silently dropped. In unaligned mode the chunk
offset is encoded in bits 48-63 of the descriptor address
(XSK_UNALIGNED_BUF_OFFSET_SHIFT = 48), meaning the offset is lost
entirely. The completion queue then returns a truncated address to
userspace, making buffer recycling impossible.
Fix this by handling the 32-bit case in the destructor_arg helpers:
- xsk_skb_destructor_set_addr(): on !CONFIG_64BIT, allocate an
xsk_addrs struct via kmem_cache_zalloc() to store the full u64
address. Leave num_descs as 0 (zalloc) so that the subsequent
xsk_inc_num_desc() brings it to the correct count of 1.
- xsk_skb_destructor_is_addr(): on !CONFIG_64BIT, return true only
when destructor_arg is NULL (not yet set), false when it points to
an xsk_addrs struct.
- xsk_skb_init_misc(): call xsk_skb_destructor_set_addr() first
before touching any other skb fields; on failure return early so
the skb destructor is never changed from sock_wfree.
The existing xsk_consume_skb() already handles 32-bit correctly after
these changes: xsk_skb_destructor_is_addr() returns false for any
allocated xsk_addrs, so the kmem_cache_free path is always taken.
The overhead is one extra kmem_cache_zalloc per first descriptor on
32-bit only; 64-bit builds are completely unchanged.
Closes: https://lore.kernel.org/all/20260419045824.D9E5EC2BCAF@smtp.kernel.org/
Fixes: 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
net/xdp/xsk.c | 38 +++++++++++++++++++++++++++++++-------
1 file changed, 31 insertions(+), 7 deletions(-)
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index ed96f6ec8ff2..fe88f47741b5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -558,7 +558,10 @@ static int xsk_cq_reserve_locked(struct xsk_buff_pool *pool)
static bool xsk_skb_destructor_is_addr(struct sk_buff *skb)
{
- return (uintptr_t)skb_shinfo(skb)->destructor_arg & 0x1UL;
+ if (IS_ENABLED(CONFIG_64BIT))
+ return (uintptr_t)skb_shinfo(skb)->destructor_arg & 0x1UL;
+ else
+ return !skb_shinfo(skb)->destructor_arg;
}
static u64 xsk_skb_destructor_get_addr(struct sk_buff *skb)
@@ -566,9 +569,21 @@ static u64 xsk_skb_destructor_get_addr(struct sk_buff *skb)
return (u64)((uintptr_t)skb_shinfo(skb)->destructor_arg & ~0x1UL);
}
-static void xsk_skb_destructor_set_addr(struct sk_buff *skb, u64 addr)
+static int xsk_skb_destructor_set_addr(struct sk_buff *skb, u64 addr)
{
+ if (!IS_ENABLED(CONFIG_64BIT)) {
+ struct xsk_addrs *xsk_addr;
+
+ xsk_addr = kmem_cache_zalloc(xsk_tx_generic_cache, GFP_KERNEL);
+ if (!xsk_addr)
+ return -ENOMEM;
+ xsk_addr->addrs[0] = addr;
+ skb_shinfo(skb)->destructor_arg = (void *)xsk_addr;
+ return 0;
+ }
+
skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t)addr | 0x1UL);
+ return 0;
}
static void xsk_inc_num_desc(struct sk_buff *skb)
@@ -644,14 +659,20 @@ void xsk_destruct_skb(struct sk_buff *skb)
sock_wfree(skb);
}
-static void xsk_skb_init_misc(struct sk_buff *skb, struct xdp_sock *xs,
- u64 addr)
+static int xsk_skb_init_misc(struct sk_buff *skb, struct xdp_sock *xs,
+ u64 addr)
{
+ int err;
+
+ err = xsk_skb_destructor_set_addr(skb, addr);
+ if (err)
+ return err;
+
skb->dev = xs->dev;
skb->priority = READ_ONCE(xs->sk.sk_priority);
skb->mark = READ_ONCE(xs->sk.sk_mark);
skb->destructor = xsk_destruct_skb;
- xsk_skb_destructor_set_addr(skb, addr);
+ return 0;
}
static void xsk_consume_skb(struct sk_buff *skb)
@@ -886,8 +907,11 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
}
}
- if (!xs->skb)
- xsk_skb_init_misc(skb, xs, desc->addr);
+ if (!xs->skb) {
+ err = xsk_skb_init_misc(skb, xs, desc->addr);
+ if (unlikely(err))
+ goto free_err;
+ }
xsk_inc_num_desc(skb);
return skb;
--
2.41.3
next prev parent reply other threads:[~2026-04-24 5:39 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 5:38 [PATCH net v4 0/8] xsk: fix bugs around xsk skb allocation Jason Xing
2026-04-24 5:38 ` [PATCH net v4 1/8] xsk: reject sw-csum UMEM binding to IFF_TX_SKB_NO_LINEAR devices Jason Xing
2026-04-25 5:40 ` sashiko-bot
2026-04-25 13:53 ` Jason Xing
2026-04-28 13:13 ` Paolo Abeni
2026-05-02 19:04 ` Jason Xing
2026-04-24 5:38 ` [PATCH net v4 2/8] xsk: handle NULL dereference of the skb without frags issue Jason Xing
2026-04-28 11:33 ` Simon Horman
2026-04-29 3:53 ` Jason Xing
2026-04-24 5:38 ` [PATCH net v4 3/8] xsk: fix use-after-free of xs->skb in xsk_build_skb() free_err path Jason Xing
2026-04-24 5:38 ` [PATCH net v4 4/8] xsk: prevent CQ desync when freeing half-built skbs in xsk_build_skb() Jason Xing
2026-04-24 5:38 ` [PATCH net v4 5/8] xsk: avoid skb leak in XDP_TX_METADATA case Jason Xing
2026-04-24 5:38 ` [PATCH net v4 6/8] xsk: free the skb when hitting the upper bound MAX_SKB_FRAGS Jason Xing
2026-04-24 5:38 ` [PATCH net v4 7/8] xsk: fix xsk_addrs slab leak on multi-buffer error path Jason Xing
2026-04-24 5:38 ` Jason Xing [this message]
2026-04-28 13:18 ` [PATCH net v4 8/8] xsk: fix u64 descriptor address truncation on 32-bit architectures Paolo Abeni
2026-04-28 23:11 ` Stanislav Fomichev
2026-04-29 3:41 ` Jason Xing
2026-04-29 15:14 ` Stanislav Fomichev
2026-04-29 19:02 ` Jason Xing
2026-05-01 3:29 ` Stanislav Fomichev
2026-05-02 20:10 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260424053816.27965-9-kerneljasonxing@gmail.com \
--to=kerneljasonxing@gmail.com \
--cc=aleksander.lobakin@intel.com \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jonathan.lemon@gmail.com \
--cc=kernelxing@tencent.com \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.