From: Jakub Kicinski <kuba@kernel.org>
To: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
magnus.karlsson@intel.com, maciej.fijalkowski@intel.com,
sdf@fomichev.me, kerneljasonxing@gmail.com, fw@strlen.de
Subject: Re: [PATCH 2/2 bpf v2] xsk: avoid data corruption on cq descriptor number
Date: Wed, 29 Oct 2025 16:22:45 -0700 [thread overview]
Message-ID: <20251029162245.5ea2ee3e@kernel.org> (raw)
In-Reply-To: <b21cf80c-5d69-4914-aa45-00f9527f3436@suse.de>
On Wed, 29 Oct 2025 08:51:58 +0100 Fernando Fernandez Mancera wrote:
> On 10/29/25 12:01 AM, Jakub Kicinski wrote:
> > On Tue, 28 Oct 2025 19:30:32 +0100 Fernando Fernandez Mancera wrote:
> >> Since commit 30f241fcf52a ("xsk: Fix immature cq descriptor
> >> production"), the descriptor number is stored in skb control block and
> >> xsk_cq_submit_addr_locked() relies on it to put the umem addrs onto
> >> pool's completion queue.
> >
> > Looking at the past discussion it sounds like you want to optimize
> > the single descriptor case? Can you not use a magic pointer for that?
> >
> > #define XSK_DESTRUCT_SINGLE_BUF (void *)1
> > destructor_arg = XSK_DESTRUCT_SINGLE_BUF
> >
> > Let's target this fix at net, please, I think the complexity here is
> > all in skbs paths.
>
> I might be missing something here but if the destructor_arg pointer is
> used to do this, where should we store the umem address associated with
> it? In the proposed approach the skb extension should not be increased
> for non-fragmented traffic as there is only a single descriptor and
> therefore we can store the umem address in destructor_arg directly.
I see. Pointers are always aligned to 8B, you can stash the "pointer
type" there. If the bottom bit is 1 it's a umem and the skb was
single-chunk. If it's non-0 then it's a full kmalloc'ed struct.
> The size of the skb extension will only increase for fragmented traffic
> (multiple descriptors).. but sure, if there is a fallback to the
> slowpath, it will burden a bit the performance. Although, for that to
> happen the must have tried to use AF_XDP family initially.. AFAICS, the
> size of skb extension is only increased when skb_ext_add() is called.
To be clear by adding an skb extension you are de-facto allocating
a bit in the skb struct. Just one of the bits of the active_extensions
field instead of a separate bitfield. If you can depend on the socket
association instead this is quite wasteful.
next prev parent reply other threads:[~2025-10-29 23:22 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 18:30 [PATCH 1/2 bpf v2] xdp: add XDP extension to skb Fernando Fernandez Mancera
2025-10-28 18:30 ` [PATCH 2/2 bpf v2] xsk: avoid data corruption on cq descriptor number Fernando Fernandez Mancera
2025-10-28 23:01 ` Jakub Kicinski
2025-10-29 7:51 ` Fernando Fernandez Mancera
2025-10-29 23:22 ` Jakub Kicinski [this message]
2025-10-30 8:38 ` Fernando Fernandez Mancera
2025-10-30 1:05 ` Jason Xing
2025-10-28 22:55 ` [PATCH 1/2 bpf v2] xdp: add XDP extension to skb Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251029162245.5ea2ee3e@kernel.org \
--to=kuba@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=fmancera@suse.de \
--cc=fw@strlen.de \
--cc=kerneljasonxing@gmail.com \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.