From: Jakub Kicinski <kuba@kernel.org>
To: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
magnus.karlsson@intel.com, maciej.fijalkowski@intel.com,
sdf@fomichev.me, kerneljasonxing@gmail.com, fw@strlen.de
Subject: Re: [PATCH 2/2 bpf v2] xsk: avoid data corruption on cq descriptor number
Date: Wed, 29 Oct 2025 16:22:45 -0700 [thread overview]
Message-ID: <20251029162245.5ea2ee3e@kernel.org> (raw)
In-Reply-To: <b21cf80c-5d69-4914-aa45-00f9527f3436@suse.de>
On Wed, 29 Oct 2025 08:51:58 +0100 Fernando Fernandez Mancera wrote:
> On 10/29/25 12:01 AM, Jakub Kicinski wrote:
> > On Tue, 28 Oct 2025 19:30:32 +0100 Fernando Fernandez Mancera wrote:
> >> Since commit 30f241fcf52a ("xsk: Fix immature cq descriptor
> >> production"), the descriptor number is stored in skb control block and
> >> xsk_cq_submit_addr_locked() relies on it to put the umem addrs onto
> >> pool's completion queue.
> >
> > Looking at the past discussion it sounds like you want to optimize
> > the single descriptor case? Can you not use a magic pointer for that?
> >
> > #define XSK_DESTRUCT_SINGLE_BUF (void *)1
> > destructor_arg = XSK_DESTRUCT_SINGLE_BUF
> >
> > Let's target this fix at net, please, I think the complexity here is
> > all in skbs paths.
>
> I might be missing something here but if the destructor_arg pointer is
> used to do this, where should we store the umem address associated with
> it? In the proposed approach the skb extension should not be increased
> for non-fragmented traffic as there is only a single descriptor and
> therefore we can store the umem address in destructor_arg directly.
I see. Pointers are always aligned to 8B, you can stash the "pointer
type" there. If the bottom bit is 1 it's a umem and the skb was
single-chunk. If it's non-0 then it's a full kmalloc'ed struct.
> The size of the skb extension will only increase for fragmented traffic
> (multiple descriptors).. but sure, if there is a fallback to the
> slowpath, it will burden a bit the performance. Although, for that to
> happen the must have tried to use AF_XDP family initially.. AFAICS, the
> size of skb extension is only increased when skb_ext_add() is called.
To be clear by adding an skb extension you are de-facto allocating
a bit in the skb struct. Just one of the bits of the active_extensions
field instead of a separate bitfield. If you can depend on the socket
association instead this is quite wasteful.
next prev parent reply other threads:[~2025-10-29 23:22 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-28 18:30 [PATCH 1/2 bpf v2] xdp: add XDP extension to skb Fernando Fernandez Mancera
2025-10-28 18:30 ` [PATCH 2/2 bpf v2] xsk: avoid data corruption on cq descriptor number Fernando Fernandez Mancera
2025-10-28 23:01 ` Jakub Kicinski
2025-10-29 7:51 ` Fernando Fernandez Mancera
2025-10-29 23:22 ` Jakub Kicinski [this message]
2025-10-30 8:38 ` Fernando Fernandez Mancera
2025-10-30 1:05 ` Jason Xing
2025-10-28 22:55 ` [PATCH 1/2 bpf v2] xdp: add XDP extension to skb Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251029162245.5ea2ee3e@kernel.org \
--to=kuba@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=fmancera@suse.de \
--cc=fw@strlen.de \
--cc=kerneljasonxing@gmail.com \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).