From: Stanislav Fomichev <stfomichev@gmail.com>
To: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, netdev@vger.kernel.org,
magnus.karlsson@intel.com, kerneljasonxing@gmail.com
Subject: Re: [PATCH bpf-next 2/3] xsk: remove @first_frag from xsk_build_skb()
Date: Mon, 22 Sep 2025 10:48:22 -0700 [thread overview]
Message-ID: <aNGL5qS8aIfcSDnD@mini-arch> (raw)
In-Reply-To: <20250922152600.2455136-3-maciej.fijalkowski@intel.com>
On 09/22, Maciej Fijalkowski wrote:
> Devices that set IFF_TX_SKB_NO_LINEAR will not execute branch that
> handles metadata, as we set @first_frag only for !IFF_TX_SKB_NO_LINEAR
> code in xsk_build_skb().
>
> Same functionality can be achieved with checking if xsk_get_num_desc()
> returns 0. To replace current usage of @first_frag with
> XSKCB(skb)->num_descs check, pull out the code from
> xsk_set_destructor_arg() that initializes sk_buff::cb and call it before
> skb_store_bits() in branch that creates skb against first processed
> frag. This so error path has the XSKCB(skb)->num_descs initialized and
> can free skb in case skb_store_bits() failed.
>
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> net/xdp/xsk.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index 72194f0a3fc0..064238400036 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -605,6 +605,13 @@ static u32 xsk_get_num_desc(struct sk_buff *skb)
> return XSKCB(skb)->num_descs;
> }
>
> +static void xsk_init_cb(struct sk_buff *skb)
> +{
> + BUILD_BUG_ON(sizeof(struct xsk_addr_head) > sizeof(skb->cb));
> + INIT_LIST_HEAD(&XSKCB(skb)->addrs_list);
> + XSKCB(skb)->num_descs = 0;
> +}
> +
> static void xsk_destruct_skb(struct sk_buff *skb)
> {
> struct xsk_tx_metadata_compl *compl = &skb_shinfo(skb)->xsk_meta;
> @@ -620,9 +627,6 @@ static void xsk_destruct_skb(struct sk_buff *skb)
>
> static void xsk_set_destructor_arg(struct sk_buff *skb, u64 addr)
> {
> - BUILD_BUG_ON(sizeof(struct xsk_addr_head) > sizeof(skb->cb));
> - INIT_LIST_HEAD(&XSKCB(skb)->addrs_list);
> - XSKCB(skb)->num_descs = 0;
> skb_shinfo(skb)->destructor_arg = (void *)(uintptr_t)addr;
> }
>
> @@ -672,7 +676,7 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
> return ERR_PTR(err);
>
> skb_reserve(skb, hr);
> -
> + xsk_init_cb(skb);
> xsk_set_destructor_arg(skb, desc->addr);
> } else {
> xsk_addr = kmem_cache_zalloc(xsk_tx_generic_cache, GFP_KERNEL);
> @@ -725,7 +729,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> struct xsk_tx_metadata *meta = NULL;
> struct net_device *dev = xs->dev;
> struct sk_buff *skb = xs->skb;
> - bool first_frag = false;
> int err;
>
> if (dev->priv_flags & IFF_TX_SKB_NO_LINEAR) {
> @@ -742,8 +745,6 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> len = desc->len;
>
> if (!skb) {
> - first_frag = true;
> -
> hr = max(NET_SKB_PAD, L1_CACHE_ALIGN(dev->needed_headroom));
> tr = dev->needed_tailroom;
> skb = sock_alloc_send_skb(&xs->sk, hr + len + tr, 1, &err);
> @@ -752,6 +753,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
>
> skb_reserve(skb, hr);
> skb_put(skb, len);
> + xsk_init_cb(skb);
>
> err = skb_store_bits(skb, 0, buffer, len);
> if (unlikely(err))
> @@ -797,7 +799,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> list_add_tail(&xsk_addr->addr_node, &XSKCB(skb)->addrs_list);
> }
>
> - if (first_frag && desc->options & XDP_TX_METADATA) {
> + if (!xsk_get_num_desc(skb) && desc->options & XDP_TX_METADATA) {
> if (unlikely(xs->pool->tx_metadata_len == 0)) {
> err = -EINVAL;
> goto free_err;
> @@ -839,7 +841,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
> return skb;
>
> free_err:
> - if (first_frag && skb)
[..]
> + if (skb && !xsk_get_num_desc(skb))
> kfree_skb(skb);
>
> if (err == -EOVERFLOW) {
For IFF_TX_SKB_NO_LINEAR case, the 'goto free_err' is super confusing.
xsk_build_skb_zerocopy either returns skb or an IS_ERR. Can we
add a separate label to jump directly to 'if err == -EOVERFLOW' for
the IFF_TX_SKB_NO_LINEAR case?
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 72e34bd2d925..f56182c61c99 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -732,7 +732,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
skb = xsk_build_skb_zerocopy(xs, desc);
if (IS_ERR(skb)) {
err = PTR_ERR(skb);
- goto free_err;
+ goto out;
}
} else {
u32 hr, tr, len;
@@ -842,6 +842,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
if (first_frag && skb)
kfree_skb(skb);
+out:
if (err == -EOVERFLOW) {
/* Drop the packet */
xsk_inc_num_desc(xs->skb);
After that, it seems we can look at skb_shinfo(skb)->nr_frags? Instead
of adding new xsk_init_cb, seems more robust?
next prev parent reply other threads:[~2025-09-22 17:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-22 15:25 [PATCH bpf-next 0/3] xsk: refactors around generic xmit side Maciej Fijalkowski
2025-09-22 15:25 ` [PATCH bpf-next 1/3] xsk: avoid overwriting skb fields for multi-buffer traffic Maciej Fijalkowski
2025-09-22 17:25 ` Stanislav Fomichev
2025-09-23 8:39 ` Jason Xing
2025-09-23 15:43 ` Maciej Fijalkowski
2025-09-22 15:25 ` [PATCH bpf-next 2/3] xsk: remove @first_frag from xsk_build_skb() Maciej Fijalkowski
2025-09-22 17:48 ` Stanislav Fomichev [this message]
2025-09-23 9:25 ` Jason Xing
2025-09-23 9:36 ` Jason Xing
2025-09-24 14:35 ` Maciej Fijalkowski
2025-09-25 0:17 ` Jason Xing
2025-09-25 8:17 ` Jason Xing
2025-09-22 15:26 ` [PATCH bpf-next 3/3] xsk: wrap generic metadata handling onto separate function Maciej Fijalkowski
2025-09-22 17:51 ` Stanislav Fomichev
2025-09-23 9:43 ` Jason Xing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aNGL5qS8aIfcSDnD@mini-arch \
--to=stfomichev@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kerneljasonxing@gmail.com \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.