From: Jason Xing <kerneljasonxing@gmail.com>
To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, bjorn@kernel.org, magnus.karlsson@intel.com,
maciej.fijalkowski@intel.com, jonathan.lemon@gmail.com,
sdf@fomichev.me, ast@kernel.org, daniel@iogearbox.net,
hawk@kernel.org, john.fastabend@gmail.com
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
Jason Xing <kernelxing@tencent.com>
Subject: [PATCH RFC net-next v4 14/14] xsk: optimize xsk_build_skb for batch copy-mode fast path
Date: Wed, 15 Apr 2026 16:26:54 +0800 [thread overview]
Message-ID: <20260415082654.21026-15-kerneljasonxing@gmail.com> (raw)
In-Reply-To: <20260415082654.21026-1-kerneljasonxing@gmail.com>
From: Jason Xing <kernelxing@tencent.com>
Three targeted optimizations for the batch copy-mode TX hot path:
Replace skb_store_bits() with memcpy() for single-buffer first-desc
path. After skb_reserve() + skb_put(), the SKB is freshly allocated
with all data in the linear area and no frags, so skb_store_bits()
degenerates to memcpy(skb->data, buffer, len) but carries unnecessary
function call overhead, offset validation, and frag iteration logic.
Inline UMEM address computation in Phase 3 and pass the pre-computed
buffer pointer to xsk_build_skb(), avoiding the per-packet non-inlined
xp_raw_get_data() (EXPORT_SYMBOL) call chain:
xsk_buff_raw_get_data -> xp_raw_get_data -> __xp_raw_get_addr +
__xp_raw_get_data.
In the batch loop the pool->addrs and pool->unaligned are invariant,
so we cache them once and compute each buffer address inline.
Prefetch the *next* descriptor's UMEM data buffer at the top of the
Phase 3 loop, hiding the memory latency of the upcoming memcpy.
It improves 3-4% performance stably.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
include/net/xdp_sock.h | 3 ++-
net/core/skbuff.c | 18 ++++++++++++++++--
net/xdp/xsk.c | 15 ++++++---------
3 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index 0609e3b04279..5e05236c7fba 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -139,7 +139,8 @@ void __xsk_map_flush(struct list_head *flush_list);
INDIRECT_CALLABLE_DECLARE(void xsk_destruct_skb(struct sk_buff *));
struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
struct sk_buff *allocated_skb,
- struct xdp_desc *desc);
+ struct xdp_desc *desc,
+ void *buffer);
int xsk_alloc_batch_skb(struct xdp_sock *xs, u32 nb_pkts, u32 nb_descs, int *err);
int xsk_direct_xmit_batch(struct xdp_sock *xs, struct net_device *dev);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5726b1566b2b..bef5270e6332 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -752,14 +752,28 @@ int xsk_alloc_batch_skb(struct xdp_sock *xs, u32 nb_pkts, u32 nb_descs, int *err
if (total_truesize)
refcount_add(total_truesize, &xs->sk.sk_wmem_alloc);
- /* Phase 3: Build SKBs with packet data */
+ /* Phase 3: Build SKBs with packet data. */
+ struct xsk_buff_pool *pool = xs->pool;
+ void *pool_addrs = pool->addrs;
+ bool unaligned = pool->unaligned;
+
for (j = 0; j < alloc_descs; j++) {
+ u64 addr = descs[j].addr;
+ void *buffer;
+
+ if (unaligned)
+ addr = xp_unaligned_add_offset_to_addr(addr);
+ buffer = pool_addrs + addr;
+
+ if (j + 1 < alloc_descs)
+ prefetch(pool_addrs + descs[j + 1].addr);
+
if (!xs->skb) {
skb = skbs[skb_count - 1 - k];
k++;
}
- skb = xsk_build_skb(xs, skb, &descs[j]);
+ skb = xsk_build_skb(xs, skb, &descs[j], buffer);
if (IS_ERR(skb)) {
*err = PTR_ERR(skb);
break;
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index be341290e42c..3bf81b838075 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -811,7 +811,8 @@ static struct sk_buff *xsk_build_skb_zerocopy(struct xdp_sock *xs,
struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
struct sk_buff *allocated_skb,
- struct xdp_desc *desc)
+ struct xdp_desc *desc,
+ void *buffer)
{
struct net_device *dev = xs->dev;
struct sk_buff *skb = xs->skb;
@@ -825,11 +826,10 @@ struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
goto free_err;
}
} else {
- u32 hr, tr, len;
- void *buffer;
+ u32 hr, tr, len = desc->len;
- buffer = xsk_buff_raw_get_data(xs->pool, desc->addr);
- len = desc->len;
+ if (!buffer)
+ buffer = xsk_buff_raw_get_data(xs->pool, desc->addr);
if (!skb) {
hr = max(NET_SKB_PAD, L1_CACHE_ALIGN(dev->needed_headroom));
@@ -844,10 +844,7 @@ struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
skb_reserve(skb, hr);
skb_put(skb, len);
-
- err = skb_store_bits(skb, 0, buffer, len);
- if (unlikely(err))
- goto free_err;
+ memcpy(skb->data, buffer, len);
xsk_skb_init_misc(skb, xs, desc->addr);
if (desc->options & XDP_TX_METADATA) {
--
2.41.3
prev parent reply other threads:[~2026-04-15 8:28 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 8:26 [PATCH RFC net-next v4 00/14] xsk: batch xmit in copy mode Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 01/14] xsk: introduce XDP_GENERIC_XMIT_BATCH setsockopt Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 02/14] xsk: extend xsk_build_skb() to support passing an already allocated skb Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 03/14] xsk: add xsk_alloc_batch_skb() to build skbs in batch Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 04/14] xsk: cache data buffers to avoid frequently calling kmalloc_reserve Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 05/14] xsk: add direct xmit in batch function Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 06/14] xsk: support dynamic xmit.more control for batch xmit Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 07/14] xsk: try to skip validating skb list in xmit path Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 08/14] xsk: rename nb_pkts to nb_descs in xsk_tx_peek_release_desc_batch Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 09/14] xsk: extend xskq_cons_read_desc_batch to count nb_pkts Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 10/14] xsk: extend xsk_cq_reserve_locked() to reserve n slots Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 11/14] xsk: support batch xmit main logic Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 12/14] xsk: separate read-mostly and write-heavy fields in xsk_buff_pool Jason Xing
2026-04-15 8:26 ` [PATCH RFC net-next v4 13/14] xsk: retire old xmit path in copy mode Jason Xing
2026-04-15 8:26 ` Jason Xing [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260415082654.21026-15-kerneljasonxing@gmail.com \
--to=kerneljasonxing@gmail.com \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jonathan.lemon@gmail.com \
--cc=kernelxing@tencent.com \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox