From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: netdev@vger.kernel.org
Cc: bpf@vger.kernel.org, magnus.karlsson@intel.com,
stfomichev@gmail.com, kuba@kernel.org, pabeni@redhat.com,
horms@kernel.org, larysa.zaremba@intel.com,
aleksander.lobakin@intel.com, bjorn@kernel.org,
Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
Stanislav Fomichev <sdf@fomichev.me>
Subject: [PATCH v5 net 02/11] xsk: respect tailroom for ZC setups
Date: Tue, 31 Mar 2026 17:02:04 +0200 [thread overview]
Message-ID: <20260331150213.550797-3-maciej.fijalkowski@intel.com> (raw)
In-Reply-To: <20260331150213.550797-1-maciej.fijalkowski@intel.com>
Multi-buffer XDP stores information about frags in skb_shared_info that
sits at the tailroom of a packet. The storage space is reserved via
xdp_data_hard_end():
((xdp)->data_hard_start + (xdp)->frame_sz - \
SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
and then we refer to it via macro below:
static inline struct skb_shared_info *
xdp_get_shared_info_from_buff(const struct xdp_buff *xdp)
{
return (struct skb_shared_info *)xdp_data_hard_end(xdp);
}
Currently we do not respect this tailroom space in multi-buffer AF_XDP
ZC scenario. To address this, introduce xsk_pool_get_tailroom() and use
it within xsk_pool_get_rx_frame_size() which is used in ZC drivers to
configure length of HW Rx buffer.
xsk_pool_get_tailroom() is only reserving necessary space when pool is
zc and underlying netdev supports zc multi-buffer. Rely on umem->zc
state when configuring tailroom. xsk_pool_get_rx_frame_size() is going
to be used in further MTU validation so move setting of umem->zc before
ndo_bpf() call and on error path clear only when there are no other
users of umem, as we want to preserve the setting for other active
sockets already bound to this entity.
Typically drivers on Rx Hw buffers side work on 128 byte alignment so
let us align the value returned by xsk_pool_get_rx_frame_size() in order
to avoid addressing this on driver's side.
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
include/net/xdp_sock_drv.h | 17 ++++++++++++++++-
net/xdp/xsk_buff_pool.c | 4 +++-
2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 6b9ebae2dc95..cd9eeff536a6 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -41,6 +41,19 @@ static inline u32 xsk_pool_get_headroom(struct xsk_buff_pool *pool)
return XDP_PACKET_HEADROOM + pool->headroom;
}
+static inline u32 xsk_pool_get_tailroom(struct xsk_buff_pool *pool)
+{
+ struct xdp_umem *umem = pool->umem;
+
+ /* Reserve tailroom only for zero-copy pools that opted into
+ * multi-buffer. The reserved area is used for skb_shared_info,
+ * matching the XDP core's xdp_data_hard_end() layout.
+ */
+ if (umem->zc && (umem->flags & XDP_UMEM_SG_FLAG))
+ return SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ return 0;
+}
+
static inline u32 xsk_pool_get_chunk_size(struct xsk_buff_pool *pool)
{
return pool->chunk_size;
@@ -48,7 +61,9 @@ static inline u32 xsk_pool_get_chunk_size(struct xsk_buff_pool *pool)
static inline u32 xsk_pool_get_rx_frame_size(struct xsk_buff_pool *pool)
{
- return xsk_pool_get_chunk_size(pool) - xsk_pool_get_headroom(pool);
+ return ALIGN_DOWN(xsk_pool_get_chunk_size(pool) -
+ xsk_pool_get_headroom(pool) -
+ xsk_pool_get_tailroom(pool), 128);
}
static inline u32 xsk_pool_get_rx_frag_step(struct xsk_buff_pool *pool)
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 37b7a68b89b3..0f40bee606d3 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -200,6 +200,7 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
goto err_unreg_pool;
}
+ pool->umem->zc = true;
if (netdev->xdp_zc_max_segs == 1 && (flags & XDP_USE_SG)) {
err = -EOPNOTSUPP;
goto err_unreg_pool;
@@ -224,13 +225,14 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
err = -EINVAL;
goto err_unreg_xsk;
}
- pool->umem->zc = true;
pool->xdp_zc_max_segs = netdev->xdp_zc_max_segs;
return 0;
err_unreg_xsk:
xp_disable_drv_zc(pool);
err_unreg_pool:
+ if (refcount_read(&pool->umem->users) == 1)
+ pool->umem->zc = false;
if (!force_zc)
err = 0; /* fallback to copy mode */
if (err) {
--
2.43.0
next prev parent reply other threads:[~2026-03-31 15:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-31 15:02 [PATCH v5 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 01/11] xsk: tighten UMEM headroom validation to account for tailroom and min frame Maciej Fijalkowski
2026-03-31 15:02 ` Maciej Fijalkowski [this message]
2026-03-31 15:02 ` [PATCH v5 net 03/11] xsk: fix XDP_UMEM_SG_FLAG issues Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 04/11] xsk: validate MTU against usable frame size on bind Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 05/11] selftests: bpf: introduce a common routine for reading procfs Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 06/11] selftests: bpf: fix pkt grow tests Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 07/11] selftests: bpf: have a separate variable for drop test Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 08/11] selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 09/11] idpf: remove xsk frame size check against alignment Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 10/11] igc: remove home-grown xsk's frame size validation Maciej Fijalkowski
2026-03-31 15:02 ` [PATCH v5 net 11/11] gve: " Maciej Fijalkowski
2026-04-02 11:12 ` [PATCH v5 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260331150213.550797-3-maciej.fijalkowski@intel.com \
--to=maciej.fijalkowski@intel.com \
--cc=aleksander.lobakin@intel.com \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=stfomichev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox