* [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation
@ 2026-03-26 11:49 Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 01/11] xsk: tighten UMEM headroom validation to account for tailroom and min frame Maciej Fijalkowski
` (10 more replies)
0 siblings, 11 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
v3->v4:
- allow exact 128 bytes of space when user defined headroom is deducted
from total frame size
- provide a routine for reading procfs entries within xskxceiver
* use it to fetch cache line size and calculate skb_shared_info size
on our own
- clean up gve and igc xsk pool enablement routines
- include mtu vs frame size * max zc segments validation in
xp_assign_dev()
v2->v3:
- add tags from Bjorn/Stan
- provide at least 128 bytes instead ETH_ZLEN when validating frame
headroom
* this way we can get rid of i40e/ice changes
* make sure xsk_pool_get_rx_frame_size() returns value 128b-aligned
* and remove pre-check from idpf
- separate XDP_UMEM_SG_FLAG fixes from MTU validation in xsk_bind()
- make drop_idx a local variable in xsk's xdp drop prog
- adjust rx_dropped to new 128b related values
- move ugly placed define (Bjorn)
- remove READ_ONCE when fetching netdev->mtu (Bjorn)
v1->v2:
- remove xsk_pool_get_tailroom() definition for !CONFIG_XDP_SOCKETS
(Stan)
- do not rely on pool->umem->zc when configuring tailroom (Stan, Bjorn)
- simplify dbuff setting in ZC drivers (Bjorn)
- use defines for {head,tail}room in tests (Bjorn)
- return EINVAL instead of EOPNOTSUPP when mtu setting is wrong (Bjorn)
- include vlan headers and fcs length when validating mtu (Olek)
- tighten umem headroom validation when registering umem (Sashiko AI)
- set XDP_USE_SG in xp_assign_dev_shared() (Sashiko AI)
- adjust rx dropped xskxceiver test
Hi,
here we fix a long-standing issue regarding multi-buffer scenario in ZC
mode - we have not been providing space at the end of the buffer where
multi-buffer XDP works on skb_shared_info. This has been brought to our
attention via [0].
Unaligned mode does not get any specific treatment, it is user's
responsibility to properly handle XSK addresses in queues.
With adjustments included here in this set against xskxceiver I have
been able to pass the full test suite on ice.
Thanks,
Maciej
[0]: https://community.intel.com/t5/Ethernet-Products/X710-XDP-Packet-Corruption-Issue-DRV-MODE-Zero-Copy-Multi-Buffer/m-p/1724208
Maciej Fijalkowski (11):
xsk: tighten UMEM headroom validation to account for tailroom and min
frame
xsk: respect tailroom for ZC setups
xsk: fix XDP_UMEM_SG_FLAG issues
xsk: validate MTU against usable frame size on bind
selftests: bpf: introduce a common routine for reading procfs
selftests: bpf: fix pkt grow tests
selftests: bpf: have a separate variable for drop test
selftests: bpf: adjust rx_dropped xskxceiver's test to respect
tailroom
idpf: remove xsk frame size check against alignment
igc: remove home-grown xsk's frame size validation
gve: remove home-grown xsk's frame size validation
drivers/net/ethernet/google/gve/gve_main.c | 5 --
drivers/net/ethernet/intel/idpf/xsk.c | 10 ----
drivers/net/ethernet/intel/igc/igc_xdp.c | 11 ----
include/net/xdp_sock.h | 2 +-
include/net/xdp_sock_drv.h | 17 +++++-
net/xdp/xdp_umem.c | 3 +-
net/xdp/xsk_buff_pool.c | 11 +++-
.../selftests/bpf/prog_tests/test_xsk.c | 55 +++++++++----------
.../selftests/bpf/prog_tests/test_xsk.h | 2 +
.../selftests/bpf/progs/xsk_xdp_progs.c | 4 +-
tools/testing/selftests/bpf/xskxceiver.c | 44 +++++++++++++++
11 files changed, 104 insertions(+), 60 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v4 net 01/11] xsk: tighten UMEM headroom validation to account for tailroom and min frame
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 02/11] xsk: respect tailroom for ZC setups Maciej Fijalkowski
` (9 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski,
Stanislav Fomichev
The current headroom validation in xdp_umem_reg() could leave us with
insufficient space dedicated to even receive minimum-sized ethernet
frame. Furthermore if multi-buffer would come to play then
skb_shared_info stored at the end of XSK frame would be corrupted.
HW typically works with 128-aligned sizes so let us provide this value
as bare minimum.
Multi-buffer setting is known later in the configuration process so
besides accounting for 128 bytes, let us also take care of tailroom space
upfront.
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Fixes: 99e3a236dd43 ("xsk: Add missing check on user supplied headroom size")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
net/xdp/xdp_umem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
index 066ce07c506d..58da2f4f4397 100644
--- a/net/xdp/xdp_umem.c
+++ b/net/xdp/xdp_umem.c
@@ -203,7 +203,8 @@ static int xdp_umem_reg(struct xdp_umem *umem, struct xdp_umem_reg *mr)
if (!unaligned_chunks && chunks_rem)
return -EINVAL;
- if (headroom >= chunk_size - XDP_PACKET_HEADROOM)
+ if (headroom > chunk_size - XDP_PACKET_HEADROOM -
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) - 128)
return -EINVAL;
if (mr->flags & XDP_UMEM_TX_METADATA_LEN) {
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 02/11] xsk: respect tailroom for ZC setups
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 01/11] xsk: tighten UMEM headroom validation to account for tailroom and min frame Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 03/11] xsk: fix XDP_UMEM_SG_FLAG issues Maciej Fijalkowski
` (8 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski,
Stanislav Fomichev
Multi-buffer XDP stores information about frags in skb_shared_info that
sits at the tailroom of a packet. The storage space is reserved via
xdp_data_hard_end():
((xdp)->data_hard_start + (xdp)->frame_sz - \
SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
and then we refer to it via macro below:
static inline struct skb_shared_info *
xdp_get_shared_info_from_buff(const struct xdp_buff *xdp)
{
return (struct skb_shared_info *)xdp_data_hard_end(xdp);
}
Currently we do not respect this tailroom space in multi-buffer AF_XDP
ZC scenario. To address this, introduce xsk_pool_get_tailroom() and use
it within xsk_pool_get_rx_frame_size() which is used in ZC drivers to
configure length of HW Rx buffer.
xsk_pool_get_tailroom() is only reserving necessary space when pool is
zc and underlying netdev supports zc multi-buffer. Rely on pool->dev
state when configuring tailroom. xsk_pool_get_rx_frame_size() inside
ndo_bpf is usually called when bringing up queues and before xsk's dma
mappings have been configured, which makes it valid to rely on
pool->dev.
Typically drivers on Rx Hw buffers side work on 128 byte alignment so
let us align the value returned by xsk_pool_get_rx_frame_size() in order
to avoid addressing this on driver's side. This addresses the fact that
idpf uses mentioned function *before* pool->dev being set so we were at
risk that after subtracting tailroom we would not provide 128-byte
aligned value to HW.
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
include/net/xdp_sock_drv.h | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
index 6b9ebae2dc95..dcf811c45b22 100644
--- a/include/net/xdp_sock_drv.h
+++ b/include/net/xdp_sock_drv.h
@@ -41,6 +41,19 @@ static inline u32 xsk_pool_get_headroom(struct xsk_buff_pool *pool)
return XDP_PACKET_HEADROOM + pool->headroom;
}
+static inline u32 xsk_pool_get_tailroom(struct xsk_buff_pool *pool)
+{
+ struct xdp_umem *umem = pool->umem;
+
+ /* Reserve tailroom only for zero-copy pools that opted into
+ * multi-buffer. The reserved area is used for skb_shared_info,
+ * matching the XDP core's xdp_data_hard_end() layout.
+ */
+ if (pool->dev && (umem->flags & XDP_UMEM_SG_FLAG))
+ return SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+ return 0;
+}
+
static inline u32 xsk_pool_get_chunk_size(struct xsk_buff_pool *pool)
{
return pool->chunk_size;
@@ -48,7 +61,9 @@ static inline u32 xsk_pool_get_chunk_size(struct xsk_buff_pool *pool)
static inline u32 xsk_pool_get_rx_frame_size(struct xsk_buff_pool *pool)
{
- return xsk_pool_get_chunk_size(pool) - xsk_pool_get_headroom(pool);
+ return ALIGN_DOWN(xsk_pool_get_chunk_size(pool) -
+ xsk_pool_get_headroom(pool) -
+ xsk_pool_get_tailroom(pool), 128);
}
static inline u32 xsk_pool_get_rx_frag_step(struct xsk_buff_pool *pool)
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 03/11] xsk: fix XDP_UMEM_SG_FLAG issues
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 01/11] xsk: tighten UMEM headroom validation to account for tailroom and min frame Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 02/11] xsk: respect tailroom for ZC setups Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 04/11] xsk: validate MTU against usable frame size on bind Maciej Fijalkowski
` (7 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Currently xp_assign_dev_shared() is missing XDP_USE_SG being propagated
to flags so set it in order to preserve mtu check that is supposed to be
done only when no multi-buffer setup is in picture.
Also, this flag has the same value as XDP_UMEM_TX_SW_CSUM so we could
get unexpected SG setups for software Tx checksums. Since csum flag is
UAPI, modify value of XDP_UMEM_SG_FLAG.
Fixes: d609f3d228a8 ("xsk: add multi-buffer support for sockets sharing umem")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
include/net/xdp_sock.h | 2 +-
net/xdp/xsk_buff_pool.c | 4 ++++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
index 23e8861e8b25..ebac60a3d8a1 100644
--- a/include/net/xdp_sock.h
+++ b/include/net/xdp_sock.h
@@ -14,7 +14,7 @@
#include <linux/mm.h>
#include <net/sock.h>
-#define XDP_UMEM_SG_FLAG (1 << 1)
+#define XDP_UMEM_SG_FLAG BIT(3)
struct net_device;
struct xsk_queue;
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 37b7a68b89b3..729602a3cec0 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -247,6 +247,10 @@ int xp_assign_dev_shared(struct xsk_buff_pool *pool, struct xdp_sock *umem_xs,
struct xdp_umem *umem = umem_xs->umem;
flags = umem->zc ? XDP_ZEROCOPY : XDP_COPY;
+
+ if (umem->flags & XDP_UMEM_SG_FLAG)
+ flags |= XDP_USE_SG;
+
if (umem_xs->pool->uses_need_wakeup)
flags |= XDP_USE_NEED_WAKEUP;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 04/11] xsk: validate MTU against usable frame size on bind
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (2 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 03/11] xsk: fix XDP_UMEM_SG_FLAG issues Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 05/11] selftests: bpf: introduce a common routine for reading procfs Maciej Fijalkowski
` (6 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
AF_XDP bind currently accepts zero-copy pool configurations without
verifying that the device MTU fits into the usable frame space provided
by the UMEM chunk.
This becomes a problem since we started to respect tailroom which is
subtracted from chunk_size (among with headroom). 2k chunk size might
not provide enough space for standard 1500 MTU, so let us catch such
settings at bind time. Furthermore, validate whether underlying HW will
be able to satisfy configured MTU wrt XSK's frame size multiplied by
supported Rx buffer chain length (that is exposed via
net_device::xdp_zc_max_segs).
Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
net/xdp/xsk_buff_pool.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index 729602a3cec0..a539e292de6c 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -10,6 +10,8 @@
#include "xdp_umem.h"
#include "xsk.h"
+#define ETH_PAD_LEN (ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN)
+
void xp_add_xsk(struct xsk_buff_pool *pool, struct xdp_sock *xs)
{
if (!xs->tx)
@@ -157,6 +159,9 @@ static void xp_disable_drv_zc(struct xsk_buff_pool *pool)
int xp_assign_dev(struct xsk_buff_pool *pool,
struct net_device *netdev, u16 queue_id, u16 flags)
{
+ u32 frame_size = xsk_pool_get_rx_frame_size(pool);
+ u32 needed = netdev->mtu + ETH_PAD_LEN;
+ u32 segs = netdev->xdp_zc_max_segs;
bool force_zc, force_copy;
struct netdev_bpf bpf;
int err = 0;
@@ -200,7 +205,7 @@ int xp_assign_dev(struct xsk_buff_pool *pool,
goto err_unreg_pool;
}
- if (netdev->xdp_zc_max_segs == 1 && (flags & XDP_USE_SG)) {
+ if (needed > frame_size * segs) {
err = -EOPNOTSUPP;
goto err_unreg_pool;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 05/11] selftests: bpf: introduce a common routine for reading procfs
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (3 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 04/11] xsk: validate MTU against usable frame size on bind Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 06/11] selftests: bpf: fix pkt grow tests Maciej Fijalkowski
` (5 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Parametrize current way of getting MAX_SKB_FRAGS value from procfs so
that it can be re-used to get cache line size of system's CPU. All that
just to mimic and compute size of kernel's struct skb_shared_info which
for xsk and test suite interpret as tailroom.
Introduce two variables to ifobject struct that will carry count of skb
frags and tailroom size. Do the reading and computing once, at the
beginning of test suite execution.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
.../selftests/bpf/prog_tests/test_xsk.c | 25 +----------
.../selftests/bpf/prog_tests/test_xsk.h | 2 +
tools/testing/selftests/bpf/xskxceiver.c | 44 +++++++++++++++++++
3 files changed, 47 insertions(+), 24 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_xsk.c b/tools/testing/selftests/bpf/prog_tests/test_xsk.c
index 7e38ec6e656b..62118ffba661 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_xsk.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_xsk.c
@@ -179,25 +179,6 @@ int xsk_configure_socket(struct xsk_socket_info *xsk, struct xsk_umem_info *umem
return xsk_socket__create(&xsk->xsk, ifobject->ifindex, 0, umem->umem, rxr, txr, &cfg);
}
-#define MAX_SKB_FRAGS_PATH "/proc/sys/net/core/max_skb_frags"
-static unsigned int get_max_skb_frags(void)
-{
- unsigned int max_skb_frags = 0;
- FILE *file;
-
- file = fopen(MAX_SKB_FRAGS_PATH, "r");
- if (!file) {
- ksft_print_msg("Error opening %s\n", MAX_SKB_FRAGS_PATH);
- return 0;
- }
-
- if (fscanf(file, "%u", &max_skb_frags) != 1)
- ksft_print_msg("Error reading %s\n", MAX_SKB_FRAGS_PATH);
-
- fclose(file);
- return max_skb_frags;
-}
-
static int set_ring_size(struct ifobject *ifobj)
{
int ret;
@@ -2242,11 +2223,7 @@ int testapp_too_many_frags(struct test_spec *test)
if (test->mode == TEST_MODE_ZC) {
max_frags = test->ifobj_tx->xdp_zc_max_segs;
} else {
- max_frags = get_max_skb_frags();
- if (!max_frags) {
- ksft_print_msg("Can't get MAX_SKB_FRAGS from system, using default (17)\n");
- max_frags = 17;
- }
+ max_frags = test->ifobj_tx->max_skb_frags;
max_frags += 1;
}
diff --git a/tools/testing/selftests/bpf/prog_tests/test_xsk.h b/tools/testing/selftests/bpf/prog_tests/test_xsk.h
index 8fc78a057de0..55d808eeabc5 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_xsk.h
+++ b/tools/testing/selftests/bpf/prog_tests/test_xsk.h
@@ -115,6 +115,8 @@ struct ifobject {
int mtu;
u32 bind_flags;
u32 xdp_zc_max_segs;
+ u32 umem_tailroom;
+ u32 max_skb_frags;
bool tx_on;
bool rx_on;
bool use_poll;
diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
index 05b3cebc5ca9..2623cf4dd2c5 100644
--- a/tools/testing/selftests/bpf/xskxceiver.c
+++ b/tools/testing/selftests/bpf/xskxceiver.c
@@ -80,6 +80,7 @@
#include <linux/mman.h>
#include <linux/netdev.h>
#include <linux/ethtool.h>
+#include <linux/align.h>
#include <arpa/inet.h>
#include <net/if.h>
#include <locale.h>
@@ -101,6 +102,9 @@
#include <network_helpers.h>
+#define MAX_SKB_FRAGS_PATH "/proc/sys/net/core/max_skb_frags"
+#define SMP_CACHE_BYTES_PATH "/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size"
+
static bool opt_print_tests;
static enum test_mode opt_mode = TEST_MODE_ALL;
static u32 opt_run_test = RUN_ALL_TESTS;
@@ -330,9 +334,28 @@ static void print_tests(void)
printf("%u: %s\n", i, ci_skip_tests[i - ARRAY_SIZE(tests)].name);
}
+static unsigned int read_procfs_val(const char *path)
+{
+ unsigned int read_val = 0;
+ FILE *file;
+
+ file = fopen(path, "r");
+ if (!file) {
+ ksft_print_msg("Error opening %s\n", path);
+ return 0;
+ }
+
+ if (fscanf(file, "%u", &read_val) != 1)
+ ksft_print_msg("Error reading %s\n", path);
+
+ fclose(file);
+ return read_val;
+}
+
int main(int argc, char **argv)
{
const size_t total_tests = ARRAY_SIZE(tests) + ARRAY_SIZE(ci_skip_tests);
+ u32 cache_line_size, max_frags, umem_tailroom;
struct pkt_stream *rx_pkt_stream_default;
struct pkt_stream *tx_pkt_stream_default;
struct ifobject *ifobj_tx, *ifobj_rx;
@@ -354,6 +377,27 @@ int main(int argc, char **argv)
setlocale(LC_ALL, "");
+ cache_line_size = read_procfs_val(SMP_CACHE_BYTES_PATH);
+ if (!cache_line_size) {
+ ksft_print_msg("Can't get SMP_CACHE_BYTES from system, using default (64)\n");
+ cache_line_size = 64;
+ }
+
+ max_frags = read_procfs_val(MAX_SKB_FRAGS_PATH);
+ if (!max_frags) {
+ ksft_print_msg("Can't get MAX_SKB_FRAGS from system, using default (17)\n");
+ max_frags = 17;
+ }
+ ifobj_tx->max_skb_frags = max_frags;
+ ifobj_rx->max_skb_frags = max_frags;
+
+ /* 48 bytes is a part of skb_shared_info w/o frags array;
+ * 16 bytes is sizeof(skb_frag_t)
+ */
+ umem_tailroom = ALIGN(48 + (max_frags * 16), cache_line_size);
+ ifobj_tx->umem_tailroom = umem_tailroom;
+ ifobj_rx->umem_tailroom = umem_tailroom;
+
parse_command_line(ifobj_tx, ifobj_rx, argc, argv);
if (opt_print_tests) {
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 06/11] selftests: bpf: fix pkt grow tests
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (4 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 05/11] selftests: bpf: introduce a common routine for reading procfs Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 07/11] selftests: bpf: have a separate variable for drop test Maciej Fijalkowski
` (4 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Skip tail adjust tests in xskxceiver for SKB mode as it is not very
friendly for it. multi-buffer case does not work as xdp_rxq_info that is
registered for generic XDP does not report ::frag_size. The non-mbuf
path copies packet via skb_pp_cow_data() which only accounts for
headroom, leaving us with no tailroom and causing underlying XDP prog to
drop packets therefore.
For multi-buffer test on other modes, change the amount of bytes we use
for growth, assume worst-case scenario and take care of headroom and
tailroom.
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
.../selftests/bpf/prog_tests/test_xsk.c | 24 ++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_xsk.c b/tools/testing/selftests/bpf/prog_tests/test_xsk.c
index 62118ffba661..ee60bcc22ee4 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_xsk.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_xsk.c
@@ -2528,16 +2528,34 @@ int testapp_adjust_tail_shrink_mb(struct test_spec *test)
int testapp_adjust_tail_grow(struct test_spec *test)
{
+ if (test->mode == TEST_MODE_SKB)
+ return TEST_SKIP;
+
/* Grow by 4 bytes for testing purpose */
return testapp_adjust_tail(test, 4, MIN_PKT_SIZE * 2);
}
int testapp_adjust_tail_grow_mb(struct test_spec *test)
{
+ u32 grow_size;
+
+ if (test->mode == TEST_MODE_SKB)
+ return TEST_SKIP;
+
+ /* worst case scenario is when underlying setup will work on 3k
+ * buffers, let us account for it; given that we will use 6k as
+ * pkt_len, expect that it will be broken down to 2 descs each
+ * with 3k payload;
+ *
+ * 4k is truesize, 3k payload, 256 HR, 320 TR;
+ */
+ grow_size = XSK_UMEM__MAX_FRAME_SIZE -
+ XSK_UMEM__LARGE_FRAME_SIZE -
+ XDP_PACKET_HEADROOM -
+ test->ifobj_tx->umem_tailroom;
test->mtu = MAX_ETH_JUMBO_SIZE;
- /* Grow by (frag_size - last_frag_Size) - 1 to stay inside the last fragment */
- return testapp_adjust_tail(test, (XSK_UMEM__MAX_FRAME_SIZE / 2) - 1,
- XSK_UMEM__LARGE_FRAME_SIZE * 2);
+
+ return testapp_adjust_tail(test, grow_size, XSK_UMEM__LARGE_FRAME_SIZE * 2);
}
int testapp_tx_queue_consumer(struct test_spec *test)
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 07/11] selftests: bpf: have a separate variable for drop test
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (5 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 06/11] selftests: bpf: fix pkt grow tests Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 08/11] selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom Maciej Fijalkowski
` (3 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Currently two different XDP programs share a static variable for
different purposes (picking where to redirect on shared umem test &
whether to drop a packet). This can be a problem when running full test
suite - idx can be written by shared umem test and this value can cause
a false behavior within XDP drop half test.
Introduce a dedicated variable for drop half test so that these two
don't step on each other toes. There is no real need for using
__sync_fetch_and_add here as XSK tests are executed on single CPU.
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
tools/testing/selftests/bpf/progs/xsk_xdp_progs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c b/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c
index 683306db8594..023d8befd4ca 100644
--- a/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c
+++ b/tools/testing/selftests/bpf/progs/xsk_xdp_progs.c
@@ -26,8 +26,10 @@ SEC("xdp.frags") int xsk_def_prog(struct xdp_md *xdp)
SEC("xdp.frags") int xsk_xdp_drop(struct xdp_md *xdp)
{
+ static unsigned int drop_idx;
+
/* Drop every other packet */
- if (idx++ % 2)
+ if (drop_idx++ % 2)
return XDP_DROP;
return bpf_redirect_map(&xsk, 0, XDP_DROP);
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 08/11] selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (6 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 07/11] selftests: bpf: have a separate variable for drop test Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 09/11] idpf: remove xsk frame size check against alignment Maciej Fijalkowski
` (2 subsequent siblings)
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Since we have changed how big user defined headroom in umem can be,
change the logic in testapp_stats_rx_dropped() so we pass updated
headroom validation in xdp_umem_reg() and still drop half of frames.
Test works on non-mbuf setup so xsk_pool_get_rx_frame_size() that is
called on xsk_rcv_check() will not account skb_shared_info size. Taking
the tailroom size into account in test being fixed is needed as
xdp_umem_reg() defaults to respect it.
Reviewed-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
tools/testing/selftests/bpf/prog_tests/test_xsk.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/test_xsk.c b/tools/testing/selftests/bpf/prog_tests/test_xsk.c
index ee60bcc22ee4..a96ca4b39d1e 100644
--- a/tools/testing/selftests/bpf/prog_tests/test_xsk.c
+++ b/tools/testing/selftests/bpf/prog_tests/test_xsk.c
@@ -1959,15 +1959,17 @@ int testapp_headroom(struct test_spec *test)
int testapp_stats_rx_dropped(struct test_spec *test)
{
+ u32 umem_tr = test->ifobj_tx->umem_tailroom;
+
if (test->mode == TEST_MODE_ZC) {
ksft_print_msg("Can not run RX_DROPPED test for ZC mode\n");
return TEST_SKIP;
}
- if (pkt_stream_replace_half(test, MIN_PKT_SIZE * 4, 0))
+ if (pkt_stream_replace_half(test, (MIN_PKT_SIZE * 2) + umem_tr, 0))
return TEST_FAILURE;
test->ifobj_rx->umem->frame_headroom = test->ifobj_rx->umem->frame_size -
- XDP_PACKET_HEADROOM - MIN_PKT_SIZE * 3;
+ XDP_PACKET_HEADROOM - (MIN_PKT_SIZE * 2) - umem_tr - 1;
if (pkt_stream_receive_half(test))
return TEST_FAILURE;
test->ifobj_rx->validation_func = validate_rx_dropped;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 09/11] idpf: remove xsk frame size check against alignment
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (7 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 08/11] selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 10/11] igc: remove home-grown xsk's frame size validation Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 11/11] gve: " Maciej Fijalkowski
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
We provide alignment within xsk_pool_get_rx_frame_size() now, so this
validation is redundant.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/idpf/xsk.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/net/ethernet/intel/idpf/xsk.c b/drivers/net/ethernet/intel/idpf/xsk.c
index d95d3efdfd36..a5b6177fd44c 100644
--- a/drivers/net/ethernet/intel/idpf/xsk.c
+++ b/drivers/net/ethernet/intel/idpf/xsk.c
@@ -558,16 +558,6 @@ int idpf_xsk_pool_setup(struct idpf_vport *vport, struct netdev_bpf *bpf)
bool restart;
int ret;
- if (pool && !IS_ALIGNED(xsk_pool_get_rx_frame_size(pool),
- LIBETH_RX_BUF_STRIDE)) {
- NL_SET_ERR_MSG_FMT_MOD(bpf->extack,
- "%s: HW doesn't support frames sizes not aligned to %u (qid %u: %u)",
- netdev_name(vport->netdev),
- LIBETH_RX_BUF_STRIDE, qid,
- xsk_pool_get_rx_frame_size(pool));
- return -EINVAL;
- }
-
restart = idpf_xdp_enabled(vport) && netif_running(vport->netdev);
if (!restart)
goto pool;
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 10/11] igc: remove home-grown xsk's frame size validation
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (8 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 09/11] idpf: remove xsk frame size check against alignment Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 11/11] gve: " Maciej Fijalkowski
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Since this check is now present in core in xp_assign_dev(), remove
redundant statement from igc_xdp_enable_pool().
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/igc/igc_xdp.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
index 9eb47b4beb06..4126173a0226 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.c
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
@@ -61,23 +61,12 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter,
struct igc_ring *rx_ring, *tx_ring;
struct napi_struct *napi;
bool needs_reset;
- u32 frame_size;
int err;
if (queue_id >= adapter->num_rx_queues ||
queue_id >= adapter->num_tx_queues)
return -EINVAL;
- frame_size = xsk_pool_get_rx_frame_size(pool);
- if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
- /* When XDP is enabled, the driver doesn't support frames that
- * span over multiple buffers. To avoid that, we check if xsk
- * frame size is big enough to fit the max ethernet frame size
- * + vlan double tagging.
- */
- return -EOPNOTSUPP;
- }
-
err = xsk_pool_dma_map(pool, dev, IGC_RX_DMA_ATTR);
if (err) {
netdev_err(ndev, "Failed to map xsk pool\n");
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v4 net 11/11] gve: remove home-grown xsk's frame size validation
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
` (9 preceding siblings ...)
2026-03-26 11:49 ` [PATCH v4 net 10/11] igc: remove home-grown xsk's frame size validation Maciej Fijalkowski
@ 2026-03-26 11:49 ` Maciej Fijalkowski
10 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-26 11:49 UTC (permalink / raw)
To: netdev
Cc: bpf, magnus.karlsson, stfomichev, kuba, pabeni, horms,
larysa.zaremba, aleksander.lobakin, bjorn, Maciej Fijalkowski
Since this check is now present in core in xp_assign_dev(), remove
redundant statement from gve_xsk_pool_enable().
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/google/gve/gve_main.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index 9eb4b3614c4f..b63e0a0459fb 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -1596,11 +1596,6 @@ static int gve_xsk_pool_enable(struct net_device *dev,
dev_err(&priv->pdev->dev, "xsk pool invalid qid %d", qid);
return -EINVAL;
}
- if (xsk_pool_get_rx_frame_size(pool) <
- priv->dev->max_mtu + sizeof(struct ethhdr)) {
- dev_err(&priv->pdev->dev, "xsk pool frame_len too small");
- return -EINVAL;
- }
err = xsk_pool_dma_map(pool, &priv->pdev->dev,
DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING);
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-26 11:50 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 11:49 [PATCH v4 net 00/11] xsk: tailroom reservation and MTU validation Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 01/11] xsk: tighten UMEM headroom validation to account for tailroom and min frame Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 02/11] xsk: respect tailroom for ZC setups Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 03/11] xsk: fix XDP_UMEM_SG_FLAG issues Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 04/11] xsk: validate MTU against usable frame size on bind Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 05/11] selftests: bpf: introduce a common routine for reading procfs Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 06/11] selftests: bpf: fix pkt grow tests Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 07/11] selftests: bpf: have a separate variable for drop test Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 08/11] selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 09/11] idpf: remove xsk frame size check against alignment Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 10/11] igc: remove home-grown xsk's frame size validation Maciej Fijalkowski
2026-03-26 11:49 ` [PATCH v4 net 11/11] gve: " Maciej Fijalkowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox