From: Tariq Toukan <tariqt@nvidia.com>
To: Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
"Alexei Starovoitov" <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
<netdev@vger.kernel.org>, <linux-rdma@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <bpf@vger.kernel.org>,
Gal Pressman <gal@nvidia.com>, Moshe Shemesh <moshe@nvidia.com>,
Dragos Tatulea <dtatulea@nvidia.com>,
Carolina Jubran <cjubran@nvidia.com>
Subject: [PATCH net-next 3/5] net/mlx5e: XDP, Remove stride size limitation
Date: Thu, 19 Mar 2026 09:50:34 +0200 [thread overview]
Message-ID: <20260319075036.24734-4-tariqt@nvidia.com> (raw)
In-Reply-To: <20260319075036.24734-1-tariqt@nvidia.com>
From: Dragos Tatulea <dtatulea@nvidia.com>
Currently XDP mode always uses PAGE_SIZE strides. This limitation
existed because page fragment counting was not implemented when XDP was
added. Furthermore, due to this limitation there were other issues as
well on system with larger pages (e.g. 64K):
- XDP for Striding RQ was effectively disabled on such systems.
- Legacy RQ allows the configuration but uses a fixed scheme of one XDP
buffer per page which is inefficient.
As fragment counting was added during the driver conversion to
page_pool and the support for XDP multi-buffer, it is now possible
to remove this stride size limitation. This patch does just that.
Now it is possible to use XDP on systems with higher page sizes (e.g.
64K):
- For Striding RQ, loading the program is no longer blocked.
Although a 64K page can fit any packet, MTUs that result in
stride > 8K will still make the RQ in non-linear mode. That's
because the HW doesn't support a higher than 8K stride.
- For Legacy RQ, the stride size was PAGE_SIZE which was very
inefficient. Now the stride size will be calculated relative to MTU.
Legacy RQ will always be in linear mode for larger system pages.
This can be observed with an XDP_DROP test [1] when running
in Legacy RQ mode on a ARM Neoverse-N1 system with a 64K
page size:
+-----------------------------------------------+
| MTU | baseline | this change | improvement |
|------+------------+-------------+-------------|
| 1500 | 15.55 Mpps | 18.99 Mpps | 22.0 % |
| 9000 | 15.53 Mpps | 18.24 Mpps | 17.5 % |
+-----------------------------------------------+
There are performance benefits for Striding RQ mode as well:
- Striding RQ non-linear mode now uses 256B strides, just like
non-XDP mode.
- Striding RQ linear mode can now fit a number of XDP buffers per page
that is relative to the MTU size. That means that on 4K page systems
and a small enough MTU, 2 XDP buffers can fit in one page.
The above benefits for Striding RQ can be observed with an
XDP_DROP test [1] when running on a 4K page x86_64 system
(Intel Xeon Platinum 8580):
+-----------------------------------------------+
| MTU | baseline | this change | improvement |
|------+------------+-------------+-------------|
| 1000 | 28.36 Mpps | 33.98 Mpps | 19.82 % |
| 9000 | 20.76 Mpps | 26.30 Mpps | 26.70 % |
+-----------------------------------------------+
[1] Test description:
- xdp-bench with XDP_DROP
- RX: single queue
- TX: sends 64B packets to saturate CPU on RX side
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 26bb31c56e45..1f4a547917ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -298,12 +298,9 @@ static u32 mlx5e_rx_get_linear_stride_sz(struct mlx5_core_dev *mdev,
* no_head_tail_room should be set in the case of XDP with Striding RQ
* when SKB is not linear. This is because another page is allocated for the linear part.
*/
- sz = roundup_pow_of_two(mlx5e_rx_get_linear_sz_skb(params, no_head_tail_room));
+ sz = mlx5e_rx_get_linear_sz_skb(params, no_head_tail_room);
- /* XDP in mlx5e doesn't support multiple packets per page.
- * Do not assume sz <= PAGE_SIZE if params->xdp_prog is set.
- */
- return params->xdp_prog && sz < PAGE_SIZE ? PAGE_SIZE : sz;
+ return roundup_pow_of_two(sz);
}
static u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5_core_dev *mdev,
@@ -453,10 +450,6 @@ u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
return order_base_2(mlx5e_rx_get_linear_stride_sz(mdev, params,
rqo, true));
- /* XDP in mlx5e doesn't support multiple packets per page. */
- if (params->xdp_prog)
- return PAGE_SHIFT;
-
return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
}
--
2.44.0
next prev parent reply other threads:[~2026-03-19 7:52 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 7:50 [PATCH net-next 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
2026-03-19 7:50 ` [PATCH net-next 1/5] net/mlx5e: XSK, Increase size for chunk_size param Tariq Toukan
2026-03-19 7:50 ` [PATCH net-next 2/5] net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX Tariq Toukan
2026-03-19 7:50 ` Tariq Toukan [this message]
2026-03-19 7:50 ` [PATCH net-next 4/5] net/mlx5e: XDP, Use a single linear page per rq Tariq Toukan
2026-03-19 7:50 ` [PATCH net-next 5/5] net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode Tariq Toukan
2026-03-24 2:42 ` Jakub Kicinski
2026-03-24 8:50 ` Dragos Tatulea
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260319075036.24734-4-tariqt@nvidia.com \
--to=tariqt@nvidia.com \
--cc=andrew+netdev@lunn.ch \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=cjubran@nvidia.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=moshe@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox