[PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page
@ 2026-04-03  9:09 Tariq Toukan
  2026-04-03  9:09 ` [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param Tariq Toukan
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Tariq Toukan @ 2026-04-03  9:09 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Dragos Tatulea, Cosmin Ratiu,
	Simon Horman, Jacob Keller, Lama Kayal, Michal Swiatkowski,
	Carolina Jubran, Nathan Chancellor, Daniel Zahka,
	Rahul Rameshbabu, Raed Salem, netdev, linux-rdma, linux-kernel,
	bpf, Gal Pressman

Hi,

This series removes the limitation of having one packet per page in XDP
mode. This has the following implications:

- XDP in Striding RQ mode can now be used on 64K page systems.

- XDP in Legacy RQ mode was using a single packet per page which on 64K
  page systems is quite inefficient. The improvement can be observed
  with an XDP_DROP test when running in Legacy RQ mode on a ARM
  Neoverse-N1 system with a 64K page size:
  +-----------------------------------------------+
  | MTU  | baseline   | this change | improvement |
  |------+------------+-------------+-------------|
  | 1500 | 15.55 Mpps | 18.99 Mpps  | 22.0 %      |
  | 9000 | 15.53 Mpps | 18.24 Mpps  | 17.5 %      |
  +-----------------------------------------------+

After lifting this limitation, the series switches to using fragments
for the side page in non-linear mode. This small improvement is at most
visible for XDP_DROP tests with small 64B packets and a large enough MTU
for Striding RQ to be in non-linear mode:
+----------------------------------------------------------------------+
| System               | MTU  | baseline   | this change | improvement |
|----------------------+------+------------+-------------+-------------|
| 4K page x86_64 [1]   | 9000 | 26.30 Mpps | 30.45 Mpps  | 15.80 %     |
| 64K page aarch64 [2] | 9000 | 15.27 Mpps | 20.10 Mpps  | 31.62 %     |
+----------------------------------------------------------------------+

This series does not cover the xsk (AF_XDP) paths for 64K page systems.

[1] https://lore.kernel.org/all/20260324024235.929875-1-kuba@kernel.org/

V2:
- Link to V1:
  https://lore.kernel.org/all/20260319075036.24734-1-tariqt@nvidia.com/
- Fixed issue found by AI review [1].


Dragos Tatulea (5):
  net/mlx5e: XSK, Increase size for chunk_size param
  net/mlx5e: XDP, Improve dma address calculation of linear part for
    XDP_TX
  net/mlx5e: XDP, Remove stride size limitation
  net/mlx5e: XDP, Use a single linear page per rq
  net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode

 drivers/net/ethernet/mellanox/mlx5/core/en.h  | 12 +++-
 .../ethernet/mellanox/mlx5/core/en/params.c   | 11 +---
 .../ethernet/mellanox/mlx5/core/en/params.h   |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en/xdp.c  |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 50 ++++++++++++--
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 65 +++++++++++++++----
 6 files changed, 113 insertions(+), 29 deletions(-)


base-commit: 8b0e64d6c9e7feec5ba5643b4fa8b7fd54464778
-- 
2.44.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param
  2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
@ 2026-04-03  9:09 ` Tariq Toukan
  2026-04-05  6:30   ` Dragos Tatulea
  2026-04-03  9:09 ` [PATCH net-next V2 2/5] net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX Tariq Toukan
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Tariq Toukan @ 2026-04-03  9:09 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Dragos Tatulea, Cosmin Ratiu,
	Simon Horman, Jacob Keller, Lama Kayal, Michal Swiatkowski,
	Carolina Jubran, Nathan Chancellor, Daniel Zahka,
	Rahul Rameshbabu, Raed Salem, netdev, linux-rdma, linux-kernel,
	bpf, Gal Pressman

From: Dragos Tatulea <dtatulea@nvidia.com>

When 64K pages are used, chunk_size can take the 64K value
which doesn't fit in u16. This results in overflows that
are detected in mlx5e_mpwrq_log_wqe_sz().

Increase the type to u32 to fix this.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
index 9b1a2aed17c3..275f9be53a34 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
@@ -8,7 +8,7 @@
 
 struct mlx5e_xsk_param {
 	u16 headroom;
-	u16 chunk_size;
+	u32 chunk_size;
 	bool unaligned;
 };
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next V2 2/5] net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX
  2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
  2026-04-03  9:09 ` [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param Tariq Toukan
@ 2026-04-03  9:09 ` Tariq Toukan
  2026-04-03  9:09 ` [PATCH net-next V2 3/5] net/mlx5e: XDP, Remove stride size limitation Tariq Toukan
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Tariq Toukan @ 2026-04-03  9:09 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Dragos Tatulea, Cosmin Ratiu,
	Simon Horman, Jacob Keller, Lama Kayal, Michal Swiatkowski,
	Carolina Jubran, Nathan Chancellor, Daniel Zahka,
	Rahul Rameshbabu, Raed Salem, netdev, linux-rdma, linux-kernel,
	bpf, Gal Pressman

From: Dragos Tatulea <dtatulea@nvidia.com>

When calculating the dma address of the linear part of an XDP frame, the
formula assumes that there is a single XDP buffer per page. Extend the
formula to allow multiple XDP buffers per page by calculating the data
offset in the page.

This is a preparation for the upcoming removal of a single XDP buffer
per page limitation when the formula will no longer be correct.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 04e1b5fa4825..d3bab198c99c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -123,7 +123,7 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq,
 	 * mode.
 	 */
 
-	dma_addr = page_pool_get_dma_addr(page) + (xdpf->data - (void *)xdpf);
+	dma_addr = page_pool_get_dma_addr(page) + offset_in_page(xdpf->data);
 	dma_sync_single_for_device(sq->pdev, dma_addr, xdptxd->len, DMA_BIDIRECTIONAL);
 
 	if (xdptxd->has_frags) {
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next V2 3/5] net/mlx5e: XDP, Remove stride size limitation
  2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
  2026-04-03  9:09 ` [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param Tariq Toukan
  2026-04-03  9:09 ` [PATCH net-next V2 2/5] net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX Tariq Toukan
@ 2026-04-03  9:09 ` Tariq Toukan
  2026-04-03  9:09 ` [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq Tariq Toukan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Tariq Toukan @ 2026-04-03  9:09 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Dragos Tatulea, Cosmin Ratiu,
	Simon Horman, Jacob Keller, Lama Kayal, Michal Swiatkowski,
	Carolina Jubran, Nathan Chancellor, Daniel Zahka,
	Rahul Rameshbabu, Raed Salem, netdev, linux-rdma, linux-kernel,
	bpf, Gal Pressman

From: Dragos Tatulea <dtatulea@nvidia.com>

Currently XDP mode always uses PAGE_SIZE strides. This limitation
existed because page fragment counting was not implemented when XDP was
added. Furthermore, due to this limitation there were other issues as
well on system with larger pages (e.g. 64K):

- XDP for Striding RQ was effectively disabled on such systems.

- Legacy RQ allows the configuration but uses a fixed scheme of one XDP
  buffer per page which is inefficient.

As fragment counting was added during the driver conversion to
page_pool and the support for XDP multi-buffer, it is now possible
to remove this stride size limitation. This patch does just that.

Now it is possible to use XDP on systems with higher page sizes (e.g.
64K):

- For Striding RQ, loading the program is no longer blocked.
  Although a 64K page can fit any packet, MTUs that result in
  stride > 8K will still make the RQ in non-linear mode. That's
  because the HW doesn't support a higher than 8K stride.

- For Legacy RQ, the stride size was PAGE_SIZE which was very
  inefficient. Now the stride size will be calculated relative to MTU.
  Legacy RQ will always be in linear mode for larger system pages.

  This can be observed with an XDP_DROP test [1] when running
  in Legacy RQ mode on a ARM Neoverse-N1 system with a 64K
  page size:
  +-----------------------------------------------+
  | MTU  | baseline   | this change | improvement |
  |------+------------+-------------+-------------|
  | 1500 | 15.55 Mpps | 18.99 Mpps  | 22.0 %      |
  | 9000 | 15.53 Mpps | 18.24 Mpps  | 17.5 %      |
  +-----------------------------------------------+

There are performance benefits for Striding RQ mode as well:

- Striding RQ non-linear mode now uses 256B strides, just like
  non-XDP mode.

- Striding RQ linear mode can now fit a number of XDP buffers per page
  that is relative to the MTU size. That means that on 4K page systems
  and a small enough MTU, 2 XDP buffers can fit in one page.

The above benefits for Striding RQ can be observed with an
XDP_DROP test [1] when running on a 4K page x86_64 system
(Intel Xeon Platinum 8580):
  +-----------------------------------------------+
  | MTU  | baseline   | this change | improvement |
  |------+------------+-------------+-------------|
  | 1000 | 28.36 Mpps | 33.98 Mpps  | 19.82 %     |
  | 9000 | 20.76 Mpps | 26.30 Mpps  | 26.70 %     |
  +-----------------------------------------------+

[1] Test description:
- xdp-bench with XDP_DROP
- RX: single queue
- TX: sends 64B packets to saturate CPU on RX side

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 26bb31c56e45..1f4a547917ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -298,12 +298,9 @@ static u32 mlx5e_rx_get_linear_stride_sz(struct mlx5_core_dev *mdev,
 	 * no_head_tail_room should be set in the case of XDP with Striding RQ
 	 * when SKB is not linear. This is because another page is allocated for the linear part.
 	 */
-	sz = roundup_pow_of_two(mlx5e_rx_get_linear_sz_skb(params, no_head_tail_room));
+	sz = mlx5e_rx_get_linear_sz_skb(params, no_head_tail_room);
 
-	/* XDP in mlx5e doesn't support multiple packets per page.
-	 * Do not assume sz <= PAGE_SIZE if params->xdp_prog is set.
-	 */
-	return params->xdp_prog && sz < PAGE_SIZE ? PAGE_SIZE : sz;
+	return roundup_pow_of_two(sz);
 }
 
 static u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5_core_dev *mdev,
@@ -453,10 +450,6 @@ u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev,
 		return order_base_2(mlx5e_rx_get_linear_stride_sz(mdev, params,
 								  rqo, true));
 
-	/* XDP in mlx5e doesn't support multiple packets per page. */
-	if (params->xdp_prog)
-		return PAGE_SHIFT;
-
 	return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev);
 }
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
                   ` (2 preceding siblings ...)
  2026-04-03  9:09 ` [PATCH net-next V2 3/5] net/mlx5e: XDP, Remove stride size limitation Tariq Toukan
@ 2026-04-03  9:09 ` Tariq Toukan
  2026-04-05  6:08   ` Dragos Tatulea
  2026-04-03  9:09 ` [PATCH net-next V2 5/5] net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode Tariq Toukan
  2026-04-07 11:50 ` [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page patchwork-bot+netdevbpf
  5 siblings, 1 reply; 16+ messages in thread
From: Tariq Toukan @ 2026-04-03  9:09 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Dragos Tatulea, Cosmin Ratiu,
	Simon Horman, Jacob Keller, Lama Kayal, Michal Swiatkowski,
	Carolina Jubran, Nathan Chancellor, Daniel Zahka,
	Rahul Rameshbabu, Raed Salem, netdev, linux-rdma, linux-kernel,
	bpf, Gal Pressman

From: Dragos Tatulea <dtatulea@nvidia.com>

Currently in striding rq there is one mlx5e_frag_page member per WQE for
the linear page. This linear page is used only in XDP multi-buffer mode.
This is wasteful because only one linear page is needed per rq: the page
gets refreshed on every packet, regardless of WQE. Furthermore, it is
not needed in other modes (non-XDP, XDP single-buffer).

This change moves the linear page into its own structure (struct
mlx5_mpw_linear_info) and allocates it only when necessary.

A special structure is created because an upcoming patch will extend
this structure to support fragmentation of the linear page.

This patch has no functional changes.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  6 ++-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 37 ++++++++++++++++---
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 17 +++++----
 3 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index c7ac6ebe8290..592234780f2b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -591,10 +591,13 @@ union mlx5e_alloc_units {
 struct mlx5e_mpw_info {
 	u16 consumed_strides;
 	DECLARE_BITMAP(skip_release_bitmap, MLX5_MPWRQ_MAX_PAGES_PER_WQE);
-	struct mlx5e_frag_page linear_page;
 	union mlx5e_alloc_units alloc_units;
 };
 
+struct mlx5e_mpw_linear_info {
+	struct mlx5e_frag_page frag_page;
+};
+
 #define MLX5E_MAX_RX_FRAGS 4
 
 struct mlx5e_rq;
@@ -689,6 +692,7 @@ struct mlx5e_rq {
 			u8                     umr_wqebbs;
 			u8                     mtts_per_wqe;
 			u8                     umr_mode;
+			struct mlx5e_mpw_linear_info *linear_info;
 			struct mlx5e_shampo_hd *shampo;
 		} mpwqe;
 	};
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 1238e5356012..aa8359a48b12 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -369,6 +369,29 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq, int node)
 	return 0;
 }
 
+static int mlx5e_rq_alloc_mpwqe_linear_info(struct mlx5e_rq *rq, int node,
+					    struct mlx5e_params *params,
+					    struct mlx5e_rq_opt_param *rqo,
+					    u32 *pool_size)
+{
+	struct mlx5_core_dev *mdev = rq->mdev;
+	struct mlx5e_mpw_linear_info *li;
+
+	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params, rqo) ||
+	    !params->xdp_prog)
+		return 0;
+
+	li = kvzalloc_node(sizeof(*li), GFP_KERNEL, node);
+	if (!li)
+		return -ENOMEM;
+
+	rq->mpwqe.linear_info = li;
+
+	/* additional page per packet for the linear part */
+	*pool_size *= 2;
+
+	return 0;
+}
 
 static u8 mlx5e_mpwrq_access_mode(enum mlx5e_mpwrq_umr_mode umr_mode)
 {
@@ -915,10 +938,6 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params,
 			mlx5e_mpwqe_get_log_rq_size(mdev, params, rqo);
 		pool_order = rq->mpwqe.page_shift - PAGE_SHIFT;
 
-		if (!mlx5e_rx_mpwqe_is_linear_skb(mdev, params, rqo) &&
-		    params->xdp_prog)
-			pool_size *= 2; /* additional page per packet for the linear part */
-
 		rq->mpwqe.log_stride_sz =
 				mlx5e_mpwqe_get_log_stride_size(mdev, params,
 								rqo);
@@ -936,10 +955,15 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params,
 		if (err)
 			goto err_rq_mkey;
 
-		err = mlx5_rq_shampo_alloc(mdev, params, rq_param, rq, node);
+		err = mlx5e_rq_alloc_mpwqe_linear_info(rq, node, params, rqo,
+						       &pool_size);
 		if (err)
 			goto err_free_mpwqe_info;
 
+		err = mlx5_rq_shampo_alloc(mdev, params, rq_param, rq, node);
+		if (err)
+			goto err_free_mpwqe_linear_info;
+
 		break;
 	default: /* MLX5_WQ_TYPE_CYCLIC */
 		err = mlx5_wq_cyc_create(mdev, &rq_param->wq, rqc_wq,
@@ -1054,6 +1078,8 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params,
 	switch (rq->wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
 		mlx5e_rq_free_shampo(rq);
+err_free_mpwqe_linear_info:
+		kvfree(rq->mpwqe.linear_info);
 err_free_mpwqe_info:
 		kvfree(rq->mpwqe.info);
 err_rq_mkey:
@@ -1081,6 +1107,7 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq)
 	switch (rq->wq_type) {
 	case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
 		mlx5e_rq_free_shampo(rq);
+		kvfree(rq->mpwqe.linear_info);
 		kvfree(rq->mpwqe.info);
 		mlx5_core_destroy_mkey(rq->mdev, be32_to_cpu(rq->mpwqe.umr_mkey_be));
 		mlx5e_free_mpwqe_rq_drop_page(rq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index f5c0e2a0ada9..feb042d84b8e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1869,6 +1869,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 	struct mlx5e_frag_page *frag_page = &wi->alloc_units.frag_pages[page_idx];
 	u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt);
 	struct mlx5e_frag_page *head_page = frag_page;
+	struct mlx5e_frag_page *linear_page = NULL;
 	struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf;
 	u32 page_size = BIT(rq->mpwqe.page_shift);
 	u32 frag_offset    = head_offset;
@@ -1897,13 +1898,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 	if (prog) {
 		/* area for bpf_xdp_[store|load]_bytes */
 		net_prefetchw(netmem_address(frag_page->netmem) + frag_offset);
+
+		linear_page = &rq->mpwqe.linear_info->frag_page;
 		if (unlikely(mlx5e_page_alloc_fragmented(rq->page_pool,
-							 &wi->linear_page))) {
+							 linear_page))) {
 			rq->stats->buff_alloc_err++;
 			return NULL;
 		}
 
-		va = netmem_address(wi->linear_page.netmem);
+		va = netmem_address(linear_page->netmem);
 		net_prefetchw(va); /* xdp_frame data area */
 		linear_hr = XDP_PACKET_HEADROOM;
 		linear_data_len = 0;
@@ -1966,10 +1969,10 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 				for (pfp = head_page; pfp < frag_page; pfp++)
 					pfp->frags++;
 
-				wi->linear_page.frags++;
+				linear_page->frags++;
 			}
 			mlx5e_page_release_fragmented(rq->page_pool,
-						      &wi->linear_page);
+						      linear_page);
 			return NULL; /* page/packet was consumed by XDP */
 		}
 
@@ -1988,13 +1991,13 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 			mxbuf->xdp.data - mxbuf->xdp.data_meta);
 		if (unlikely(!skb)) {
 			mlx5e_page_release_fragmented(rq->page_pool,
-						      &wi->linear_page);
+						      linear_page);
 			return NULL;
 		}
 
 		skb_mark_for_recycle(skb);
-		wi->linear_page.frags++;
-		mlx5e_page_release_fragmented(rq->page_pool, &wi->linear_page);
+		linear_page->frags++;
+		mlx5e_page_release_fragmented(rq->page_pool, linear_page);
 
 		if (xdp_buff_has_frags(&mxbuf->xdp)) {
 			struct mlx5e_frag_page *pagep;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next V2 5/5] net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode
  2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
                   ` (3 preceding siblings ...)
  2026-04-03  9:09 ` [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq Tariq Toukan
@ 2026-04-03  9:09 ` Tariq Toukan
  2026-04-07 11:50 ` [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page patchwork-bot+netdevbpf
  5 siblings, 0 replies; 16+ messages in thread
From: Tariq Toukan @ 2026-04-03  9:09 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Dragos Tatulea, Cosmin Ratiu,
	Simon Horman, Jacob Keller, Lama Kayal, Michal Swiatkowski,
	Carolina Jubran, Nathan Chancellor, Daniel Zahka,
	Rahul Rameshbabu, Raed Salem, netdev, linux-rdma, linux-kernel,
	bpf, Gal Pressman

From: Dragos Tatulea <dtatulea@nvidia.com>

Currently in XDP multi-buffer mode for striding rq a whole page is
allocated for the linear part of the XDP buffer. This is wasteful,
especially on systems with larger page sizes.

This change splits the page into fixed sized fragments. The page is
replenished when the maximum number of allowed fragments is reached.
When a fragment is not used, it will be simply recycled on next packet.
This is great for XDP_DROP as the fragment can be recycled for the next
packet. In the most extreme case (XDP_DROP everything), there will be 0
fragments used => only one linear page allocation for the lifetime of
the XDP program.

The previous page_pool size increase was too conservative (doubling the
size) and now there are much fewer allocations (1/8 for a 4K page). So
drop the page_pool size extension altogether when the linear side page
is used.

This small improvement is at most visible for XDP_DROP tests with small
64B packets and a large enough MTU for Striding RQ to be in non-linear
mode:
+----------------------------------------------------------------------+
| System               | MTU  | baseline   | this change | improvement |
|----------------------+------+------------+-------------+-------------|
| 4K page x86_64 [1]   | 9000 | 26.30 Mpps | 30.45 Mpps  | 15.80 %     |
| 64K page aarch64 [2] | 9000 | 15.27 Mpps | 20.10 Mpps  | 31.62 %     |
+----------------------------------------------------------------------+

[1] Intel Xeon Platinum 8580
[2] ARM Neoverse-N1

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  6 ++
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 25 ++++++--
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 60 +++++++++++++++----
 3 files changed, 74 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 592234780f2b..2270e2e550dd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -82,6 +82,9 @@ struct page_pool;
 
 #define MLX5E_PAGECNT_BIAS_MAX U16_MAX
 #define MLX5E_RX_MAX_HEAD (256)
+#define MLX5E_XDP_LOG_MAX_LINEAR_SZ \
+	order_base_2(MLX5_SKB_FRAG_SZ(XDP_PACKET_HEADROOM + MLX5E_RX_MAX_HEAD))
+
 #define MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE (8)
 #define MLX5E_SHAMPO_WQ_HEADER_PER_PAGE \
 	(PAGE_SIZE >> MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE)
@@ -596,6 +599,7 @@ struct mlx5e_mpw_info {
 
 struct mlx5e_mpw_linear_info {
 	struct mlx5e_frag_page frag_page;
+	u16 max_frags;
 };
 
 #define MLX5E_MAX_RX_FRAGS 4
@@ -1081,6 +1085,8 @@ bool mlx5e_reset_rx_moderation(struct dim_cq_moder *cq_moder, u8 cq_period_mode,
 bool mlx5e_reset_rx_channels_moderation(struct mlx5e_channels *chs, u8 cq_period_mode,
 					bool dim_enabled, bool keep_dim_state);
 
+void mlx5e_mpwqe_dealloc_linear_page(struct mlx5e_rq *rq);
+
 struct mlx5e_sq_param;
 int mlx5e_open_xdpsq(struct mlx5e_channel *c, struct mlx5e_params *params,
 		     struct mlx5e_sq_param *param, struct xsk_buff_pool *xsk_pool,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index aa8359a48b12..4ba198fb9d6c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -371,11 +371,11 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq, int node)
 
 static int mlx5e_rq_alloc_mpwqe_linear_info(struct mlx5e_rq *rq, int node,
 					    struct mlx5e_params *params,
-					    struct mlx5e_rq_opt_param *rqo,
-					    u32 *pool_size)
+					    struct mlx5e_rq_opt_param *rqo)
 {
 	struct mlx5_core_dev *mdev = rq->mdev;
 	struct mlx5e_mpw_linear_info *li;
+	u32 linear_frag_count;
 
 	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params, rqo) ||
 	    !params->xdp_prog)
@@ -385,10 +385,22 @@ static int mlx5e_rq_alloc_mpwqe_linear_info(struct mlx5e_rq *rq, int node,
 	if (!li)
 		return -ENOMEM;
 
+	linear_frag_count =
+		BIT(rq->mpwqe.page_shift - MLX5E_XDP_LOG_MAX_LINEAR_SZ);
+	if (linear_frag_count > U16_MAX) {
+		netdev_warn(rq->netdev,
+			    "rq %d: linear_frag_count (%u) larger than expected (%u), page_shift: %u, log_max_linear_sz: %u\n",
+			    rq->ix, linear_frag_count, U16_MAX,
+			    rq->mpwqe.page_shift, MLX5E_XDP_LOG_MAX_LINEAR_SZ);
+		kvfree(li);
+		return -EINVAL;
+	}
+
+	li->max_frags = linear_frag_count;
 	rq->mpwqe.linear_info = li;
 
-	/* additional page per packet for the linear part */
-	*pool_size *= 2;
+	/* Set to max to force allocation on first run. */
+	li->frag_page.frags = li->max_frags;
 
 	return 0;
 }
@@ -955,8 +967,7 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params,
 		if (err)
 			goto err_rq_mkey;
 
-		err = mlx5e_rq_alloc_mpwqe_linear_info(rq, node, params, rqo,
-						       &pool_size);
+		err = mlx5e_rq_alloc_mpwqe_linear_info(rq, node, params, rqo);
 		if (err)
 			goto err_free_mpwqe_info;
 
@@ -1347,6 +1358,8 @@ void mlx5e_free_rx_descs(struct mlx5e_rq *rq)
 			mlx5_wq_ll_pop(wq, wqe_ix_be,
 				       &wqe->next.next_wqe_index);
 		}
+
+		mlx5e_mpwqe_dealloc_linear_page(rq);
 	} else {
 		struct mlx5_wq_cyc *wq = &rq->wqe.wq;
 		u16 missing = mlx5_wq_cyc_missing(wq);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index feb042d84b8e..5b60aa47c75b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -300,6 +300,35 @@ static void mlx5e_page_release_fragmented(struct page_pool *pp,
 		page_pool_put_unrefed_netmem(pp, netmem, -1, true);
 }
 
+static int mlx5e_mpwqe_linear_page_refill(struct mlx5e_rq *rq)
+{
+	struct mlx5e_mpw_linear_info *li = rq->mpwqe.linear_info;
+
+	if (likely(li->frag_page.frags < li->max_frags))
+		return 0;
+
+	if (likely(li->frag_page.netmem)) {
+		mlx5e_page_release_fragmented(rq->page_pool, &li->frag_page);
+		li->frag_page.netmem = 0;
+	}
+
+	return mlx5e_page_alloc_fragmented(rq->page_pool, &li->frag_page);
+}
+
+static void *mlx5e_mpwqe_get_linear_page_frag(struct mlx5e_rq *rq)
+{
+	struct mlx5e_mpw_linear_info *li = rq->mpwqe.linear_info;
+	u32 frag_offset;
+
+	if (unlikely(mlx5e_mpwqe_linear_page_refill(rq)))
+		return NULL;
+
+	frag_offset = li->frag_page.frags << MLX5E_XDP_LOG_MAX_LINEAR_SZ;
+	WARN_ON(frag_offset >= BIT(rq->mpwqe.page_shift));
+
+	return netmem_address(li->frag_page.netmem) + frag_offset;
+}
+
 static inline int mlx5e_get_rx_frag(struct mlx5e_rq *rq,
 				    struct mlx5e_wqe_frag_info *frag)
 {
@@ -702,6 +731,22 @@ static void mlx5e_dealloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 	bitmap_fill(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe);
 }
 
+void mlx5e_mpwqe_dealloc_linear_page(struct mlx5e_rq *rq)
+{
+	struct mlx5e_mpw_linear_info *li = rq->mpwqe.linear_info;
+
+	if (!li || !li->frag_page.netmem)
+		return;
+
+	mlx5e_page_release_fragmented(rq->page_pool, &li->frag_page);
+
+	/* Recovery flow can call this function and then alloc again, so leave
+	 * things in a good state for re-allocation.
+	 */
+	li->frag_page.netmem = 0;
+	li->frag_page.frags = li->max_frags;
+}
+
 INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
 {
 	struct mlx5_wq_cyc *wq = &rq->wqe.wq;
@@ -1899,18 +1944,17 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 		/* area for bpf_xdp_[store|load]_bytes */
 		net_prefetchw(netmem_address(frag_page->netmem) + frag_offset);
 
-		linear_page = &rq->mpwqe.linear_info->frag_page;
-		if (unlikely(mlx5e_page_alloc_fragmented(rq->page_pool,
-							 linear_page))) {
+		va = mlx5e_mpwqe_get_linear_page_frag(rq);
+		if (!va) {
 			rq->stats->buff_alloc_err++;
 			return NULL;
 		}
 
-		va = netmem_address(linear_page->netmem);
 		net_prefetchw(va); /* xdp_frame data area */
 		linear_hr = XDP_PACKET_HEADROOM;
 		linear_data_len = 0;
 		linear_frame_sz = MLX5_SKB_FRAG_SZ(linear_hr + MLX5E_RX_MAX_HEAD);
+		linear_page = &rq->mpwqe.linear_info->frag_page;
 	} else {
 		skb = napi_alloc_skb(rq->cq.napi,
 				     ALIGN(MLX5E_RX_MAX_HEAD, sizeof(long)));
@@ -1971,8 +2015,6 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 
 				linear_page->frags++;
 			}
-			mlx5e_page_release_fragmented(rq->page_pool,
-						      linear_page);
 			return NULL; /* page/packet was consumed by XDP */
 		}
 
@@ -1989,15 +2031,11 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
 			rq, mxbuf->xdp.data_hard_start, linear_frame_sz,
 			mxbuf->xdp.data - mxbuf->xdp.data_hard_start, len,
 			mxbuf->xdp.data - mxbuf->xdp.data_meta);
-		if (unlikely(!skb)) {
-			mlx5e_page_release_fragmented(rq->page_pool,
-						      linear_page);
+		if (unlikely(!skb))
 			return NULL;
-		}
 
 		skb_mark_for_recycle(skb);
 		linear_page->frags++;
-		mlx5e_page_release_fragmented(rq->page_pool, linear_page);
 
 		if (xdp_buff_has_frags(&mxbuf->xdp)) {
 			struct mlx5e_frag_page *pagep;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-03  9:09 ` [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq Tariq Toukan
@ 2026-04-05  6:08   ` Dragos Tatulea
  2026-04-06 15:43     ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Dragos Tatulea @ 2026-04-05  6:08 UTC (permalink / raw)
  To: Tariq Toukan, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Cosmin Ratiu, Simon Horman, Jacob Keller,
	Lama Kayal, Michal Swiatkowski, Carolina Jubran,
	Nathan Chancellor, Daniel Zahka, Rahul Rameshbabu, Raed Salem,
	netdev, linux-rdma, linux-kernel, bpf, Gal Pressman

On Fri, Apr 03, 2026 at 12:09:26PM +0300, Tariq Toukan wrote:
> From: Dragos Tatulea <dtatulea@nvidia.com>
> 
> Currently in striding rq there is one mlx5e_frag_page member per WQE for
> the linear page. This linear page is used only in XDP multi-buffer mode.
> This is wasteful because only one linear page is needed per rq: the page
> gets refreshed on every packet, regardless of WQE. Furthermore, it is
> not needed in other modes (non-XDP, XDP single-buffer).
> 
> This change moves the linear page into its own structure (struct
> mlx5_mpw_linear_info) and allocates it only when necessary.
> 
> A special structure is created because an upcoming patch will extend
> this structure to support fragmentation of the linear page.
> 
> This patch has no functional changes.
> 
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en.h  |  6 ++-
>
> [...]
> +static int mlx5e_rq_alloc_mpwqe_linear_info(struct mlx5e_rq *rq, int node,
> +					    struct mlx5e_params *params,
> +					    struct mlx5e_rq_opt_param *rqo,
> +					    u32 *pool_size)
> +{
> +	struct mlx5_core_dev *mdev = rq->mdev;
> +	struct mlx5e_mpw_linear_info *li;
> +
> +	if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params, rqo) ||
> +	    !params->xdp_prog)
> +		return 0;
> +
sashiko says:
"""
Could mlx5e_rx_mpwqe_is_linear_skb() return true here (meaning li is not
allocated), but later return false when the rx handlers are being set up?

When mlx5e_open_rq() sets up handlers it passes NULL for rqo to
mlx5e_rx_mpwqe_is_linear_skb(). Inside that function, without rqo, the
page_shift calculation falls back to the system PAGE_SHIFT rather than
rqo->qcfg->rx_page_size. A smaller page_shift could cause the stride check
to fail, returning false.
If the allocation evaluates to true (skipping allocation) but the handler setup
evaluates to false, the nonlinear handler mlx5e_skb_from_cqe_mpwrq_nonlinear
will be used for the queue.
"""

This is by design. HW-GRO is the mode that Sashiko is talking about. In
that mode linear_info is not used hence not allocated.

> [...]
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1869,6 +1869,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
>  	struct mlx5e_frag_page *frag_page = &wi->alloc_units.frag_pages[page_idx];
>  	u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt);
>  	struct mlx5e_frag_page *head_page = frag_page;
> +	struct mlx5e_frag_page *linear_page = NULL;
>  	struct mlx5e_xdp_buff *mxbuf = &rq->mxbuf;
>  	u32 page_size = BIT(rq->mpwqe.page_shift);
>  	u32 frag_offset    = head_offset;
> @@ -1897,13 +1898,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
>  	if (prog) {
>  		/* area for bpf_xdp_[store|load]_bytes */
>  		net_prefetchw(netmem_address(frag_page->netmem) + frag_offset);
> +
> +		linear_page = &rq->mpwqe.linear_info->frag_page;
"""
If mlx5e_skb_from_cqe_mpwrq_nonlinear() is invoked but linear_info was skipped
during allocation, does this result in a NULL pointer dereference when accessing
linear_info->frag_page?
"""

Connected to the statement above. linear_page will always be allocated
for this handler.

[1] https://sashiko.dev/#/patchset/20260403090927.139042-1-tariqt%40nvidia.com?part=4

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param
  2026-04-03  9:09 ` [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param Tariq Toukan
@ 2026-04-05  6:30   ` Dragos Tatulea
  0 siblings, 0 replies; 16+ messages in thread
From: Dragos Tatulea @ 2026-04-05  6:30 UTC (permalink / raw)
  To: Tariq Toukan, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Mark Bloch, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Cosmin Ratiu, Simon Horman, Jacob Keller,
	Lama Kayal, Michal Swiatkowski, Carolina Jubran,
	Nathan Chancellor, Daniel Zahka, Rahul Rameshbabu, Raed Salem,
	netdev, linux-rdma, linux-kernel, bpf, Gal Pressman

On Fri, Apr 03, 2026 at 12:09:23PM +0300, Tariq Toukan wrote:
> [...]
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h
> @@ -8,7 +8,7 @@
>  
>  struct mlx5e_xsk_param {
>  	u16 headroom;
> -	u16 chunk_size;
> +	u32 chunk_size;
>  	bool unaligned;

Sashiko says [1]:
"""
Is it possible that users will still fail to create XSK pools with 64K
chunk sizes because of an existing limit in mlx5e_xsk_is_pool_sane()?
[...]
"""
Yes, it is possible. XSK is not yet supported fo 64K pages. This series
adds 64K page support only for plain XDP.

[1] https://sashiko.dev/#/patchset/20260403090927.139042-1-tariqt%40nvidia.com?part=1 

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-05  6:08   ` Dragos Tatulea
@ 2026-04-06 15:43     ` Jakub Kicinski
  2026-04-06 16:31       ` Mark Bloch
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2026-04-06 15:43 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: Tariq Toukan, Eric Dumazet, Paolo Abeni, Andrew Lunn,
	David S. Miller, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Cosmin Ratiu, Simon Horman,
	Jacob Keller, Lama Kayal, Michal Swiatkowski, Carolina Jubran,
	Nathan Chancellor, Daniel Zahka, Rahul Rameshbabu, Raed Salem,
	netdev, linux-rdma, linux-kernel, bpf, Gal Pressman

On Sun, 5 Apr 2026 08:08:06 +0200 Dragos Tatulea wrote:
> sashiko says:

Thanks a lot for reviewing the review! It takes a lot of maintainer time

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-06 15:43     ` Jakub Kicinski
@ 2026-04-06 16:31       ` Mark Bloch
  2026-04-06 18:30         ` Jakub Kicinski
  2026-04-06 19:13         ` Nicolai Buchwitz
  0 siblings, 2 replies; 16+ messages in thread
From: Mark Bloch @ 2026-04-06 16:31 UTC (permalink / raw)
  To: Jakub Kicinski, Dragos Tatulea
  Cc: Tariq Toukan, Eric Dumazet, Paolo Abeni, Andrew Lunn,
	David S. Miller, Saeed Mahameed, Leon Romanovsky,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Cosmin Ratiu, Simon Horman,
	Jacob Keller, Lama Kayal, Michal Swiatkowski, Carolina Jubran,
	Nathan Chancellor, Daniel Zahka, Rahul Rameshbabu, Raed Salem,
	netdev, linux-rdma, linux-kernel, bpf, Gal Pressman

On 06/04/2026 18:43, Jakub Kicinski wrote:
> On Sun, 5 Apr 2026 08:08:06 +0200 Dragos Tatulea wrote:
>> sashiko says:
> 
> Thanks a lot for reviewing the review! It takes a lot of maintainer time

Just to add some context: we started running Sashiko internally,
so hopefully trivial issues won’t be missed. I don’t know if
you remember our on-list discussion from a few weeks ago, following that
discussion right now we have three different internal AI tools reviewing each
commit.

At the moment this is still manageable, and I think developers should
look over all comments from all tools. In our case that currently
means three review outputs per commit. It would also be useful to have
some official guidance on what authors are recommended to run before
posting, so obvious issues can be caught earlier and less reviewer/maintainer
time is spent on them.

For example:

“Before posting, authors could run a recommended baseline of review tools,
where available, to catch obvious issues early. During review, tools such
as review-prompts and Sashiko may be used to assist the reviewer.”

Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-06 16:31       ` Mark Bloch
@ 2026-04-06 18:30         ` Jakub Kicinski
  2026-04-06 19:50           ` Mark Bloch
  2026-04-06 19:13         ` Nicolai Buchwitz
  1 sibling, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2026-04-06 18:30 UTC (permalink / raw)
  To: Mark Bloch
  Cc: Dragos Tatulea, Tariq Toukan, Eric Dumazet, Paolo Abeni,
	Andrew Lunn, David S. Miller, Saeed Mahameed, Leon Romanovsky,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Cosmin Ratiu, Simon Horman,
	Jacob Keller, Lama Kayal, Michal Swiatkowski, Carolina Jubran,
	Nathan Chancellor, Daniel Zahka, Rahul Rameshbabu, Raed Salem,
	netdev, linux-rdma, linux-kernel, bpf, Gal Pressman

On Mon, 6 Apr 2026 19:31:03 +0300 Mark Bloch wrote:
> On 06/04/2026 18:43, Jakub Kicinski wrote:
> > On Sun, 5 Apr 2026 08:08:06 +0200 Dragos Tatulea wrote:  
> >> sashiko says:  
> > 
> > Thanks a lot for reviewing the review! It takes a lot of maintainer time  
> 
> Just to add some context: we started running Sashiko internally,
> so hopefully trivial issues won’t be missed. I don’t know if
> you remember our on-list discussion from a few weeks ago, following that
> discussion right now we have three different internal AI tools reviewing each
> commit.
> 
> At the moment this is still manageable, and I think developers should
> look over all comments from all tools. In our case that currently
> means three review outputs per commit. It would also be useful to have
> some official guidance on what authors are recommended to run before
> posting, so obvious issues can be caught earlier and less reviewer/maintainer
> time is spent on them.
> 
> For example:
> 
> “Before posting, authors could run a recommended baseline of review tools,
> where available, to catch obvious issues early. During review, tools such
> as review-prompts and Sashiko may be used to assist the reviewer.”

Please send patches if you think something should be mentioned
somewhere. It'd be awesome if y'all participated more in upstream
reviews so that you recommendation could be rooted in what happens
on the list not just what happens within nVidia.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-06 16:31       ` Mark Bloch
  2026-04-06 18:30         ` Jakub Kicinski
@ 2026-04-06 19:13         ` Nicolai Buchwitz
  2026-04-06 19:52           ` Mark Bloch
  2026-04-07  0:43           ` Jakub Kicinski
  1 sibling, 2 replies; 16+ messages in thread
From: Nicolai Buchwitz @ 2026-04-06 19:13 UTC (permalink / raw)
  To: Mark Bloch
  Cc: Jakub Kicinski, Dragos Tatulea, Tariq Toukan, Eric Dumazet,
	Paolo Abeni, Andrew Lunn, David S. Miller, Saeed Mahameed,
	Leon Romanovsky, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Cosmin Ratiu, Simon Horman, Jacob Keller, Lama Kayal,
	Michal Swiatkowski, Carolina Jubran, Nathan Chancellor,
	Daniel Zahka, Rahul Rameshbabu, Raed Salem, netdev, linux-rdma,
	linux-kernel, bpf, Gal Pressman

On 6.4.2026 18:31, Mark Bloch wrote:
> On 06/04/2026 18:43, Jakub Kicinski wrote:
>> On Sun, 5 Apr 2026 08:08:06 +0200 Dragos Tatulea wrote:
>>> sashiko says:
>> 
>> Thanks a lot for reviewing the review! It takes a lot of maintainer 
>> time

> [...]

> 
> For example:
> 
> “Before posting, authors could run a recommended baseline of review 
> tools,
> where available, to catch obvious issues early. During review, tools 
> such
> as review-prompts and Sashiko may be used to assist the reviewer.”
> 

There is already https://netdev-ai.bots.linux.dev/ai-local.html which I 
found really helpful.
If this is still the preferred approach, I could draft a patch to add it 
to Documentation/process/maintainer-netdev.rst

> Mark

Thanks
Nicolai

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-06 18:30         ` Jakub Kicinski
@ 2026-04-06 19:50           ` Mark Bloch
  0 siblings, 0 replies; 16+ messages in thread
From: Mark Bloch @ 2026-04-06 19:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Dragos Tatulea, Tariq Toukan, Eric Dumazet, Paolo Abeni,
	Andrew Lunn, David S. Miller, Saeed Mahameed, Leon Romanovsky,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, Cosmin Ratiu, Simon Horman,
	Jacob Keller, Lama Kayal, Michal Swiatkowski, Carolina Jubran,
	Nathan Chancellor, Daniel Zahka, Rahul Rameshbabu, Raed Salem,
	netdev, linux-rdma, linux-kernel, bpf, Gal Pressman



On 06/04/2026 21:30, Jakub Kicinski wrote:
> On Mon, 6 Apr 2026 19:31:03 +0300 Mark Bloch wrote:
>> On 06/04/2026 18:43, Jakub Kicinski wrote:
>>> On Sun, 5 Apr 2026 08:08:06 +0200 Dragos Tatulea wrote:  
>>>> sashiko says:  
>>>
>>> Thanks a lot for reviewing the review! It takes a lot of maintainer time  
>>
>> Just to add some context: we started running Sashiko internally,
>> so hopefully trivial issues won’t be missed. I don’t know if
>> you remember our on-list discussion from a few weeks ago, following that
>> discussion right now we have three different internal AI tools reviewing each
>> commit.
>>
>> At the moment this is still manageable, and I think developers should
>> look over all comments from all tools. In our case that currently
>> means three review outputs per commit. It would also be useful to have
>> some official guidance on what authors are recommended to run before
>> posting, so obvious issues can be caught earlier and less reviewer/maintainer
>> time is spent on them.
>>
>> For example:
>>
>> “Before posting, authors could run a recommended baseline of review tools,
>> where available, to catch obvious issues early. During review, tools such
>> as review-prompts and Sashiko may be used to assist the reviewer.”
> 
> Please send patches if you think something should be mentioned
> somewhere. It'd be awesome if y'all participated more in upstream
> reviews so that you recommendation could be rooted in what happens
> on the list not just what happens within nVidia.

Fair point.

I can’t speak for the others, but I’ll try to do more upstream reviews
myself. In fact, I already blocked regular time slots each week for that.

Unfortunately, the region I live in has been affected by a certain war
situation recently, and that has derailed quite a few plans on my side.
Hopefully things will settle down sooner rather than later.

Mark

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-06 19:13         ` Nicolai Buchwitz
@ 2026-04-06 19:52           ` Mark Bloch
  2026-04-07  0:43           ` Jakub Kicinski
  1 sibling, 0 replies; 16+ messages in thread
From: Mark Bloch @ 2026-04-06 19:52 UTC (permalink / raw)
  To: Nicolai Buchwitz
  Cc: Jakub Kicinski, Dragos Tatulea, Tariq Toukan, Eric Dumazet,
	Paolo Abeni, Andrew Lunn, David S. Miller, Saeed Mahameed,
	Leon Romanovsky, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Cosmin Ratiu, Simon Horman, Jacob Keller, Lama Kayal,
	Michal Swiatkowski, Carolina Jubran, Nathan Chancellor,
	Daniel Zahka, Rahul Rameshbabu, Raed Salem, netdev, linux-rdma,
	linux-kernel, bpf, Gal Pressman



On 06/04/2026 22:13, Nicolai Buchwitz wrote:
> On 6.4.2026 18:31, Mark Bloch wrote:
>> On 06/04/2026 18:43, Jakub Kicinski wrote:
>>> On Sun, 5 Apr 2026 08:08:06 +0200 Dragos Tatulea wrote:
>>>> sashiko says:
>>>
>>> Thanks a lot for reviewing the review! It takes a lot of maintainer time
> 
>> [...]
> 
>>
>> For example:
>>
>> “Before posting, authors could run a recommended baseline of review tools,
>> where available, to catch obvious issues early. During review, tools such
>> as review-prompts and Sashiko may be used to assist the reviewer.”
>>
> 
> There is already https://netdev-ai.bots.linux.dev/ai-local.html which I found really helpful.
Yes, I’m aware of that page, and it’s definitely useful.

The reason I brought this up is that recently, with the rise of sashiko usage, Jakub has
also started pointing out comments coming from it during review, while sashiko itself is
not mentioned anywhere in the official netdev documentation.

Mark

> If this is still the preferred approach, I could draft a patch to add it to Documentation/process/maintainer-netdev.rst
> 
>> Mark
> 
> Thanks
> Nicolai


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq
  2026-04-06 19:13         ` Nicolai Buchwitz
  2026-04-06 19:52           ` Mark Bloch
@ 2026-04-07  0:43           ` Jakub Kicinski
  1 sibling, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2026-04-07  0:43 UTC (permalink / raw)
  To: Nicolai Buchwitz
  Cc: Mark Bloch, Dragos Tatulea, Tariq Toukan, Eric Dumazet,
	Paolo Abeni, Andrew Lunn, David S. Miller, Saeed Mahameed,
	Leon Romanovsky, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
	Cosmin Ratiu, Simon Horman, Jacob Keller, Lama Kayal,
	Michal Swiatkowski, Carolina Jubran, Nathan Chancellor,
	Daniel Zahka, Rahul Rameshbabu, Raed Salem, netdev, linux-rdma,
	linux-kernel, bpf, Gal Pressman

On Mon, 06 Apr 2026 21:13:43 +0200 Nicolai Buchwitz wrote:
> On 6.4.2026 18:31, Mark Bloch wrote:
> > On 06/04/2026 18:43, Jakub Kicinski wrote:  
> >> Thanks a lot for reviewing the review! It takes a lot of maintainer 
> >> time  
> > “Before posting, authors could run a recommended baseline of review 
> > tools,
> > where available, to catch obvious issues early. During review, tools 
> > such
> > as review-prompts and Sashiko may be used to assist the reviewer.”
> 
> There is already https://netdev-ai.bots.linux.dev/ai-local.html which I 
> found really helpful.
> If this is still the preferred approach, I could draft a patch to add it 
> to Documentation/process/maintainer-netdev.rst

Right, I'd like to have something similar and a bit of time to drive
down the number of false positives from Sashiko. Compared to the
NIPA's ai-reviews the Sashiko comments take a lot of time to validate
and are often alarmist. The merge window starts next week so I'll have
more spare cycles.

Please share if you had success running Sashiko (https://sashiko.dev/)
"locally"!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page
  2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
                   ` (4 preceding siblings ...)
  2026-04-03  9:09 ` [PATCH net-next V2 5/5] net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode Tariq Toukan
@ 2026-04-07 11:50 ` patchwork-bot+netdevbpf
  5 siblings, 0 replies; 16+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-04-07 11:50 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: edumazet, kuba, pabeni, andrew+netdev, davem, saeedm, leon,
	mbloch, ast, daniel, hawk, john.fastabend, sdf, dtatulea, cratiu,
	horms, jacob.e.keller, lkayal, michal.swiatkowski, cjubran,
	nathan, daniel.zahka, rrameshbabu, raeds, netdev, linux-rdma,
	linux-kernel, bpf, gal

Hello:

This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Fri, 3 Apr 2026 12:09:22 +0300 you wrote:
> Hi,
> 
> This series removes the limitation of having one packet per page in XDP
> mode. This has the following implications:
> 
> - XDP in Striding RQ mode can now be used on 64K page systems.
> 
> [...]

Here is the summary with links:
  - [net-next,V2,1/5] net/mlx5e: XSK, Increase size for chunk_size param
    https://git.kernel.org/netdev/net-next/c/1047e14b44ed
  - [net-next,V2,2/5] net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX
    https://git.kernel.org/netdev/net-next/c/833e72645aac
  - [net-next,V2,3/5] net/mlx5e: XDP, Remove stride size limitation
    https://git.kernel.org/netdev/net-next/c/2dfaa0238774
  - [net-next,V2,4/5] net/mlx5e: XDP, Use a single linear page per rq
    https://git.kernel.org/netdev/net-next/c/ebd4ad29cc82
  - [net-next,V2,5/5] net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode
    https://git.kernel.org/netdev/net-next/c/25b8c9b6d731

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-04-07 11:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03  9:09 [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page Tariq Toukan
2026-04-03  9:09 ` [PATCH net-next V2 1/5] net/mlx5e: XSK, Increase size for chunk_size param Tariq Toukan
2026-04-05  6:30   ` Dragos Tatulea
2026-04-03  9:09 ` [PATCH net-next V2 2/5] net/mlx5e: XDP, Improve dma address calculation of linear part for XDP_TX Tariq Toukan
2026-04-03  9:09 ` [PATCH net-next V2 3/5] net/mlx5e: XDP, Remove stride size limitation Tariq Toukan
2026-04-03  9:09 ` [PATCH net-next V2 4/5] net/mlx5e: XDP, Use a single linear page per rq Tariq Toukan
2026-04-05  6:08   ` Dragos Tatulea
2026-04-06 15:43     ` Jakub Kicinski
2026-04-06 16:31       ` Mark Bloch
2026-04-06 18:30         ` Jakub Kicinski
2026-04-06 19:50           ` Mark Bloch
2026-04-06 19:13         ` Nicolai Buchwitz
2026-04-06 19:52           ` Mark Bloch
2026-04-07  0:43           ` Jakub Kicinski
2026-04-03  9:09 ` [PATCH net-next V2 5/5] net/mlx5e: XDP, Use page fragments for linear data in multibuf-mode Tariq Toukan
2026-04-07 11:50 ` [PATCH net-next V2 0/5] net/mlx5e: XDP, Add support for multi-packet per page patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox