* [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size
@ 2025-09-02 13:00 Mingrui Cui
2025-09-02 18:56 ` Saeed Mahameed
0 siblings, 1 reply; 3+ messages in thread
From: Mingrui Cui @ 2025-09-02 13:00 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Mingrui Cui, netdev, linux-rdma, linux-kernel
When page size is 4K, DEFAULT_FRAG_SIZE of 2048 ensures that with 3
fragments per WQE, odd-indexed WQEs always share the same page with
their subsequent WQE. However, this relationship does not hold for page
sizes larger than 8K. In this case, wqe_index_mask cannot guarantee that
newly allocated WQEs won't share the same page with old WQEs.
If the last WQE in a bulk processed by mlx5e_post_rx_wqes() shares a
page with its subsequent WQE, allocating a page for that WQE will
overwrite mlx5e_frag_page, preventing the original page from being
recycled. When the next WQE is processed, the newly allocated page will
be immediately recycled.
In the next round, if these two WQEs are handled in the same bulk,
page_pool_defrag_page() will be called again on the page, causing
pp_frag_count to become negative.
Fix this by making DEFAULT_FRAG_SIZE always equal to half of the page
size.
Signed-off-by: Mingrui Cui <mingruic@outlook.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 3cca06a74cf9..d96a3cbea23c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -666,7 +666,7 @@ static void mlx5e_rx_compute_wqe_bulk_params(struct mlx5e_params *params,
info->refill_unit = DIV_ROUND_UP(info->wqe_bulk, split_factor);
}
-#define DEFAULT_FRAG_SIZE (2048)
+#define DEFAULT_FRAG_SIZE (PAGE_SIZE / 2)
static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
struct mlx5e_params *params,
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size
2025-09-02 13:00 [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size Mingrui Cui
@ 2025-09-02 18:56 ` Saeed Mahameed
2025-09-03 14:59 ` Mingrui Cui
0 siblings, 1 reply; 3+ messages in thread
From: Saeed Mahameed @ 2025-09-02 18:56 UTC (permalink / raw)
To: Mingrui Cui
Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, netdev, linux-rdma, linux-kernel
On 02 Sep 21:00, Mingrui Cui wrote:
>When page size is 4K, DEFAULT_FRAG_SIZE of 2048 ensures that with 3
>fragments per WQE, odd-indexed WQEs always share the same page with
>their subsequent WQE. However, this relationship does not hold for page
>sizes larger than 8K. In this case, wqe_index_mask cannot guarantee that
>newly allocated WQEs won't share the same page with old WQEs.
>
>If the last WQE in a bulk processed by mlx5e_post_rx_wqes() shares a
>page with its subsequent WQE, allocating a page for that WQE will
>overwrite mlx5e_frag_page, preventing the original page from being
>recycled. When the next WQE is processed, the newly allocated page will
>be immediately recycled.
>
>In the next round, if these two WQEs are handled in the same bulk,
>page_pool_defrag_page() will be called again on the page, causing
>pp_frag_count to become negative.
>
>Fix this by making DEFAULT_FRAG_SIZE always equal to half of the page
>size.
>
>Signed-off-by: Mingrui Cui <mingruic@outlook.com>
CC: Dragos Tatulea <dtatulea@nvidia.com>
Dragos is on a mission to improve page_size support in mlx5.
Dragos, please look into this, I am not sure making DEFAULT_FRAG_SIZE
dependant on PAGE_SIZE is the correct way to go,
see mlx5e_build_rq_frags_info()
I believe we don't do page flipping for > 4k pages, but I might be wrong,
anyway the code also should throw a warn_on:
/* The last fragment of WQE with index 2*N may share the page with the
* first fragment of WQE with index 2*N+1 in certain cases. If WQE
* 2*N+1
* is not completed yet, WQE 2*N must not be allocated, as it's
* responsible for allocating a new page.
*/
if (frag_size_max == PAGE_SIZE) {
/* No WQE can start in the middle of a page. */
info->wqe_index_mask = 0;
} else {
/* PAGE_SIZEs starting from 8192 don't use 2K-sized fragments,
* because there would be more than MLX5E_MAX_RX_FRAGS of
* them.
*/
WARN_ON(PAGE_SIZE != 2 * DEFAULT_FRAG_SIZE);
/* Odd number of fragments allows to pack the last fragment of
* the previous WQE and the first fragment of the next WQE
* into
* the same page.
* As long as DEFAULT_FRAG_SIZE is 2048, and
* MLX5E_MAX_RX_FRAGS
* is 4, the last fragment can be bigger than the rest only
* if
* it's the fourth one, so WQEs consisting of 3 fragments
* will
* always share a page.
* When a page is shared, WQE bulk size is 2, otherwise
* just 1.
*/
info->wqe_index_mask = info->num_frags % 2;
}
Looking at the above makes me think that this patch is correct, but a more
careful look is needed to be taken, a Fixes tag is also required and target
'net' branch.
Thanks,
Saeed.
>---
> drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
>index 3cca06a74cf9..d96a3cbea23c 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
>@@ -666,7 +666,7 @@ static void mlx5e_rx_compute_wqe_bulk_params(struct mlx5e_params *params,
> info->refill_unit = DIV_ROUND_UP(info->wqe_bulk, split_factor);
> }
>
>-#define DEFAULT_FRAG_SIZE (2048)
>+#define DEFAULT_FRAG_SIZE (PAGE_SIZE / 2)
>
> static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
> struct mlx5e_params *params,
>--
>2.43.0
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size
2025-09-02 18:56 ` Saeed Mahameed
@ 2025-09-03 14:59 ` Mingrui Cui
0 siblings, 0 replies; 3+ messages in thread
From: Mingrui Cui @ 2025-09-03 14:59 UTC (permalink / raw)
To: Saeed Mahameed, Dragos Tatulea
Cc: Mingrui Cui, Leon Romanovsky, Tariq Toukan, Mark Bloch,
Saeed Mahameed, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, linux-rdma, linux-kernel
> On 02 Sep 21:00, Mingrui Cui wrote:
> >When page size is 4K, DEFAULT_FRAG_SIZE of 2048 ensures that with 3
> >fragments per WQE, odd-indexed WQEs always share the same page with
> >their subsequent WQE. However, this relationship does not hold for page
> >sizes larger than 8K. In this case, wqe_index_mask cannot guarantee that
> >newly allocated WQEs won't share the same page with old WQEs.
> >
> >If the last WQE in a bulk processed by mlx5e_post_rx_wqes() shares a
> >page with its subsequent WQE, allocating a page for that WQE will
> >overwrite mlx5e_frag_page, preventing the original page from being
> >recycled. When the next WQE is processed, the newly allocated page will
> >be immediately recycled.
> >
> >In the next round, if these two WQEs are handled in the same bulk,
> >page_pool_defrag_page() will be called again on the page, causing
> >pp_frag_count to become negative.
> >
> >Fix this by making DEFAULT_FRAG_SIZE always equal to half of the page
> >size.
> >
> >Signed-off-by: Mingrui Cui <mingruic@outlook.com>
> CC: Dragos Tatulea <dtatulea@nvidia.com>
>
> Dragos is on a mission to improve page_size support in mlx5.
>
> Dragos, please look into this, I am not sure making DEFAULT_FRAG_SIZE
> dependant on PAGE_SIZE is the correct way to go,
> see mlx5e_build_rq_frags_info()
>
> I believe we don't do page flipping for > 4k pages, but I might be wrong,
> anyway the code also should throw a warn_on:
From what I saw, the handling for > 4K pages should be no different
from 4K pages.
>
> /* The last fragment of WQE with index 2*N may share the page with the
> * first fragment of WQE with index 2*N+1 in certain cases. If WQE
> * 2*N+1
> * is not completed yet, WQE 2*N must not be allocated, as it's
> * responsible for allocating a new page.
> */
> if (frag_size_max == PAGE_SIZE) {
> /* No WQE can start in the middle of a page. */
> info->wqe_index_mask = 0;
> } else {
> /* PAGE_SIZEs starting from 8192 don't use 2K-sized fragments,
> * because there would be more than MLX5E_MAX_RX_FRAGS of
> * them.
> */
> WARN_ON(PAGE_SIZE != 2 * DEFAULT_FRAG_SIZE);
> /* Odd number of fragments allows to pack the last fragment of
> * the previous WQE and the first fragment of the next WQE
> * into
> * the same page.
> * As long as DEFAULT_FRAG_SIZE is 2048, and
> * MLX5E_MAX_RX_FRAGS
> * is 4, the last fragment can be bigger than the rest only
> * if
> * it's the fourth one, so WQEs consisting of 3 fragments
> * will
> * always share a page.
> * When a page is shared, WQE bulk size is 2, otherwise
> * just 1.
> */
> info->wqe_index_mask = info->num_frags % 2;
> }
>
> Looking at the above makes me think that this patch is correct, but a more
> careful look is needed to be taken, a Fixes tag is also required and target
> 'net' branch.
>
> Thanks,
> Saeed.
Okay, I will prepare and send a v2 patch next. I think WARN_ON() should
also be removed and surrounding comments updated accordingly. Thanks.
>
> >---
> > drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
> >index 3cca06a74cf9..d96a3cbea23c 100644
> >--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
> >+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
> >@@ -666,7 +666,7 @@ static void mlx5e_rx_compute_wqe_bulk_params(struct mlx5e_params *params,
> > info->refill_unit = DIV_ROUND_UP(info->wqe_bulk, split_factor);
> > }
> >
> >-#define DEFAULT_FRAG_SIZE (2048)
> >+#define DEFAULT_FRAG_SIZE (PAGE_SIZE / 2)
> >
> > static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
> > struct mlx5e_params *params,
> >--
> >2.43.0
> >
> >
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-09-03 15:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-02 13:00 [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size Mingrui Cui
2025-09-02 18:56 ` Saeed Mahameed
2025-09-03 14:59 ` Mingrui Cui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).