public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeed@kernel.org>
To: Mingrui Cui <mingruic@outlook.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>,
	Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size
Date: Tue, 2 Sep 2025 11:56:06 -0700	[thread overview]
Message-ID: <aLc9xknpad29kSnH@x130> (raw)
In-Reply-To: <MN6PR16MB5450CAF432AE40B2AFA58F61B706A@MN6PR16MB5450.namprd16.prod.outlook.com>

On 02 Sep 21:00, Mingrui Cui wrote:
>When page size is 4K, DEFAULT_FRAG_SIZE of 2048 ensures that with 3
>fragments per WQE, odd-indexed WQEs always share the same page with
>their subsequent WQE. However, this relationship does not hold for page
>sizes larger than 8K. In this case, wqe_index_mask cannot guarantee that
>newly allocated WQEs won't share the same page with old WQEs.
>
>If the last WQE in a bulk processed by mlx5e_post_rx_wqes() shares a
>page with its subsequent WQE, allocating a page for that WQE will
>overwrite mlx5e_frag_page, preventing the original page from being
>recycled. When the next WQE is processed, the newly allocated page will
>be immediately recycled.
>
>In the next round, if these two WQEs are handled in the same bulk,
>page_pool_defrag_page() will be called again on the page, causing
>pp_frag_count to become negative.
>
>Fix this by making DEFAULT_FRAG_SIZE always equal to half of the page
>size.
>
>Signed-off-by: Mingrui Cui <mingruic@outlook.com>
CC:  Dragos Tatulea <dtatulea@nvidia.com>

Dragos is on a mission to improve page_size support in mlx5.

Dragos, please look into this, I am not sure making  DEFAULT_FRAG_SIZE
dependant on PAGE_SIZE is the correct way to go,
see mlx5e_build_rq_frags_info()

I believe we don't do page flipping for > 4k pages, but I might be wrong,
anyway the code also should throw a warn_on: 

/* The last fragment of WQE with index 2*N may share the page with the
  * first fragment of WQE with index 2*N+1 in certain cases. If WQE
  * 2*N+1
  * is not completed yet, WQE 2*N must not be allocated, as it's
  * responsible for allocating a new page.
  */
if (frag_size_max == PAGE_SIZE) {
	/* No WQE can start in the middle of a page. */
	info->wqe_index_mask = 0;
} else {
	/* PAGE_SIZEs starting from 8192 don't use 2K-sized fragments,
	 * because there would be more than MLX5E_MAX_RX_FRAGS of
	 * them.
	 */
	WARN_ON(PAGE_SIZE != 2 * DEFAULT_FRAG_SIZE);
	/* Odd number of fragments allows to pack the last fragment of
	 * the previous WQE and the first fragment of the next WQE
	 * into
	 * the same page.
	 * As long as DEFAULT_FRAG_SIZE is 2048, and
	 * MLX5E_MAX_RX_FRAGS
	 * is 4, the last fragment can be bigger than the rest only
	 * if
	 * it's the fourth one, so WQEs consisting of 3 fragments
	 * will
	 * always share a page.
	 * When a page is shared, WQE bulk size is 2, otherwise
	 * just 1.
	 */
	info->wqe_index_mask = info->num_frags % 2;
}

Looking at the above makes me think that this patch is correct, but a more
careful look is needed to be taken, a Fixes tag is also required and target
'net' branch.

Thanks,
Saeed.

>---
> drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
>index 3cca06a74cf9..d96a3cbea23c 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
>@@ -666,7 +666,7 @@ static void mlx5e_rx_compute_wqe_bulk_params(struct mlx5e_params *params,
> 	info->refill_unit = DIV_ROUND_UP(info->wqe_bulk, split_factor);
> }
>
>-#define DEFAULT_FRAG_SIZE (2048)
>+#define DEFAULT_FRAG_SIZE (PAGE_SIZE / 2)
>
> static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
> 				     struct mlx5e_params *params,
>-- 
>2.43.0
>
>

  reply	other threads:[~2025-09-02 18:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-02 13:00 [PATCH] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size Mingrui Cui
2025-09-02 18:56 ` Saeed Mahameed [this message]
2025-09-03 14:59   ` Mingrui Cui
2025-09-04 16:31 ` Dragos Tatulea
2025-09-08 13:35   ` Mingrui Cui
2025-09-08 14:25     ` Dragos Tatulea
2025-09-15 13:28       ` Dragos Tatulea
2025-09-16  9:01         ` Mingrui Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aLc9xknpad29kSnH@x130 \
    --to=saeed@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=mingruic@outlook.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox