public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Shay Drori <shayd@nvidia.com>
To: Tariq Toukan <tariqt@nvidia.com>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	"Andrew Lunn" <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
	Mark Bloch <mbloch@nvidia.com>,
	"Leon Romanovsky" <leon@kernel.org>,
	Simon Horman <horms@kernel.org>, Kees Cook <kees@kernel.org>,
	Patrisious Haddad <phaddad@nvidia.com>,
	Parav Pandit <parav@nvidia.com>, Gal Pressman <gal@nvidia.com>,
	<netdev@vger.kernel.org>, <linux-rdma@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	Dragos Tatulea <dtatulea@nvidia.com>
Subject: Re: [PATCH net V3 1/4] net/mlx5: SD: Serialize init/cleanup
Date: Sun, 26 Apr 2026 13:46:07 +0300	[thread overview]
Message-ID: <60dbc1e0-97b8-497c-86bf-90a0f75d6d18@nvidia.com> (raw)
In-Reply-To: <20260423123104.201552-2-tariqt@nvidia.com>



On 23/04/2026 15:31, Tariq Toukan wrote:
> From: Shay Drory <shayd@nvidia.com>
> 
> mlx5_sd_init() / mlx5_sd_cleanup() may run from multiple PFs in the same
> Socket-Direct group. This can cause the SD bring-up/tear-down sequence
> to be executed more than once or interleaved across PFs.
> 
> Protect SD init/cleanup with mlx5_devcom_comp_lock() and track the SD
> group state on the primary device. Skip init if the primary is already
> UP, and skip cleanup unless the primary is UP.

Sashiko:
"The commit message mentions skipping cleanup unless the primary is UP.
However, it appears this state check is missing from mlx5_sd_cleanup()
in the diff below."

The above sentence is leftover and should be removed.
will drop in next version.

> 
> In addition, move mlx5_devcom_comp_set_ready(false) from sd_unregister()
> into the cleanup's locked section. A concurrent init acquiring the
> devcom lock will now observe devcom is no longer ready and bail out
> immediately.
> 
> Fixes: 381978d28317 ("net/mlx5e: Create single netdev per SD group")
> Signed-off-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
> ---
>   .../net/ethernet/mellanox/mlx5/core/lib/sd.c  | 32 +++++++++++++++----
>   1 file changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> index 762c783156b4..96b4316f570e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> @@ -18,6 +18,7 @@ struct mlx5_sd {
>   	u8 host_buses;
>   	struct mlx5_devcom_comp_dev *devcom;
>   	struct dentry *dfs;
> +	u8 state;
>   	bool primary;
>   	union {
>   		struct { /* primary */
> @@ -31,6 +32,11 @@ struct mlx5_sd {
>   	};
>   };
>   
> +enum mlx5_sd_state {
> +	MLX5_SD_STATE_DOWN = 0,
> +	MLX5_SD_STATE_UP,
> +};
> +
>   static int mlx5_sd_get_host_buses(struct mlx5_core_dev *dev)
>   {
>   	struct mlx5_sd *sd = mlx5_get_sd(dev);
> @@ -270,9 +276,6 @@ static void sd_unregister(struct mlx5_core_dev *dev)
>   {
>   	struct mlx5_sd *sd = mlx5_get_sd(dev);
>   
> -	mlx5_devcom_comp_lock(sd->devcom);
> -	mlx5_devcom_comp_set_ready(sd->devcom, false);
> -	mlx5_devcom_comp_unlock(sd->devcom);
>   	mlx5_devcom_unregister_component(sd->devcom);
>   }
>   
> @@ -426,6 +429,7 @@ int mlx5_sd_init(struct mlx5_core_dev *dev)
>   	struct mlx5_core_dev *primary, *pos, *to;
>   	struct mlx5_sd *sd = mlx5_get_sd(dev);
>   	u8 alias_key[ACCESS_KEY_LEN];
> +	struct mlx5_sd *primary_sd;
>   	int err, i;
>   
>   	err = sd_init(dev);
> @@ -440,10 +444,15 @@ int mlx5_sd_init(struct mlx5_core_dev *dev)
>   	if (err)
>   		goto err_sd_cleanup;
>   
> +	mlx5_devcom_comp_lock(sd->devcom);
>   	if (!mlx5_devcom_comp_is_ready(sd->devcom))
> -		return 0;
> +		goto out;

Sashiko:
"Can primary be NULL here?
In sd_register(), the devcom ready state is published under the devcom
lock, but the lock is then released before the peer_sd->primary_dev
pointers are initialized.
If a concurrent thread executing mlx5_sd_init() or mlx5_sd_cleanup()
acquires the lock and observes the ready state, could it read an
uninitialized primary_dev before the loop in sd_register() completes?"

No, this is impossible. concurrent init will always set primary before
accessing it, and cleanup is always after successful init, so again-
primary is set.
and the next comment is also impossible seqence

>   
>   	primary = mlx5_sd_get_primary(dev);
> +	primary_sd = mlx5_get_sd(primary);
> +
> +	if (primary_sd->state != MLX5_SD_STATE_DOWN)
> +		goto out;
>   
>   	for (i = 0; i < ACCESS_KEY_LEN; i++)
>   		alias_key[i] = get_random_u8();
> @@ -472,6 +481,9 @@ int mlx5_sd_init(struct mlx5_core_dev *dev)
>   		sd->group_id, mlx5_devcom_comp_get_size(sd->devcom));
>   	sd_print_group(primary);
>   
> +	primary_sd->state = MLX5_SD_STATE_UP;
> +out:
> +	mlx5_devcom_comp_unlock(sd->devcom);
>   	return 0;
>   
>   err_unset_secondaries:
> @@ -481,6 +493,8 @@ int mlx5_sd_init(struct mlx5_core_dev *dev)
>   	sd_cmd_unset_primary(primary);
>   	debugfs_remove_recursive(sd->dfs);
>   err_sd_unregister:
> +	mlx5_devcom_comp_set_ready(sd->devcom, false);
> +	mlx5_devcom_comp_unlock(sd->devcom);
>   	sd_unregister(dev);
>   err_sd_cleanup:
>   	sd_cleanup(dev);
> @@ -491,22 +505,28 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev)
>   {
>   	struct mlx5_sd *sd = mlx5_get_sd(dev);
>   	struct mlx5_core_dev *primary, *pos;
> +	struct mlx5_sd *primary_sd;
>   	int i;
>   
>   	if (!sd)
>   		return;
>   
> +	mlx5_devcom_comp_lock(sd->devcom);
>   	if (!mlx5_devcom_comp_is_ready(sd->devcom))
> -		goto out;
> +		goto out_unlock;
>   
>   	primary = mlx5_sd_get_primary(dev);
> +	primary_sd = mlx5_get_sd(primary);
>   	mlx5_sd_for_each_secondary(i, primary, pos)
>   		sd_cmd_unset_secondary(pos);
>   	sd_cmd_unset_primary(primary);
>   	debugfs_remove_recursive(sd->dfs);
>   
>   	sd_info(primary, "group id %#x, uncombined\n", sd->group_id);
> -out:
> +	primary_sd->state = MLX5_SD_STATE_DOWN;
> +	mlx5_devcom_comp_set_ready(sd->devcom, false);
> +out_unlock:
> +	mlx5_devcom_comp_unlock(sd->devcom);
>   	sd_unregister(dev);
>   	sd_cleanup(dev);
>   }


  reply	other threads:[~2026-04-26 10:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23 12:31 [PATCH net V3 0/4] net/mlx5: Fixes for Socket-Direct Tariq Toukan
2026-04-23 12:31 ` [PATCH net V3 1/4] net/mlx5: SD: Serialize init/cleanup Tariq Toukan
2026-04-26 10:46   ` Shay Drori [this message]
2026-04-23 12:31 ` [PATCH net V3 2/4] net/mlx5: SD, Keep multi-pf debugfs entries on primary Tariq Toukan
2026-04-23 12:31 ` [PATCH net V3 3/4] net/mlx5e: SD, Fix missing cleanup on probe/resume error Tariq Toukan
2026-04-26 10:45   ` Shay Drori
2026-04-23 12:31 ` [PATCH net V3 4/4] net/mlx5e: SD, Fix race condition in secondary device probe/remove Tariq Toukan
2026-04-26 13:26   ` Shay Drori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60dbc1e0-97b8-497c-86bf-90a0f75d6d18@nvidia.com \
    --to=shayd@nvidia.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dtatulea@nvidia.com \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=horms@kernel.org \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=phaddad@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox