From: Jakub Kicinski <kuba@kernel.org>
To: tariqt@nvidia.com
Cc: Jakub Kicinski <kuba@kernel.org>,
edumazet@google.com, pabeni@redhat.com, andrew+netdev@lunn.ch,
davem@davemloft.net, saeedm@nvidia.com, mbloch@nvidia.com,
leon@kernel.org, shayd@nvidia.com, horms@kernel.org,
kees@kernel.org, parav@nvidia.com, phaddad@nvidia.com,
gal@nvidia.com, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH net 2/3] net/mlx5: SD, Keep multi-pf debugfs entries on primary
Date: Wed, 1 Apr 2026 20:09:11 -0700 [thread overview]
Message-ID: <20260402030911.878500-1-kuba@kernel.org> (raw)
In-Reply-To: <20260330193412.53408-3-tariqt@nvidia.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/mlx5: SD, Keep multi-pf debugfs entries on primary
This commit fixes an issue where the "multi-pf" debugfs directory was stored
in the calling device's sd struct but created under the primary device's root.
It explicitly stores the debugfs dentry in the primary device's sd struct.
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> index 060649645012..4c80b9d25283 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/sd.c
[ ... ]
> @@ -491,6 +499,7 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev)
> {
> struct mlx5_sd *sd = mlx5_get_sd(dev);
> struct mlx5_core_dev *primary, *pos;
> + struct mlx5_sd *primary_sd;
> int i;
>
> if (!sd)
> @@ -500,10 +509,12 @@ void mlx5_sd_cleanup(struct mlx5_core_dev *dev)
> goto out;
>
> primary = mlx5_sd_get_primary(dev);
> + primary_sd = mlx5_get_sd(primary);
> mlx5_sd_for_each_secondary(i, primary, pos)
> sd_cmd_unset_secondary(pos);
> sd_cmd_unset_primary(primary);
> - debugfs_remove_recursive(sd->dfs);
> + debugfs_remove_recursive(primary_sd->dfs);
> + primary_sd->dfs = NULL;
Can concurrent unbind operations cause a use-after-free and double-free
regression here?
In mlx5_sd_cleanup(), the check !mlx5_devcom_comp_is_ready(sd->devcom) is
performed locklessly. If multiple physical functions in the Socket-Direct
group are unbound concurrently, multiple threads can observe ready == true
and enter this teardown block simultaneously.
Since the cleanup code now operates on the shared primary_sd pointer instead
of the local device's sd, the thread for the primary physical function might
complete its unbind and execute sd_cleanup(dev), freeing its sd object
(which is primary_sd).
Concurrently, the thread for a secondary physical function will dereference
the freed primary_sd to access primary_sd->dfs, causing a use-after-free.
Both threads could also pass the same pointer to debugfs_remove_recursive(),
resulting in a double-free.
Note: This was fixed later in the patch series by commit 92528cec4e0b91deb
("net/mlx5: SD: Serialize init/cleanup").
--
pw-bot: cr
next prev parent reply other threads:[~2026-04-02 3:09 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-30 19:34 [PATCH net 0/3] net/mlx5: Fixes for Socket-Direct Tariq Toukan
2026-03-30 19:34 ` [PATCH net 1/3] net/mlx5e: SD, Fix race condition in secondary device probe/remove Tariq Toukan
2026-04-02 3:08 ` Jakub Kicinski
2026-04-02 20:03 ` Shay Drori
2026-04-03 0:45 ` Jakub Kicinski
2026-04-05 19:05 ` Shay Drori
2026-03-30 19:34 ` [PATCH net 2/3] net/mlx5: SD, Keep multi-pf debugfs entries on primary Tariq Toukan
2026-04-02 3:09 ` Jakub Kicinski [this message]
2026-04-02 19:50 ` Shay Drori
2026-03-30 19:34 ` [PATCH net 3/3] net/mlx5: SD: Serialize init/cleanup Tariq Toukan
2026-04-02 3:09 ` Jakub Kicinski
2026-04-02 19:49 ` Shay Drori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260402030911.878500-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=horms@kernel.org \
--cc=kees@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=parav@nvidia.com \
--cc=phaddad@nvidia.com \
--cc=saeedm@nvidia.com \
--cc=shayd@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.