public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Mark Bloch <mbloch@nvidia.com>
To: Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>
Cc: Tariq Toukan <tariqt@nvidia.com>,
	Leon Romanovsky <leon@kernel.org>,
	"Saeed Mahameed" <saeedm@nvidia.com>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, Gal Pressman <gal@nvidia.com>,
	<linux-rdma@vger.kernel.org>, Moshe Shemesh <moshe@nvidia.com>,
	Parav Pandit <parav@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	Shay Drory <shayd@nvidia.com>
Subject: [PATCH net V2 07/11] net/mlx5: Nack sync reset when SFs are present
Date: Mon, 25 Aug 2025 17:34:30 +0300	[thread overview]
Message-ID: <20250825143435.598584-8-mbloch@nvidia.com> (raw)
In-Reply-To: <20250825143435.598584-1-mbloch@nvidia.com>

From: Moshe Shemesh <moshe@nvidia.com>

If PF (Physical Function) has SFs (Sub-Functions), since the SFs are not
taking part in the synchronization flow, sync reset can lead to fatal
error on the SFs, as the function will be closed unexpectedly from the
SF point of view.

Add a check to prevent sync reset when there are SFs on a PF device
which is not ECPF, as ECPF is teardowned gracefully before reset.

Fixes: 92501fa6e421 ("net/mlx5: Ack on sync_reset_request only if PF can do reset_now")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c   |  6 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c | 10 ++++++++++
 drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h      |  6 ++++++
 3 files changed, 22 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 38b9b184ae01..22995131824a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -6,6 +6,7 @@
 #include "fw_reset.h"
 #include "diag/fw_tracer.h"
 #include "lib/tout.h"
+#include "sf/sf.h"
 
 enum {
 	MLX5_FW_RESET_FLAGS_RESET_REQUESTED,
@@ -428,6 +429,11 @@ static bool mlx5_is_reset_now_capable(struct mlx5_core_dev *dev,
 		return false;
 	}
 
+	if (!mlx5_core_is_ecpf(dev) && !mlx5_sf_table_empty(dev)) {
+		mlx5_core_warn(dev, "SFs should be removed before reset\n");
+		return false;
+	}
+
 #if IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)
 	if (reset_method != MLX5_MFRL_REG_PCI_RESET_METHOD_HOT_RESET) {
 		err = mlx5_check_hotplug_interrupt(dev, bridge);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
index 0864ba625c07..3304f25cc805 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
@@ -518,3 +518,13 @@ void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev)
 	WARN_ON(!xa_empty(&table->function_ids));
 	kfree(table);
 }
+
+bool mlx5_sf_table_empty(const struct mlx5_core_dev *dev)
+{
+	struct mlx5_sf_table *table = dev->priv.sf_table;
+
+	if (!table)
+		return true;
+
+	return xa_empty(&table->function_ids);
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h
index 860f9ddb7107..89559a37997a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h
@@ -17,6 +17,7 @@ void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev);
 
 int mlx5_sf_table_init(struct mlx5_core_dev *dev);
 void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev);
+bool mlx5_sf_table_empty(const struct mlx5_core_dev *dev);
 
 int mlx5_devlink_sf_port_new(struct devlink *devlink,
 			     const struct devlink_port_new_attrs *add_attr,
@@ -61,6 +62,11 @@ static inline void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev)
 {
 }
 
+static inline bool mlx5_sf_table_empty(const struct mlx5_core_dev *dev)
+{
+	return true;
+}
+
 #endif
 
 #endif
-- 
2.34.1


  parent reply	other threads:[~2025-08-25 14:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-25 14:34 [PATCH net V2 00/11] mlx5 misc fixes 2025-08-25 Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 01/11] net/mlx5: HWS, Fix memory leak in hws_pool_buddy_init error path Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 02/11] net/mlx5: HWS, Fix memory leak in hws_action_get_shared_stc_nic error flow Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 03/11] net/mlx5: HWS, Fix uninitialized variables in mlx5hws_pat_calc_nop " Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 04/11] net/mlx5: HWS, Fix pattern destruction in mlx5hws_pat_get_pattern error path Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 05/11] net/mlx5: Reload auxiliary drivers on fw_activate Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 06/11] net/mlx5: Fix lockdep assertion on sync reset unload event Mark Bloch
2025-08-25 14:34 ` Mark Bloch [this message]
2025-08-25 14:34 ` [PATCH net V2 08/11] net/mlx5: Prevent flow steering mode changes in switchdev mode Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 09/11] net/mlx5e: Update and set Xon/Xoff upon MTU set Mark Bloch
2025-08-25 14:34 ` [PATCH net V2 10/11] net/mlx5e: Update and set Xon/Xoff upon port speed set Mark Bloch
2025-09-11  0:00   ` Jakub Kicinski
2025-09-11 13:47     ` Jakub Kicinski
2025-09-11 14:25       ` Mark Bloch
2025-09-11 14:36         ` Jakub Kicinski
2025-09-15  7:38           ` Tariq Toukan
2025-09-17 10:39             ` Tariq Toukan
2025-09-17 13:00               ` Daniel Zahka
2025-09-17 13:38                 ` Tariq Toukan
2025-08-25 14:34 ` [PATCH net V2 11/11] net/mlx5e: Set local Xoff after FW update Mark Bloch
2025-08-27  1:10 ` [PATCH net V2 00/11] mlx5 misc fixes 2025-08-25 patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250825143435.598584-8-mbloch@nvidia.com \
    --to=mbloch@nvidia.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=moshe@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=shayd@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox